Industry Expertise

Enterprise Analytics & Industrial IoT

Industrial clients continue to look for methods to increase safety, decrease failures, and optimize productivity. With the advent of IoT, sensors are everywhere. This constant flow real-time source of data offers the promise of using data science to automate and optimize. The unique challenge in this space is to be able to deploy AI at the source of the data - at the edge. Our products have helped clients in energy, manufacturing, and agriculture to reduce costs and increase productivity.

Generate geospatial visualizations to understand realtime fluid distribution

Software engineering to translate time series data into actionable insights

Read more

Generate geospatial visualizations to understand realtime fluid distribution

Software engineering to translate time series data into actionable insights

The Challenge

Our client uses electromagnetic fields to track fluid distribution during oil and gas well-completion. They are able to display single-stage fluid patterns, and needed software engineering assistance to display multiple stages concurrently.

Outcome

Using software and data engineering, we were able to design a Python-based web-application that converts real-time data to visualizable patterns. Our client’s engineering team has incorporated our enhancements into their platform.

Read more
Generate geospatial visualizations to understand realtime fluid distribution
The Challenge

Our client uses electromagnetic fields to track fluid distribution during oil and gas well-completion. They are able to display single-stage fluid patterns, and needed software engineering assistance to display multiple stages concurrently.

Our Solution

Using software and data engineering, we were able to design a Python-based web-application that converts real-time data to visualizable patterns. Our client’s engineering team has incorporated our enhancements into their platform.

Unify global user data to enable realtime analytics

Build ETL (extract, transform and load) pipelines and create a realtime unified store of analytics ready data from sources including web, social media, and app servers

Read more

Unify global user data to enable realtime analytics

Build ETL (extract, transform and load) pipelines and create a realtime unified store of analytics ready data from sources including web, social media, and app servers

The Challenge

Our client, an enterprise software platform company, has more than 40,000 installations of their log management software and needed to better understand global customer software usage. 

Outcome

We built automated data pipelines to extract, transform and load data from multiple sources including web, social media, and application servers, creating a realtime unified store of data in S3 ready for analytics. We designed an architecture including S3 as the data lake, Athena as the query engine, cloud formation to orchestrate nightly data aggregation, and realtime visualizations on AWS Quicksight.

Read more
Unify global user data to enable realtime analytics
The Challenge

Our client, an enterprise software platform company, has more than 40,000 installations of their log management software and needed to better understand global customer software usage. 

Our Solution

We built automated data pipelines to extract, transform and load data from multiple sources including web, social media, and application servers, creating a realtime unified store of data in S3 ready for analytics. We designed an architecture including S3 as the data lake, Athena as the query engine, cloud formation to orchestrate nightly data aggregation, and realtime visualizations on AWS Quicksight.

Automatically discover themes from 6 million new web articles published daily

Develop specialized natural language processing (NLP) models to extract and organize newly published web content for a digital PR firm

Read more

Automatically discover themes from 6 million new web articles published daily

Develop specialized natural language processing (NLP) models to extract and organize newly published web content for a digital PR firm

The Challenge

Our client, a digital PR firm, needed to automate the discovery of concepts in new articles published to the web daily to share with their customers to drive realtime PR campaigns. Further, these 6 million articles needed to be processed within 1 hour. 

Outcome

We built data engineering pipelines to pre-process text strings to lemmatize and drop out stop words. Our team developed a specialized natural language processing (NLP) model to process, classify, and cluster web-based articles, based on primary purpose and content. We were able to extract the most common themes present across the full set of new articles. We optimized the performance and parallelized the pipeline to process 6 million articles daily within 1 hour.

Read more
Automatically discover themes from 6 million new web articles published daily
The Challenge

Our client, a digital PR firm, needed to automate the discovery of concepts in new articles published to the web daily to share with their customers to drive realtime PR campaigns. Further, these 6 million articles needed to be processed within 1 hour. 

Our Solution

We built data engineering pipelines to pre-process text strings to lemmatize and drop out stop words. Our team developed a specialized natural language processing (NLP) model to process, classify, and cluster web-based articles, based on primary purpose and content. We were able to extract the most common themes present across the full set of new articles. We optimized the performance and parallelized the pipeline to process 6 million articles daily within 1 hour.

Automate data engineering normalization for public real estate data

Use data science techniques to auto-normalize and unify data for an organization focused on showing the impact of the governmental policy decisions on real estate values

Read more

Automate data engineering normalization for public real estate data

Use data science techniques to auto-normalize and unify data for an organization focused on showing the impact of the governmental policy decisions on real estate values

The Challenge

Our client focuses on showing the impact of governmental-policy decisions on real estate values. Their business model depends on real estate transaction data contained in deeds, from county and town level recorders, each with varying data formats. To support the growth and scalability of their business, our client needed an automated process to ingest and normalize data in varying formats into a unified format.

Outcome

Our team built data engineering pipelines to ingest semi-structured, heterogeneous data files and trained machine learning models to infer data types and auto-normalize to a standard data model. Our models were trained to classify data based on morphology, string matching, and distribution of data. We identified the most similar distribution based on past data to assign data types. We worked with the client engineering team to incorporate the data engineering pipelines and automated normalization algorithms into their platform.

Read more
Automate data engineering normalization for public real estate data
The Challenge

Our client focuses on showing the impact of governmental-policy decisions on real estate values. Their business model depends on real estate transaction data contained in deeds, from county and town level recorders, each with varying data formats. To support the growth and scalability of their business, our client needed an automated process to ingest and normalize data in varying formats into a unified format.

Our Solution

Our team built data engineering pipelines to ingest semi-structured, heterogeneous data files and trained machine learning models to infer data types and auto-normalize to a standard data model. Our models were trained to classify data based on morphology, string matching, and distribution of data. We identified the most similar distribution based on past data to assign data types. We worked with the client engineering team to incorporate the data engineering pipelines and automated normalization algorithms into their platform.

Optimize a manufacturing process with anomaly detection

Train predictive models on manufacturing process time-series data to predict and prevent future faults

Read more

Optimize a manufacturing process with anomaly detection

Train predictive models on manufacturing process time-series data to predict and prevent future faults

The Challenge

Our client is an international glass manufacturer that needed to improve product quality in their manufacturing process. Our client was experiencing faults in their product, caused by inconsistencies in the manufacturing process. The client needed to understand where in the process the faults were being introduced to enable intervention and prevention.

Outcome

Our team aggregated time series data for 250 sensors in the manufacturing process including flow rates, temperature, and energy expenditure. Feature engineering focused on aggregating and summarizing sensor data in various time increments. Machine learning model experimentation included: linear models, advanced regression, clustered feature reduction, neural net, sliding window averaging, automated peak detection, time series clustering. We identified the most predictive models enabling the client to significantly reduce product faults.

Read more
Optimize a manufacturing process with anomaly detection
The Challenge

Our client is an international glass manufacturer that needed to improve product quality in their manufacturing process. Our client was experiencing faults in their product, caused by inconsistencies in the manufacturing process. The client needed to understand where in the process the faults were being introduced to enable intervention and prevention.

Our Solution

Our team aggregated time series data for 250 sensors in the manufacturing process including flow rates, temperature, and energy expenditure. Feature engineering focused on aggregating and summarizing sensor data in various time increments. Machine learning model experimentation included: linear models, advanced regression, clustered feature reduction, neural net, sliding window averaging, automated peak detection, time series clustering. We identified the most predictive models enabling the client to significantly reduce product faults.