Industry Expertise

Enterprise Analytics & Industrial IoT

Industrial clients continue to look for methods to increase safety, decrease failures, and optimize productivity. With the advent of IoT, sensors are everywhere. This constant flow real-time source of data offers the promise of using data science to automate and optimize. The unique challenge in this space is to be able to deploy AI at the source of the data - at the edge. Our products have helped clients in energy, manufacturing, and agriculture to reduce costs and increase productivity.

Generate geospatial visualizations to understand realtime fluid distribution

Software engineering to translate time series data into actionable insights

Read more

Generate geospatial visualizations to understand realtime fluid distribution

Software engineering to translate time series data into actionable insights

The Challenge

Our client uses electromagnetic fields to track fluid distribution during oil and gas well-completion. They are able to display single-stage fluid patterns, and needed software engineering assistance to display multiple stages concurrently.

Outcome

Using software and data engineering, we were able to design a Python-based web-application that converts real-time data to visualizable patterns. Our client’s engineering team has incorporated our enhancements into their platform.

Read more
Generate geospatial visualizations to understand realtime fluid distribution
The Challenge

Our client uses electromagnetic fields to track fluid distribution during oil and gas well-completion. They are able to display single-stage fluid patterns, and needed software engineering assistance to display multiple stages concurrently.

Our Solution

Using software and data engineering, we were able to design a Python-based web-application that converts real-time data to visualizable patterns. Our client’s engineering team has incorporated our enhancements into their platform.

Unify global user data to enable realtime analytics

Build ETL (extract, transform and load) pipelines and create a realtime unified store of analytics ready data from sources including web, social media, and app servers

Read more

Unify global user data to enable realtime analytics

Build ETL (extract, transform and load) pipelines and create a realtime unified store of analytics ready data from sources including web, social media, and app servers

The Challenge

Our client, an enterprise software platform company, has more than 40,000 installations of their log management software and needed to better understand global customer software usage. 

Outcome

We built automated data pipelines to extract, transform and load data from multiple sources including web, social media, and application servers, creating a realtime unified store of data in S3 ready for analytics. We designed an architecture including S3 as the data lake, Athena as the query engine, cloud formation to orchestrate nightly data aggregation, and realtime visualizations on AWS Quicksight.

Read more
Unify global user data to enable realtime analytics
The Challenge

Our client, an enterprise software platform company, has more than 40,000 installations of their log management software and needed to better understand global customer software usage. 

Our Solution

We built automated data pipelines to extract, transform and load data from multiple sources including web, social media, and application servers, creating a realtime unified store of data in S3 ready for analytics. We designed an architecture including S3 as the data lake, Athena as the query engine, cloud formation to orchestrate nightly data aggregation, and realtime visualizations on AWS Quicksight.

Automatically discover themes from 6 million new web articles published daily

Develop specialized natural language processing (NLP) models to extract and organize newly published web content for a digital PR firm

Read more

Automatically discover themes from 6 million new web articles published daily

Develop specialized natural language processing (NLP) models to extract and organize newly published web content for a digital PR firm

The Challenge

Our client, a digital PR firm, needed to automate the discovery of concepts in new articles published to the web daily to share with their customers to drive realtime PR campaigns. Further, these 6 million articles needed to be processed within 1 hour. 

Outcome

We built data engineering pipelines to pre-process text strings to lemmatize and drop out stop words. Our team developed a specialized natural language processing (NLP) model to process, classify, and cluster web-based articles, based on primary purpose and content. We were able to extract the most common themes present across the full set of new articles. We optimized the performance and parallelized the pipeline to process 6 million articles daily within 1 hour. Our client’s engineering team incorporated the data engineering pipelines and NLP algorithms into their product platform.

Read more
Automatically discover themes from 6 million new web articles published daily
The Challenge

Our client, a digital PR firm, needed to automate the discovery of concepts in new articles published to the web daily to share with their customers to drive realtime PR campaigns. Further, these 6 million articles needed to be processed within 1 hour. 

Our Solution

We built data engineering pipelines to pre-process text strings to lemmatize and drop out stop words. Our team developed a specialized natural language processing (NLP) model to process, classify, and cluster web-based articles, based on primary purpose and content. We were able to extract the most common themes present across the full set of new articles. We optimized the performance and parallelized the pipeline to process 6 million articles daily within 1 hour. Our client’s engineering team incorporated the data engineering pipelines and NLP algorithms into their product platform.

Automate data engineering normalization for public real estate data

Use data science techniques to auto-normalize and unify data for an organization focused on showing the impact of the governmental policy decisions on real estate values

Read more

Automate data engineering normalization for public real estate data

Use data science techniques to auto-normalize and unify data for an organization focused on showing the impact of the governmental policy decisions on real estate values

The Challenge

Our client focuses on showing the impact of governmental-policy decisions on real estate values. Their business model depends on real estate transaction data contained in deeds, from county and town level recorders, each with varying data formats. To support the growth and scalability of their business, our client needed an automated process to ingest and normalize data in varying formats into a unified format.

Outcome

Our team built data engineering pipelines to ingest semi-structured, heterogeneous data files and trained machine learning models to infer data types and auto-normalize to a standard data model to unify data. Our machine learning models were trained to classify data based on morphology, string matching, and distribution of data. We identified the most similar distribution based on past data to assign data types. We worked with the client engineering team to incorporate the data engineering pipelines and automated normalization algorithms into their platform.

Read more
Automate data engineering normalization for public real estate data
The Challenge

Our client focuses on showing the impact of governmental-policy decisions on real estate values. Their business model depends on real estate transaction data contained in deeds, from county and town level recorders, each with varying data formats. To support the growth and scalability of their business, our client needed an automated process to ingest and normalize data in varying formats into a unified format.

Our Solution

Our team built data engineering pipelines to ingest semi-structured, heterogeneous data files and trained machine learning models to infer data types and auto-normalize to a standard data model to unify data. Our machine learning models were trained to classify data based on morphology, string matching, and distribution of data. We identified the most similar distribution based on past data to assign data types. We worked with the client engineering team to incorporate the data engineering pipelines and automated normalization algorithms into their platform.

Optimize a manufacturing process with anomaly detection

Train predictive models on manufacturing process time-series data to predict and prevent future faults

Read more

Optimize a manufacturing process with anomaly detection

Train predictive models on manufacturing process time-series data to predict and prevent future faults

The Challenge

Our client is an international glass manufacturer that needed to improve product quality in their manufacturing process. Our client was experiencing faults in their product, caused by inconsistencies in the manufacturing process. The client needed to understand where in the process the faults were being introduced to enable intervention and prevention.

Outcome

Our team aggregated one year of time series data for 250 sensors capturing data every 20s at different locations in the manufacturing process including flow rates, temperature, energy expenditure, and more. The number of faults was provided using product photos captured every 10 minutes. Data engineering enabled us to create new features by aggregating and summarizing sensor data in various time increments, and calculating differences between sensors. A significant amount of machine learning model experimentation was done including: linear models, advanced regression, clustered feature reduction, neural net / deep learning, sliding window averaging, automated peak detection, time series clustering. We were able to identify the most predictive models in identifying and preventing future faults. We provided the results of the data engineering and model experimentation to the client engineering team. Based on the insights, the client was able to modify the manufacturing process to significantly reduce product faults.

Read more
Optimize a manufacturing process with anomaly detection
The Challenge

Our client is an international glass manufacturer that needed to improve product quality in their manufacturing process. Our client was experiencing faults in their product, caused by inconsistencies in the manufacturing process. The client needed to understand where in the process the faults were being introduced to enable intervention and prevention.

Our Solution

Our team aggregated one year of time series data for 250 sensors capturing data every 20s at different locations in the manufacturing process including flow rates, temperature, energy expenditure, and more. The number of faults was provided using product photos captured every 10 minutes. Data engineering enabled us to create new features by aggregating and summarizing sensor data in various time increments, and calculating differences between sensors. A significant amount of machine learning model experimentation was done including: linear models, advanced regression, clustered feature reduction, neural net / deep learning, sliding window averaging, automated peak detection, time series clustering. We were able to identify the most predictive models in identifying and preventing future faults. We provided the results of the data engineering and model experimentation to the client engineering team. Based on the insights, the client was able to modify the manufacturing process to significantly reduce product faults.