Our client had an ECG preprocessing and feature extraction pipeline executed in Matlab. The Matlab version required costly licensing and was not friendly toward deploying the pipeline at scale and in the cloud. The pipeline was also bespoke to a specific clinical trial and not generalizable to analyze data from new incoming clinical trials with differently structured data and study designs. It lacked handling for input validation, missing data, multiple leads, datetime tracking, and other functionality desired to make the pipeline fully generalizable to any new incoming dataset.
We converted the Matlab pipeline into Python, a free, cloud-friendly language. Unit tests were utilized to ensure example Matlab and Python outputs matched exactly. We added additional functionality to ensure the pipeline would be generalizable and handle different input structures and a varying level of input data quality. This Python version was implemented at scale in the cloud within their internal system that builds from a git repository, importing the module and accepting inputs from S3. This transformer was designed to be accessed and deployed within their user interface.
This generalized GxP compliant ECG pipeline can be used to support digital biomarker discovery across many clinical trials and indications.