TL;DR:
- AI models can predict fetal well-being using cardiotocography (CTG) signals.
- Deep learning methods improve CTG interpretation, reducing false-positive rates.
- Combining fetal heart rate (FHR) and uterine contractions (UC) data enhances model performance.
The Future of Fetal Monitoring
Imagine a world where artificial intelligence (AI) can accurately predict the well-being of a fetus during pregnancy and labour. This is no longer a dream; it’s a reality. Recent advancements in AI and deep learning are revolutionising the field of fetal monitoring, offering the potential to improve neonatal outcomes and reduce the burden on healthcare providers. In this article, we delve into the groundbreaking work on developing and evaluating machine learning models for cardiotocography (CTG) interpretation.
Understanding Cardiotocography (CTG)
Cardiotocography (CTG) is a crucial technique used during pregnancy and labour to monitor fetal well-being. It involves recording the fetal heart rate (FHR) and uterine contractions (UC) using doppler ultrasound. CTG can be done continuously or intermittently, with sensors placed externally or internally.
Currently, healthcare providers interpret CTG recordings using guidelines from organisations like the National Institute of Child Health and Human Development (NICHD) or the International Federation of Gynecologists and Obstetricians (FIGO). These guidelines define patterns in CTG and FHR traces that may indicate fetal distress.
The Role of AI in Improving CTG Interpretation
Despite its widespread use, CTG interpretation is complex and subjective, leading to high false-positive rates and intra- and inter-observer variability. This is particularly challenging in low-resource settings where access to skilled interpreters is limited.
Enter AI. Recent research has focused on using deep learning methods to improve CTG interpretation. These methods use physiological time series data as input, offering a more comprehensive analysis compared to traditional feature extraction techniques.
Developing and Evaluating Deep Learning Models for CTG
In a recent paper titled “Development and evaluation of deep learning models for cardiotocography interpretation,” researchers developed end-to-end neural network-based models to predict measures of fetal well-being. These models were trained on an open-source CTG dataset, the CTU-UHB Intrapartum Cardiotocography Database, which includes 552 FHR and UC CTG signal pairs.
Model Architecture
The researchers began with the CTG-net network architecture, which convolves the paired FHR and UC input signals temporally before conducting a depthwise convolution to learn the relationship between them. They added several methodological configurations, including architecture and hyperparameter optimization, single input variation, and the addition of clinical metadata.
Pre-processing and Pre-training
To improve data quality, the researchers created a pre-processing pipeline that included inputting missing measurements, random cropping, and additive multiscale noise for data augmentation and downsampling. This generated a large dataset for pre-training and training the models.
Intermittent versus Continuous CTG Use Cases
CTG use comes in two primary formats: intermittent and continuous. In high-resource settings, continuous CTGs are used to monitor fetal heart rate throughout labour. In low-resource settings, intermittent CTGs are often used, covering only about 30 minutes at any point during labour.
The researchers simulated intermittent settings by splitting the 90-minute signals in the dataset into 30-minute signals and training and evaluating the model at different time points. This helped understand how training and evaluating on intermittent time points impacts model performance.
Predicting Objective and Subjective Ground Truth Labels
The researchers used three outcome labels from the dataset:
- Arterial umbilical cord blood pH: An objective measurement that tracks fetal acidosis, an indication of fetal distress.
- Apgar score: A subjective measure recorded by a clinician after delivery that reflects the general health of the newborn.
- Abnormal label: If either Apgar or pH results were abnormal.
Evaluating Model Prediction Robustness
The researchers performed several comparisons to evaluate model performance, including:
- Performance on the dataset versus the state-of-the-art CTG-net model.
- Apgar versus pH classification tasks.
- FHR-only versus FHR+UC.
- Base model using the last 30 minutes of labour (continuous case) versus intermittent measurements.
- Base model of FHR+UC versus FHR+UC+Metadata.
- Subgroup performance of the base model (FHR+UC) with subgroups determined by binarizing clinical metadata.
The results showed that combining FHR+UC achieved the highest model performance for both pH and Apgar classification. The pre-training step enabled the highest model performance, and adding clinical metadata slightly improved model performance for pH but less so for Apgar.
Subgroup Evaluations
The researchers found significant differences in baseline performance between subgroups with frequent and infrequent UC signals gaps for pH prediction and for subgroups with frequent and infrequent FHR signal gaps for Apgar prediction. With metadata, the performance disparities observed with pH prediction were mitigated. However, including metadata increased the AUROC performance disparities for demographic and clinical-related subgroups on this task.
Open CTG Model for Research Use Cases
The researchers are currently exploring open-sourcing their models, hoping that other researchers and stakeholders can build on this work with their own datasets to evaluate it for their clinical use cases.
Limitations and Future Work
The study had limitations that constrain the generalizability of the findings. Future investigations should involve a larger and more diverse dataset sourced from maternity centers worldwide, encompassing varied clinical contexts, demographics, and outcomes. Additionally, further work is needed to understand how such prediction algorithms can be optimally integrated into clinical workflows to improve neonatal outcomes.
The Promise of AI in Fetal Monitoring
The development and evaluation of deep learning models for CTG interpretation hold immense promise for improving fetal monitoring and neonatal outcomes. By leveraging AI, healthcare providers can gain objective interpretation assistance, reducing the burden and potentially improving fetal outcomes.
Comment and Share:
What are your thoughts on the future of AI in fetal monitoring? Do you think AI can revolutionise the field and improve neonatal outcomes? Share your experiences and thoughts in the comments below. Don’t forget to subscribe for updates on AI and AGI developments and join the conversation!