July 18, 2025

Day 39 - Advancing ECG Diagnostics Through Multimodal AI Fusion

By Ayomide Jeje

What I Learned

Today at my internship, I deepened my understanding of multimodal AI model integration for ECG signal analysis by exploring and implementing early, late, and intermediate fusion strategies. The goal was to build a more accurate diagnostic system by combining different representations of ECG data — specifically, time-domain signals, frequency-domain features, and spectrogram images. The day began with a review session focused on the theoretical differences between the three fusion techniques. Early fusion involves concatenating features before model input, which can be powerful but requires precise alignment. Late fusion combines the outputs of independently trained models, offering flexibility and modularity. Intermediate fusion — the most complex of the three — merges internal representations at hidden layers, allowing for richer interactions between modalities. After reviewing the pros and cons of each, I started with early fusion. I normalized the time-domain and frequency-domain data to ensure compatibility, then concatenated them before feeding the combined vector into a new model. However, I noticed that the training accuracy plateaued quickly, possibly due to the imbalance in the number of features contributed by each modality. Next, I shifted to late fusion. I trained two separate models — one on time-domain data using a 1D CNN, and another on 2D spectrograms using a 2D CNN — and then combined their output probabilities using a weighted average. This approach yielded slightly better performance and revealed the strength of modular specialization: each model was able to optimize for its specific data representation.

Blockers

No blockers.

Reflection

Today at my internship, I found myself diving deeper into the complexities of building intelligent systems for ECG analysis. What started as a technical task — implementing early, late, and intermediate fusion models — turned out to be a meaningful exercise in patience, learning, and problem-solving. I began the day revisiting the theoretical underpinnings of fusion techniques. While I had read about them before, today felt different. I wasn’t just learning definitions — I was applying them to real medical data with real implications. Early fusion, where features are combined before entering the model, seemed straightforward at first, but in practice, it revealed how tricky feature alignment can be. I tried merging time- and frequency-domain data, but quickly noticed performance limitations. That failure, though frustrating, reminded me that a model is only as good as the relationships it can extract — and sometimes, forcing different data types together too early weakens those relationships.