July 17, 2025
Day 38 - Fusing Intelligence: Advancing ECGNet Through Multimodal Integration
What I Learned
Today at my internship, our research took a significant step forward as we began experimenting with model fusion strategies for our project, ECGNet — a hybrid deep learning system designed to diagnose cardiovascular conditions using ECG data. After weeks of building and training separate models on different feature types — time-domain signals (1D-CNN), frequency-domain representations (1D-CNN + Transformer), and spectrograms (2D-CNN) — the focus has now shifted toward combining these models in a way that maximizes diagnostic performance. We explored three primary approaches: early fusion, intermediate fusion, and late fusion — each offering a different perspective on how to integrate insights across modalities. Throughout the day, I implemented these fusion schemes, debugged shape mismatches, and evaluated performance shifts across our validation set. I also began preparing a comparative analysis to help us decide which fusion strategy provides the best trade-off between accuracy, complexity, and explainability. Today’s work made me realize that the heart of deep learning innovation lies not only in individual models but in how we intelligently bring them together. Fusion isn’t just a technical task — it’s a design philosophy that mirrors the complexity of real-world signals. As ECGNet evolves, this fusion phase may be the key that unlocks truly robust, multimodal cardiac diagnostics.
Blockers
No blockers.
Reflection
Today’s internship experience marked a turning point in my journey with the ECGNet project. After weeks of working with individual models trained on time-domain, frequency-domain, and spectrogram-based ECG data, we began exploring something far more complex and intellectually stimulating — model fusion. This shift wasn’t just technical; it pushed me to think more deeply about how different perspectives (or in this case, modalities) can be brought together to solve a single problem more effectively.The most intellectually demanding part of the day was intermediate fusion, where we tried to combine the hidden features — the “thought process” — of each model midway through their forward pass. This method required me to modify architectures, extract internal representations, and design a system where the fused features could meaningfully interact. It was challenging, but I enjoyed it. It felt like I was no longer just training models, but orchestrating a conversation between them