July 24, 2025
Day 43 - Enhancing Multimodal Robustness + Research Sync-Up
What I Learned
Today marked another meaningful stride in my internship journey. I started the day by meeting with my professor to review progress on our multimodal ECG classification system. During our discussion, we went over the three model pipelines (1D CNN for time-domain, 2D CNN for spectrograms, and the hybrid model for frequency-domain signals), and aligned on upcoming goals for evaluating real-world performance. One of the major tasks I worked on today was simulating noise in the multimodal validation sets. The goal was to test the robustness and generalization capacity of each model by exposing them to signal distortions that mimic real-world clinical scenarios. I introduced Gaussian noise to both the time-domain and spectrogram inputs, carefully calibrating the noise level to ensure realism without overpowering the signal. This required reshaping and validating each input pipeline independently. I also prepared the hybrid inputs (1D CNN + Transformer) to handle these augmented variants, ensuring that the signal structure wasn’t disrupted. After implementing the noise-injected evaluation, I ran inference and recorded the fused predictions from the weighted ensemble model. Early results suggest that while the clean data accuracy remains high, there is noticeable variation in performance under noisy conditions — which opens up important conversations around regularization and augmentation strategies.
Blockers
No blockers.
Reflection
Today was a solid blend of technical work and academic alignment. I had a meeting with my professor this morning to discuss how far the project has come and what’s next. We talked through the current status of the three models — the 1D CNN for time-domain ECG signals, the 2D CNN for spectrograms, and the hybrid 1D CNN + Transformer for frequency-domain inputs. It was helpful to get clarity on expectations, especially as we move toward evaluation and robustness testing. After the meeting, I focused on something I’ve been thinking about for a while: making the model stronger against real-world noise. In actual hospital environments, ECG readings aren’t perfect — they come with all sorts of noise, from electronic interference to patient movement. To simulate that, I added Gaussian noise to both the time-domain and spectrogram validation datasets. Getting the noise level right was tricky — it had to challenge the model without completely distorting the signal. Once that was set up, I passed the noisy data through all three models and then fused the predictions to test how well the ensemble could handle it. It was interesting to see the small shifts in accuracy, especially in the hybrid model. It made me realize how important it is to train and test models in conditions that mimic reality — not just ideal lab setups. Overall, today reminded me that building good AI isn’t just about hitting accuracy numbers. It’s about resilience — making sure your model still performs when the world isn’t perfect. That’s something I’m really starting to appreciate as this project progresses