July 11, 2025
Day 34 -Building the 2D CNN Model for ECG Spectrogram Classification
What I Learned
Today, I officially began working on the 2D Convolutional Neural Network (2D CNN) model for ECG signal classification using spectrogram images generated from the PTB-XL dataset. This marks the next phase of the project, shifting from 1D time-series and frequency-domain models to image-based deep learning approaches. The first part of the day was focused on reviewing the spectrogram preprocessing pipeline to ensure the input data was correctly formatted and normalized for the CNN. I confirmed the resolution, dimensions, and color channels of the spectrogram images, and verified that the labels were correctly aligned with each image. After that, I began building the initial 2D CNN architecture using PyTorch. The architecture currently includes multiple convolutional and max-pooling layers, ReLU activation functions, and dropout for regularization. I also added flattening and fully connected layers to support classification across the five target ECG categories.
Blockers
No blockers.
Reflection
Today marked the beginning of my work on the 2D CNN model for ECG classification, and it felt like stepping into a new chapter of the project. After weeks of working with 1D time-series and frequency-domain data, switching to image-based learning using spectrograms pushed me to think differently — more like a computer vision task than a traditional signal processing one. I spent the first part of the day making sure the spectrogram images were clean, properly shaped, and correctly labeled. It was a reminder of how crucial data preparation is, especially in medical AI where even small misalignments can skew results. Once I had the data ready, I moved on to building the initial CNN architecture in PyTorch. I started simple: convolutional and pooling layers, ReLU activations, dropout, and a fully connected output layer. As I built, I found myself comparing different architectural styles — from classic VGG blocks to more modern ResNet ideas — and asking myself what balance of complexity and performance made sense for our dataset size.