This repository contains a hands-on exploration of Convolutional Neural Networks (CNNs) for image classification using the MNIST handwritten digit dataset.
This assignment guides you through building, training, and analyzing CNNs to understand how architectural choices impact model performance. You'll experiment with different configurations, compare CNNs to traditional feedforward networks, and reflect on how specialized architectures could enhance real-world projects.
- Build and train CNNs for image classification
- Experiment with architectural hyperparameters (filter counts, kernel sizes)
- Visualize model training performance through accuracy and loss curves
- Compare CNN performance against feedforward neural networks
- Understand when to apply specialized architectures vs. traditional models
DSMII-Assignment-7/
├── README.md # This file
└── starter_notebook.ipynb # Main assignment notebook
The starter_notebook.ipynb includes:
- Setup & Data Loading: Import libraries and load the MNIST dataset
- Baseline CNN: Build a 2-layer convolutional network with standard hyperparameters
- Filter Count Experiments: Test networks with fewer (16, 32) and more (64, 128) filters
- Kernel Size Experiments: Explore 5x5 kernels and mixed kernel configurations
- Best Architecture: Train your optimal configuration based on experiments
- Performance Visualization: Plot accuracy and loss curves over training epochs
- Feedforward Comparison: Build a traditional neural network and compare results
- Final Project Reflection: Consider how specialized architectures apply to your project
- Python 3.x
- VS Code with Jupyter extension installed
- TensorFlow, NumPy, Pandas, and Matplotlib
- Open the workspace in VS Code
- Open
starter_notebook.ipynb - Install dependencies by running the first code cell (pip install command)
- Execute cells sequentially, completing the TODO sections
- Complete analysis questions in markdown cells throughout the notebook
The notebook uses TensorFlow/Keras to build neural networks and trains them on the MNIST dataset (28x28 grayscale images of handwritten digits 0-9).
- Baseline CNN: 32 and 64 filters with 3x3 kernels
- Filter Variations: Compare 16/32 vs 64/128 filter configurations
- Kernel Variations: Test 5x5 kernels and mixed kernel sizes
- Architecture Comparison: CNN vs feedforward network on image data
- Trained CNN models with varying architectures
- Performance metrics (accuracy, loss, training time)
- Visualization plots showing model learning over epochs
- Comparison table of different model architectures
- Written analysis of architectural choices and their impact
Complete all TODO sections in the notebook, including:
- Model building and training code
- Analysis responses in markdown cells
- Final project reflection
- Push completed work to GitHub and submit repository link