Skip to content

striderzz/ML3-CNN-MLP-CIFARMNIST

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

5 Commits
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Got it β€” here’s a clean, professional GitHub README.md you can directly drop into your repo for this project πŸ‘‡ (It’s written to match CS 6375 Project 3 and your notebooks: MLP.ipynb, cnn_final.ipynb.)


# Deep Learning on MNIST and CIFAR-10 (MLP & CNN)

This repository contains my implementation and evaluation of **Multilayer Perceptrons (MLPs)** and **Convolutional Neural Networks (CNNs)** on the **MNIST** and **CIFAR-10** datasets using **PyTorch**.  
This project was completed as **Project 3 for CS 6375: Machine Learning**.

## πŸ“Œ Project Objectives
- Implement MLP and CNN models from scratch using PyTorch
- Explore **multiple architectures** with increasing depth
- Perform **validation-based hyperparameter tuning**
- Compare the performance of **MLPs vs CNNs**
- Evaluate final models on held-out **test sets**

The full project specification is provided in the assignment PDF :contentReference[oaicite:0]{index=0}.

---

## πŸ“‚ Repository Structure

. β”œβ”€β”€ MLP.ipynb # Multilayer Perceptron experiments (MNIST & CIFAR-10) β”œβ”€β”€ cnn_final.ipynb # Convolutional Neural Network experiments β”œβ”€β”€ project_3.pdf # Official assignment description └── README.md


---

## 🧠 Models Implemented

### 1️⃣ Multilayer Perceptrons (MLP)
MLPs were trained on **flattened images** from both datasets.

Architectures explored:
- **Shallow MLP**: 1 hidden layer (e.g., 128 units)
- **Medium MLP**: 3 hidden layers (e.g., 512 β†’ 256 β†’ 128)
- **Deep MLP**: 5+ hidden layers

### 2️⃣ Convolutional Neural Networks (CNN)
CNNs were trained directly on image tensors.

Architectures explored:
- **Baseline CNN**: 2 convolutional layers + pooling + FC
- **Enhanced CNN**: Batch Normalization + Dropout
- **Deeper CNN**: 3+ convolutional layers with normalization and dropout

---

## βš™οΈ Datasets & Preprocessing
- **MNIST** (28Γ—28 grayscale images)
- **CIFAR-10** (32Γ—32 RGB images)

Preprocessing steps:
- Loaded using `torchvision.datasets`
- Pixel normalization applied
- Dataset split into **training / validation / test**
  - MNIST: 50k train / 10k validation
  - CIFAR-10: 45k train / 5k validation

---

## πŸ” Hyperparameter Tuning
For each architecture and dataset, multiple configurations were tested using a validation set.

Tuned parameters:
- Learning rate: `{0.01, 0.001, 0.0001}`
- Batch size: `{32, 64, 128}`
- Optimizer: `SGD`, `Adam`
- Dropout rate: `{0.2, 0.5}`

Instead of exhaustive search, **10–12 meaningful configurations** were explored per architecture.  
The best model was selected based on **validation accuracy**, then retrained on combined training + validation data.

---

## πŸ“Š Results
Results are reported in the same format as required by the assignment:
- Validation Accuracy (Β± standard deviation)
- Runtime (minutes)
- Final Test Accuracy

Separate result tables were produced for:
- **MNIST**
- **CIFAR-10**

Key observation:
- CNNs significantly outperform MLPs, especially on **CIFAR-10**, due to their ability to capture spatial features.

---

## πŸš€ How to Run

### Requirements
```bash
pip install torch torchvision matplotlib numpy

Run MLP Experiments

jupyter notebook MLP.ipynb

Run CNN Experiments

jupyter notebook cnn_final.ipynb

Note: GPU (Google Colab recommended) significantly reduces training time.


πŸ“ Key Takeaways

  • Deeper models do not always guarantee better performance without proper regularization
  • CNNs are far more effective than MLPs for image-based tasks
  • Batch normalization and dropout improve stability and generalization
  • Validation-based tuning is critical for fair model comparison

πŸ“š References

About

PyTorch implementation of Multilayer Perceptrons (MLPs) and Convolutional Neural Networks (CNNs) evaluated on MNIST and CIFAR-10 with validation-based hyperparameter tuning.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors