Legacy Projects
CIFAR-10 Image Classification with Deep Learning
(2021)
Summary: This project explores deep learning for a computer vision task on the CIFAR-10 dataset, a standard benchmark for 10-class image classification. The project proceeds in two phases. The initial phase involves experimenting with a basic Convolutional Neural Network (CNN) containing just two convolutional layers using ReLU and Sigmoid activations, followed by experiments with Multi-Layer Perceptrons (MLPs), to demonstrate the importance of convolutional layers for image data and the relative performance of different activation functions. This initial CNN achieved only 63% accuracy. The second phase focuses on developing a more complex CNN architecture, drawing inspiration from VGG-style networks in terms of depth and filter sizes, with the goal of exceeding 80% accuracy on a personal computer equipped with a single GPU. By incorporating techniques like batch normalisation and multi-scale convolutional filters, the final model achieved 88% accuracy on the CIFAR-10 test set, surpassing the initial goal and demonstrating the effectiveness of the chosen approach within the given resource constraints.
Problem: This project aimed to develop an effective deep learning model for the CIFAR-10 dataset within the constraints of a personal computer equipped with a single GTX 1070 GPU. The key challenges included:
Demonstrating the Importance of Convolutional Layers: Initial experiments with a basic two-layer CNN achieved only 63% accuracy. Subsequent trials with Multi-Layer Perceptrons (MLPs) performed even worse, clearly demonstrating the necessity of convolutional layers for effectively extracting features from image data. This comparison motivated the focus on a more advanced approach.
Achieving High Accuracy with Limited Resources: While established architectures like VGG-16/19 can achieve 90–93% accuracy on CIFAR-10 (and even higher with data augmentation), training such computationally intensive models from scratch was impractical on the available hardware. The challenge was to design a model that approached this level of performance while remaining feasible to train with limited resources.
Optimising Training Strategies: To maximise performance within the resource constraints, the project employed techniques like batch normalisation and multi-scale convolutional filters to improve accuracy and training stability. The target was to achieve an accuracy exceeding 80% without relying on extensive and complex training pipelines.
Solution:
To address the challenges outlined in the previous section, the project implemented the following solution:
-
Custom CNN Architecture: A custom CNN architecture was developed, drawing inspiration from Inception model in its use of multiple convolutional layers with varying filter sizes (3x3, 5x5, 7x7, and 9x9). This multi-scale approach aimed to capture features at different levels of detail. The architecture incorporated:
The convolutional section comprised an initial multi-scale feature extraction stage (four parallel convolutions with 3x3 to 9x9 kernels), three subsequent convolutional blocks with varying combinations of operations (convolutions, concatenations, batch normalization, and activations), and a final feature concatenation stage (two parallel convolutions).
Leaky ReLU activation functions (negative slope = 0.1) after the first set of concatenated convolutional layers (
act
) for improved gradient flow.Batch normalisation layers (
bn1
andbn2
) strategically placed after the concatenation and activation operations to stabilize training and act as regularisation.Max pooling layers (
maxpool1
andmaxpool2
) to reduce spatial dimensions and introduce translation invariance.A series of fully connected layers (
fc1
tofc5
) with ReLU activations, culminating in a final output layer with 10 neurons for the 10 CIFAR-10 classes.
-
Training Optimisation: The model was trained for 40 epochs using the following setup:
Batch size of 16 to manage memory usage on the GTX 1070 GPU.
Cross-entropy loss function, suitable for multi-class classification.
Stochastic Gradient Descent (SGD) optimiser with momentum (0.9).
A dynamic learning rate schedule to fine-tune the model during training. The learning rate was initially set to 0.001 for the first 20 epochs. After observing that the validation loss plateaued, the learning rate was reduced to 3e-4 for the next 10 epochs (epochs 21-30). Finally, for the last 10 epochs (epochs 31-40), the learning rate was further reduced to 1e-4 to facilitate finer adjustments and potentially escape local minima.
This strategy allowed the model to achieve 88% accuracy on the CIFAR-10 test set, surpassing the initial target of 80% and demonstrating the effectiveness of the chosen architecture and training regimen within the given resource constraints. The use of batch normalisation proved sufficient for regularisation, eliminating the need for dropout.
-
Impact: This project provided valuable practical experience in designing and training deep learning models within resource constraints. It solidified my understanding of convolutional layers and their crucial role in image data processing, along with the performance benefits of architectural innovations like multi-scale feature extraction. I also gained proficiency in optimization techniques, including dynamic learning rate schedules and batch normalisation, for stabilizing training and maximizing accuracy. Successfully building and fine-tuning a custom CNN architecture to achieve 88% accuracy on CIFAR-10 with limited resources showcased my ability to balance computational efficiency with performance—a critical skill for deep learning research and real-world applications.
GitHub repo: github.com/lzrdGreen/Models-for-CIFAR-10
Relevant skills: Python, PyTorch, Scikit-Learn, matplotlib, numpy, pandas

Loss for training and validaion sets.
Application of BERT, a Transformer-based language model, to check the correctness of a sentence in English
(September 2021)
Summary: This project tackled the challenge of grammatical error detection in English using Natural Language Processing (NLP) by fine-tuning a pre-trained BERT model with the CoLA dataset. The implementation was completed on a personal computer with a single GTX 1070 GPU, demonstrating the accessibility of advanced NLP techniques without requiring high-performance computing clusters. By validating the model's performance on real-world examples, the project showcased BERT's potential for nuanced linguistic tasks, contributing to understanding fine-tuning techniques and paving the way for practical applications like grammar checkers and language learning tools.
Problem: The core problem addressed by this project is the difficulty of accurately and efficiently detecting grammatical errors in English sentences. While humans can generally identify many grammatical mistakes, the process is time-consuming and prone to inconsistencies. At the time of this project (September 2021), Natural Language Processing (NLP) was advancing rapidly, with large language models (LLMs) like BERT demonstrating promising capabilities in various language-based tasks. However, many introductory NLP applications focus on simpler classification tasks like sentiment analysis, which, thanks to readily available pre-trained models, have become remarkably streamlined. For instance, models like
cardiffnlp/twitter-roberta-base-sentiment-latest
(fine-tuned on a large dataset of tweets labeled with sentiment) from the Hugging Face Model Hub can be directly employed for accurate sentiment classification using only a simple downstream classifier like logistic regression, without requiring further training. Grammatical error detection, a more nuanced challenge, presents an opportunity to explore the potential of LLMs more deeply. Inspired by Chris McCormick's work utilising the Corpus of Linguistic Acceptability (CoLA) dataset and his provided Jupyter Notebook demonstrating fine-tuning a standard pre-trained 12-layer uncased BERT model (released in 2018), this project aimed to replicate and evaluate this fine-tuning process on a personal computing setup (single GTX 1070 GPU). This addressed the need for more sophisticated NLP tasks beyond basic sentiment analysis and explored the practical application of cutting-edge LLMs on readily available hardware.Solution: This project implemented a solution based on fine-tuning a pre-trained BERT model for binary classification of grammatical acceptability using the CoLA dataset. The key steps involved:
Data Preprocessing: The CoLA dataset, containing sentences labeled as grammatically acceptable or unacceptable, was loaded into a pandas DataFrame and tokenised using
BertTokenizer
. This crucial step mapped sentences to numerical representations, a sequence of tokens, that the BERT model could understand.Fine-tuning BERT: A pre-trained BERT model was fine-tuned specifically for this task. A
BertForSequenceClassification
transformer model, with a single linear layer on top of the BERT output, was used for binary classification (grammatically correct or incorrect). This process adapted the pre-trained model's knowledge to the specific task of grammatical error detection.Training: The model was trained using the AdamW optimiser, a common choice for fine-tuning transformer models. This leveraged the pre-trained weights of BERT while optimising the model's parameters on the CoLA data.
Evaluation: The model's performance was primarily evaluated using real-world examples from a reputable source: Michael Swan's "Practical English Usage," an Oxford University Press publication. This approach was chosen due to the difficulty in interpreting metrics like MCC or F1 in isolation. Two pairs of sentences (one grammatically correct and one incorrect in each pair) were used to validate the model's generalization ability beyond the CoLA training data. The model successfully identified the grammatical correctness in all pairs, providing qualitative evidence of its effectiveness and practical applicability.
Impact: This project demonstrated the feasibility of using powerful LLMs like BERT for complex NLP tasks like grammatical error detection, even with limited computational resources. The impact extends beyond a simple demonstration of technology, contributing to the understanding and application of fine-tuning techniques, validating the real-world applicability of BERT for grammatical error detection, and opening avenues for future research and development in language technology. Specifically, the project:
Proof of Concept: While BERT is widely used for various NLP tasks, this project specifically showcased its effectiveness in the less common domain of grammatical correctness classification. This contributes to a broader understanding of BERT's versatility and its potential for addressing nuanced linguistic challenges.
Practical Validation: By moving beyond standard dataset evaluation and testing the model on real-world examples from a reputable grammar resource (Michael Swan's "Practical English Usage"), the project validated the model's ability to generalise to unseen data and address real-world linguistic challenges. This highlights the practical utility of fine-tuned language models in improving writing quality and language learning. This practical validation is a key contribution of the project.
Advanced understanding of fine-tuning: The project deepened the author's understanding of the intricacies of fine-tuning pre-trained models, including data preprocessing, model selection, training procedures, and evaluation strategies. This contributes to a growing body of knowledge on effective fine-tuning practices and provides a valuable learning experience.
Potential for real-world applications: While this project was a proof-of-concept, it demonstrates the potential for developing practical applications such as advanced grammar checkers, language learning tools that provide detailed feedback on grammatical errors, and improved machine translation systems that produce more grammatically sound output.
Accessibility: By successfully implementing the fine-tuning on a personal computer with a single GTX 1070 GPU, the project made this advanced NLP technique more accessible to a wider audience, demonstrating that high-performance computing clusters are not always necessary.
Reproducibility: By replicating existing work (while adapting the evaluation method), the project contributed to the reproducibility of NLP research and provided a valuable learning resource for others interested in exploring LLMs.
GitHub repo: English Grammar Tester
Relevant skills: Python, PyTorch, Scikit-Learn, matplotlib, numpy, pandas