Building a Neural Network from scratch

In this series, we will build a neural network from scratch. The posts require no understanding at all of the subject, we will start from zero. At the end, we will have implemented a basic neural network that can recognize handwritten digits with 97% accuracy.

To get there, we will start with a much simpler dataset and then expand incrementally. On the way to digit recognition, I will explain the parts of a neural network in great detail.

Code on GitHub

Part 1 - The Neuron

15 October 2016
neural-networks
theory
This is the first post of a series about understanding Deep Neural Networks. We start with the core component of artificial neural networks - the neuron. We will use a single artificial neuron to learn a simple dataset.

Part 2 - Flexible Neurons

16 October 2016
neural-networks
theory
In this post, we extend the neuron's flexibility by adding a bias to handle datasets with a constant offset. We also use a learning rate to prevent exploding weights.
neural-networks
theory
This post explains, why we should not use linear regression to solve classification problems. It also describes, what we can do instead to let our neuron tackle classification tasks.
neural-networks
theory
We observed that a single neuron fails to learn not linearly separable datasets like the XOR dataset. In this post, we expand from a single neuron to a net of neurons that can learn more complex functions.
neural-networks
theory
This post explains the math behind the update rules for weights during backpropagation. In other words, it explains how neural networks learn and why. After that, we will implement the update rules - with just 8 lines of code.
neural-networks
theory
In this post, we will experiment with our neural network. We will test out values for hyperparameters such as the learning rate and the number of hidden neurons.
neural-networks
theory
In this post, we will do digit recognition. For that, we need to extend the output layer of our network using the softmax function with cross entropy loss. This enables us to output predictions for multiple classes.
neural-networks
theory
In this post, we will continue with digit recognition and try to come closer to the benchmark of 99.8% accuracy. This post will be about validation, which we can use to reduce overfitting, and batch learning, which speeds up the training phase.