Skip to content

uzairarif5/CNNForDigits

Repository files navigation

Coding a neural network to identify hand written numbers

Lets see a description of my project files and folders:

  • initWeightAndBiases.py:
    • Contains the function of setting weights and biases which will be used by the neural network. The function will run if you run the file.
    • Contains the sizes of the input layer, second hidden layer, third hidden layer and the kernels.
  • /dataStore:
    • Stores the weights and biases made from initWeightAndBiases.py in txt and npy files. The npy file type is used by the neural network.
  • getNumbers.py:
    • Contains functions to get images from MNIST and ownDataset.
    • Some images are banned and will not be used. Banned images are stored in the bannedNumIndices array.
  • showNumber.py:
    • Contains showImgsOnPlt, which shows images on a matplotlib graph.
    • Running this files allows you to see images from MNIST or ownDataset.
  • kernel.py:
    • For CNN, this is the file that applies a kernel to an image.
    • Running this file allows you to select an image, and see the "kerneled" images.
    • Keep in mind that the kernel does not get flipped, so what being done is actually cross-correlation.
  • downPool.py:
    • Contains a function to down size an image by half (using max pooling) and use Relu on it.
    • Running this files allows you the select an image and downSize it twice.
  • convolution.py:
    • Does filtering (from kernel.py), and down sizing (from downPool.py) on a range of images.
    • Running this files allows you the select a range of images and apply convolution to them.
    • Convolution currently is very slow, any recommendations to optimize will be appreciated.
  • trainCNN.py:
    • The main file that contains the actual neural network.
    • Understanding the back propagation can be confusing, for more detail see /imagesForBackPro.
    • Run this file to see the neural network in action. While training, selected images will be shown using showImgsOnPlt from showNumber.py.
    • The neural network will either use sigmoid or reLu (leaky version) (the choice will be given as a user input). Keep in mind, when I say "reLu", it's not actually reLu because the function divides the final result by the number of nodes in the previous layer. For example, a node in layer (N) gets it's value by multiplying all the nodes in layer (N-1) with the weights and summing them, then adding the bias, then clipping it to (0.01) if it's less than (0.01), and then dividing it by the number of nodes in layer (n-1).
    • Several prompts get asked during runtime, some of them include:
      • "Press 1 to initialize weights and biases": Either use the last saved values or make new ones.
      • "Press 1 to use a new learning rate"
    • Program summary:
      • Select images from MNIST. Either use neighboring images (with starting index and ending index chosen by the user), or select randomly (the number of images is chosen by the user).
      • Ask user to initialize new weights and biases, otherwise used the saved values.
      • Set init variables (input array, hidden layers array, output array, correct ans array).
      • Start training.
      • The weights and biases will keep updating [MAX_UPDATES] times using the same images as long as loss is above [MIN_ERR].
      • After training is done, the user has the option to save the weights and biases, train again using the same images or train again with different images.
  • /imagesForBackPro:
    • Contains my notes on my back propagation formulas (helps understand the back propagation algorithm in trainNN.py).
  • trainCNNEntireDatasetSigmoid.py:
    • Like trainCNN.py, but some choices are pre-selected.
    • It chooses the first [BATCH_SIZE] images from MNIST, updates the weights once, then chooses the next [BATCH_SIZE] images, updates the weights once again, and so on, until the entire dataset is used. This is known as one epoch.
    • At the end, the user will be given the option to save the new weights and biases.
    • The first [BATCH_SIZE] images will be shown using matplotlib.
  • trainCNNEntireDatasetRelu.py:
    • like trainCNNEntireDatasetSigmoid.py, but uses Relu instead of sigmoid.
  • testImage.py:
    • Allows you to pick test images from MNIST, and test the neural network.

As of update 8.1, the model was trained with leaky relu and is 94% accurate with the test data. If you want to inform my about any errors, or give me any suggestions, feel free to message me at my LinkedIn (linkedin.com/in/uzair0845).

Updates

update 8.3:

  • Using keras to get the mnist dataset. The original mnist package doesn't semem to be working.

update 8.2:

  • Added images to explain kernel stuff.

update 8.1:

  • Relu is changed to Leaky ReLU.
  • There was an error on how convolution is calculated, and how the kernel's back propagation updates is done. That is fixed now... I think.

update 7.2:

  • In trainCNN.py, for choosing a new learning rate, the 0 to 1 range restriction is removed.
  • In trainCNNEntireDatasetRelu.py, added accuracyArr variable which stores the accuracy throughout the training and can be saved to /dataStore/accuracy.txt.

update 7.1:

  • In trainCNN.py, added the option to use Relu.
  • Renamed trainCNNEntireDataset.py to trainCNNEntireDatasetSigmoid.py
  • Added trainCNNEntireDatasetRelu.py which is like trainCNNEntireDatasetSigmoid.py, but uses Relu instead of sigmoid.
  • Drawing Board now takes a 196x196 drawing array and reduces it to 28x28 by using mean pooling.
  • First kernels array now has 6 kernels instead of the previous 4.

update 6.3:

  • Kernels now have biases as well. They are now used in forward propagation and get updated in backward propagation.

update 6.2:

  • Added testImage.py, where you can pick test images from MNIST, and test the neural network. As of this update, the neural network is around 95% accurate.

update 6.1:

  • Kernels are now initialized randomly and get saved in /dataStore. I didn't delete the pre-set fixed kernels I had before in kernels.py, I kept them for the sake of testing. The kernels also have biases as well.
  • In trainCNN.py and trainCNNEntireDataset.py, kernel values also get updated in back propagation. Also, images will be convoluted on every loop instead of doing it once at the start.
  • In convolution.py, added convGPU function which is like doubleConvGPU, but does convolution only once. Also, removed doubleConvGPU function.
  • In trainCNN.py and trainCNNEntireDataset.py, the learning rate used to get multiplied once to the repeatedCalArr1 variable, but now it gets multiplied on the weights and biases individually.
  • In downPool.py, added the downPoolAndReluGPU function which takes an image, and does downPool and relu vectorized (using GPU), it also returns a list 3x3 matrices, these 3x3 matrices is the sum of all the pixel values that "passed" downPool and relu, along with its neighboring pixel. For example, passing 5 28x28 image will return a 14x14 image, but also 5 3x3 matrices, the middle value of the inner 3x3 matrix is the sum of all the pixel values that passed downpool and relu and ended up in the 14x14 image, the edges of the 3x3 matrix contains the passed pixel value's neighbors. These 3x3 matrices are suppose to make back propagation easier.
  • In downPool.py, added the function downPoolAndReluGPUForPassedMatrix which takes the passed matrix and the filtered image as an input, and maxpools the passed matrix in the same indices where the filtered image get maxpooled. This is used keep track which pixel in the original 28x28 image ended up in the final 7x7 image. This is suppose to make back propagation easier.

update 5.2:

  • In drawingBoard.py, the drawn image is multiplied by 255 before being convoluted, and then the filtered images are divided by 255 after convolution.
  • In trainCNNEntireDataset.py, the accuracy now represents the average accuracy of all batches passed so far in the current epoch, instead of using only the last batch.

update 5.1:

  • Added L2 Regularization.
  • Added applyKernelsGPU function in kernels.py, which uses GPU.
  • Added doubleConvGPU in convolution.py which is like doubleConv but uses applyKernelsGPU.
  • Removed trainNN.py, which was like trainCNN.py but without the convolution.
  • Deleted ownDataset.py, which used to contain my own custom dataset. That also means I removed the corresponding code in trainCNN.py and getNumbers.py.
  • trainCNNEntireDataset.py now use test data at the end to test accuracy.

update 4.1:

  • Deleted autoTrainCNN.py.
  • Added trainCNNEntireDataset.py, where the user doesn't select images, instead the first 200 images used, then weights are updated once, then the next 200 images are used, then the weights are updated again once and so on, until the entire dataset is used.
  • Added drawingBoard.py, which shows a graphical user interface where you can draw numbers and see the output using the last save weights and biases.
  • In trainNN.py, trainCNN.py and trainCNNEntireDataset.py, neural network node values are now saved in arrInput.txt, arrHidden1.txt, arrHidden2.txt and arrOutput.txt, instead of arr12.txt, arr23.txt, arr34.txt and arrOut.txt.
  • In convolution.py, the print messages indicating the start and end of convolution are now optional.
  • In showNumber.py, getting the images is now done using the getImagesFromMNIST function in getNumbers.py.
  • In getNumbers.py, the function getImagesFromMNIST return images in float32 instead of uint16.

update 3.2:

  • Prompt in getNumbers.py changed from "Choose the number to use" to "Number of images to use".
  • Added autoTrainCNN.py, which is like trainCNN.py, but some options are pre-selected. It chooses 1500 random images from MNIST, runs 1000 updates, then chooses 1500 random images again. This keeps repeating [NUM_OF_TRAINING] times. At the end, the user will be given the option to save the new weights ans biases.

update 3.1:

  • The maxpooling function and the filtering function are now vectorized, so convolution is now faster.
  • In trainCNN.py, I have changed the way the progress is shown, instead of printing a new line in each update, there is now a progress bar in shown one line. That line also shows the current error value and the number of correct predictions.
  • In trainCNN.py, added the option to view the output layer of last input array in the training loop.
  • train.py does not show images in matplotlib anymore.
  • In showNumbers.py and getNumbers.py, I added an option to view images from pre-selected indices

update 2.1: Added kernel.py, downPool.py, convolution.py and trainCNN.py

update 1.2: Made small changes to readme.md

update 1.1: First main update. Commits before this were to initialize my repository.

About

Using CNN for recognizing hand-written digits

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages