AI Probably
AI Probably's Blog


AI Probably's Blog

Creating a simple neural network using TensorFlow

Creating a simple neural network using TensorFlow

Master the basics of a Neural Network, and build your first neural network using TensorFlow.

AI Probably's photo
AI Probably
ยทAug 20, 2021ยท

9 min read

Want to build your first artificial neural network but do not know where to start?๐Ÿค” You might be surprised to learn that neural networks are not all that complex!๐Ÿคฉ You can learn all of the steps required by following this short lesson. Let us start with the fundamentals, shall we?๐Ÿค—

What are Neural Networks?

What are neural networks

Neural networks are designed for deep learning algorithms to function. In other words, they are just striving to produce accurate forecastsโœ”๐Ÿ“ˆ, as any different machine learning model.๐Ÿค– But what distinguishes them is their capacity to process massive amounts of data ๐Ÿ“š and predict the targets with great accuracy! ๐Ÿ’ฏ

So, let us see the definition of neural networks.

Artificial neural networks are a collection of nodes, inspired by brain neurons, that are linked together to form a network.

neural network architecture

In general, a neural network consists of an input layer, one or more hidden layers, an output layer and is linked together ๐Ÿ”€ to give outputs. The activation function shown here is a hyperparameter that will be addressed later in the blog.

elements of neural network

P and Q are input neurons in the above ๐Ÿ‘† diagram, and w0 and w1 are the weights.

๐Ÿ“ It is important to note that w1 and w2 measure the strength ๐Ÿ’ช of connections made to the center neuron, which sums(โž•) up all inputs. Here, b is a constant known as the bias constant. This is what it looks like mathematically:

SUM = w1 * P + w2 * Q + b

๐Ÿ“ As long as the sum exceeds zero, the output is one or 'yes'; otherwise, it is zero or 'no.' Each input is linked with a weight in a neural network.


The steepness of the activation function rises ๐Ÿ“ˆ as the weight increases. In other words, weight determines how quickly the activation function will fire ๐Ÿ”ฅ, whereas bias is utilized to delay activation.

Like the intercept ๐Ÿ“ˆ in a linear equation, bias is introduced to the equation. Hence, bias is a constant that aids the model to fit the provided data ๐Ÿ“š in the best possible way๐Ÿ’ฏ.

Implementing deep learning algorithms can be daunting๐Ÿ‘Ž, but thanks to Google's TensorFlow - which makes obtaining data๐Ÿ“, training models, serving predictions๐Ÿ“Š, and improving future outcomes more accessible๐Ÿคฉ.

What is TensorFlow?

TensorFlow is a free and open-source framework built by the Google Brain team for numerical computing ๐Ÿงฎ and solving complex machine learning ๐Ÿค– problems. TensorFlow incorporates a wide variety of deep learning models and methods through its shared paradigm๐Ÿ”บ. TensorFlow enables developers to construct dataflow graphs, which are data structures that represent how data flows across a graph or a set of processing nodes. Each node in the graph symbolizes a mathematical process, and each link or edge between nodes is a tensor, which is a multidimensional array.

Now, let us build our first neural network with TensorFlow.

We will be using a vehicle ๐Ÿš— detection dataset from Kaggle. This is a binary classification problem. The idea ๐Ÿ’ก is to develop a model that can distinguish between pictures with and without cars ๐Ÿš—. For the execution, Kaggle notebooks are used; you may alternatively use Google Colab.

Click here to download the dataset!

๐Ÿ“ The dataset consists of 8792 images of vehicles and 8968 images of non-vehicles. So, let us begin by importing the dataset:

import numpy as np
import pandas as pd

import os
for dirname, _, filenames in os.walk('/kaggle/input'):
    for filename in filenames:
        os.path.join(dirname, filename)

maindir = "../input/vehicle-detection-image-set/data"

Output: ['vehicles', 'non-vehicles']

vehicle_dir = "../input/vehicle-detection-image-set/data/vehicles"
nonvehicle_dir = "../input/vehicle-detection-image-set/data/non-vehicles"
vehicle = os.listdir(maindir+"/vehicles")
non_vehicle = os.listdir(maindir+"/non-vehicles")
print(f"Number of Vehicle Images: {len(vehicle)}")
print(f"Number of Non Vehicle Images: {len(non_vehicle)}")

Output: Number of Vehicle Images: 8792 | Number of Non-Vehicle Images: 8968

๐Ÿ“ Let us print an image from vehicle_dir:

import cv2
import matplotlib.pyplot as plt
vehicle_img = np.random.choice(vehicle,5)
img = cv2.imread(vehicle_dir+'/'+vehicle_img[0])
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)

Output: vehicle image

There are several open-source libraries for computer vision ๐Ÿค–and image processing๐ŸŽž. OpenCV is one of the largest. Using pictures and videos, it can recognize items, people, and even the handwriting of a human being.

When the user calls cv2.imread(), an image is read from a file. Among the programming languages supported by OpenCV are Python, C++, and Java.

๐Ÿ“ We will also check an image in the nonvehicle_dir:

import cv2
import matplotlib.pyplot as plt
nonvehicle_img = np.random.choice(non_vehicle,5)
img = cv2.imread(nonvehicle_dir+'/'+nonvehicle_img[0])
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)

Output: non-vehicle image

๐Ÿ“ We now split the data into train and test datasets:

train = []
test = []
import tqdm
from tensorflow.keras.preprocessing import image

for i in tqdm.tqdm(vehicle):
    img = cv2.imread(vehicle_dir+'/'+ i)
    img = cv2.resize(img,(150,150))

for i in tqdm.tqdm(non_vehicle1):
    img = cv2.imread(nonvehicle_dir+'/'+ i)
    img = cv2.resize(img,(150,150))
    test.append("Non Vehicle")

Output: Picture5.png

๐Ÿ“ We can create console line progress bars and GUI progress bars with the help of the tqdm library. We use these progress indicators to check โœ” if we are getting stuck someplace and work on it right away ๐Ÿ’ฏ.

train = np.array(train)
test = np.array(test)

Output: ((17584, 150, 150, 3), (17584,))

๐Ÿ“ The train dataset contains arrays of different images, whereas the test dataset consists of the labels for each respective image.

๐Ÿ“ Let us check the train and test datasets:


Output: Picture7.png


Output: array(['Vehicle', 'Vehicle', 'Vehicle', ..., 'Non Vehicle', 'Non Vehicle', 'Non Vehicle'], dtype='<U11')

๐Ÿ“ The test data, unfortunately, consists of the string labels. First, we have to convert it into numeric data ๐Ÿ”ข; for instance, vehicles will be labeled as 1, and non-vehicles will be labeled as 0.

from sklearn.preprocessing import LabelEncoder
le = LabelEncoder()
test= le.fit_transform(test)


Output: array( [ 1, 1, 1, . . . . , 0, 0, 0 ] )

๐Ÿ“ Now this will make our life easier!๐Ÿค—

๐Ÿ“ The next step involves splitting the train dataset into x_train, y_train, and the test dataset into x_test, y_test. But before that, we need to shuffle the training and testing datasets. Shuffle is nothing more than rearranging the items of an array.

from sklearn.utils import shuffle
train,test = shuffle(train, test)

from sklearn.model_selection import train_test_split
x_train,x_test,y_train,y_test = train_test_split(train,test,test_size=0.2,random_state = 50)

Model Training:

import tensorflow as tf
model = tf.keras.models.Sequential()

๐Ÿ“ The Sequential API is the simplest model to construct and run in Keras. A sequential model enables us to build models layer by layer.

๐Ÿ“ An important thing to note is that right now; the images are in a multidimensional array. We want them to be flat rather than n-dimensional. If we were to use something more powerful like CNN, we might not require it. But in this case, we definitely want to flatten it. For this, we can use one of the layers that are in-built keras.


๐Ÿ“ The following essential steps involve building the layers. We will create five layers to obtain the result.

model.add(tf.keras.layers.Dense(128, activation = tf.nn.relu))
model.add(tf.keras.layers.Dense(64, activation = tf.nn.relu))
model.add(tf.keras.layers.Dense(32, activation = tf.nn.relu))
model.add(tf.keras.layers.Dense(16, activation = tf.nn.relu))

model.add(tf.keras.layers.Dense(2, activation = tf.nn.softmax))  # Output Layer

๐Ÿ“ The next most important thing is to decide the number of neurons in the hidden layers. We typically use systematic experimentation to determine what works best for our particular dataset. Typically, the number of hidden neurons should decrease in succeeding layers as we move closer to the pattern and identify the target labels.

๐Ÿ“ I have taken 128 neurons for the first layer, and for the subsequent layers, I have used 64, 32, and 16 neurons, respectively. You are free to experiment around these parameters. The output layer always consists of the number of classifications; in our case, it is 2. Sometimes, the last layer of a classification network is activated using softmax and the output is a probability distribution of the targets.

๐Ÿ“ An activation function is utilized for the internal processing of a neuron. The activation function of a node describes the output of that node, given input or collection of inputs. We already know that an artificial neural network computes the weighted sum of its inputs and then adds a bias. Now, the value of net output might range from -Inf to +Inf. The neuron does not understand how to bind the value and so cannot determine the firing pattern ๐Ÿ”ฅ. Thus, the activation function determines whether or not a neuron should be activated.

ReLU function

๐Ÿ“ There are many activation functions like sigmoid, tanh. 'Rectified Linear' or relu is one of the activation functions utilized in deep learning models. If it gets any negative input, it returns 0, but if it receives any positive input, it returns that value. As a result, it may be written as:

f (x) = max (0, x)

model.compile(optimizer = 'adam',
               loss = 'categorical_crossentropy',
               metrics = ['accuracy'])

๐Ÿ“ We will use adam as our optimization algorithm for iteratively updating the weights.

๐Ÿ“ Epochs - are the number of times our training dataset will pass through our neural network and are defined as a hyperparameter (a parameter whose value is used to regulate the learning process).

๐Ÿ“ Categorical_crossentropy - Each predicted value is compared to the actual output of 0 or 1, and a score/loss is computed depending on how much it differs from the true value., y_train, epochs = 10)

Output: output

๐Ÿ“ You can notice that with each layer, the loss is decreasing, and the model's accuracy is increasing. The highest accuracy reached was approximately92%.

๐Ÿ“ We can also evaluate this by:

val_loss, val_acc = model.evaluate(x_test, y_test)
print(val_loss, val_acc)

Output: 110/110 [==============================] - 1s 12ms/step - loss: 0.2342 - accuracy: 0.9338


predictions = model.predict([x_test])

Output: [[8.3872803e-02 9.1612726e-01] [7.1111350e-07 9.9999928e-01] [1.9085113e-03 9.9809152e-01] ... [9.9999988e-01 1.5221100e-07] [7.8120285e-01 2.1879715e-01] [3.9260790e-08 1.0000000e+00]]

๐Ÿ“ Wow, that looks like a mess!๐Ÿ˜ต Let us try to simplify it:

y_pred = model.predict(x_test)
y_pred = np.argmax(y_pred,axis=1)

Output: array([1, 1, 1, 0, 1, 0, 1, 1, 1, 1, 1, 1, 0, 0, 1])

๐Ÿ“ We can also crosscheckโœ” the actual values vs. the predicted values:

y_test = np.argmax(y_test,axis=1)

Output: array([1, 1, 1, 0, 1, 0, 1, 1, 1, 1, 1, 1, 0, 0, 1])

๐Ÿ“ Thatโ€™s amazing; the values are almost matching! To get a better look, try:

sample_idx = np.random.choice(range(len(x_test)))
plt.xlabel(f"Actual: {y_test[sample_idx]}" )
plt.xlabel(f"Predicted: {y_pred[sample_idx]}")

Output: Picture10.png

print("y_test value: ",y_test[sample_idx])
print("y_pred value: ",y_pred[sample_idx])

Output: y_test value: 1 | y_pred value: 1

Kudos! You have built your first neural network using TensorFlow!

We hope you liked and found this blog useful!๐ŸคฉCheck out our other blogs below! Also, subscribe to our YouTube channel for more great videos and live projectsโญ.

Share this