AI EducademyAIEducademy
AcademicsLabBlogAbout
Sign In
AI EducademyAIEducademy

Free AI education for everyone, in every language.

Learn

  • Academics
  • Lessons
  • Lab
  • Dashboard
  • About

Community

  • GitHub
  • Contribute
  • Code of Conduct

Support

  • Buy Me a Coffee โ˜•

Free AI education for everyone

MIT Licence. Open Source

Programsโ€บ๐ŸŒฟ AI Sproutsโ€บLessonsโ€บIntroduction to Neural Networks โ€” How AI Thinks in Layers
๐Ÿง 
AI Sprouts โ€ข Intermediateโฑ๏ธ 35 min read

Introduction to Neural Networks โ€” How AI Thinks in Layers

The Brain-Inspired Algorithm ๐Ÿ‘‹

In the last lesson, you learned about decision trees, KNN, and linear regression. They're powerful โ€” but they struggle with truly complex tasks like recognising faces, understanding language, or generating images.

For those tasks, we need something more powerful: neural networks โ€” algorithms loosely inspired by how your brain works.

A biological neuron on the left transforms into an artificial neuron on the right
Neural networks borrow ideas from biology โ€” but they're not actual brains!

From Neurons to Artificial Neurons ๐Ÿ”ฌ

Your brain has about 86 billion neurons. Each one:

  1. Receives signals from other neurons
  2. Processes those signals (adds them up)
  3. Decides whether to fire (send a signal onwards)

An artificial neuron does the same thing, but with numbers:

  1. Receives numbers as inputs (data)
  2. Multiplies each input by a weight (how important is this input?)
  3. Adds them all up
  4. Decides whether to "fire" using an activation function
Inputs          Weights        Sum         Activation      Output
โ”€โ”€โ”€โ”€โ”€           โ”€โ”€โ”€โ”€โ”€โ”€         โ”€โ”€โ”€         โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€      โ”€โ”€โ”€โ”€โ”€โ”€
xโ‚ = 0.5  โ”€โ”€โ†’  wโ‚ = 0.8  โ”€โ”
                             โ”œโ”€โ”€โ†’ 1.25 โ”€โ”€โ†’ f(1.25) โ”€โ”€โ†’  0.78
xโ‚‚ = 0.3  โ”€โ”€โ†’  wโ‚‚ = 1.5  โ”€โ”˜

Sum = (0.5 ร— 0.8) + (0.3 ร— 1.5) = 0.40 + 0.85 = 1.25
๐Ÿค”
Think about it:

Think of weights like volume knobs on a mixing board. Each input is a different instrument. The weights control how loud each instrument is. The neuron blends them together, and the activation function decides if the combined sound is loud enough to pass through to the speakers.


Layers: How Neurons Organise ๐Ÿ—๏ธ

A single neuron can't do much โ€” just like a single brain cell can't think. The power comes from organising neurons into layers.

The three types of layers

1. Input Layer โ€” The eyes and ears ๐Ÿ‘€

  • Receives raw data (pixel values, numbers, words)
  • One neuron per input feature
  • Doesn't do any computation โ€” just passes data forward

2. Hidden Layer(s) โ€” The thinking brain ๐Ÿง 

  • Where the actual learning happens
  • Each neuron finds a pattern in the data
  • More hidden layers = ability to find more complex patterns
  • Called "hidden" because you don't directly see their inputs or outputs

3. Output Layer โ€” The answer ๐Ÿ’ก

  • Produces the final prediction
  • For classification: one neuron per category (cat, dog, bird)
  • For regression: one neuron with a number (predicted price)
INPUT LAYER      HIDDEN LAYER 1    HIDDEN LAYER 2     OUTPUT LAYER
(3 features)     (4 neurons)       (4 neurons)        (2 classes)

  [xโ‚] โ”€โ”€โ”€โ”€โ”€โ”€โ†’  [hโ‚] โ”€โ”€โ”€โ”€โ”€โ”€โ†’     [hโ‚…] โ”€โ”€โ”€โ”€โ”€โ”€โ†’       [Cat: 0.92]
  [xโ‚‚] โ”€โ”€โ”€โ”€โ”€โ”€โ†’  [hโ‚‚] โ”€โ”€โ”€โ”€โ”€โ”€โ†’     [hโ‚†] โ”€โ”€โ”€โ”€โ”€โ”€โ†’       [Dog: 0.08]
  [xโ‚ƒ] โ”€โ”€โ”€โ”€โ”€โ”€โ†’  [hโ‚ƒ] โ”€โ”€โ”€โ”€โ”€โ”€โ†’     [hโ‚‡]
                 [hโ‚„] โ”€โ”€โ”€โ”€โ”€โ”€โ†’     [hโ‚ˆ]

Note: In reality, EVERY neuron in one layer connects
to EVERY neuron in the next layer (fully connected).
๐Ÿคฏ

The term "deep learning" simply means a neural network with many hidden layers โ€” typically more than 2. GPT-4 has over 100 layers! More layers let the network learn increasingly abstract patterns: edges โ†’ shapes โ†’ objects โ†’ scenes.


Activation Functions โ€” The Decision Makers ๐Ÿ’ง

After a neuron adds up its inputs, it needs to decide: should I fire or not? That's the job of the activation function.

The water flow analogy ๐Ÿšฟ

Imagine a water pipe with a valve:

  • Water flows in (the weighted sum of inputs)
  • The valve controls what comes out (the activation function)
  • Some valves let everything through; others block most flow

Common activation functions

ReLU (Rectified Linear Unit) โ€” The simple gate

  • If the input is positive โ†’ let it through unchanged
  • If the input is negative โ†’ block it (output zero)
  • Like a one-way valve: water flows forward but never backward
def relu(x):
    return max(0, x)

# Examples
print(relu(3.5))   # 3.5 (positive โ†’ passes through)
print(relu(-2.0))  # 0.0 (negative โ†’ blocked)

Sigmoid โ€” The gentle squisher

  • Squishes any input into a range between 0 and 1
  • Perfect for probabilities: "There's a 0.87 chance this is a cat"
  • Like a dimmer switch โ€” gently controls the flow
import math

def sigmoid(x):
    return 1 / (1 + math.exp(-x))

# Examples
print(sigmoid(5.0))    # 0.993 (very confident "yes")
print(sigmoid(0.0))    # 0.500 (completely uncertain)
print(sigmoid(-5.0))   # 0.007 (very confident "no")

Softmax โ€” The vote counter

  • Used in the output layer for classification
  • Takes multiple values and converts them to probabilities that add up to 1
  • Like an election: every class gets a vote share
Raw outputs:  [2.0, 1.0, 0.5]
After softmax: [0.59, 0.24, 0.17]  โ†’ Cat (59%), Dog (24%), Bird (17%)
๐Ÿ’ก

Why do we need activation functions? Without them, a neural network โ€” no matter how many layers โ€” would just be fancy linear regression. Activation functions introduce non-linearity, which lets the network learn curves, edges, and complex patterns instead of only straight lines.


Forward Propagation โ€” Data Flows Through ๐ŸŒŠ

Now let's see how data actually moves through a neural network. This process is called forward propagation โ€” and it's simpler than it sounds.

Think of it as a relay race:

  1. Input layer receives the data and passes it to the first hidden layer
  2. Each neuron in the hidden layer multiplies, adds, and activates
  3. The results pass to the next layer
  4. This continues until we reach the output layer
  5. The output layer gives us the final prediction

A concrete example

Let's classify whether a fruit is an apple or orange using two features: weight (grams) and colour intensity (0โ€“1 scale).

Input: weight = 150g, colour = 0.8 (orange-ish)

Step 1 โ€” Input layer passes values forward:
  xโ‚ = 150 (weight)
  xโ‚‚ = 0.8 (colour)

Step 2 โ€” Hidden neuron 1 computes:
  sum = (150 ร— 0.01) + (0.8 ร— 2.0) + bias(0.1) = 1.5 + 1.6 + 0.1 = 3.2
  output = ReLU(3.2) = 3.2

Step 3 โ€” Hidden neuron 2 computes:
  sum = (150 ร— -0.005) + (0.8 ร— 1.5) + bias(0.3) = -0.75 + 1.2 + 0.3 = 0.75
  output = ReLU(0.75) = 0.75

Step 4 โ€” Output layer computes from hidden outputs:
  apple_score  = (3.2 ร— -0.5) + (0.75 ร— 0.8) = -1.0
  orange_score = (3.2 ร— 0.7) + (0.75 ร— 0.3) = 2.465

Step 5 โ€” Softmax converts to probabilities:
  Apple:  18%
  Orange: 82%  โ† Prediction: Orange! ๐ŸŠ

Backpropagation โ€” Learning from Mistakes ๐Ÿ”„

Forward propagation gives us a prediction. But what if it's wrong? That's where backpropagation comes in โ€” the process by which a neural network learns.

The dart-throwing analogy ๐ŸŽฏ

Imagine you're learning to throw darts blindfolded:

  1. Throw a dart (forward propagation โ€” make a prediction)
  2. Someone tells you how far off you were (calculate the error)
  3. Adjust your aim โ€” a little left, a little higher (update the weights)
  4. Throw again โ€” this time you're closer!
  5. Repeat thousands of times until you're hitting near the bullseye

That's backpropagation in a nutshell:

Forward propagation:  Data โ†’ Network โ†’ Prediction
                                          โ†“
Compare:              Prediction vs Actual Answer = Error
                                          โ†“
Backpropagation:      Error flows BACKWARD through the network
                                          โ†“
Update:               Adjust weights to reduce the error
                                          โ†“
Repeat:               Do this thousands of times!

Why "backward"?

The error signal starts at the output and flows backward through the network, layer by layer. Each neuron learns: "How much did I contribute to the error? What should I adjust?"

  • Output layer adjusts first โ€” it's closest to the error
  • Then hidden layer 2 adjusts based on the output layer's feedback
  • Then hidden layer 1 adjusts based on hidden layer 2's feedback
  • The input layer doesn't adjust โ€” it just provides data
๐Ÿค”
Think about it:

Imagine a factory assembly line making defective products. To fix the problem, you trace backward from the final product through each station: "Was the packaging wrong? Was the painting wrong? Was the raw material wrong?" Each station adjusts its process. That's backpropagation โ€” tracing errors backward to fix each part of the network.


Interactive: Trace Data Through a Network ๐Ÿ”

Let's trace data through a tiny network step by step. We have a 3-layer network (input โ†’ hidden โ†’ output) that classifies whether an email is spam.

Features:

  • xโ‚ = number of exclamation marks (5)
  • xโ‚‚ = contains the word "free" (1 = yes)

Weights (pre-trained):

Input โ†’ Hidden:          Hidden โ†’ Output:
  hโ‚: wโ‚=0.6, wโ‚‚=0.9   out: wโ‚ƒ=0.7, wโ‚„=0.5
  hโ‚‚: wโ‚=0.3, wโ‚‚=0.4

Biases: bโ‚=0.1, bโ‚‚=0.2, b_out=-0.5

Let's trace it:

import math

# Inputs
x1, x2 = 5, 1

# Hidden neuron 1
z1 = (x1 * 0.6) + (x2 * 0.9) + 0.1    # = 3.0 + 0.9 + 0.1 = 4.0
h1 = max(0, z1)                          # ReLU โ†’ 4.0

# Hidden neuron 2
z2 = (x1 * 0.3) + (x2 * 0.4) + 0.2    # = 1.5 + 0.4 + 0.2 = 2.1
h2 = max(0, z2)                          # ReLU โ†’ 2.1

# Output neuron
z_out = (h1 * 0.7) + (h2 * 0.5) - 0.5  # = 2.8 + 1.05 - 0.5 = 3.35
output = 1 / (1 + math.exp(-z_out))      # Sigmoid โ†’ 0.966

print(f"Hidden layer: h1={h1}, h2={h2}")
print(f"Raw output: {z_out:.2f}")
print(f"Spam probability: {output:.1%}")   # 96.6% โ†’ SPAM! ๐Ÿšซ

What happened:

  1. Both hidden neurons activated strongly โ€” lots of exclamation marks + "free" is suspicious
  2. The output neuron combined these signals
  3. Sigmoid converted the result to a probability: 96.6% chance of spam

Try modifying the inputs: What happens with xโ‚=0 (no exclamation marks) and xโ‚‚=0 (no "free")? The probability drops dramatically! That's the network using what it learned.


The Big Picture ๐Ÿ—บ๏ธ

Here's how everything connects:

Data (Lesson 1) โ†’ Algorithm (Lesson 2) โ†’ Neural Network (Lesson 3)
     ๐Ÿ“Š                  ๐Ÿงฎ                      ๐Ÿง 

Ingredients        Cooking method          The master chef
The fuel           The engine              The power plant

Neural networks are just a type of algorithm โ€” but a remarkably powerful one. They power:

  • ๐Ÿ—ฃ๏ธ ChatGPT โ€” understanding and generating language
  • ๐ŸŽจ DALL-E / Midjourney โ€” creating images from text
  • ๐Ÿš— Self-driving cars โ€” seeing and understanding the road
  • ๐Ÿ”ฌ AlphaFold โ€” predicting protein structures in biology
  • ๐ŸŽต Spotify โ€” recommending your next favourite song
๐Ÿคฏ

The basic idea of neural networks was proposed in 1943 โ€” over 80 years ago! But they only became practical in the 2010s when we had enough data and fast-enough computers (GPUs) to train large networks. Sometimes great ideas just need to wait for technology to catch up.


Quick Recap ๐ŸŽฏ

  1. Artificial neurons receive inputs, multiply by weights, add them up, and apply an activation function
  2. Neural networks have three layer types: input (receives data), hidden (learns patterns), output (makes predictions)
  3. Activation functions (ReLU, Sigmoid, Softmax) add non-linearity โ€” without them, networks can only learn straight lines
  4. Forward propagation pushes data through the network to produce a prediction
  5. Backpropagation sends errors backward through the network to adjust weights and improve
  6. Neural networks power modern AI โ€” from chatbots to self-driving cars

What's Next? ๐Ÿš€

Congratulations โ€” you now understand the three pillars of AI: data, algorithms, and neural networks! In upcoming lessons, we'll explore AI tools and APIs you can use right away, and dive into responsible AI โ€” because building AI that works is only half the job. Building AI that's fair is the other half. ๐ŸŒฑ

Lesson 3 of 30 of 3 completed
โ†Algorithms Explained โ€” The Recipes of AI๐ŸŒณ AI Branchesโ†’