AI EducademyAIEducademy
AcademicsLabBlogAbout
Sign In
AI EducademyAIEducademy

Free AI education for everyone, in every language.

Learn

  • Academics
  • Lessons
  • Lab
  • Dashboard
  • About

Community

  • GitHub
  • Contribute
  • Code of Conduct

Support

  • Buy Me a Coffee โ˜•

Free AI education for everyone

MIT Licence. Open Source

Programsโ€บ๐ŸŒฟ AI Sproutsโ€บLessonsโ€บAlgorithms Explained โ€” The Recipes of AI
๐Ÿงฎ
AI Sprouts โ€ข Beginnerโฑ๏ธ 30 min read

Algorithms Explained โ€” The Recipes of AI

From Data to Decisions ๐Ÿ‘‹

In the previous lesson, you learned that data is the fuel of AI. But fuel alone doesn't drive a car โ€” you need an engine. In AI, that engine is called an algorithm.

By the end of this lesson, you'll understand three fundamental algorithms and know when to use each one.

Data goes in, the algorithm processes it, predictions come out
An algorithm transforms raw data into useful predictions

What is an Algorithm? ๐Ÿค”

An algorithm is simply a set of step-by-step instructions to solve a problem.

You already follow algorithms every day:

  • ๐Ÿณ A cooking recipe โ€” "Heat oil, add onions, stir for 3 minutes..."
  • ๐Ÿ—บ๏ธ Directions to school โ€” "Walk north, turn left at the park, cross the bridge..."
  • ๐Ÿ”ข Long division โ€” A step-by-step process you learned in maths class

In AI, algorithms are the step-by-step instructions a computer follows to find patterns in data and make predictions.

๐Ÿค”
Think about it:

Think about how you decide what to wear each morning. You probably check the weather, think about your plans, consider what's clean โ€” that's an algorithm! You follow a series of steps (even if unconsciously) to reach a decision. AI algorithms do the same thing, just with data instead of gut feelings.


Algorithm 1: Decision Trees ๐ŸŒณ

A decision tree is one of the most intuitive algorithms in AI. It makes decisions by asking a series of yes/no questions โ€” just like the game "20 Questions."

How it works

Imagine you're deciding whether to play outside:

Is it raining?
โ”œโ”€โ”€ Yes โ†’ Stay inside ๐Ÿ 
โ””โ”€โ”€ No โ†’ Is it above 15ยฐC?
    โ”œโ”€โ”€ Yes โ†’ Play outside! โšฝ
    โ””โ”€โ”€ No โ†’ Wear a jacket and play outside ๐Ÿงฅ

That's a decision tree! Each node asks a question, each branch follows an answer, and each leaf gives a final decision.

A real ML example

Suppose we want to predict whether someone will buy a product:

Age greater than 25?
โ”œโ”€โ”€ Yes โ†’ Has bought before?
โ”‚   โ”œโ”€โ”€ Yes โ†’ WILL BUY โœ… (95% confidence)
โ”‚   โ””โ”€โ”€ No โ†’ Price under ยฃ20?
โ”‚       โ”œโ”€โ”€ Yes โ†’ WILL BUY โœ… (70% confidence)
โ”‚       โ””โ”€โ”€ No โ†’ WON'T BUY โŒ (80% confidence)
โ””โ”€โ”€ No โ†’ Student discount available?
    โ”œโ”€โ”€ Yes โ†’ WILL BUY โœ… (60% confidence)
    โ””โ”€โ”€ No โ†’ WON'T BUY โŒ (75% confidence)

Why decision trees are great

  • โœ… Easy to understand โ€” you can literally draw them on paper
  • โœ… Explainable โ€” you can trace exactly why a decision was made
  • โœ… Work with both numbers and categories โ€” age (number) or colour (category)

When they struggle

  • โŒ Can overfit โ€” memorise the training data instead of learning general patterns
  • โŒ A single tree can be inaccurate on complex problems
  • โŒ Small changes in data can create a completely different tree
๐Ÿคฏ

Random Forests โ€” one of the most powerful ML algorithms โ€” simply combine hundreds of decision trees and let them "vote" on the answer. It's like asking 500 people for directions and going with the majority. This simple idea dramatically improves accuracy!


Algorithm 2: K-Nearest Neighbors (KNN) ๐Ÿ˜๏ธ

KNN is the "ask your neighbors" algorithm. Its logic is beautifully simple: things that are similar tend to be close together.

The intuition

Imagine you move to a new city and want to find a good restaurant. What do you do? You ask your nearest neighbors for recommendations! If 3 out of 5 neighbors recommend Italian food, you'd probably try Italian.

KNN works exactly the same way:

  1. Take a new, unknown data point
  2. Find the K closest data points in the training data (K is a number you choose)
  3. Let those neighbors vote on the answer
  4. Go with the majority

A visual example

Imagine a graph with red dots (cats) and blue dots (dogs), based on weight and height:

Height
  |   ๐Ÿ”ด  ๐Ÿ”ด
  |  ๐Ÿ”ด  โ“  ๐Ÿ”ด      โ† What is โ“?
  |    ๐Ÿ”ด   ๐Ÿ”ต
  |  ๐Ÿ”ต  ๐Ÿ”ต
  |   ๐Ÿ”ต   ๐Ÿ”ต  ๐Ÿ”ต
  +----------------โ†’ Weight

If K=3, we find the 3 nearest neighbors to โ“. If 2 are red (cat) and 1 is blue (dog), we predict: cat! ๐Ÿฑ

Choosing K

  • K too small (e.g., K=1) โ†’ Sensitive to noise โ€” one weird data point changes everything
  • K too large (e.g., K=100) โ†’ Too general โ€” distant points that aren't really relevant get a vote
  • Sweet spot โ†’ Usually K=3, 5, or 7 works well. Always use an odd number to avoid ties!
from sklearn.neighbors import KNeighborsClassifier

# Create and train a KNN model with K=5
knn = KNeighborsClassifier(n_neighbors=5)
knn.fit(X_train, y_train)

# Predict on new data
prediction = knn.predict(X_test)
print(f"Accuracy: {knn.score(X_test, y_test):.2%}")
๐Ÿ’ก

KNN is a "lazy learner" โ€” it doesn't actually learn anything during training! It just memorises all the data and does the real work at prediction time by calculating distances. This makes training instant but predictions slow on large datasets.


Algorithm 3: Linear Regression ๐Ÿ“ˆ

Linear regression is the art of fitting a line through data points. It's used when you want to predict a number (not a category).

The intuition

Think about this: the more hours you study, the higher your test score tends to be. If you plotted this on a graph, you'd see data points trending upward. Linear regression draws the best-fitting line through those points.

Score
100 |                    *   *
 80 |              *  *
 60 |        *  *
 40 |     *
 20 |  *
    +-------------------------โ†’ Hours studied

The line lets you predict: "If I study for 7 hours, I'll probably score around 85."

The equation

Every line can be described as:

y = mx + b
  • y = what we're predicting (test score)
  • x = input feature (hours studied)
  • m = slope (how steep the line is)
  • b = intercept (where the line crosses the y-axis)

The algorithm finds the best values for m and b so the line is as close as possible to all the data points.

from sklearn.linear_model import LinearRegression
import numpy as np

# Study hours and test scores
hours = np.array([1, 2, 3, 4, 5, 6, 7, 8]).reshape(-1, 1)
scores = np.array([20, 35, 45, 55, 65, 75, 82, 90])

# Fit the model
model = LinearRegression()
model.fit(hours, scores)

# Predict score for 5.5 hours of study
predicted = model.predict([[5.5]])
print(f"Predicted score for 5.5 hours: {predicted[0]:.1f}")
print(f"Slope (m): {model.coef_[0]:.2f}")
print(f"Intercept (b): {model.intercept_:.2f}")

When linear regression works โ€” and when it doesn't

  • โœ… Great when the relationship is roughly a straight line
  • โœ… Fast, simple, and easy to interpret
  • โŒ Terrible when the relationship is curved or complex
  • โŒ Sensitive to outliers โ€” one extreme point can tilt the whole line

When to Use Which Algorithm? ๐Ÿงญ

| Question | Decision Tree | KNN | Linear Regression | |----------|:---:|:---:|:---:| | Predicting a category (cat/dog)? | โœ… | โœ… | โŒ | | Predicting a number (price, score)? | โœ… | โœ… | โœ… | | Need to explain the decision? | โœ…โœ… | โŒ | โœ… | | Have a very large dataset? | โœ… | โŒ | โœ… | | Relationship is a straight line? | โŒ | โŒ | โœ…โœ… | | Don't know the relationship shape? | โœ… | โœ… | โŒ |

๐Ÿค”
Think about it:

There's no single "best" algorithm. The right choice depends on your data, your problem, and what you need. A doctor diagnosing diseases might prefer a decision tree because it can explain why it made a diagnosis. A weather app predicting temperature might use linear regression because the relationship with historical data is roughly linear.


Interactive: Build a Movie Recommender ๐ŸŽฌ

Let's build a simple decision tree for recommending films! Think about how you choose films:

Do you want something funny?
โ”œโ”€โ”€ Yes โ†’ Do you like animated films?
โ”‚   โ”œโ”€โ”€ Yes โ†’ Watch "Inside Out 2" ๐ŸŽญ
โ”‚   โ””โ”€โ”€ No โ†’ Do you want something family-friendly?
โ”‚       โ”œโ”€โ”€ Yes โ†’ Watch "Home Alone" ๐Ÿ 
โ”‚       โ””โ”€โ”€ No โ†’ Watch "The Grand Budapest Hotel" ๐Ÿจ
โ””โ”€โ”€ No โ†’ Do you like action?
    โ”œโ”€โ”€ Yes โ†’ Do you prefer superheroes?
    โ”‚   โ”œโ”€โ”€ Yes โ†’ Watch "Spider-Man: Across the Spider-Verse" ๐Ÿ•ท๏ธ
    โ”‚   โ””โ”€โ”€ No โ†’ Watch "Top Gun: Maverick" โœˆ๏ธ
    โ””โ”€โ”€ No โ†’ Do you want a true story?
        โ”œโ”€โ”€ Yes โ†’ Watch "Hidden Figures" ๐Ÿš€
        โ””โ”€โ”€ No โ†’ Watch "Interstellar" ๐ŸŒŒ

Try this yourself: Add more branches! What about genre, mood, length, or language? Every question you add makes the tree more personalised. This is exactly how recommendation algorithms start โ€” simple rules that get refined with data.

# A simple decision tree in code
def recommend_movie(funny, animated, action, superhero, true_story):
    if funny:
        if animated:
            return "Inside Out 2 ๐ŸŽญ"
        else:
            return "The Grand Budapest Hotel ๐Ÿจ"
    else:
        if action:
            if superhero:
                return "Spider-Man: Across the Spider-Verse ๐Ÿ•ท๏ธ"
            else:
                return "Top Gun: Maverick โœˆ๏ธ"
        else:
            if true_story:
                return "Hidden Figures ๐Ÿš€"
            else:
                return "Interstellar ๐ŸŒŒ"

# Try it!
print(recommend_movie(funny=False, animated=False,
                      action=True, superhero=True,
                      true_story=False))
# Output: Spider-Man: Across the Spider-Verse ๐Ÿ•ท๏ธ

Quick Recap ๐ŸŽฏ

  1. An algorithm is a step-by-step procedure for solving a problem
  2. Decision trees ask yes/no questions to reach a decision โ€” explainable and visual
  3. KNN classifies by asking the K nearest neighbors to vote โ€” simple but lazy
  4. Linear regression fits a line through data to predict numbers โ€” fast but limited to linear relationships
  5. No algorithm is universally best โ€” choose based on your data and problem
  6. Real AI systems often combine multiple algorithms for better results

What's Next? ๐Ÿš€

You've now met three classic algorithms. In the next lesson, we'll explore neural networks โ€” the powerful, brain-inspired algorithms behind modern AI breakthroughs like ChatGPT, image generation, and self-driving cars. Get ready to think in layers! ๐Ÿง 

Lesson 2 of 30 of 3 completed
โ†Datasets and Data โ€” The Fuel of AIIntroduction to Neural Networks โ€” How AI Thinks in Layersโ†’