AI EducademyAIEducademy
AcademicsLabBlogAbout
Sign In
AI EducademyAIEducademy

Free AI education for everyone, in every language.

Learn

  • Academics
  • Lessons
  • Lab
  • Dashboard
  • About

Community

  • GitHub
  • Contribute
  • Code of Conduct

Support

  • Buy Me a Coffee โ˜•

Free AI education for everyone

MIT Licence. Open Source

Programsโ€บโœ๏ธ AI Sketchโ€บLessonsโ€บArrays & Hashing โ€” The Building Blocks of AI
๐Ÿ“Š
AI Sketch โ€ข Beginnerโฑ๏ธ 15 min read

Arrays & Hashing โ€” The Building Blocks of AI

Arrays & Hashing

Every AI system โ€” from recommendation engines to search indexes โ€” is built on two primitives: arrays and hash maps. Master these and you master the foundation.


How Arrays Work

An array stores elements in contiguous memory โ€” each slot is one step from the next.

Array memory layout showing contiguous cells with index labels and hash map with key-to-bucket mapping
Arrays store data in contiguous memory (left). Hash maps use a hash function to map keys to bucket positions (right).

Key characteristics:

  • O(1) access โ€” jump directly to any index
  • O(n) insert/delete โ€” shifting elements is expensive
  • Cache-friendly โ€” sequential memory = fast CPU reads
  • Fixed size (in most languages) โ€” must know capacity upfront
๐Ÿ’ก

Arrays are the backbone of tensors in machine learning. A 2D array is a matrix, a 3D array is a tensor โ€” the core data structure of every neural network.


How Hash Maps Work

A hash map converts a key into an index using a hash function, then stores the value in that bucket.

The process:

  1. Input: "apple" (key)
  2. Hash: hash("apple") % size โ†’ bucket 0
  3. Store: bucket 0 = {apple: 3}
  4. Lookup: O(1) average โ€” go straight to the bucket

Key characteristics:

  • O(1) average lookup, insert, delete
  • O(n) worst case โ€” when all keys collide into one bucket
  • Unordered โ€” no guaranteed iteration order
  • Keys must be hashable โ€” immutable types work best
๐Ÿค”
Think about it:

Why can't you use a list as a dictionary key in Python? Because lists are mutable โ€” if you change the list after hashing, the hash value changes but the bucket doesn't, breaking the lookup.


Pattern 1: Two Sum

Given an array and a target, find two numbers that add up to the target.

Brute force: Check every pair โ†’ O(nยฒ) Hash map approach: For each number, check if target - num exists โ†’ O(n)

Step by step with nums = [2, 7, 11, 15], target = 9:

| Step | Current | Need (9 - current) | In Map? | Action | |------|---------|---------------------|---------|--------| | 1 | 2 | 7 | No | Store {2: 0} | | 2 | 7 | 2 | Yes! | Return [0, 1] |

๐Ÿ’ก

This pattern โ€” "store what you've seen, check what you need" โ€” appears in dozens of interview problems.


Pattern 2: Frequency Counting

Count how often each element appears.

def frequency_count(nums):
    freq = {}
    for num in nums:
        freq[num] = freq.get(num, 0) + 1
    return freq

# [1, 2, 2, 3, 3, 3] โ†’ {1: 1, 2: 2, 3: 3}

Used in:

  • Top K Frequent Elements โ€” count then sort by frequency
  • Valid Anagram โ€” compare character frequencies
  • Group Anagrams โ€” group words with same sorted characters

Pattern 3: Grouping by Key

Group anagrams: ["eat","tea","tan","ate","nat","bat"]

Strategy: Sort each word โ†’ use sorted version as hash key.

| Original | Sorted (Key) | Group | |----------|-------------|-------| | "eat" | "aet" | ["eat", "tea", "ate"] | | "tan" | "ant" | ["tan", "nat"] | | "bat" | "abt" | ["bat"] |


The AI Connection: Feature Stores

In production ML systems, feature stores use hash maps at their core:

  • User ID โ†’ feature vector โ€” instant lookup for real-time recommendations
  • Product ID โ†’ embeddings โ€” similarity search in recommendation engines
  • Cache layer โ€” avoid recomputing expensive features
๐Ÿคฏ

Netflix's recommendation engine performs millions of hash map lookups per second to match users with content. Each user's viewing history is stored as a hash map of {movie_id: rating}, enabling O(1) feature retrieval during model inference.


Complexity Cheat Sheet

| Operation | Array | Hash Map | |-----------|-------|----------| | Access by index | O(1) | โ€” | | Search by value | O(n) | O(1) avg | | Insert at end | O(1)* | O(1) avg | | Insert at start | O(n) | O(1) avg | | Delete | O(n) | O(1) avg | | Space | O(n) | O(n) |

*amortised for dynamic arrays

๐Ÿค”
Think about it:

When would you choose an array over a hash map? When you need ordered data, cache performance, or minimal memory overhead. Hash maps trade space for speed โ€” each entry has overhead for the hash, key, and pointer.

Lesson 1 of 30 of 3 completed
โ†Back to programStrings & Pattern Matching โ€” How AI Reads Textโ†’