Understanding Basic Hashing: From Brute‑Force to Maps and Unordered Maps
Introduction
- The video is part of the Strivers A‑to‑Z DSA course, touted as one of India’s most in‑depth data‑structures and algorithms series.
- The instructor also promotes the GoRush X contest by Newton School (₹10 lakh prize pool) and encourages registration.
Brute‑Force Counting
- Problem: Given an array, answer many queries of the form “how many times does X appear?”.
- Naïve solution: For each query, scan the whole array and count matches.
- Pseudocode:
function count(arr, x): cnt = 0 for i from 0 to arr.length‑1: if arr[i] == x: cnt += 1 return cnt - Time complexity: O(n) per query → O(n·Q) for Q queries. With n and Q up to 10⁵, this can reach 10¹⁰ operations (≈100 seconds), which is too slow.
Array Hashing for Numbers (Frequency Array)
- Idea: Pre‑compute the frequency of every possible value.
- Create an auxiliary array
freqof sizemaxValue+1(initialised to 0). - Single pass over the original array:
freq[arr[i]] += 1. - Answer each query in O(1) by returning
freq[x]. - Overall complexity: O(n + Q).
- Memory limits:
- Inside
main()you can allocate at most ~10⁶ integers. - Declaring globally extends the limit to ~10⁷.
- For values up to 10⁹ or larger, a plain frequency array is impossible.
Character Hashing
- When the data consists of characters, the same technique applies.
- For lowercase letters only, use an array of size 26 and index with
c - 'a'. - For the full ASCII set, use size 256 and rely on implicit conversion of a
charto its ASCII code. - Pseudocode example:
freq[26] = {0} for each ch in string: freq[ch - 'a'] += 1 // query answer = freq[queryChar - 'a'] - Complexity remains O(n + Q) with negligible memory.
Maps and Unordered Maps (C++ STL)
map- Stores key‑value pairs in sorted order.
- Insertion and lookup: O(log n).
- Missing keys return default‑constructed value (0 for integers).
- Uses only the keys that actually appear, saving memory compared to a full frequency array.
unordered_map- Hash‑based, average‑case O(1) for insert and lookup.
- Worst‑case degrades to O(n) due to collisions.
- No ordering guarantee.
- Preferred when average performance matters; fall back to
mapif you encounter time‑limit issues. - Example usage (frequency counting):
unordered_map<int,int> freq; for (int x : arr) freq[x]++; cout << freq[query] << '\n';
Division Method and Collisions
- When the key range is huge, a simple hash function such as
key % M(M = table size) is used. - The division method maps a large number to a smaller bucket index.
- Collision occurs when different keys produce the same bucket index.
- Common resolution: chaining – each bucket holds a linked list of all keys that map to it.
- In practice, STL containers handle collisions internally; you only need to know the concept for interviews.
Practical Tips & Homework
- Implement a program that reads an array, pre‑computes frequencies (using array,
map, orunordered_map), and answers Q queries. - Extend it to find the element with the highest and lowest frequency.
- Submit your solution in the comments for peer review.
Summary of When to Use Which Technique
| Situation | Recommended Structure |
|---|---|
| Small, bounded integer range (≤ 10⁶) | Frequency array |
| Small alphabetic range (lowercase) | 26‑size array (or 256 for full ASCII) |
| Large or sparse integer range | map (sorted) or unordered_map (average O(1)) |
| Need guaranteed O(log n) and ordering | map |
| Need fastest average look‑up and can tolerate rare worst‑case | unordered_map |
The lecture concludes the “basic hashing” module and signals the move to step 2 of the course.
Hashing transforms repeated linear scans (O(n·Q)) into a single pre‑computation followed by constant‑time queries (O(n+Q)). Use a simple frequency array for small, bounded ranges, ASCII tricks for characters, and STL map/unordered_map for large or sparse keys, keeping collision handling and complexity trade‑offs in mind.
Frequently Asked Questions
Who is take U forward on YouTube?
take U forward is a YouTube channel that publishes videos on a range of topics. Browse more summaries from this channel below.
Does this page include the full transcript of the video?
Yes, the full transcript for this video is available on this page. Click 'Show transcript' in the sidebar to read it.
queries. With n and Q up to 10⁵, this can reach 10¹⁰ operations (≈100 seconds), which is too slow. ### Array Hashing for Numbers (Frequency Array) - Ide
Pre‑compute the frequency of every possible value. - Create an auxiliary array `freq` of size `maxValue+1` (initialised to 0). - Single pass over the original array: `freq[arr[i]] += 1`. - Answer each query in O(1) by returning `freq[x]`. - Overall complexity: O(n + Q). - Memory limits: * Inside `main()` you can allocate at most ~10⁶ integers. * Declaring globally extends the limit to ~10⁷. * For values up to 10⁹ or larger, a plain frequency array is impossible.
queries. - Extend it to find the element with the highest and lowest frequency. - Submit your solution in the comments for peer review. ### Summary of When to Use Which Technique | Situation | Recommended Structure | |-----------|-----------------------| | Small, bounded integer range (≤ 10⁶) | Frequency array | | Small alphabetic range (lowercase) | 26‑size array (or 256 for full ASCII) | | Large or sparse integer range | `map` (sorted) or `unordered_map` (average O(1)) | | Need guaranteed O(log n) and ordering | `map` | | Need fastest average look‑up and can tolerate rare worst‑case | `unordered_map` | --- The lecture concludes the “basic hashing” module and signals the move to step 2 of the course. Hashing transforms repeated linear scans (O(n·Q)) into
single pre‑computation followed by constant‑time queries (O(n+Q)). Use a simple frequency array for small, bounded ranges, ASCII tricks for characters, and STL `map`/`unordered_map` for large or sparse keys, keeping collision handling and complexity trade‑offs in mind.