Mo’s Algorithm – Step by Step Guide
A powerful trick to handle complex range queries in competitive programming

I break down complex programming and DSA topics into simple, beginner-friendly explanations. Currently learning, building, and sharing my journey in tech.
Mo’s Algorithm is a technique used to answer range queries efficiently on arrays
You are given :
An array A
Many queries: [L, R]
Each query asks something like :
sum of elements
number of distinct elements
Use Mo's when :
Queries are offline (you know all queries beforehand)
No updates in array
Query depends on range (L, R)
You can :
Add an element
Remove an element
Idea of Mo’s Algorithm :
Instead of solving queries one by one, we :
1. Reorder queries
2. Process them smartly
3. Reuse previous computations
Core Concept :
We maintain a current range [curL, curR]
Initially :
curL = 0
curR = -1 (empty range)
Then for each query [L, R] :
We expand/shrink range using :
Add element → when expanding
Remove element → when shrinking
Understanding Pointer Movement in Mo’s Algorithm
In Mo’s Algorithm, we maintain a current range [curL, curR] and adjust it to match each query range [L, R]
Instead of recomputing everything from scratch, we move the pointers step by step and update the answer using two functions :
add(index)→ include an elementremove(index)→ exclude an element
// Expand Right
while(curR < R)
{
curR = curR + 1;
add(curR);
}
/*
If our current right pointer is behind the required R,
we move it forward and add new elements into our range.
*/
// Shrink Right
while(curR > R)
{
remove(curR);
curR = curR - 1;
}
/*
If our right pointer goes beyond R,
we remove extra elements and move it back.
*/
// Expand Left
while(curL > L)
{
curL = curL - 1;
add(curL);
}
/*
If current left pointer is ahead of L,
we move it backward and include new elements.
*/
// Shrink Left
while(curL < L)
{
remove(curL);
curL = curL + 1;
}
/*
If left pointer is behind L,
we remove elements and move it forward.
*/
Query Sorting :
We divide array into blocks : blocksize = sqrt(N)
Each query( [L,R] ) belongs to a block : block = L / blocksize
Sorting Rule :
/*
[L1,R1] -> Query 1
[L2,R2] -> Query 2
*/
if(block(L1) != block(L2))
sort by block(L)
else
sort by R
Example : Array A = [1, 2, 1, 3, 2]
Queries : 0 - Indexed
Q1: [0, 2]
Q2: [1, 3]
Q3: [0, 4]
Block Size : sqrt(5) ≈ 2
Array indices :
Index : 0 1 | 2 3 | 4
Block : 0 0 | 1 1 | 2
Sort Queries :
Query | L | R | Block |
Q1 | 0 | 2 | 0 |
Q2 | 1 | 3 | 0 |
Q3 | 0 | 4 | 0 |
Block 0 → indices [0, 1]
All queries have L = 0 or 1
So all fall in same block -> Block 0
Sorted by R : Q1 → Q2 → Q3
Why Sorted By R?
Inside the same block :
L values are already close (0 or 1 here)
So moving L pointer cost is small anyway
So we focus on minimizing R movement
Why sorting by L is Not enough?
Inside the same block :
All queries already have similar L (0 or 1)
So sorting by L does not create meaningful order
Example if sorted by L :
Q1: [0,2]
Q3: [0,4]
Q2: [1,3]
Now look at R :
R movement :
0 → 2 → 4 → 3
Still goes forward then backward
Not optimal
Why sorting by R fixes it
If we sort by R :
Q1: [0,2]
Q2: [1,3]
Q3: [0,4]
R movement : 0 → 2 → 3 → 4
Smooth increasing order
No backtracking
Why L movement is "less important"
Inside the same block:
Block size = √N = 2
So L can only vary within a small range (0 to 1)
Max movement of L = very small (bounded by block size)
In your case :
L moves: 0 → 1 → 0 (only ±1) ---Cheap cost
Case : Sort by R
R : 0 → 2 → 3 → 4
L : 0 → 1 → 0
Small L cost
Optimized R cost
Case : Sort by L only
R : 0 → 4 → 2 → 3
L : 0 → 0 → 1
R moves back and forth → expensive
L is fine (but doesn’t matter much)
→ We allow small inefficiency in L to avoid large inefficiency in R
Problem : Count DISTINCT elements
We maintain:
freq[x] → frequency of element
distinct_count
Start :
curL = 0
curR = -1
distinct = 0
Process Q1: [0,2]
Expand :
add(0) → 1 → distinct = 1
add(1) → 2 → distinct = 2
add(2) → 1 → already exists
Answer = 2
Process Q2: [1,3]
Adjust :
remove(0) → 1 removed → freq=2-1=1 → still exists
add(3) → 3 → distinct = 3
Answer = 3
Process Q3: [0,4]
Adjust :
add(0) → 1 → freq increases
add(4) → 2 → already exists
Answer = 3
Add & Remove Functions :
Add :
freq[A[pos]]++
if freq becomes 1 → new element → distinct++
Remove :
freq[A[pos]]--
if freq becomes 0 → element gone → distinct--
Why do we sort queries like this?
Problem without sorting
If we process queries in random order :
The window [L, R] jumps randomly
We keep adding/removing many elements again and again
Time becomes O(N × Q) → too slow
We want to : Minimize movement of pointers (curL, curR)
Because every movement = add/remove operation = cost
Step 1 : Divide into blocks
blocksize = sqrt(N)
block = L / blocksize
Why? We group queries with similar L values together.
So instead of jumping L randomly:
- We process queries where L is in the same range
Example : N = 10 → blocksize ≈ 3
Blocks : [0 – 2], [3 – 5], [6 – 8], [9 – ...]
Step 2 : Sort by block of L
if block(L1) != block(L2)
sort by L
Why?
Queries in same block → similar L
So curL moves very little
Reduces left pointer movement
Step 3 : Sort by R (inside same block)
else
sort by R
Why?
Inside a block:
L is already similar
Now we want R to move smoothly
So :
R: 2 → 5 → 8 → 10 (smooth increase)
Instead of:
R: 10 → 2 → 9 → 1
Reduces right pointer movement
#include<bits/stdc++.h>
using namespace std;
const int MAXN=1000005;
int A[MAXN];
int freq[MAXN];
int answer[MAXN];
int distinct=0;
struct Query
{
int L,R,idx;
};
int blocksize;
bool cmp(Query a, Query b)
{
int blocka = a.L / blocksize;
int blockb = b.L / blocksize;
if (blocka != blockb)
return blocka < blockb;
return a.R < b.R;
}
void add(int pos)
{
int val = A[pos];
freq[val]++;
if (freq[val] == 1)
{
distinct++;
}
}
void remove_(int pos)
{
int val = A[pos];
freq[val]--;
if (freq[val] == 0)
{
distinct--;
}
}
int main()
{
ios_base::sync_with_stdio(false);
cin.tie(NULL);
int n;
cin>>n;
for(int i=0;i<n;i++)
cin>>A[i];
int q;
cin>>q;
vector<Query> queries(q);
for(int i=0;i<q;i++)
{
cin >> queries[i].L >> queries[i].R;
queries[i].idx = i;
}
blocksize = sqrt(n);
sort(queries.begin(), queries.end(), cmp);
int curL = 0, curR = -1;
for(auto &q : queries)
{
int L = q.L;
int R = q.R;
// Expand Right
while(curR < R)
{
curR = curR + 1;
add(curR);
}
// Shrink Right
while(curR > R)
{
remove_(curR);
curR = curR - 1;
}
// Expand Left
while(curL > L)
{
curL = curL - 1;
add(curL);
}
// Shrink Left
while(curL < L)
{
remove_(curL);
curL = curL + 1;
}
answer[q.idx]=distinct;
}
for(int i=0;i<q;i++)
{
cout<<answer[i]<<"\n";
}
return 0;
}
Time Complexity : O((N + Q) √N)
Why time = O((N + Q) √N) ?
1. R pointer
For each block, R moves from 0 → N
Number of blocks = √N
So : R work = N × √N
2. L pointer
For each query, L moves at most √N
Total queries = Q
So : L work = Q × √N
Total = R work + L work
= N√N + Q√N
= (N + Q)√N
We are using this function :
bool cmp(Query a, Query b)
{
int blocka = a.L / blocksize;
int blockb = b.L / blocksize;
if (blocka != blockb)
return blocka < blockb;
return a.R < b.R;
}
But this is much more powerful :
bool cmp(Query a, Query b)
{
int blocka = a.L / blocksize;
int blockb = b.L / blocksize;
if (blocka != blockb)
return blocka < blockb;
if (blocka & 1)
return a.R > b.R;
else
return a.R < b.R;
}
Reason :-
if (blocka & 1)
return a.R > b.R;
else
return a.R < b.R;
Why do we use it?
Normally we sort like this :
return a.R < b.R;
But this causes a problem
Problem :
At the end of one block:
- R goes to right end (large value)
Then next block:
- R suddenly jumps back to small value
This causes huge unnecessary movement
Idea of this trick :
We alternate sorting direction:
Block | Sorting of R |
Even block (0,2,4...) | Increasing (→) |
Odd block (1,3,5...) | Decreasing (←) |
Visualization :
Without optimization :
Block 0 → R: 1 → 5 → 9 (increasing)
Block 1 → R: 2 → 6 → 10 (increasing)
Movement:
9 → 2 (big jump back)
Low – High – Low – High
With optimization:
Block 0 → R: 1 → 5 → 9 (increasing)
Block 1 → R: 10 → 6 → 2 (decreasing)
Movement:
9 → 10 (small move)
Low – High - High – Low
What does (blocka & 1) mean?
blocka & 1
Checks if block is odd or even
If blocka is odd → result = 1
If blocka is even → result = 0
Example:
blocka = 2 → even → (2 & 1 = 0)
blocka = 3 → odd → (3 & 1 = 1)
Link To Sqrt Decomposition : Sqrt Decomposition
Core Difference Between Mo’s Algorithm & Sqrt Decomposition :
Use Sqrt Decomposition when :
Queries are simple (sum, min)
Online queries (must answer immediately)
Updates are needed
Use Mo’s Algorithm when :
Queries are complex
Offline allowed
No updates (or very limited)
Feature | Sqrt Decomposition | Mo’s Algorithm |
Query type | Simple (sum, min) | Complex (distinct, freq) |
Queries | Online | Offline |
Updates | Supported | Not supported |
Idea | Precompute blocks | Reorder queries |



