Skip to main content

Command Palette

Search for a command to run...

Mo’s Algorithm – Step by Step Guide

A powerful trick to handle complex range queries in competitive programming

Updated
10 min read
Mo’s Algorithm – Step by Step Guide
M

I break down complex programming and DSA topics into simple, beginner-friendly explanations. Currently learning, building, and sharing my journey in tech.

Mo’s Algorithm is a technique used to answer range queries efficiently on arrays

  • You are given :

    • An array A

    • Many queries: [L, R]

  • Each query asks something like :

    • sum of elements

    • number of distinct elements

Use Mo's when :

  • Queries are offline (you know all queries beforehand)

  • No updates in array

  • Query depends on range (L, R)

  • You can :

    • Add an element

    • Remove an element

Idea of Mo’s Algorithm :

Instead of solving queries one by one, we :

1.   Reorder queries

2.   Process them smartly

3.   Reuse previous computations

Core Concept :
We maintain a current range [curL, curR]

Initially :
curL = 0
curR = -1 (empty range)

Then for each query [L, R] :

We expand/shrink range using :

  • Add element → when expanding

  • Remove element → when shrinking

Understanding Pointer Movement in Mo’s Algorithm

In Mo’s Algorithm, we maintain a current range [curL, curR] and adjust it to match each query range [L, R]

Instead of recomputing everything from scratch, we move the pointers step by step and update the answer using two functions :

  • add(index) → include an element

  • remove(index) → exclude an element

// Expand Right
while(curR < R) 
{
    curR = curR + 1;
    add(curR);
}
/*
If our current right pointer is behind the required R,
we move it forward and add new elements into our range.
*/


// Shrink Right
while(curR > R) 
{
    remove(curR);
    curR = curR - 1;
}
/*
If our right pointer goes beyond R,
we remove extra elements and move it back.
*/


// Expand Left
while(curL > L) 
{
    curL = curL - 1;
    add(curL);
}
/*
If current left pointer is ahead of L,
we move it backward and include new elements.
*/


// Shrink Left
while(curL < L) 
{
    remove(curL);
    curL = curL + 1;
}
/*
If left pointer is behind L,
we remove elements and move it forward.
*/

Query Sorting :
We divide array into blocks : blocksize = sqrt(N)
Each query( [L,R] ) belongs to a block : block = L / blocksize

Sorting Rule :

/*
[L1,R1] -> Query 1
[L2,R2] -> Query 2
*/

if(block(L1) != block(L2))
    sort by block(L)
else
    sort by R

Example : Array A = [1, 2, 1, 3, 2]

Queries : 0 - Indexed
Q1: [0, 2]
Q2: [1, 3]
Q3: [0, 4]

Block Size : sqrt(5) ≈ 2

Array indices :

Index :  0  1  |  2  3  |  4
Block :  0  0  |  1  1  |  2

Sort Queries :

Query

L

R      

Block

Q1

0

2

0

Q2

1

3

0

Q3

0

4

0

Block 0 → indices [0, 1]
All queries have L = 0 or 1
So all fall in same block -> Block 0
Sorted by R : Q1 → Q2 → Q3

Why Sorted By R?
Inside the same block :
L values are already close (0 or 1 here)
   So moving L pointer cost is small anyway
So we focus on minimizing R movement

Why sorting by L is Not enough?
Inside the same block :

  • All queries already have similar L (0 or 1)

  • So sorting by L does not create meaningful order

Example if sorted by L :
Q1: [0,2]
Q3: [0,4]
Q2: [1,3]

Now look at R :

R movement :
0 → 2 → 4 → 3

Still goes forward then backward
Not optimal

Why sorting by R fixes it
If we sort by R :
Q1: [0,2]
Q2: [1,3]
Q3: [0,4]

R movement : 0 → 2 → 3 → 4

Smooth increasing order
No backtracking

Why L movement is "less important"
Inside the same block:

  • Block size = √N = 2

  • So L can only vary within a small range (0 to 1)

Max movement of L = very small (bounded by block size)
In your case :
L moves: 0 → 1 → 0  (only ±1) ---Cheap cost


Case : Sort by R

R : 0 → 2 → 3 → 4  
L : 0 → 1 → 0

Small L cost
Optimized R cost


Case : Sort by L only

R : 0 → 4 → 2 → 3  
L : 0 → 0 → 1

R moves back and forth → expensive
L is fine (but doesn’t matter much)

→ We allow small inefficiency in L to avoid large inefficiency in R


Problem : Count DISTINCT elements
We maintain:
freq[x] → frequency of element
distinct_count


Start :
curL = 0
curR = -1
distinct = 0


Process Q1: [0,2]
Expand :
add(0) → 1 → distinct = 1
add(1) → 2 → distinct = 2
add(2) → 1 → already exists
Answer = 2


Process Q2: [1,3]
Adjust :
remove(0) → 1 removed → freq=2-1=1 → still exists
add(3) → 3 → distinct = 3
Answer = 3


Process Q3: [0,4]
Adjust :
add(0) → 1 → freq increases
add(4) → 2 → already exists
Answer = 3


Add & Remove Functions :
Add :
freq[A[pos]]++
if freq becomes 1 → new element → distinct++

Remove :
freq[A[pos]]--
if freq becomes 0 → element gone → distinct--


Why do we sort queries like this?

Problem without sorting
If we process queries in random order :

  • The window [L, R] jumps randomly

  • We keep adding/removing many elements again and again

  • Time becomes O(N × Q) → too slow

We want to : Minimize movement of pointers (curL, curR)
Because every movement = add/remove operation = cost


Step 1 : Divide into blocks
blocksize = sqrt(N)
block = L / blocksize
Why? We group queries with similar L values together.
So instead of jumping L randomly:

  • We process queries where L is in the same range

Example : N = 10 → blocksize ≈ 3

Blocks : [0 – 2], [3 – 5], [6 – 8], [9 – ...]


Step 2 : Sort by block of L
if block(L1) != block(L2)
    sort by L

Why?

  • Queries in same block → similar L

  • So curL moves very little

Reduces left pointer movement


Step 3 : Sort by R (inside same block)
else
    sort by R

Why?
Inside a block:

  • L is already similar

  • Now we want R to move smoothly

So :
R: 2 → 5 → 8 → 10 (smooth increase)
Instead of:
R: 10 → 2 → 9 → 1
Reduces right pointer movement


#include<bits/stdc++.h>
using namespace std;

const int MAXN=1000005;

int A[MAXN];
int freq[MAXN];
int answer[MAXN];

int distinct=0;

struct Query 
{
    int L,R,idx;
};

int blocksize;

bool cmp(Query a, Query b) 
{
    int blocka = a.L / blocksize;
    int blockb = b.L / blocksize;

    if (blocka != blockb)
        return blocka < blockb;

    return a.R < b.R;
}

void add(int pos) 
{
    int val = A[pos];
    freq[val]++;
    if (freq[val] == 1) 
    {
        distinct++;
    }
}

void remove_(int pos) 
{
    int val = A[pos];
    freq[val]--;
    if (freq[val] == 0) 
    {
        distinct--;
    }
}

int main() 
{
    ios_base::sync_with_stdio(false);
    cin.tie(NULL);

    int n;
    cin>>n;

    for(int i=0;i<n;i++)
        cin>>A[i];

    int q;
    cin>>q;

    vector<Query> queries(q);

    for(int i=0;i<q;i++) 
    {
        cin >> queries[i].L >> queries[i].R;
        queries[i].idx = i;
    }

    blocksize = sqrt(n);

    sort(queries.begin(), queries.end(), cmp);

    int curL = 0, curR = -1;

    for(auto &q : queries) 
    {
        int L = q.L;
        int R = q.R;

        // Expand Right
        while(curR < R) 
        {
            curR = curR + 1;
            add(curR);
        }
        
       
        // Shrink Right
        while(curR > R) 
        {
            remove_(curR);
            curR = curR - 1;
        }
        
        // Expand Left
        while(curL > L) 
        {
            curL = curL - 1;
            add(curL);
        }
        
        
        // Shrink Left
        while(curL < L) 
        {
            remove_(curL);
            curL = curL + 1;
        }

        answer[q.idx]=distinct;
    }

    for(int i=0;i<q;i++) 
    {
        cout<<answer[i]<<"\n";
    }

    return 0;
}

Time Complexity : O((N + Q) √N)


Why time = O((N + Q) √N) ?

1. R pointer

  • For each block, R moves from 0 → N

  • Number of blocks = √N

So : R work = N × √N


2. L pointer

  • For each query, L moves at most √N

  • Total queries = Q

So : L work = Q × √N

Total = R work + L work
      = N√N + Q√N
      = (N + Q)√N


We are using this function :

bool cmp(Query a, Query b)
{

    int blocka = a.L / blocksize;
    int blockb = b.L / blocksize;

    if (blocka != blockb)
        return blocka < blockb;

    return a.R < b.R;
}

But this is much more powerful :

bool cmp(Query a, Query b)
{
    int blocka = a.L / blocksize;
    int blockb = b.L / blocksize;

    if (blocka != blockb)
        return blocka < blockb;

    if (blocka & 1)
        return a.R > b.R;
    else
        return a.R < b.R;
}

Reason :-

if (blocka & 1)
    return a.R > b.R;
else
    return a.R < b.R;

Why do we use it?
Normally we sort like this :
return a.R < b.R;
But this causes a problem

Problem :
At the end of one block:

  • R goes to right end (large value)

Then next block:

  • R suddenly jumps back to small value

This causes huge unnecessary movement

Idea of this trick :
We alternate sorting direction:

Block

Sorting of R

Even block (0,2,4...)

Increasing (→)

Odd block (1,3,5...)

Decreasing (←)

Visualization :

Without optimization : 
Block 0 → R: 1 → 5 → 9      (increasing)
Block 1 → R: 2 → 6 → 10    (increasing)

Movement:
9 → 2 (big jump back)

Low – High – Low – High


With optimization:
Block 0 → R: 1 → 5 → 9     (increasing)
Block 1 → R: 10 → 6 → 2   (decreasing)

Movement:
9 → 10 (small move)

Low – High - High – Low

What does (blocka & 1) mean?
blocka & 1
Checks if block is odd or even

  • If blocka is odd → result = 1

  • If blocka is even → result = 0

Example:
blocka = 2 → even → (2 & 1 = 0)
blocka = 3 → odd  → (3 & 1 = 1)


Link To Sqrt Decomposition : Sqrt Decomposition


Core Difference Between Mo’s Algorithm &  Sqrt Decomposition :

Use Sqrt Decomposition when :

  • Queries are simple (sum, min)

  • Online queries (must answer immediately)

  • Updates are needed

Use Mo’s Algorithm when :

  • Queries are complex

  • Offline allowed

  • No updates (or very limited)

Feature

Sqrt Decomposition

Mo’s Algorithm

Query type

Simple (sum, min)

Complex (distinct, freq)

Queries

Online

Offline

Updates

Supported

Not supported

Idea

Precompute blocks

Reorder queries