Skip to main content

Command Palette

Search for a command to run...

Sweep Line Technique : The Secret Weapon for Range & Geometry Problems , Part-1

Optimize Event Processing and Interval Queries Like a Pro

Updated
9 min read
Sweep Line Technique : The Secret Weapon for Range & Geometry Problems , Part-1
M

I break down complex programming and DSA topics into simple, beginner-friendly explanations. Currently learning, building, and sharing my journey in tech.

Interval :

An Interval is a continuous range of values between a start point and an end point on a line
Example :
[1, 5) means the interval starts at 1 and goes up to 5, but does not include 5


Event :

An Event is a specific point where an interval starts or ends.
In Sweep Line problems, each interval is converted into two events:

  • Start Event → when the interval becomes active

  • End Event → when the interval becomes inactive

Example :
For interval [1, 5):

  • Start Event = 1

  • End Event = 5


Sweep Line :

The Sweep Line Technique is a strategy used to solve interval or geometric problems by converting them into sorted start and end events and scanning them in order with an imaginary moving line to maintain active counts efficiently.

This might feel confusing at first — don’t worry, once you see the example, it becomes very clear.

Each interval is broken into two events :

  • Start Event → +1 (an interval becomes active)

  • End Event → −1 (an interval becomes inactive)

All these events are then sorted.
After sorting, we move the imaginary sweep line from left to right and keep a running count of active intervals.

  • When we see +1, we increase the count.

  • When we see −1, we decrease the count.

While scanning :

  • If the count becomes 2, it means 2 intervals are active (overlapping) at that point.

  • If the count becomes 3, then 3 intervals overlap, and so on.

This method is efficient because instead of checking every interval against every other interval, we only process ordered events once, reducing time complexity and making it suitable for large datasets.

Sweep Line = Processing Events In Order
Sweep Line Works Only If Events Are in Sorted Order

Example: Sweep Line on X–Y Axis
We have 3 intervals on the X-axis :
[1, 5) → includes 1, but does Not include 5
[3, 7) → includes 3, but does Not include 7
[6, 9) → includes 6, but does Not include 9

We want to know :
How many intervals overlap at every moment?

Step 1 — Draw the Intervals on X-axis

Step 2 — Sweep Line Moving Left → Right
We place a vertical sweep line and move it across the X axis.
Here’s how it looks at each important point.
At X = 1

At X = 3

At X = 5

Final Overlap Summary

X PositionActive Intervals
11 (A)
32 (A, B)
51 (B)
62 (B, C)
71 (C)

Sweep Line Interpretation

  • You created events :

    • Start of interval = +1

    • End of interval = -1

  • You sorted them :
    1(+1), 3(+1), 5(-1), 6(+1), 7(-1), 9(-1)

  • As the vertical sweep line moves, you update the count of active intervals.

Very Simple Example
[1, 4)
[2, 6)

Events :
1 start +1
4 end -1
2 start +1
6 end -1

After sorting :
1 start → count=1
2 start → count=2
4 end → count=1
6 end → count=0

You processed events in order, so the “sweep” correctly tells how many intervals are active at every point.



+1 when an interval becomes active
−1 when an interval stops being active

Case 1 :
Intervals are [L, R) (Left inclusive, Right exclusive)
This is what your overlap table actually uses

Rule

  • +1 at L

  • −1 at R

Why?

  • Interval starts contributing at L

  • Interval stops contributing exactly at R

Case 2 :
Intervals are [L, R] (Both inclusive)
Rule

  • +1 at L

  • −1 at R + 1

Why?

  • Interval is still active at R

  • It must stop after R


Question :

You are given N intervals on the number line.
Each interval is of the form [L, R) (inclusive of L, and ending exactly at R).

You are also given an integer K.

Your task is to compute the total duration of all points on the number line that are covered by at least K intervals simultaneously.

In other words:

At every point x, check how many intervals cover x.
If that count ≥ K, then that point contributes to the answer.
Sum the lengths of all such portions.

Input

  • First line: two integers N and K

  • Next N lines: each contains two integers L and R, representing an interval [L, R).

Output

  • Print a single integer :
    The total length of the union of all positions where the number of active intervals is ≥ K.

Explanation :-

3 2 → n,k
1 5
3 7
4 6

Why next point − current point ?

When the overlap count is ≥ 2 at some position, it means from this point until the next event point, the overlap stays the same. So the whole gap is valid.
That gap has a length, and its length is next point − current point.

#include<bits/stdc++.h>
using namespace std;
int main()
{
    int n,k;
    cin>>n>>k;
    vector<pair<int,int>> events;
    for(int i=0;i<n;i++)
    {
        int l,r;
        cin>>l>>r;
        events.push_back({l,+1});
        events.push_back({r,-1});
    }
    sort(events.begin(),events.end());

    int cnt=0;
    int ans=0;
    for(int i=0;i<events.size();i++)
    {
        cnt+=events[i].second;
        if(i+1< events.size() && cnt>=k)
        {
            ans += events[i+1].first - events[i].first;
        }
    }
    cout<<ans<<endl;

    return 0;
}

If The Queries Are Offline(Offline queries mean you don’t have to process each query immediately —
you can see all the queries first, then reorder, sort, or preprocess them in any way you like before answering.)

What If They Are Online(Online queries mean you must process each query immediately in the exact order they are given —
you cannot see or change future queries before answering the current one.)

Question :

Sweep Line Works Only If Events Are in Sorted Order

Problem Statement :
-------------------
N intervals (N <= 10^5)
Each interval: [Li, Ri, Xi] (<= 10^5 each)
Each interval has Xi value associated with it.

Q queries (<= 10^5 each)
Y K
K Value <= 10

For each query point Y, find all intervals that cover Y.
Take their Xi values, choose the top K values, and add them.
If there are fewer than K intervals, just add all the values you have.

N
L R X (N Times)
Q
Y K (Q Times)

5
1 9 1
5 7 2
3 10 3
11 19 5
13 15 6
5
2 2
4 5
6 2
14 2
14 1

Ans: 
1
4
5
11
6
2
1 10 1
5 15 2
3
7 3
12 2
2 1

1_________________10(X = 1)

        5________________________15(X = 2)

________________________________________
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17  -> Axis
  |         |           |
 2,1       7,3        12,2
 y,k       y,k        y,k
 3rd       1st        2nd  -> Queries
 1         3          2    -> Answers For Each Query


0 7 3    -> 0,1,2 Are Indexing
1 12 2
2 2 1

Sorting Based On Y Value

2 2 1    -> 0,1,2 Are Indexing
0 7 3
1 12 2

Why Do We Sort on Y Values (Query Points) ?

We sort queries by Y value because the Sweep Line moves in one direction — left to right on the number axis.
If queries are not sorted, the sweep line would have to jump back and forth, which breaks the whole efficiency idea.

Main Reason :

The sweep line technique works only when we process positions in increasing order.

  • Intervals start at L

  • Intervals end at R

  • Queries happen at Y

All these lie on the same axis.
So we must process them in sorted order of position.

What Happens If We Don’t Sort?
Imagine queries come like this :

Y = 14
Y = 2
Y = 6

If we answer 14 first :

When we answer a query at Y = 14, we pretend our sweep line has already moved from 0 to 14 on the number line.

While moving:

  • When we pass an interval start (L) → we add it.

  • When we pass an interval end (R) → we remove it.

So by the time we reach 14:

  • Many intervals were added (because their start was before 14).

  • Some intervals were removed (because they already ended before 14).

Now if we go back to 2 :

  • We would need to undo removals

  • Re-add old intervals

  • Recalculate everything

This becomes messy and slow — almost like brute force again.

With Sorting :
If queries are sorted:

Y = 2 → 6 → 14

The sweep line only moves forward:

  • Add intervals when they start.

  • Remove intervals when they end.

  • At each Y, just read the current active set.

No backtracking. No recomputation.

    #include <bits/stdc++.h>
    using namespace std;
    int getSumTopK(multiset<int>& st, int k) 
    {
        int sum = 0;
        int currk = 0;
        for(auto it = st.rbegin(); it != st.rend() && currk < k; it++, currk++) 
        {
            sum += *it;
        }
        return sum;
    }
    signed main() 
    {
        /*
            add     -> 0
            remove  -> 2
            query   -> 1
        */
        int n;
        cin >> n;
        vector< pair< pair<int, int>, pair<int, int> > > events;
        for(int i = 0; i < n; i++) 
        {
            int l, r, x;
            cin >> l >> r >> x;
            events.push_back( { {l,0},{x,i} } );
            events.push_back( { {r,2},{x,i} } );     
        }

        int q;
        cin >> q;

        for(int i = 0; i < q; i++) 
        {
            int y, k;
            cin >> y >> k;
            events.push_back( { {y,1},{k,i} } );      
        }

        sort(events.begin(), events.end()); // l r y were used for sorting so that events are in sequence now
        int ans[q]; // for outputing ans

        multiset<int> st;

        for(int i = 0; i < events.size(); i++) 
        {
            // { { _ , _ } , { _ , _ } }  

            int point = events[i].first.first;
            int type = events[i].first.second;
            int metaData = events[i].second.first;   
            int idx = events[i].second.second;

            if(type == 0) 
            {
                st.insert(metaData);  // here metaData is value of the segment, add to the multiset
            } 
            else if(type == 2) 
            {
                st.erase(st.find(metaData)); // here metaData is value of the segment end here, remove from the multiset
            } 
            else if(type == 1) 
            {
                int k = metaData; // for query type, metaData is K value
                ans[idx] = getSumTopK(st, k);
            }
        }

        // print the ans in correct sequence
        for(int i = 0; i < q; i++) 
        {
            cout << ans[i] << "\n";
        }
        return 0;
    }

    /*
    2
    1 10 1
    5 15 2
    3
    7 3
    12 2
    2 1

    3
    2
    1
    */