NumberOfDiscIntersections O(N) solution

Task

The task is to count overlapping discs given an array of radii

We draw N discs on a plane. The discs are numbered from 0 to N − 1. An array A of N non-negative integers, specifying the radiuses of the discs, is given. The J-th disc is drawn with its center at (J, 0) and radius A[J].

We say that the J-th disc and K-th disc intersect if J ≠ K and the J-th and K-th discs have at least one common point (assuming that the discs contain their borders).

The figure below shows discs drawn for N = 6 and A as follows:
A[0] = 1 A[1] = 5 A[2] = 2 A[3] = 1 A[4] = 4 A[5] = 0

There are eleven (unordered) pairs of discs that intersect, namely:

discs 1 and 4 intersect, and both intersect with all the other discs;

disc 2 also intersects with discs 0 and 3.

Write a function: int solution(vector<int> &A);

that, given an array A describing N discs as explained above, returns the number of (unordered) pairs of intersecting discs. The function should return −1 if the number of intersecting pairs exceeds 10,000,000.

Write an efficient algorithm for the following assumptions:

N is an integer within the range [0..100,000];

each element of array A is an integer within the range [0..2,147,483,647].

Solution 1

If two discs A and B, where A is to the left of B, overlap then the right hand edge of A is at the same point or to the right of the left hand edge of B.

This suggests the following algorithm:


Create a list of the left hand edges.
Create a list of the right hand edges.
for each left hand edge (lh) 
	overlap := number of right hand (rh) edges >= lh edge
	count +=  overlap

There are a couple of flaws in the above algorithm, viz:

It counts a disc as overlapping itself
It counts discs as overlapping where A is to the right of B, and the rh of A is therefore to the right of the lh of B

Fortunately it is easy to calculate how many intersections are wrongly counted like this, and correct for them.

For A[0], it wrongly counts N items (itself and the following N-1 discs to the right), for A[1] it adds N-1 items, etc.

This means the additional intersections counted are: N + (N-1) + … +1 = N(N+1)/2

The above algorithm can be optimised by using sorted lists, like this:


lhes:= a sorted list of the left hand edges
rhes:= a sorted list of the right hand edges
rhindex := 0
count:= 0
for lhs = each left hand edge, from smallest  to largest
	while (rhes[rhindex++] < lhs); // note final rh edge has to be >= last lh edge, so this will always terminate
	rhEdgesCount := rhes.size – rhindex
	count += rhEdgesCount 
return (count – N(N+1)/2)

The following implements this. Note, we have to subtract the adjustment before the loop, so we can break out early if count > 10,000,000


#include <algorithm>
#include <cassert>
#include <vector>

int solution(vector<int> &A) {
  const int MAX_COUNT = 10000000;
  // create sorted arrays of left-hand edges, and rh edges
  vector<int64_t> lhes;
  vector<int64_t> rhes;
  ulong N = A.size();
  for (size_t i = 0; i < N; i++) {
    // for all prev discs – is rhs intersecting with lhs?
    long lhs = static_cast<long>(i) – A[i];
    lhes.push_back(lhs);
    long rhs = A[i] + static_cast<long>(i);
    rhes.push_back(rhs);
  }

  std::sort(lhes.begin(), lhes.end());
  std::sort(rhes.begin(), rhes.end());

  long count = 0;
  // now we remove ones where circle centre is to right or same pos. of circle. all of these will be counted in the loop
  // below sum = n(n+1)/2. note includes each item as intersecting itself, so is N + (N-1) + …
  count -= (N + 1) * N / 2;
  size_t rhIndex = 0;

  // for each lh edge, count number of rh edges >= it
  for (size_t i = 0; i < lhes.size(); i++) {
    int64_t lhs = lhes[i];
    while (rhes[rhIndex] < lhs)
      rhIndex++;
    if (rhIndex == rhes.size())
      break;
    // add all circles where rh edge >= current left hand edge
    count += rhes.size() – rhIndex;
    if (count > MAX_COUNT)
      return -1;
  }
  assert(count <= MAX_COUNT);
  return static_cast<int>(count);
}

Results

This scores 100% on codility for both correctness and performance. The complexity is O(N * log(N)) and is dominated by the sorting.

Correctness	100%
Performance	100%
Complexity	O(N * log(N))

Solution 2

However, I was not happy with the above, the inner loop (lines 32-37) seems a bit messy.

So I googled for other solutions and found this: stackoverflow.com/a/16814894 by user Толя.

While this uses a different algorithm than the above, it has a neat hack:

For right hand edges that are >=N it is not necessary to store the exact value, only that they are >=N (because it is not possible for a lh edge to be >=N).

This means the rh sides can be stored in an array [0..N-1], with the elements of the array having a count of the number of edges at that point.

The same can be done with the lh edges.

Applying this to the above, we don’t need to sort the arrays, and the count of rh edges can be calculated by counting down the array keeping a running total.

This gives us this algorithm:


clear lhcount, rhcount
For radius in A
	lhe := max(0,centre – radius)
	lhcount[lhe]++
	rhe:= min(N-1,centre+radius)
	rhcount[rhe]++
count := 0
rhSum := 0
for i := N-1 downto 0 
	rhSum += rhcount[i]
	count += lhcount[i] * rhsum
return (count – N(N+1)/2)

The implementation is below


#include <cassert>
int solution(vector<int> &A) {
  const int MAX_COUNT = 10000000;
  ulong N = A.size();
  // create arrays of frequency of left-hand edges, and rh edges
  vector<int> lhes(N,0);
  vector<int> rhes(N,0);
  for (size_t i = 0; i < N; i++) {
    // for all discs – count instances of edges at points
    ulong lhs = static_cast<ulong>(max(0, static_cast<int>(i) – A[i]));
    lhes[lhs]++;
    ulong rhs = min(N – 1, static_cast<ulong>(A[i]) + i);
    rhes[rhs]++;
  }
  long count = 0;
  // now we remove ones where circle center is to right or same pos. of circle. all of these will be counted by loop
  // below sum = n(n+1)/2. note includes 0 and N so  N + (N-1) + …
  count -= (N + 1) * N / 2;
  ulong rhSum = 0;
  for (long i = N – 1; i >= 0; i–) {
    // for all prev discs – is rhs intersecting with lhs?
    rhSum += rhes[i];
    count += lhes[i] * rhSum;
    if (count > MAX_COUNT)
      return -1;
  }

  assert(count <= MAX_COUNT);
  return static_cast<int>(count);
}

Results

This scores 100% on codility for both correctness and performance. The time complexity is O(N).

Correctness	100%
Performance	100%
Complexity	O(N)

Other Solutions

The only other O(N) solutions I found are variations of the one by Толя.

This works as follows:

Firstly, build arrays of the number of lh edges and rh edges at each position 0..N-1, as above
Then iterate from 0..N-1, using these arrays to keep track of how many discs are overlapping at the current point.
For each lh edge, add on the number of overlapping discs to the count
If there are > 1 lh edges at a point, we also need to add on the number of new overlaps caused by these new discs overlapping each other. For r new discs this is r(r-1)/2 (the number of combinations of 2 items out of r).

This is my implementation in C++:


#include <cassert>
int solution(vector<int> &A) {
  const int MAX_COUNT = 10000000;
  size_t N = A.size();
  // create arrays of frequency of left-hand edges, and rh edges
  vector<int> lhes(N,0);
  vector<int> rhes(N,0);
  for (size_t i = 0; i < N; i++) {
    // for all discs – count instances of edges at points
    int lhs = max(0, static_cast<int>(i) – A[i]);
    lhes[lhs]++;
    int rhs = min(N – 1, static_cast<int>(A[i]) + i);
    rhes[rhs]++;
  }
  int count = 0;
  int overlapping = 0;
  for (size_t i = 0; i < N; i++) {
    // find number of new discs at this point
    int newDiscs = lhes[i];
    count += newDiscs * overlapping;
    count += newDiscs * (newDiscs – 1) / 2;
    if (count > MAX_COUNT)
      return -1;
    // update with number of new discs, minus number now passed
    overlapping += newDiscs – rhes[i];
  }
  assert(count <= MAX_COUNT);
  return count;
}

Results

Codility Results

Correctness	100%
Performance	100%
Complexity	O(N)

These blogs also have some variation of Толя’s algorithm

stackoverflow solution in Java

In C++

In Java

These blogs have a solution that involves sorting lists of edges (as in solution 1) and then iterating through, counting overlaps (as in Толя’s algorithm)

Overengineered Java solution

In Java

Overcomplicated C++ version

In Python

Test Function

This code tests the solution


int main() {
  vector<int> A9(100000, 0);
  assert(solution(A9) == 0);

  vector<int> A2{};
  assert(solution(A2) == 0);

  vector<int> A1{1, 5};
  assert(solution(A1) == 1);

  vector<int> A7{1, 5, 2, 1};
  assert(solution(A7) == 5);

  vector<int> A6{1, 5, 2};
  assert(solution(A6) == 3);

  vector<int> A{1, 5, 2, 1, 4, 0};
  assert(solution(A) == 11);

  vector<int> A3{1, 5, 2, 1, 4, 0, 0};
  assert(solution(A3) == 13);
  vector<int> A4{1, 5, 2, 1, 4, 0, 0, 0};
  assert(solution(A4) == 14);
  vector<int> A5{1, 5, 2, 1, 4, 0, 0, 0, 0};
  assert(solution(A5) == 15);

  vector<int> A8{1, 2147483647, 0};
  assert(solution(A8) == 2);
}

NumberOfDiscIntersections O(N) solution

Task

Solution 1

Results

Solution 2

Results

Other Solutions

Test Function

Index of solutions to Codility Lessons

Leave a Reply Cancel reply