## Wednesday, June 23, 2010

### Sum to X

Given a large file, having N (N is very large) positive integers, how will you find a pair of numbers that add up to x.

1. Divide integers in c classes using their log c leftmost bits. Assume there exist (Ah << c) | Al +
(Bh << c ) | Bl = X. For each choice of Ah we
compute Bh and go through the sequence creating
one hash table H for elements with high bits = Ah
and one sequence Z for elements with high bits = Bh.
Next, iterate over Z and query H for (X-Z[i]).
Using 32 bits integers and c = 2, it takes 4 passes.
Using 64 bits it will be surely longer, but one
can make some statistics.

Is it what you intended?

2. Suppose that we can store M ints in memory.
Partition the large file into N/(2*M) files, each
one holding M/2 ints. Before writing these M/2 integers, perform a quicksort in memory and write in sorted order. Hence we have created N/(2*M) sorted runs.

Allocate two arrays A[M/2], B[M/2]

for (i = 0; i < N/(2*M); i++) {

Load ints of file i into array A

for (j = i + 1; j < N/(2*M); j++) {
Load ints of file j into array B

for (k = 0; k < M/2; k++) {
f = BinarySearch in array B for the number x-A[k]

if (f) {
printf("found");
break;
}
}
}
}