Wednesday, June 23, 2010

Sum to X

Given a large file, having N (N is very large) positive integers, how will you find a pair of numbers that add up to x.


  1. Divide integers in c classes using their log c leftmost bits. Assume there exist (Ah << c) | Al +
    (Bh << c ) | Bl = X. For each choice of Ah we
    compute Bh and go through the sequence creating
    one hash table H for elements with high bits = Ah
    and one sequence Z for elements with high bits = Bh.
    Next, iterate over Z and query H for (X-Z[i]).
    Using 32 bits integers and c = 2, it takes 4 passes.
    Using 64 bits it will be surely longer, but one
    can make some statistics.

    Is it what you intended?

  2. Suppose that we can store M ints in memory.
    Partition the large file into N/(2*M) files, each
    one holding M/2 ints. Before writing these M/2 integers, perform a quicksort in memory and write in sorted order. Hence we have created N/(2*M) sorted runs.

    Allocate two arrays A[M/2], B[M/2]

    for (i = 0; i < N/(2*M); i++) {

    Load ints of file i into array A

    for (j = i + 1; j < N/(2*M); j++) {
    Load ints of file j into array B

    for (k = 0; k < M/2; k++) {
    f = BinarySearch in array B for the number x-A[k]

    if (f) {