Divide integers in c classes using their log c leftmost bits. Assume there exist (Ah << c) | Al + (Bh << c ) | Bl = X. For each choice of Ah we compute Bh and go through the sequence creating one hash table H for elements with high bits = Ah and one sequence Z for elements with high bits = Bh. Next, iterate over Z and query H for (X-Z[i]). Using 32 bits integers and c = 2, it takes 4 passes. Using 64 bits it will be surely longer, but one can make some statistics.
Suppose that we can store M ints in memory. Partition the large file into N/(2*M) files, each one holding M/2 ints. Before writing these M/2 integers, perform a quicksort in memory and write in sorted order. Hence we have created N/(2*M) sorted runs.
Allocate two arrays A[M/2], B[M/2]
for (i = 0; i < N/(2*M); i++) {
Load ints of file i into array A
for (j = i + 1; j < N/(2*M); j++) { Load ints of file j into array B
for (k = 0; k < M/2; k++) { f = BinarySearch in array B for the number x-A[k]
Divide integers in c classes using their log c leftmost bits. Assume there exist (Ah << c) | Al +
ReplyDelete(Bh << c ) | Bl = X. For each choice of Ah we
compute Bh and go through the sequence creating
one hash table H for elements with high bits = Ah
and one sequence Z for elements with high bits = Bh.
Next, iterate over Z and query H for (X-Z[i]).
Using 32 bits integers and c = 2, it takes 4 passes.
Using 64 bits it will be surely longer, but one
can make some statistics.
Is it what you intended?
Suppose that we can store M ints in memory.
ReplyDeletePartition the large file into N/(2*M) files, each
one holding M/2 ints. Before writing these M/2 integers, perform a quicksort in memory and write in sorted order. Hence we have created N/(2*M) sorted runs.
Allocate two arrays A[M/2], B[M/2]
for (i = 0; i < N/(2*M); i++) {
Load ints of file i into array A
for (j = i + 1; j < N/(2*M); j++) {
Load ints of file j into array B
for (k = 0; k < M/2; k++) {
f = BinarySearch in array B for the number x-A[k]
if (f) {
printf("found");
break;
}
}
}
}