Wednesday, July 28, 2010

Set intersection in an efficient way?

If you are given a set of 1000 integers in a set A , and 10,000,000 integers in a set B, how would you create a set C that would only contain numbers that are both in A and B?

1 comment:

  1. 1. sort the numbers of set A
    2. sort the numbers of set B
    3. perform ordered list intersection in a way
    similar to the inverted list intersection
    (i.e. maintain two pointers and compare the
    items pointed by the pointers. Progress the
    pointer pointing to the smallest int, or
    progress both pointers when a common item is
    4. For more efficiency place skip pointers
    on the second sorted list every sqrt(10^6)
    items. For every item in list A, perform
    a binary search for the correct skip pointer
    and place pointer B in the skip pointer
    you've found.