Antonio Gulli's coding playground: Selecting the median in an array

Wednesday, December 30, 2009

Selecting the median in an array

Ok, this is a typical "easy" interview question: you are given an array of integers, select the median value. What is the optimal solution?

6 comments:

codingplaygroundDecember 30, 2009 at 8:51 AM
Various solutions are:

1) Sort ant take the element in the middle; O(n log n) and it takes additional space O(n)

2) QuickSelect which you find in any Algo book is linear in average, but still takes O(n) space

3) A linear solution given by Cormern and other, which I cited in a previous post. It still takes additional O(n) space.

What if I want to have this in place?
ReplyDelete
Replies
GiroTontiJanuary 2, 2010 at 12:04 PM
It can be done in O(n log n) time and O(1) additional space using binary search.

Assume that the median is contained in the range [l,r]. With a single scan of the array we can sample a random value among the elements of the array contained in [l,r]. Then we just count the number of values less than x in order to decide if the median is in [l,x] or [x,r]. This solution does not modify the array.
ReplyDelete
Replies
codingplaygroundJanuary 4, 2010 at 5:52 AM
that's classical quickselect. Any linear time solution ?
ReplyDelete
Replies
GiroTontiJanuary 5, 2010 at 4:54 AM
Hum, probably I misunderstood the question... is the algorithm allowed to modify the input array?
If the answer is no then achieving O(n) time in the RAM model is probably an open problem and is actually impossible in the comparison-based model (see T.Chan SODA09 on that problem). If the answer is yes, the classic selection algorithm inspired by quicksort is O(n) expected time and is already inplace.
ReplyDelete
Replies
codingplaygroundJanuary 5, 2010 at 9:42 AM
No it is not allowed
ReplyDelete
Replies
GiroTontiJanuary 13, 2010 at 6:29 AM
In practice, if the integers are 32-bit words, i would use a radix-based approach. With a single scan of the array build a table with 256 entries, such that T[i] stores the number of elements whose topmost 8 bits are i. Then you can trivially locate the bucket containing the median, which gives its first 8 bits. Reiterating the same scheme compute the median with 4 scan of the array and 256 words of extra space. Clearly, the size of the table need to be tuned to make the algorithm as fast as possible in practice.
ReplyDelete
Replies

Add comment