## Monday, September 3, 2012

### Given a stream compute the running average and the running median

#### 1 comment:

1. Here my approach:

Keep reading from the stream and store the values until you collect N values (N is decided by the user) into a temp array A. For each element "a" from the stream do:

1.1) Keep calculating the AVG of the read values so far by summing and dividing by the current number of elements read from the stream.

1.2) Maintain a MAX-heap and a min-heap and a variable k where top(MAX-heap) < k < top(min-heap). Once you read the next element "a" from the stream you compare it against k and if it is lesser than k you put "a" into MAX-heap else into min-heap. If the size of the two heaps differs by more than 1, then you pop the top elements of the heaps (let's call them M and m), and push M, m and k into the smaller heap. Then you pop from it the top element and assign k to that element. Now, the two heaps have the same size and k is the median.

As you read the N+k (k>0) value do:

2.1) print AVG and k as the current running average and median.

2.2) Maintain the AVG current by subtracting A[i]/N and adding a/N. Then a[i] = a and i = (i + 1) % N (i starts from 0).

2.3) maintain k as the current median applying the strategy described in 1.2.