Random commentary about Machine Learning, BigData, Spark, Deep Learning, C++, STL, Boost, Perl, Python, Algorithms, Problem Solving and Web Search
Data stream is a very interesting topic. I recommend reading [Data Streams: Algorithms and Applications] by Muthukrishna
Assuming that the symbols in the stream are from an infinite vocabulary (else you can just collect the frequencies of each symbol and pick the max), the idea is the following:1) keep a map with 2 keys2) process the next element N in the stream3) if N is in the map then map[N]++4) if N is not in the map then replace the element with the lowest value in the map with map[N]=15) at any time you can return the value with the highest value in the mapTo make the implementation easier you could use a 2 nodes max-heap to keep track of the element with highest value in the map.The idea is that if at any time there has been an element that appeared more the 50% of the times in the stream, then this element will stay in the map with the highest value. If no element appeared more than 50% of the times so far then the result is meaningless.