Saturday, May 29, 2010

Friday, May 28, 2010

Apple beats Microsoft


The maker of Mac computers, interactive phone and tablet iPhone IPAD beats Microsoft as the most valued technology firm in the world on stock market.

Wednesday, May 26, 2010

Business Model for Facebook

Facebook is one of the most interesting company out of there. They went from few millions of users up to five hundred million in a couple of years. I forecast they break the billion by Q2 2012.

In addition, running the business is not as expensive as search. They need to store user profiles and a lot of images. Video is outsourced to youtube and realtime updates are not so expensive (what is the history they maintain?).

Anyway, running a business requires to make money. So here is my call: What business model do you suggest for facebook to make money?

Content ads never worked for search. Could it work for FB? Or what else?

Tuesday, May 25, 2010

Count numbers

How do you count the number of ways a number can be expressed as a sum
of 2 or more numbers?

For eg. if the number is 5 , count=3 i.e 1+1+1+1+1, 4+1, 3+2
note 2+3 is same as 3+2

Monday, May 24, 2010

Detecting the dominant (>50%) symbol in a stream.

You are given a stream that you cannot hold in memory. At each instant you want to determine the dominant symbol observed in the stream (e.g. appearing more than 50% of times).

Sunday, May 23, 2010

Count in a range in O(1)

Given n integers in the range 0 to k, answers any query about how many of the n integers fall into a range [a b] in O(1) time. Make your own assumptions and build your own indexing data structures.

PS: I was used to ask this question during my interviews. Now no longer ;-)

Saturday, May 22, 2010

Sort a dictionary of variable lenght words

Assume you have a dictionary of k words. They have variable lengths, but the total number of symbols counted if we juxtapose all the words is n. Give an optimal sorting algorithm.

PS: this is a tricky question, with practical implication in information retrieval.

Friday, May 21, 2010

The perfect interview: Make your own assumptions

I am making a lot of interviews these days. So I wonder if there is a way to make the "perfect" interview. I say, probably no. How can I understand how good is a candidate in just one hour? Certain people are good in communication, other people are shy. Certain people are good team workers, other people are good as individual contributors. Certain people are good with numbers and symbolic computations, other people are good in describing methodologies. Certain people can have a bad day, other people prefer to leave for other places. And so on... So you want me to understand all these factors in just one hour? I say: No ways. Give me a couple of weeks to start with.

Anyway, interviews are important and you need to find a solution. So I follow three golden rules:

1) Many judgments are better than one. The candidate should be evaluated by many independent interviewers in a loop. It would be better if the interviewers express no judgment until the loop is closed to avoid influencing each others;

2) I always ask to myself: "Can I work with this candidate? Would (s)he help me in solving the problems we face day by day?"

My interviews are around some problem solving (you read my blog so you know this), a lot of algorithmic questions ;-), a lot of C++ coding and design patterns. In addition, machine learning, retrieval, and data mining are my areas of expertise so do expect to get some questions here. I am not very much impressed if you know all the recent academic papers or the books. I am very much interested about your intuitions. In fact, my third question is the most important one:

3) "How much creative is this candidate? How much can we learn from him in the future?"

The most interesting part of the interview is when we can discuss about hard problems
applied to real life and on very large dataset (up to petabytes of data). I describe the problem with one or two sentences and then tell to the candidate

Make your own assumptions

Tuesday, May 18, 2010

Sort again

We have N element array with k distinct keys. sort this array without using any extra memory.

Monday, May 17, 2010


given a bst of n nodes, find two nodes whose sum is equal to a number k in O(n) time and constant space

Sunday, May 16, 2010

Google and the WI-FI Mapping

Interesting posting about Google WI-FI Mapping by the way of Alessio. He suggested that this is due to the need of geo-localize mobile users. I hope that they do not want to make this alternative use found on YouTube

Saturday, May 15, 2010


A new UX on the top of Facebook OpenGraph search API --

Friday, May 14, 2010

Why Facebook's "Like" buttom is a real game changer?

These are some elements we discussed with a friend of mine in front of a good coffee.

Facebook "Like" is a real game changer for two different reasons:

1) FB enlarged the base of its data sources. Every time a user push the "Like" button on a partner site, they will know.

2) FB enlarged the base of its data sources. Every time a user load an external page in a partner site including the "Like" button, they will know. Even if you do not push the button.

We both agreed that 2) is the most important information, because you know a lot about real-time traffic.

Thursday, May 13, 2010

Partition a set (a bit harder)

Partition a set of numbers into two sets such that the difference between their sum is mininum and they have equal num of elements

Tuesday, May 11, 2010

Common substrings

Find the longest common subsequence of given N strings each having length between 0 to M

Monday, May 10, 2010

Evolution of Search

For a long time I thought that Search was a mature market, with Google and Microsoft the only two players remaining to fight.

Well, I was wrong. Facebook has a lot of data to search and they are the only one who can mine it. Try to search the volcano situation. Strangely enough they are not giving too much emphasis to this feature. So far ...

Sunday, May 9, 2010

Data analysis is the language of this age

Metric, data, numbers. Every theory must start with a measure.

Saturday, May 8, 2010

Friday, May 7, 2010

Minimum in two lists

You are given two sorted lists of size m and n. Give an O(log m+log n) time algorithm for computing the kth smallest element in the union of the two lists

Thursday, May 6, 2010

Optimal merge and operator AND in search

Given k sorted list merge them in optimal time. Assume that the total number of elements is n. Why this is useful for implementing the AND operator for a search engine?

(this is one of that questions that explains why basic algo knowledge is fundamental)

Wednesday, May 5, 2010

What direction is the stack growing?

You are working on a machine / compiler and you want to determine if the stack is growing towards increasing or decreasing addresses. What strategy would you use?

Tuesday, May 4, 2010

Bartz in London

I must confess that I like her very much: ""I don't need everybody to think I am an asshole. You think it's so much fun answering your questions? If I didn't think there was a good bottle of white wine at the end of it – I probably wouldn't do"

Monday, May 3, 2010

Facebook Searches Double – Words per Search to 3.5

The number of search conducted on Facebook doubled in the last year to 650 million searches. The average number of words per search has reached 3.5

Sunday, May 2, 2010

Google acquired a 3d desktop company

I wrote about my will to invest in a 3d desktop company. Google acquired one company, but they are not true 3d they simulate 3d into a 2d space.

Saturday, May 1, 2010

Oneriot is indexing public Facebook data

Now, of course, we’re only showing (indeed, only have access to) data that has been shared publicly by Facebook users. A user can restrict the visibility of these Likes on their Facebook profile. However, we’d be sidestepping the issue if we didn’t recognize that some users might be concerned that stuff they have shared on Facebook can now pop up on services like ours. Given that, we are rolling out this feature as a very limited bucket test today to assess users’ reactions and gather feedback. We love the new feature. And if users do too then we’ll roll it out to everyone at an appropriate speed.

Silently Facebook added open search API , change q=