Sunday, January 30, 2011

Numbers of transactions in Facebook (on mysql)

Those are numbers of transactions in Mysql for Facebook (see long presentation)

Saturday, January 29, 2011

Find the longest increasing substring

Please know the answer for my interviews: A substring is made up of contiguous symbols. Therefore the invariant is a[j] > a[k] for j>k. If the invariant is respected than increase a counter where you maintain the current longest substring observed so far. If the invariant is not respect then reset the counter. Also, when you reset you need to compare the current longest substring observed before the rest with the longest observed in the past. Finally, you may want to maintain the starting point.

Now tell me what is the complexity of this algo?

Friday, January 28, 2011

Find the longest increasing subsequence

Please know the answer for my interviews ;-) A subsequence is made up of non contiguous symbols. A solution can be found in O(n * log n), where n is the length of the string S[1...n] by using binary search

Thursday, January 27, 2011

find the longest palindromic substring.

Longest common substring with your reverse, look at the mirror dude.

Wednesday, January 26, 2011

Urls and costs

You have a certain numbers of URLs N, grouped by domain. Each URL u_j has an associated weight w_j. You want to pick X URLs from N, but when you pick the theurl u_j in d_i then you need to pick all the urls for that domain. How can you pick exactly X urls and minimize the associated cost?

Tuesday, January 25, 2011

Array and indeces

You are given an array A[1..n] and array B[1...n], build an array C containing the position in B where a[i] occurs. Complexity?

Monday, January 24, 2011

Find the largest sub-matrix (part second)

You have a positive integer matrix of size mxm . Find the largest sub-matrix containing the largest numbers in al the 4 corners. Solve using additional space, what is the complexity. Solve without using additional space, what is the complexity.

Sunday, January 23, 2011

Search with wildcards

You are given a collection of n documents D. The collection is static (i.e. no new documents are added to the list) but is very large (e.g. it cannot fit in memory).

You need to retrieve all the documents matching queries where wild cards are allowed. A wild card * will match any sequence of characters, and a wile card ? will match a single character.

Implement a C++ solution.

Quite Difficult.

Wednesday, January 19, 2011

Maximum sum. with positive and negative integers

Given an array of positive and negative numbers, find the maximum sum in the array

Tuesday, January 18, 2011

Maximum sum, with no adjacent elements

Given an integer array consisting of positive numbers only, find maximum possible sum of elements such that there are no 2 consecutive elements in the sum.

Monday, January 17, 2011

Old interview question: phonebook in C++

Implement a phone book in C++. You have a list of names and numbers and you can search by name, by prefix name, by number. New numbers and names can be added over time. A name can have multiple numbers associated and a number can have multiple names associated.

Optional: consider how to serialize the data structures, by using STL & Boost.

Sunday, January 16, 2011

Off topic: being 40, and a buch of news

Being 40, and got a new car with 7 seats
Being 40, and got a new place to stay with a large garden (the second one with a garden)
Being 40, and got a new baby
Being 40, and got a new MBA course to start
Being 40, and hacking my coding project just for fun

Being 40, and want to have an impact more than ever

Saturday, January 15, 2011

Find the largest sub-matrix (part second)

You have a positive integer matrix of size mxm . Find the largest sub-matrix containing the largest numbers in al the 4 corners. Solve using additional space, what is the complexity. Solve without using additional space, what is the complexity.

Friday, January 14, 2011

A special kind of bucketized list

You are given a very long list of elements, partition the list in optimal times so that at the beginning there are all the elements repeating one time, followed by all the elements repeating two times, followed by all the elements repeating three times, and so on and so forth.

Thursday, January 13, 2011

Find the largest submatrix

You have a binary matrix of size mxm . Find the largest sub-matrix containing all 0 in al the 4 corners. Solve using additional space, what is the complexity. Solve without using additional space, what is the complexity.

Tuesday, January 11, 2011

Collinear points 1/11/11

There are n points in 2d space with they (x,y) co-ordinates. Find the maximum set of points that are collinear?

PS: hard one if you want to be under cubic complexity. It is what you should expect for 1/11/11 -- those are some sort of segments where collinear points can lie.

Monday, January 10, 2011

Is this an open problem: smallest base that makes a number palindrome?

You have a number n, find a base b >1 such that n is palindrome in that base. I don't know the solution.

Sunday, January 9, 2011

Infinite for loop on FB Json answers

If you submit an ajax query to Facebook like

http://www.facebook.com/ajax/typeahead/first_degree.php?__a=1&filter[0]=user&no_cache=0&stale_ok=0&viewer=&token=&lfe=1

they will return a valid json enclosed in a piece of code which goes on infinite loop.

Kinda of naive filter for people poking around and doing naive eval of json fragments? This happens for all the json queries submitted to Facebook.
for (;;);{"error":0,"errorSummary":"","errorDescription":"","errorIsWarning":false,"silentError":0,"payload":{

Saturday, January 8, 2011

Given two lists find the k-th sum

You are given two lists A and B of size n and m respectively and a number k < n * m, find the k-th sum (a+b), where a is in A and b is in B.

Please solve this problem if you come to my interviews

Friday, January 7, 2011

An old interview question: counting in a list

You have a list of names which is so large that you cannot fit in memory and it is too expensive to sort. Count the number of duplicates in optimal space and time.

Thursday, January 6, 2011

Bit generators

you have an unbiased generator G1 which generate 0 and 1 with probability 1/2. Build a new generator G2 which generates 0 with probability p and 1 with probability 1-p.

Wednesday, January 5, 2011

Build a tree from a matrix

An n-ary tree is represented in a matrix form and A[i,j] = 1 if j is the ancestor of i, otherwise A[i,j] = 0. Given this construct the tree.

Tuesday, January 4, 2011

next in an array

Design a data structure to find the next higher element for each element in an array.

Monday, January 3, 2011

Reverse check

Given two lists, check if one is reverse of other. Do it without using extra space and without changing the lists.

Sunday, January 2, 2011

A interesting reading to start the year: type systems

A lovely and long old goldies about type systems. What is a weak, strong, static, dynamic ts? And also, what are more esotheric things such as duck ts?

For long time my preferences went to static and strong type systems, then I am become a fan of dynamic ones. What about you?

Enjoy this lovely reading for your 1/1/11 day

Saturday, January 1, 2011

Internet in numbers 2010

Originally seen here. So there are 3 emails account out of 1 Facebook account, which is amazing if you consider that email was born around 70 and Facebook just 6 years ago but has already 600millions registered users. Also, we send 294billion emails per day, and we share 30 billions pieces of information per month on facebook.

Email

  • 107 trillion – The number of emails sent on the Internet in 2010.
  • 294 billion – Average number of email messages per day.
  • 1.88 billion – The number of email users worldwide.
  • 480 million – New email users since the year before.
  • 89.1% – The share of emails that were spam.
  • 262 billion – The number of spam emails per day (assuming 89% are spam).
  • 2.9 billion – The number of email accounts worldwide.
  • 25% – Share of email accounts that are corporate.

Websites

  • 255 million – The number of websites as of December 2010.
  • 21.4 million – Added websites in 2010.

Web servers

  • 39.1% – Growth in the number of Apache websites in 2010.
  • 15.3% – Growth in the number of IIS websites in 2010.
  • 4.1% – Growth in the number of nginx websites in 2010.
  • 5.8% – Growth in the number of Google GWS websites in 2010.
  • 55.7% – Growth in the number of Lighttpd websites in 2010.

Web server market share

Domain names

  • 88.8 million – .COM domain names at the end of 2010.
  • 13.2 million – .NET domain names at the end of 2010.
  • 8.6 million – .ORG domain names at the end of 2010.
  • 79.2 million – The number of country code top-level domains (e.g. .CN, .UK, .DE, etc.).
  • 202 million – The number of domain names across all top-level domains (October 2010).
  • 7% – The increase in domain names since the year before.

Internet users

  • 1.97 billion – Internet users worldwide (June 2010).
  • 14% – Increase in Internet users since the previous year.
  • 825.1 million – Internet users in Asia.
  • 475.1 million – Internet users in Europe.
  • 266.2 million – Internet users in North America.
  • 204.7 million – Internet users in Latin America / Caribbean.
  • 110.9 million – Internet users in Africa.
  • 63.2 million – Internet users in the Middle East.
  • 21.3 million – Internet users in Oceania / Australia.

Social media

  • 152 million – The number of blogs on the Internet (as tracked by BlogPulse).
  • 25 billion – Number of sent tweets on Twitter in 2010
  • 100 million – New accounts added on Twitter in 2010
  • 175 million – People on Twitter as of September 2010
  • 7.7 million – People following @ladygaga (Lady Gaga, Twitter’s most followed user).
  • 600 million – People on Facebook at the end of 2010.
  • 250 million – New people on Facebook in 2010.
  • 30 billion – Pieces of content (links, notes, photos, etc.) shared on Facebook per month.
  • 70% – Share of Facebook’s user base located outside the United States.
  • 20 million – The number of Facebook apps installed each day.

Web browsers

Web browser market share

Videos

  • 2 billion – The number of videos watched per day on YouTube.
  • 35 – Hours of video uploaded to YouTube every minute.
  • 186 – The number of online videos the average Internet user watches in a month (USA).
  • 84% – Share of Internet users that view videos online (USA).
  • 14% – Share of Internet users that have uploaded videos online (USA).
  • 2+ billion – The number of videos watched per month on Facebook.
  • 20 million – Videos uploaded to Facebook per month.

Images

  • 5 billion – Photos hosted by Flickr (September 2010).
  • 3000+ – Photos uploaded per minute to Flickr.
  • 130 million – At the above rate, the number of photos uploaded per month to Flickr.
  • 3+ billion – Photos uploaded per month to Facebook.
  • 36 billion – At the current rate, the number of photos uploaded to Facebook per year.

Data sources and notes: Spam percentage from MessageLabs (PDF). Email user numbers and counts from Radicati Group (the number of sent emails was their prediction for 2010, so it’s very much an estimate). Website numbers from Netcraft. Domain name stats from Verisign and Webhosting.info. Internet user numbers and distribution from Internet World Stats. Facebook stats from Facebook and Business Insider. Twitter stats from Twitter (and here), TwitterCounter and TechCrunch. Web browser stats from StatCounter. YouTube video numbers from Google. Facebook video numbers from GigaOM. US online video stats from Comscore and the Pew Research Center. Flickr image numbers from Flickr. Facebook image numbers from this blog.