## Wednesday, December 25, 2013

### Egg drops

Given k eggs and n floors in a building, find the minimum number of trials required to determine the lowest floor from which when we drop an egg it does not break. Provide a Dynamic Programming solution.

## Tuesday, December 24, 2013

### Bridge Matching

n cities are sitting on the northern bank of a river and n cities are sitting on the southern bank. You can build a bridge only between cities with the same number. Provide a dynamic programming solution.

## Monday, December 23, 2013

### LBS

Given an array of integers find the longest bitonic sequence. Provide a dynamic programming solution

## Sunday, December 22, 2013

### Box Stacking

Given a set of 3d boxes compute the largest stack of boxes. A box can be stacked only on the top of another box with lager base. Provide a dynamic programming solution for stacking the boxes

## Saturday, December 21, 2013

### Boolean Parentization

given a Boolean expression of and, or, xor, true, false, find the number of ways to parenthesize and evaluate to true. Provide a dynamic programming solution

## Friday, December 20, 2013

### Box Stacking

Given a set of 3d boxes compute the largest stack of boxes. A box can be stacked only on the top of another box with lager base.

## Thursday, December 19, 2013

### Maximum value contiguous subsequence

given a real numbers’ array find a contiguous subsequence with max sum. Provide a dynamic programming solution

## Wednesday, December 18, 2013

### Balanced partition

given an array of integers between 0 and M divide the integers into two set such that difference of sums is minimized. Provide a dynamic programming solution

## Tuesday, December 17, 2013

### Scheduling

given n jobs each of which with a processing time, a profit and a deadline maximize the profit. Provide a dynamic programming solution

## Monday, December 16, 2013

### Print the binary representation of an unsigned

## Sunday, December 15, 2013

## Saturday, December 14, 2013

### Find the position with only one bit set in an unsigned

## Friday, December 13, 2013

### given an unsigned int swap odds and even bits

## Thursday, December 12, 2013

## Wednesday, December 11, 2013

### Add two numbers without using arithmetic operators

## Tuesday, December 10, 2013

## Saturday, November 9, 2013

### Chapter 3 is done

This chapter is about Graph computation and there are classical algorithms (BFS, DFS, MST). MST is implementing the classical Prim's algo but using Fibonacci heap available in Boost. Here you find the first version of the graph code

## Wednesday, October 30, 2013

### Writing a programming / algorithmic book

I wanted to take on a new challenge and start writing a book about algorithms and design. The intent is to have Simplicity by Design (SD): every Chapter is a collection of 15-30 questions, followed by the solution, followed by a C++ code snippet which can be easily re-used.

So far, I have drafts for 3 chapters: bit programming, dynamic programming, and graphs. Here an example of graph code with very classical questions: implement a graph, implement bfs, implement dfs, implement topological sort, I already have code for other graph problems. Dynamic programming has more mathematical content and I will publish an excerpt next weeks.

A number of questions:

So far, I have drafts for 3 chapters: bit programming, dynamic programming, and graphs. Here an example of graph code with very classical questions: implement a graph, implement bfs, implement dfs, implement topological sort, I already have code for other graph problems. Dynamic programming has more mathematical content and I will publish an excerpt next weeks.

A number of questions:

- Should I publish code for public review? If so, can i still use it for the book?
- Should I publish questions for public review?

I think that this will be one of my side projects off work for the next year or so.

## Sunday, September 29, 2013

### Create your Team

you have a pool of players. Each player ha a set of skills. For creating a team you need to hire the minimum number of players who have together a predetermined set of skills

## Wednesday, September 25, 2013

### P0 - what people are saying

Here some previous postings

### Find i = n^3 + m^3

how can a number be expressed as sum of two cubes?

### 52 cards

You have 52 cards half of them are red and half of them are black. You have a given amount of money M and every time you can bet k < M money. If you predict the right colour you will win 2k money, if not you will loose k. What is the best strategy?

## Tuesday, September 24, 2013

## Monday, September 23, 2013

### Conflits in k keywords

A keyword is made up of n bits. Given k keywords generated with uniform probability what is the probability to have collision? Provide an evaluation and write the code.

## Sunday, September 22, 2013

### Radix Sort -

Write a radix sort implementation in C++

## Saturday, September 21, 2013

### PageZero and Ted

I enjoyed attending Ted and give a talk about PageZero. While I was listening a number of very inspiring talks, I asked to myself would this PageZero really help users?

So I started to write {Ted} and immediately PageZero suggested {Ted Talks} as first suggestions and {Speakers} Speakers as su-bintent. That's exactly what I was looking for!!! I am impressed.

## Friday, September 20, 2013

### Buying with constrains

You have a bag with capacity C and can insert items with value wi which will occupy space si. You cannot exceed the capacity. Write an algorithm in C++

## Thursday, September 19, 2013

### Project PageZero

SearchBox never changed since 1993, when the first search engine was created. Bing is disrupting the box by bridging the gap with the World of Entities. Names, Web Sites, and Generic Entities are directly accessible and understood as soon as you type.

http://www.slideshare.net/antoniogulli/project-page-zero-bi

http://www.slideshare.net/antoniogulli/project-page-zero-bi

## Wednesday, September 18, 2013

## Tuesday, September 17, 2013

### k balls, n floors

There is a building with n floors and you have k balls. Balls are identical and when you drop them they can break or not. You need to figure out the highest floor of the building an egg can be dropped without breaking.

## Monday, September 16, 2013

### In a network of n nodes elect a leader

what is the algorithm you would suggest?

## Sunday, September 15, 2013

### In a stream of integers find the ones that appears at least k times

Optimal in space and time

## Saturday, September 14, 2013

## Friday, September 13, 2013

### Design Patterns : C++ full collection of Gamma's patterns

### Design Patterns : C++ full collection of Gamma's patterns

Full collection of Gamma's patterns in Full collection of Gamma's patterns in c++:

- Creational: Abstract Factory, Builder, Factory, Prototype, Object Pool, Singleton,
- Structural: Adapter, Bridge, Composite, Decorator, Facade, Flyweight, Proxy patterns,
- Behavioral: Chain of Responsibility, Command, Iterator, Mediator, Memento, Observer, State, Strategy, Template, Visitor, and Interpreter

## Thursday, September 12, 2013

### Capped payrolls

In a company there are different payrolls pi, and we want to find a cap C such that the total amount of money paid is not above a maxim M. How can you determine C?

## Wednesday, September 11, 2013

### Int 16 bit - compute the sqrt

Given an integer in 16 bits compute the sqrt

## Tuesday, September 10, 2013

### Given an array of sorted lists merge them

in optimal time and space

## Monday, September 9, 2013

## Sunday, September 8, 2013

## Saturday, September 7, 2013

### Given a BST and two nodes find the common ancestor

## Friday, September 6, 2013

### How can you detect the endianess of your computer?

### Given an array of integers longest non decreasing

Compute and print the longest non decreasing sequence of integers

## Thursday, September 5, 2013

### Given a list of words, detect the longest word

made by concatenating other words in the list

## Wednesday, September 4, 2013

### Given a stream of integers select k random entries

## Tuesday, September 3, 2013

## Monday, September 2, 2013

## Sunday, September 1, 2013

### Given two numbers a and b

Swap them with no temp variable. How many way do you know for achieving that?

## Saturday, August 31, 2013

## Friday, August 30, 2013

### What is the probability of dropping m balls in the same bin?

Suppose there are n bins

## Thursday, August 29, 2013

### Load balancing

We have a set of task i={1, ..., n} and each one has a cost b_i, how can we assigned to m servers so that the load is balanced?

## Wednesday, August 28, 2013

## Tuesday, August 27, 2013

### Given an array of 1 million entries

Pick the smallest int.

## Monday, August 26, 2013

## Sunday, August 25, 2013

### Triangle and Ants

On a triangle, there are 3 ants what is the probability that they collide when they move

## Saturday, August 24, 2013

### In a number of 64 bits

Swap odd and even bits.

## Sunday, August 18, 2013

### What is wrong with this code?

string FindAddr( listemps, string name ) { for( list ::iterator i = emps.begin(); i != emps.end(); i++ ) { if( *i == name ) { return i->addr; } } return ""; }

## Friday, August 16, 2013

### generate strings

Write a program that generates from the string "abcdefghijklmnopqrstuvwxyz{" the following:
a
bcb
cdedc
defgfed
efghihgfe
fghijkjihgf
ghijklmlkjihg
hijklmnonmlkjih
ijklmnopqponmlkji
jklmnopqrsrqponmlkj
klmnopqrstutsrqponmlk
lmnopqrstuvwvutsrqponml
mnopqrstuvwxyxwvutsrqponm
nopqrstuvwxyz{zyxwvutsrqpon

## Thursday, August 15, 2013

### Add two integers without using '+'

How can you add two integers without using + operator

## Wednesday, August 7, 2013

### Swap two integers

with no temp. How many ways do you know?

### how many ways do you know to swap variables?

without using temp vars?

## Sunday, August 4, 2013

### Use a dictionary

Given a string like "thisisalongstringwithmisakenoten" and a dictionary of words, split the string so that the number of non matched words are minimized. Try to optimize the code

## Saturday, August 3, 2013

### Implement a smartpointer class

using templates in C++

Smartpointer s1(T * ptr)

Smartpointer s1(Smartpointer & s2)

and the assigment

s1=s2

Smartpointer s1(T * ptr)

Smartpointer s1(Smartpointer & s2)

and the assigment

s1=s2

## Friday, August 2, 2013

### Allocate a matrix of ints reducing the fragmentation

## Wednesday, July 31, 2013

### Shuffle an array of n elements

Suppose to have a uniform random generator in [0, 1)

## Tuesday, July 30, 2013

### Intersect two list of integers

what is the complexity

## Monday, July 29, 2013

## Monday, July 22, 2013

### Bing vs Google suggestions

Spot the difference on {royal baby} - which on has better suggestions the unbranded search engine or the other one?

## Friday, July 19, 2013

### What they've said about new Knowledge Suggestions @ Bing

#####
"This is one area where Bing has a clearly better user experience than Google, if you’re looking to get actionable results from the actual search box. While Google will put a Pitbull Google+ profile in its autosuggest feature, it does little with suggestions for any of the examples Bing provides"

That’s not to say Google’s Knowledge Graph results don’t provide ways for users to distinguish what they’re looking for, but you have to get through the search query before you can tell Google what you’re actually looking for (like the Harry Potter books vs. movies, for example). Of course, you could always specify in your actual query."

#####
Bing Autosuggest Expands to Take on Instant Search, Knowledge Graph In One Fell Swoop - http://t.co/nPS4Z6qufz

#####
Bing has taken search suggestions, autocomplete, and semantic search to a whole new level. -- http://t.co/nPS4Z6qufz

##### I find the experience pretty darn pleasant. Controversial article but our share is increasing. http://t.co/8SryXw61Eb via @MarketingPilgrm

#####
PCMag talks about the new searchbox with Knowledge understanding -http://t.co/pTf0J2xpk6

#####
Techcrunch talks about Bing Autosuggest - http://t.co/0DgvxFq8OU

#####

According to webpronews we clearly beats Google - http://t.co/v5HIA9jE1Q

According to webpronews we clearly beats Google - http://t.co/v5HIA9jE1Q

#####
Searchengineland talks about the searchbox disruption - http://t.co/vLbOjnoKSs

#####
Microsoft is pushing a bit further by surfacing these differences directly below the search box. -- http://t.co/LcQRmNU7BR

##### Help vague Bing users get off of Microsoft's landing page and on with their lives.http://t.co/FeC6kax9EJ via @engadget

Other articles:

**Bing Autosuggest**Categories Revamped

UberGizmo (blog)

Microsoft expands

**Bing**’s

**autosuggest**for people to include animals, brands, movies, albums, places, and more

thenextweb.com

Marketing
Pilgrim

MediaPost

IN

## Wednesday, July 17, 2013

### Disrupting the searchbox

Today,

I published a blog on Microsoft search and it has been quoted by many

I published a blog on Microsoft search and it has been quoted by many

**“**

**This is one area where Bing has a clearly better user experience than Google, if you’re looking to get actionable results from the actual search box. While Google will put a Pitbull Google+ profile in its autosuggest feature, it does little with suggestions for any of the examples Bing provides.**

That’s
not to say Google’s Knowledge Graph results don’t provide ways for users to
distinguish what they’re looking for, but you have to get through the search
query before you can tell Google what you’re actually looking for (like the
Harry Potter books vs. movies, for example). Of course, you could always
specify in your actual query.”

## Sunday, July 14, 2013

### SQL-like engines on the top of hadoop map/reduce

### Apache Kafka

A nice system for distributed logging from multiple platform - apache kafka

## Thursday, July 11, 2013

### Building large scale systems at Twitter

Very interesting article for building large systems at Twitter. Worth reading it, use of sharding on the top of standard mysql makes the trick (

https://twitter.com/zipkinproject

http://www.youtube.com/watch?v=5cKTP36HVgI

http://highscalability.com/blog/2011/12/19/how-twitter-stores-250-million-tweets-a-day-using-mysql.html

http://highscalability.com/blog/2013/7/8/the-architecture-twitter-uses-to-deal-with-150m-active-users.html

**gizzard**- for sharding**,****flockdb -**for graph db)https://twitter.com/zipkinproject

http://www.youtube.com/watch?v=5cKTP36HVgI

http://highscalability.com/blog/2011/12/19/how-twitter-stores-250-million-tweets-a-day-using-mysql.html

http://highscalability.com/blog/2013/7/8/the-architecture-twitter-uses-to-deal-with-150m-active-users.html

## Tuesday, July 9, 2013

## Monday, July 8, 2013

### Othello

Othello is a game played on a square board, with black and white pieces. If a white/black piece is surrounded by a black/white one (either left/right or up/down) then the piece will change color. Players play in turn until they have valid moves left. Implement Othello following Object Oriented criteria (e.g. what is a piece, what is a board, what is a game, who keeps the score, what is a valid move).

### represent a double in [0,1) as binary

First need to understand what is the binary representation for a double in [0,1). Let's see an analogy. If the number is an integer such as 123 I would probably make something like this

2 ^ 7 = 128 > 123 too large

123 - 2 ^ 6 = 59 -> set bit 1

59 - 2 ^ 5 = 27 -> set bit 1

27 - 2 ^ 4 = 11 -> set bit 1

11 - 2 ^ 3 = 3 -> set bit 1

3 - 2 ^ 2 < 0 -> set bit 0

3 - 2 ^ 1 = 1 -> set bit 1

1 1 1 1 0 1 1

### SearchCharm - Unified Search Experience in Windows 8.1

I am using windows 8.1 preview since many weeks without any problem. Search experience brings together the online and the offline worlds and it allows users to look for information virtually everywhere.

Do I want to access

Do I want to access

**Traditional Web Search**Do I want to access

**cnn?**Easy. I start to insert the query and I get both the web site and some relevant suggestions including**cnn news**and**cnn money.**Suggestions are generated by my group in London.

**Navigating windows**Do I want to access

**windows control panel?**No problem, I just use the same interface and search paradigm.**Cloud search**
What if I want to access my files stored in the cloud? Here I have my

**cv**saved on Microsoft Skydrive and accessing it is as easy as searching via the charm**Local search**
What if I want to access my files stored in my local computer? Here I have some videos

**saved on my local computer and accessing them is as easy as searching via the charm**## Sunday, July 7, 2013

### Given a very large binary tree T1 and a small binary T2

check whether T2 is contained in T1. Discuss your solution

## Saturday, July 6, 2013

### Given a binary tree, create a list of lists containing all the nodes in one level

write some c++ code / STL. Optimal in space and time

## Friday, July 5, 2013

## Wednesday, July 3, 2013

### Search as you type with snapshots and disambiguation

Exciting day today! Bing is shipping an innovative search feature aiming at helping users to complete their tasks, as soon as they start to type their queries.

Another example is the query

Now let's suppose that the intent is not ambiguous. For instance consider the keyword

And what if there are two persons with the same name? In this case, we offer the opportunity to select the desired person as in this example for the query prefix

One more example. Let's consider the query

Hope you enjoy it - this is the beginning.

Consider the keyword

However, we believe that the search engine can help the user to express very quickly this intent. How? In Bing we use the power of Satori (our Microsoft Knowledge Graph) to enrich our suggestions with additional information useful for disambiguating different meanings associated with the query. In this way, the users can select their own most appropriate choice.

**"pitbull".**This is a very interesting query because the user can be interested in getting either results about the famous singer (a person) or about the dog (an animal). For this example, we believe no instantaneous decision can be taken on the user's behalf and that only the user has a complete understanding of the intent expressed with a query.However, we believe that the search engine can help the user to express very quickly this intent. How? In Bing we use the power of Satori (our Microsoft Knowledge Graph) to enrich our suggestions with additional information useful for disambiguating different meanings associated with the query. In this way, the users can select their own most appropriate choice.

Another example is the query

**{pluto}.**Is the user interested in the planet or is she interested in the cartoon? Those are very different kinds of entities which happens to have the same name but Bing understands the difference and gives to the user the opportunity to select the desired choice.

Now let's suppose that the intent is not ambiguous. For instance consider the keyword

**{michele obama},**we believe that it is very valuable to have a rich suggestion like the following one.

And what if there are two persons with the same name? In this case, we offer the opportunity to select the desired person as in this example for the query prefix

**{rachel nich}**

One more example. Let's consider the query

**{u2}.**Here, we offer the opportunity to explore all the entities / artists that are playing with the pop band

And what if you search for

**{harry potter}?**In this case Bing provides a snapshot of the filmography, a reference to the book and a reference to the character.Hope you enjoy it - this is the beginning.

## Sunday, May 12, 2013

## Friday, May 10, 2013

### Stocastic Gradient Descent

Added a stocastic gradient descent to the linear regression code

## Thursday, May 9, 2013

### Learning linear regression with gradient descend

Last week I restarted an old and good behavior (see A collection of algos and data structures published here). Every day, I take an well known algorithm and code it in boost and C++. Nothing else, just pure training and geeky fun. The only constrain is the time limit of 45mins, after a running session in St. James park.

Here you have the code for linear regression with gradient descent in C++, boost, and ublas.

Here you have the code

Here you have the code for linear regression with gradient descent in C++, boost, and ublas.

**Linear regression**is an approach to modeling the relationship between a scalar dependent variable y and one or more explanatory variables denoted X. Ublas is a powerful set of c routines for efficient matrix and vector computations, Boost:: numeric provides an elegant C++ way of using ublas based on templates.Here you have the code

## Wednesday, April 3, 2013

### Scaling requests with MemCached

A nice paper from Facebook about the use of memcached for scaling requests. Classical UDP connection-less techniques, are mixed with TCP coalescing via mcrouter. Also, they implement flow control, new mechanism for cache miss, and excess of requests on a particular key. The concept of pool is quite interesting.

## Monday, April 1, 2013

### Implement a binary search

with minimum number of comparisons

## Saturday, March 30, 2013

## Friday, March 29, 2013

### Find largest sum in a subarray

Given an array that may contain both positive and negative integers, find the sum of contiguous subarray of numbers which has the largest sum.

Hint: there is a divide & conquer solution in O(nlogn) and one cool solution in O(n)

## Thursday, March 28, 2013

### Given a sorted array which is rotated N times(N is unknown), search a number in it

Hint: binary search with modification

## Wednesday, March 27, 2013

### Find the maxium integer in an array of size N, in O(N) and additional space O(1)

It's easy to make this in O(N) and O(N), or in O(N) and O(K) where K is the maximum int value contained in the array. However, this is requiring O(N) and O(1).

Hint: one way of doing this is to modify the original array.

Hint: one way of doing this is to modify the original array.

## Tuesday, March 26, 2013

### My talk at ECIR2013

Gave an invited talk @ ECIR2013 and covered:

- Bing, as a platform, new trends for Microsoft search
- Bing suggestion technologies, an overview
- Bing & Cosmos, massive data mining models compared to Hadoop, Map & Reduce, Hive, Pig

## Monday, March 25, 2013

### Posted a blog about autosuggest for Bing

Last week, I posted a blog about autosuggest for Bing

## Sunday, March 24, 2013

### Given n sets, intersect them. preserve the duplication

## Saturday, March 23, 2013

### Find the row with max num of 1s

Given a matrix nxm where all the rows are 0/1 and sorted. Find the row with max num of 1s.

## Friday, March 22, 2013

## Thursday, March 21, 2013

## Wednesday, March 20, 2013

## Monday, March 11, 2013

### Sorted array and rotated

Given a sorted array of distinct elements, and the array is rotated at an unknown position. Find minimum element in the array.

## Sunday, March 10, 2013

### Given an array of positive integers, select two sub-sequences having minimum difference

Hint: one element can be either in one sub-sequence or in the other, so this is a typical recursive problem

## Saturday, March 9, 2013

### A sorted array with duplicates. Count #key

In O(logN)

## Friday, March 8, 2013

## Thursday, March 7, 2013

### Design a data structure where min, max are O(1) , delete and insert are O(logN)

Hint: min is O(1) for an min-heap, delete and insert are O(log N). Same for max. Question is how to make it for both min and max

## Wednesday, March 6, 2013

### Circular pumps

Given a circular trace with pumps and a car running on a trace. Each pump has a well known distance from the origin, and the amount of kilometers you can run if you stop and refuel your car. Find a suitable starting point so that the car will not run out of fuel and the number of stops are minimal.

## Tuesday, March 5, 2013

### Special sort

Given an array of numbers, arrange them in a way that generate the largest value. For instance, if the given numbers are {34, 345, 348, 50}, the arrangement 5034834534 gives the largest value.

## Monday, March 4, 2013

### Visit a BST in level order

Visit a BST in level order.

Visit a BST in reverse level order

Visit a BST in reverse level order

## Sunday, March 3, 2013

## Saturday, March 2, 2013

## Friday, March 1, 2013

### Find a pair in a BST with sum 0

Given a BST of integers find a pair with sum 0.

## Tuesday, February 12, 2013

### Class of problems with Facebook Graph search

Several classes of problems with Facebook graph search have been reported. Here there is a list:

Plus, they frequently give no search results for suggested queries

Plus, they frequently give no search results for suggested queries

## Thursday, February 7, 2013

## Tuesday, February 5, 2013

### Facebook naive graph search and the need of proper query understanding algos

Search engines use multiple data sources to help users in their query formulation. Among those sources three are very important:

- The web graph, which includes all the web pages published on the net. Mining the content of those pages and their connections can be very useful for detecting topics/entities and for understanding the correlation among those topics/entities;
- The query graph, which includes all the queries submitted to a search engine in the past. Mining this graph can be useful for detecting what are the needs expressed by past users and how those needs change over the time;
- The blog and news graph, which are useful for detecting fresh content;

Facebook is very interesting for search because they have access to other sources of data including:

Today, the most important financial news is Dell's acquisition and every single newspaper or financial web site report the news. Bing (and Google) both have this information in their suggestion, while Facebook is completely missing it. It seems like they are not analyzing the news graph (3rd source in the above list)

In the below example, the last two suggestions are provided by Bing. Facebook just tries to complete the prefix {microsoft and} with stuff such as {London} and {Pisa}, which are popular in my social graph but have

So, If you spend sometime analyzing Facebook suggestion it's clear that they create synthetic suggestions trying to complete every query prefix with stuffs popular in my social graph. This approach is naive because they DON'T seem to understand the meaning of what the user is looking for. Here there are some more examples:

Again, "Happy Diwali", "Spotify", "Microsoft" and "London" are popular in my social graph but they have NO correlation with Tom Cruise. Still, Facebook is naively generating those suggestions.

A proper search engine understands the right correlation by analyzing the Web graph, the query graph and the news graph something that Facebook is missing at the moment.

- The friend graph, which keeps track of all the friendship relations for each user;
- The content graph, which includes all the postings and the content produced by your friends.

**Freshness**Today, the most important financial news is Dell's acquisition and every single newspaper or financial web site report the news. Bing (and Google) both have this information in their suggestion, while Facebook is completely missing it. It seems like they are not analyzing the news graph (3rd source in the above list)

In the below example, the last two suggestions are provided by Bing. Facebook just tries to complete the prefix {microsoft and} with stuff such as {London} and {Pisa}, which are popular in my social graph but have

__NO__correlation with Microsoft and with today's news.**Query understanding**So, If you spend sometime analyzing Facebook suggestion it's clear that they create synthetic suggestions trying to complete every query prefix with stuffs popular in my social graph. This approach is naive because they DON'T seem to understand the meaning of what the user is looking for. Here there are some more examples:

Again, "Happy Diwali", "Spotify", "Microsoft" and "London" are popular in my social graph but they have NO correlation with Tom Cruise. Still, Facebook is naively generating those suggestions.

A proper search engine understands the right correlation by analyzing the Web graph, the query graph and the news graph something that Facebook is missing at the moment.

There are many web sites out there which clearly say that tom cruise is correlated with {scientology}, {katie holmes} (his wife), {suri} (his daughter), and {cameron diaz} (since they performed in many films together), and {nicole kidman} (his past wife). However, Facebook does not capture the intent of user query and does not understand the correlation among those entities. Let's see another example:

Last suggestion is from Bing, while Facebook keeps suggesting the same {Happy Diwali} stuff which is popular in my social graph. Again, they are missing a quality access to the 2nd and 3rd sources of information in my above list.

Bing (and Google) analyse the Web graph, the query graph, the news graph and many other sources and therefore they detect the right correlations among entities (strangely enough Google misses the correlation between {Obama} and {Michelle}).

Please note that if you try the above examples you may get different suggestions from Facebook because those are personalized with stuff popular in your social graph. However, it's easy to reproduce similar naive suggestions when you understand the pattern.

## Friday, February 1, 2013

### Can you search a list with length n in less than O(n)?

## Thursday, January 24, 2013

### Facebook Graph Search and problems with synthetic suggestions

I spent sometime playing with the suggestions mined by Facebook for their recent Graph Search. It seems to me that many of them are synthetic e.g. they are generated by mining the friends's graph and not the queries past queries submitted by users.

This approach seems to create many nonsensical suggestions. In this example, I was searching {

Another example: I am an Italian guy and for me Milan is in Italy (btw, this Milan is way more popular on the Web that the one in Tennessee). Plus, none of those Milan cities are near to London. Web suggestions are nailing the correct location "il duomo" is central area in Milan, Italy.

Here there is another example of nonsensical suggestion. I assume that there are no Google employees who work for Microsoft at the same time. However, both the companies are quite popular in my social network hence the synthetic suggestion.

Now, let's consider another class of defects. There are queries which are not directly answered and the proposed suggestions are quite unrelated. In this example, I was searching for {actors who won the oscar} and Facebook graph search suggestions are clearly unrelated since I am not interested in "actors who like Oscar wilde"

Here there is another example of unrelated suggestion. I am searching {visit the taja} and this is not related to "People named 'taja' who work at Visit Holland". In this case the semantic of the query is quite not understood.

If you try to replicate my examples, please note that you could get a different set of suggestions because those are generated by using your personal friend graph. However, it would be easy to generate similar situations once you understand the pattern.

See also my past postings:

This approach seems to create many nonsensical suggestions. In this example, I was searching {

*hotels in new york city close*} and the top suggestions are "*hotels in new york city close to London",*which is clearly a no sense. It seems to me that they are using some kind of frequent itemset algorithm and they notice that I have many friends in London, hence the synthetically generated nonsensical suggestion. As you see, Bing suggestions are not suffering this problem and correct locations are shown.Another example: I am an Italian guy and for me Milan is in Italy (btw, this Milan is way more popular on the Web that the one in Tennessee). Plus, none of those Milan cities are near to London. Web suggestions are nailing the correct location "il duomo" is central area in Milan, Italy.

Here there is another example of nonsensical suggestion. I assume that there are no Google employees who work for Microsoft at the same time. However, both the companies are quite popular in my social network hence the synthetic suggestion.

Now, let's consider another class of defects. There are queries which are not directly answered and the proposed suggestions are quite unrelated. In this example, I was searching for {actors who won the oscar} and Facebook graph search suggestions are clearly unrelated since I am not interested in "actors who like Oscar wilde"

Here there is another example of unrelated suggestion. I am searching {visit the taja} and this is not related to "People named 'taja' who work at Visit Holland". In this case the semantic of the query is quite not understood.

If you try to replicate my examples, please note that you could get a different set of suggestions because those are generated by using your personal friend graph. However, it would be easy to generate similar situations once you understand the pattern.

See also my past postings:

## Friday, January 18, 2013

### Spelling, Spelling, Spelling. Facebook Graph Search and Bing Web Search

Continuing with my personal analysis on the Facebook Graph Search and Bing Web Search's integration, here is another example of why you can get the best of two worlds. In this case, I will discuss misspelled words, a common problem for English (see for instance this funny article: "Why misspelled names are so common and what journalists are doing to prevent them").

Let's suppose that users start to type the word {

Another example is {

Bing has a significant investment in building a state-of-art speller and those technologies are also leveraged by our team in London in securing the right suggestions. In this context, the additional problem is dealing with the mispelled fragments represented by the query prefix, correct the errors and, at the same time, predict which the more relevant suggestions are, and we need to do this every time new characters are typed in real time!

Can you imagine how fast the predictions and corrections need to be?

Let's suppose that users start to type the word {

**swarzin**}: their intent is to get information about the actor, but they are unsure of the correct spelling. As you see in the below image, Bing’s Speller processes tens of millions of data points mined from searches, web pages, clicks and user actions to help you find the right “**Schwarzenegger**".Another example is {

**terramis**} a delicious Italina cake of which the correct spelling is {**tiramisu**} - technically, {terramis} is a mispelled query prefix for the full query{tiramisu}. Other mispelled prefixes are**{bufet warr}**and one of the myriad variants of {**britnay s**}. In all of these cases, Bing nails the correct suggestions and are integrated into Facebook typeahead search.Bing has a significant investment in building a state-of-art speller and those technologies are also leveraged by our team in London in securing the right suggestions. In this context, the additional problem is dealing with the mispelled fragments represented by the query prefix, correct the errors and, at the same time, predict which the more relevant suggestions are, and we need to do this every time new characters are typed in real time!

Can you imagine how fast the predictions and corrections need to be?

## Thursday, January 17, 2013

### Facebook Graph Search, with Bing Web Search

Many people asked what are the benefits of merging Facebook internal search (aka Facebook Graph Search), with Bing Web Search results. This is my poster query {Conrad Bain} just died, and Facebook knows many people with that name from their internal data, while Bing nails the name of the actor performing 'Different Strokes' series.

Try your own queries and give us feedback

Try your own queries and give us feedback

Subscribe to:
Posts (Atom)