Some days ago the world press was abuzz with the study on the Facebook friend graph, that found the average distance between active Facebook users to be 4.74, i.e., almost 5, meaning that there are on average 4 Facebook linked friends separating one Facebook user from another. See also brief summary on the Brede Wiki.There are standard algorithms on the shelve to compute the distance for small graphs, but because the Facebook graph is so huge you/they need special algorithms. First author Lars Backstrom employed at Facebook (that gave a keynote at 8th Extended Semantic Web Conference about the Facebook friend suggester) had the Facebook data and got hold on an algorithm from Milano researchers that could handle the 871 million active Facebook users and their 69 milliards friendship links. In a previous study the Milano researchers examined the “spid”, i.e., the variance-to-mean ratio of the distances. They claim that “spid larger than one are to be considered ‘web-like’, whereas networks with a spid smaller than one are to be considered ‘properly social’ and demonstrated that on a number of social and web networks. The Facebook study found a spid on 0.08.
I am confused somewhat by the notion of six degrees of separation. Firstly, does “degrees of separation” mean the number of persons (nodes) or the number of friendships (edge) between a source person and a target persen? Backstrom a Co. “will assume that ‘degree of separation’ is the same as ‘distance minus one’.”, that is, we are counting the persons (nodes) between source person and target person. Another problem is whether the “six” refers to
- the average distance between all pairs,
- the maximum of the average distance for each person,
- the maximum distance between all pairs (the diameter), or
- the average eccentricity; the eccentricity being the maximum distance for each person to any other person.
If you look on the first sentence on the present version of the Wikipedia article I think it alludes to the first interpretation. Playwright John Guare’s six degrees seem rather to be the third interpretation.With the co-authorship graph from the Brede Wiki I can computate these different distances. The co-authors are not fully connected but the largest connected components has 665 names, which resolve to somewhat below 665 people (I got uncorrected problems with, e.g., “Saffron A. Willis-Owen”/”Saffron A. G. Willis-Owen”). On this graph I find the mean distance to be 5.65, the mean eccentricity to be 9.37 and the diameter 12. Computing the spid I get 0.73, i.e., a “social network” according to the Milano criterion. I wonder why the average Facebook distance is so low. Jon Kleinberg mentions “weak ties”. Some of my Facebook friends are linked to public figures in Denmark. Could it be that Facebook users tend to connect with famous persons and that these famous people tend to act as hubs? Another phenomenon that I think I noticed on Facebook is that when people travel abroad and have a cursory acquaintanceship they tend to friendship on Facebook, perhaps as a kind of token and reminder. Are such brief encounters actually there and important for the low average distance?
(2012-01-16: Language correction)