Did Pinski’s and Narin’s ‘basic research’ have any influence on PageRank?

Posted on Updated on

In a letter to the Danish newspaper Politiken a group of young researchers, Anders Søgaard, Rebecca Adler-Nissen, Steffen Dalsgaard, Vibe Gedsø Frøkjær, Kristin Veel, Sune Lehmann and Kresten Lindorff-Larsen wrote against letting business get too much influence on the universities. Among their arguments was one example with the Pinski-Narin paper:

“A good example on basic research, which has made a huge economical difference, is Gabriel Pinski’s and Francis Narin’s article about citation analysis from 1976. That article made the PageRank algorithm possible, which still is used in Google Search. According to some statistics Google Search can account for 2 percent of the BNP of the world, all because of research in how researchers cites each others article” (translated from Danish)

The specific paper is Citation influence for journal aggregates of scientific publications: theory, with application to the literature of physics published in ‘Information Processing & Management’. In this paper the two researchers set up a citation matrix corresponding to a graph where the nodes are “publishing entities” such as “journals, institutions, individuals, fields of research, geographical subdivisions or levels of research methodology”. They perform an eigenvalue computation to find the ‘influence’ of the publishing entities. The method is demonstrated on the citation network between physics journals.

Is Pinsk-Narin basic research and did it influence Brin and Page for PageRank?

Interestingly, the two researchers are not university researchers. Pinski and Narin worked in the company “Computer Horizons, Inc” as President and Research Advisor according to the information in the article, being support by the National Science Foundation.

The Pinski-Narin paper is cited by Jon Kleinberg in his Hubs, authorities, and communities paper from December 1999. Pinski-Narin is also cited by Kleinberg’s Authoritative Sources in a Hyperlinked Environment that was published as an IBM research report in May 1997, i.e., a company report.

Brin’s and Page’s famous article The anatomy of a large-scale hypertextual Web search engine (made while they were students at Stanford University) has no mentioning of Pinski and Narin. So were they not aware of it? Initially I thought so.

However, Brin’s and Page’s paper cite Kleinberg’s ‘Authoritative Sources in a Hyperlinked Environment’ which has information about Pinski-Narin, so if Brin and Page read Kleinberg’s paper they must have known about Pinski-Narin, – at least in the latter part of 1997.

The Brin-Page paper is from the Seventh International World Wide Web Conference which was held in April 1998 with submission deadline in December 1997. The tracing of PageRank leads further back to Lawrence Page’s patent US 6285999 with filing date in January 1998 and a priority date in January 1997. This patent has a citation to Pinski-Narin. It is not clear when the citation was added to the patent. I suppose it could be somewhere between during the writing process leading up to the priority date in 1997 and the publication date in 2001. I have not been able to find information about whether the Pinski-Narin influenced Page to PageRank, but in late 1997 they must have been aware of the paper, so it is not at all unlikely that they were inspired from it. However, as an argument for keeping business out of universities the PageRank/Pinski-Narin issue seems a poor example because Pinski-Narin came from a company.

The entire field of scientometrics has depended quite heavily on data from the Science Citation Index (SCI), – a data from the company ‘Institute of Scientific Information’. Indeed, the Pinski-Narin paper used data from SCI. Still the scientometrics field is dominated by commercial interests. Thomson-Reuter now owns SCI, Elsevier has Scopus and Google Google Scholar. Also note that CiteSeer/ResearchIndex was developed, not by a university, but by the American research branch of the Japanese company NEC. And in turn (according to Wikipedia) SCI was “heavily influenced” by the non-academic Shepard’s Citations.

Interestingly, Massimo Franceschet has written on the history of PageRank: “PageRank: Standing on the shoulders of giants” and tracing it back to Wassily W. Leontief in 1941. Wikipedia’s PageRank article also mentions Yanhong Li‘s work Toward a qualitative search engine and US 5920859. At the time Li worked for a company “GARI Software/IDD Information Services” and later cofounded Baidu.

It may be worth to note the lack of references in the Pinski-Narin paper. It has no citation to, e.g., Leo Katz’ 1953 paper or Leontief. Perhaps they were unaware of the research in the other areas?

Although PageRank can be said to depend on university-based basic research such as German-speaking matematicians Oskar Perron, Ferdinand Georg Frobenius and Richard von Mises the work in the Computer Horizons company is not an example of university-based basic research.

One final note: Though some academics may see PageRank as an example of a basic numerical research yielding a company of great economic value, I see it as only a component in the Google success. The application of low-cost Linux computers together with a non-intrusive quick-responding interface may well explain more of the success. Linux, inspired by academic MINIX, is mostly an non-academic endeavor.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s