Suppose you want to measure the performance of individual researchers of a university department. Which variables can you get hold on and how relevant would they be to measure academic performance?
Here is my take on it:
- Google Scholar citations number. Google Scholar records total number of citations, h-index and i10-index as well as the numbers for a fixed period.
- Scopus citation numbers.
- Twitter. The number of tweets and the number of followers would be relevant.
One issue here is that the number of tweets may not be relevant to the academic performance and it is also susceptible to manipulation. Interestingly there has been a comparison between Twitter numbers and standard citation counts with a coefficient between the two numbers named the Kardashian index.
- Wikidata and Wikipedia presence. Whether Wikidata has a item of the researcher, the number of articles of the researchers, the number of bytes they span, the number of articles recorded in Wikidata. There is an API to get these numbers, and – interestingly – Wikidata can record a range of other identifiers for Google Scholar, Scopus, Twitter, etc. which would make it a convenient open database for keeping track of researcher identifiers across sites of scientometric relevance.
The number of citations in Wikipedia to the work of a researcher would be interesting to have, but is somewhat more difficult to automatically obtain.
The numbers of Wikipedia and Wikidata are a bit manipulable.
- Stackoverflow/Stackexchange points in relevant areas. The question/answering sites under the Stackexchange umbrella have a range of cites that are of academic interest. In my area, e.g., Stackoverflow and Cross Validated.
- GitHub repositories and stars.
- Publication download counts. For instance, my department has a repository with papers and the backend keeps track of statistics. The most downloaded papers tend to be introductory or material and overviews.
- ResearchGate numbers: Publications, reads, citations and impact points.
- ResearcherID (Thomson Reuters) numbers: total articles in publication list, articles with citation data, sum of the time cited, average citations per article, h-index.
- Microsoft Academic Search numbers.
- Count in the dblp computer science bibliography (the Trier database).
- Count of listings in ArXiv.
- Counts in Semantic Scholar.
- ACM digital library counts.
Mr. XKCD Randall Munroe has just written about Einsten and his relativity theories with the ten hundred words language.
I few weeks ago I submitted a research application where popular science part of it were written with ten hundred words. It is an interesting exercise to formulate your research without all the jargon and buzzwords.
The English version is here:
Imagine numbers from all studies presented in books and papers put into a computer and carefully put up so everyone from all over the world quickly can see them. So everyone can add new numbers by themselves. So everyone can show the numbers against each other to see if they agree or not between studies. We will study ways to make this possible. We start from brain studies and take out numbers from brain study papers and put them into a computer store. We need to find a way to do this fast and do it exactly. We need to decide on a way to handle the numbers in the computer so taking numbers in and out of the store is easy and the form is easy for others to understand. We need to find a way to make the computer understand whether the studies agree or not. And a way to find what is the cause when they do not agree. And finally we need to find a way to show whether the number agree or not to people from the whole world sitting at their computers.
It was constructed with http://splasho.com/upgoer5/
My Danish translation was:
Forstil dig tal fra alle studier fra artikler lagt ind i en computer, og omhyggeligt lagt op så alle fra hele verden hurtigt kan se dem. Så alle selv kan komme med nye tal. Så alle kan vise tallene op mod hinanden og se om de stemmer overens mellem studierne. Vi vil undersøge måder for at gøre dette muligt. Vi begynder med hjernestudier og tager tal fra artikler om hjernen og lægger dem ind i computeren. Vi vil undersøge hvordan man kan gøre det hurtigt og præcist. Det er nødvendigt at bestemme en måde at håndtere tallene i computeren på, så det er nemt både at få tallene ind og ud. Det er nødvendigt at finde en måde så computeren forstår om studierne stemmer overens eller ikke, og hvad årsagen er hvis de ikke gør. Til sidst vil vi finde en måde at vise om tallene stemme overens så alle folk fra hele verden kan se det fra deres computer.
Several years ago we started a research project, Responsible Business in the Blogosphere, together with, among others, members from the Corporate Social Responsibility (CSR) group at the Copenhagen Business School (CBS). The research project looked at social media, companies and their corporate social responsibility. The start of the project coincided with the ascent of Twitter and a number of our research publications from the project considered data analysis of Twitter message. Among them were my A new ANEW: Evaluation of a word list for sentiment analysis in microblogs about the development and evaluation of my sentiment analysis word list AFINN and Good Friends, Bad News – Affect and Virality in Twitter with analysis of information diffusion on Twitter, that is, retweets.
Strategies of Legitimacy Through Social Media: The Networked Strategy is our latest published work in the project. It describes a pharmaceutical company adopting Twitter for communication of CSR-related topics. It is a longitudinal case study with interviews of the people behind the company Twitter account and data mining of tweets. Itziar Castelló, Michael Etter and I authored the paper.
While I did not participate in the interviews nor the interesting analysis of that information, I did a sentiment analysis and topic mining of the tweets that we collected from the company Twitter account and by searching for the company name via the Twitter search API. The results are displayed in Table I and Figure 2 of the paper.
A note from the paper that I find interesting comments on the issues faced by the company as they developed the social media method:
“… the institutional orientation to hierarchical processes requiring approval for all forms of external communication; and the establishment of fixed working hours that ended at 4pm local time coexisting alongside a policy that customer complaints must be resolved within 48 hours, which prevented SED managers from conducting real-time conversations over the Twitter platform.”
Our paper argues for “a new, networked legitimacy strategy” for stackholder engagement in social media with “nonhierarchical, non-regulated participatory relationships”.
Strategies of Legitimacy Through Social Media: The Networked Strategy is available gratis in September 2015.
Val McDermid, apparently an author of some standing as a writer of untrue crime novels, has written a true crime walkthrough of forensics topics interweaving real-life cases and comments. The fine selection of topics has no overall progressive narrative to such an extend that most of the chapters may have been permuted without loss of coherency. If there is a base for the book it is a fascination and awe for modern forensics. She is a good writer. Perhaps her crime novels has trained her in writing clear prose. She delves not into academic technicalities that could perhaps have been interesting.
She has based her book on other books as well as a good number of interviews with a broad range of forensics experts. A few of these comes from the University of Dundee: Forensics chemist Niamh Nic Daéid and forensics antropologist Sue Black.
I find McDermid view of the fallibility of forensics balanced drawing forth cases where presumed experts lack self-critique. Bernard Spilsbury and a U.S. ballistic expert Thomas Quirk are critized. For Roy Meadow, McDermid presents aspects of the tragic Sally Clark case that I do not recall having read before: The appeal was not prompted by Meadow’s evidence but by Pathologist Alan Williams that had failed to disclose blood test results. I do sometimes find popular science writing lack an appropriate level of critique to the material. McDermid is one of the better writers, but I do find one case where she oversteps the confidence we should have in science. Here is what she writes on page 164: “We already know, for instance about the existence of a ‘warrior gene’ – present mainly in men – which is linked with violent and impulsive behaviour under stress”. When I read “We know” I get mad, and when I read ‘warrior gene’ I get extra mad. Behavioral genetics is a mess full of red herrings. Recent meta-analysis of the warrior gene polymorphism MAOA-uVNTR and antisocial behavior (“Candidate Genes for Aggression and Antisocial Behavior: A Meta-analysis of Association Studies of the 5HTTLPR and MAOA-uVNTR“) reaches a 95% confidence interval on 0.98-1.32, while, interesting a very low p-value (0.00000137). The strangeness of difference between confidence interval and p-value is discussed in the paper and presently walks over my head. What seems reasonable certain is the loads of between-study heterogeneity. Any talk of warrior gene needs to acknowledge the uncertainty.
There are certainly more elements to forensics than McDermid presents. A Danish newspaper has recently run a story about cell phone tower records used in courtroom cases. A person carrying a powered cell phone reveals his/her location, – but only with a certain exactness. Cell phones may not necessarily select the nearest cell tower. From my own experience I know that my cell phone can select cell towers in other countries from where I am located, e.g., my cell phone in Nordsjælland in Denmark can easily select a cell tower in Sweden 15 to 20 kilometers or more away and my cell phone in Romania switched to a Ukrainian cell tower perhaps 20 kilometers or more away. U.S state Oregon has seen the case of Lisa Marie Roberts that on her bad lawyer’s advice pleaded guilty in 2004 because of critical important cell tower evidence. In 2013 she was freed.
I was struck by one of the stories presented that originates from the book of criminal lawyer Alex McBride. A surveillance camera records a case of apparently straightforward violence, but McBride is able to get his client off by threatening to use another part of the camera recording showing a policeman mishandling a person in a case of wrongful arrest. The prosecution dropped the charge for the original case. It does not seem fair to the victim of the original crime that the criminal can go free just because another crime is committed. To me it looks like a kind of corruption and extortion.
(Review also available on LibraryThing)
Back in the 1990s I spent considerable computer time training and optimizing artificial neural networks. It was hot then. Then around year 2000 artificial neural networks became unfashionable with Gaussian processes and support vector machines taking over. During the 2000s computers got faster and some engineers turned to see what graphics processing units (GPU) could do besides doing computer rendering for computer games. GPUs are fast for matrix computations which are central in artificial neural network computations. Oh and Jung’s 2004 paper “GPU implementation of neural networks” seems to be the first according to Jurgen Schmidhuber describing the use of GPUs for neural network computation, but it was perhaps first when Dan Ciresan from Politehnica University of Timisoara began using GPUs that interesting advances began: In Schmidhuber’s lab he trained a GPU-based deep neural network system for Traffic Sign Classification and managed to get superhuman performance in 2011.
Deep learning, i.e., computation with many-layered neural network systems, was already then taking off and now broadly applied where the training of a system for computer gaming (classic Atari 2600 games) is perhaps the most illustrative example on how flexible and powerful modern neural networks are. So in limited domains deep neural networks are presently taking large steps.
A question is whether this will continue and whether we will see artificial intelligence system having more general superhuman capabilities. Nick Bostrom‘s book ‘Superintelligence‘ presupposes so and then starts to discuss “what then”.
Bostrom’s book, written from the standpoint of an academic philosopher, can be regarded as a elaboration from the classic Vernor Venge “The coming technological singularity: how to survive in the post-human era” from 1993. It is generally thought that if or when artificial intelligence become near-human intelligent the artificial intelligence system will be able to improve itself and once improved it will be able to improve yet more, resulting in a quick escalation (Verge’s ‘singularity’) with the artificial intelligence system becoming much more intelligent than humans (Bostrom’s ‘superintelligence’). Bostrom lists surveys among expert showing that the median time for the human-level intelligence is estimated to be around year 2040 and 2050, – a share of experts even believe the singularity will appear in the 2020s.
The book lacks solid empirical work on the singularity. The changes around the industrial revolution is discussed a bit and the horse in society in the 20th Century is mentioned: From having widespread use for transport, its function for humans would be taken over with human-constructed machines and the horses sent the butcher. Horses in the developed world are now mostly being used for entertainment purposes. There are various examples in history where a more ‘advanced’ society competes with an established less developed: neanderthal/modern humans, the age of colonization. It is possible that a superintelligence/human encounter will be quite different though.
The book discusses a number of issues from a theoretical and philosophical point of view: ‘the control problem’, ‘singleton’, equality, strategies for uploading values to the superintelligent entity. It is unclear to me if a singleton is what we should aim at. In capitalism, a monopoly seems not necessarily to be good for society, and in market economy societies put up regulation against monopolies. Even with a superintelligent singleton it appears to me that the system can run into problems when it tries to handle incompatible subgoals, e.g., an ordinary desktop computer – as a singleton – may have individual processes that require a resource which is not available because another resource is using it.
Even if the singularity is avoided there are numerous problems facing us in the future: warbots as autonomous machines with killing capability, do-it-yourself kitchen-table bioterrorism, general intelligent programs and robots taking our jobs. Major problems with it-security occur nowadays with nasty ransomware. The development of intelligent technologies may foster further inequality where a winner-takes-all company will rip all benefits.
Bostrom’s take home message is that the superintelligence is a serious issue, that we do not know how to tackle, so please send more money to superintelligence researchers. It is worth alerting society about the issue. There is general awareness of the evolution of society for some long term issues such as the demographics, future retirement benefits, natural resource depletion and climate change issues. It seems that development in information technology might be much more profound and requires much more attention than, say, climate change. I found Bostrom’s book a bit academically verbose, but I think the book has quite important merit as a coherent work setting up the issue for the major task we have at hand.
The Science and Engineering Indicators 2012 from the National Science Foundation was commented on in Discovery. A direct link to the report is here. The public knowledge of science was surveyed across countries. You were supposed to answer yes or no to the around 12 scientific questions and if people gave the right answer they were declared “science literate”.
It is interesting to turn the head around on the survey and ask whether the ground truth answers are actually correct. It is dangerous to be too dogmatic in science as great paradigmatic changes may go against established “common wisdom”. Here is my attempt on answering the opposite of the supposedly correct answer:
- “The center of the earth is very hot.” This answer is supposed to be true, but what does “very hot” mean? My oven is very hot when it is above 200 degrees. Wikipedia states presently the inner core temperature to be around 5,700 K. This temperature is not “very hot” compared to temperature at the sun center. It is also important to note that it is not a necessary truth that the earth is hot. In the future as the natural radioactivity of the earth will decay then the center temperature will gradually decrease. The center could become very cold. But note that the life of the Sun has some effect of the temperature of the earth. According to Wikipedia the Sun’s luminocity is increasing and will make the surface temperature hot, perhaps “very hot” depending on the definition.
- “The continents have been moving their location for millions of years and will continue to move”. This is supposed to be true, but does, e.g., Eurasia move or Antarctica move? It seems mostly so based on a NASA image, though some areas in the Antarctica does not move very much. There is no natural law saying that the continents are supposed to move, so there are no guarantee that they will continue to do so. Given that we will see a gradual cooling of the Earth should we not at one point expect the continents will freeze to their position and no longer move? So should the answer should be false? Yes?
- “Does the Earth go around the Sun, or does the Sun go around the Earth?” The answer is supposed to be “earth around sun”: A classic fallacy. You can put you coordinate system at the Earth center and the Sun will go around the Earth. Of course as the Sun is heavier than the Earth you will have a tendency to say that the Earth goes around the Sun. Introducing the barycenter in the question may be better.
- “All radioactivity is man-made”. (False) The default answer is hard to refute.
- “Electrons are smaller than atoms.” (true). Wikipedia tells us that for the atom “the boundary is not a well-defined physical entity”. The classical electron radius is 2.82×10-15 m. Wikipedia claims the “size of an 11 MeV proton” is less than that value. With a bit of stretching you could say that a hydrogen atom is smaller than an electron. I do not know much about physics at that level, but my impression is that you should be careful when using classical physical concepts at atomic levels.
- “Lasers work by focusing sound waves.” (False) Usually lasers do not work by focusing sound waves. It is interesting to wonder whether it is possible to produce a laser this way. For a start read about Sonoluminescence. I am not knowledgeable enough to say it cannot be done, but your next reading could be Laser sonoluminescence in water under increased hydrostatic pressure or single-bubble sonolumnicescence.
- “The universe began with a huge explosion”. (True) Most scientist would say that the Big Bang happened at some time, but you can read “The pre-bang universe has become the latest frontier of cosmology” in Scientific America. You can also read “Sir Roger Penrose has changed his mind about the Big Bang. He now imagines an eternal cycle of expanding universes where matter becomes energy and back again in the birth of new universes and so on and so on” in an introduction to a documentary. It should certainly not be written in stone that “the universe began with a huge explosion”.
- “The cloning of living things produces genetically identical copies” (true). Do they? I suppose not necessarily. I guess you could genetically engineer the clone with new genes. I do not know much about cloning but I imaging that something could go wrong in the transcription proces, e.g., you can read “clones created from somatic cells will have shortened telomeres and therefore reach a state of senescence more rapidly” in one random article I found. This sounds to me as clones are not identical copies.
- “It is the father’s gene that decides whether the baby is a boy or a girl” (true). “Decides” is a word that in my eyes indicates agency whatever that is. I hardly think that the gene has any cognitive capabilities to engage in a process of selection. “Determine” may be a better word. And “decides” no. It may also be worth reading the Wikipedia article “temperature-dependence sex determination” starting off with “Temperature-dependent sex determination (TSD) is a type of environmental sex determination in which the temperatures experienced during embryonic development determine the sex of the offspring.” Another article worth reading is “Female foeticide in India” with “MacPherson estimates that 100,000 abortions every year continue to be performed in India solely because the fetus is female.”
- “Ordinary tomatoes do not contain genes, while genetically modified tomatoes do” (false). That is hard to refute. I suppose that with treatment (gamma rays, heat?) you can kill genes in tomatoes.
- “Antibiotics kill viruses as well as bacteria”. (False) This is not necessarily false. You could easily imagine antibiotics that was also engineered to kill viruses. With opppurtunistic infections you could say that antibiotics kill viruses indirectly. There are scientific experiments that examine whether antibiotics has an effect on virus diseases, see, e.g., “Antibiotics for bronchiolitis in babies“. It could very well be dangerous to ignore the possibility that antibiotics could be involved in fighting a virus disease.
- “Human beings, as we know them today, developed from earlier species of animals.” (True) Yet again we have a issue of words: “Developed”? “Evolved” is probably a better word. “Develop” may entail some form of agency. It may also be worth mentioning that certain aspects of the human society and being human being seem to have been created de novo by humans, e.g., humor, clothes, music, …
Overall the survey creators did not get many answers correct. Better luck next time.
I have just received a citation alert from the Google Scholar system as I was cited in http://firstmonday.org/ojs/index.php/fm/article/view/3203/3019
Interestingly, the alert did not come from the First Monday journal directly but from a paper on firstmonday.insurancetribe.com (see the excerpt below). To me it seems that insurancetribe.com is abusing First Monday material on their site. Their URL redirects to homesecurityfix.com. This must be spam.
[HTML] Font Size Current Issue Atom logo
http : / / scholar.google.com/scholar_url?url=http://firstmonday.insurancetribe.com/ojs/index.php/fm/article/view/3203/3019>
D Geifman, DR Raban, R Sheizaf
Abstract Prediction Markets are a family of Internet–based social
which use market price to aggregate and reveal information and opinion
audiences. The considerable complexity of these markets inhibited the
full realization of *…*
When I last checked, Google Scholar redirected to the spam site. However, I cannot find the insurancetribe version among the indexed versions now :