Latest Event Updates

The Wikidata scholarly profile page

Posted on Updated on


Recently Lambert Heller wrote an overview piece on websites for scholarly profile pages: “What will the scholarly profile page of the future look like? Provision of metadata is enabling experimentation“. There he tabularized the features of the various online sites having scholarly profile pages. These sites include (with links to my entries): ORCID, ResearchGate, Mendeley, Pure and VIVO (don’t know these two), Google Scholar and Impactstory. One site missing from the equation is Wikidata. It can produce scholarly profile pages too. The default Wikidata editing interface may not present the data in a nice way – Magnus Manske’s Reasonator – better, but very much of the functionality is there to make a scholarly profile page.

In terms of the features listed by Heller, I will here list the possible utilization of Wikidata:

  1. Portrait picture: The P18 property can record Wikimedia Commons image related to a researcher. For instance, you can see a nice photo of neuroimaging professor Russ Poldrack.
  2. Researchers alternative names: This is possible with the alias functionality in Wikidata. Poldrack is presently recorded with the canonical label “Russell A. Poldrack” and the alternative names “Russell A Poldrack”, “R. A. Poldrack”, “Russ Poldrack” and “R A Poldrack”. It is straightforward to add more variations
  3. IDs/profiles in other systems: There are absolutely loads of these links in Wikidata. To name a few deep linking posibilities: Twitter, Google Scholar, VIAF, ISNI, ORCID, ResearchGate, GitHub and Scopus. Wikidata is very strong in interlinking databases.
  4. Papers and similar: Papers are presented as items in Wikidata and these items can link to the author via P50. The reverse link is possible with a SPARQL query. Futhermore, on the researcher’s items it is possible to list main works with the appropriate property. Full texts can be linked with the P953 property. PDF of papers with an appropriate compatible license can be uploaded to Wikimedia Commons and/or included in Wikisource.
  5. Uncommon research product: I am not sure what this is, but the developer of software services is recorded in Wikidata. For instance, for the neuroinformatics database OpenfMRI it is specified that Poldrack is the creator. Backlinks are possible with SPARQL queries.
  6. Grants, third party funding. Well there is a sponsor property but how it should be utilized for researchers is not clear. With the property, you can specify that paper or research project were funded by an entity. For the paper The Center for Integrated Molecular Brain Imaging (Cimbi) database you can see that it is funded by the Lundbeck Foundation and Rigshospitalet.
  7. Current institution: Yes. Employer and affiliation property is there for you. You can see an example of an incomplete list of people affiliated with research sections at my department, DTU Compute, here, – automagically generated by the Magnus Manske’s Listeria tool.
  8. Former employers, education etc.: Yes. There is a property for employer and for affiliation and for education. With qualifiers you can specify the dates of employment.
  9. Self assigned keywords: Well, as a Wikidata contributor you can create new items and you can use these items for specifying field of work of to label you paper with main theme.
  10. Concept from controlled vocabulary: Whether Wikidata is a controlled vocabulary is up for discussion. Wikidata items can be linked to controlled vocabularies, e.g., Dewey’s, so there you can get some controlness. For instance, the concept “engineer” in Wikidata is linked the BNCF, NDL, GND, ROME, LCNAF, BNF and FAST.
  11. Social graph of followers/friends: No, that is really not possible on Wikidata.
  12. Social graph of coauthors: Yes, that is possible. With Jonas Kress’ work on D3 enabling graph rendering you got on-the-fly graph rendering in the Wikidata Query Service. You can see my coauthor graph here (it is wobbly at the moment, there is some D3 parameter that need a tweak).
  13. Citation/attention metadata from platform itself: No, I don’t think so. You can get page view data from somewhere on the Wikimedia sites. You can also count the number of citations on-the-fly, – to an author, to a paper, etc.
  14. Citation/attention metadata from other sources: No, not really.
  15. Comprehensive search to match/include own papers: Well, perhaps not. Or perhaps. Magnus Manske’s sourcemd and quickstatement tools allow you to copy-paste a PMID or DOI in a form field press two buttons to grap bibliographic information from PubMed and a DOI source. One-click full paper upload is not well-supported, – to my knowledge. Perhaps Daniel Mietchen knows something about this.
  16. Forums, Q&A, etc.: Well, yes and no. You can use the discussion pages on Wikidata, but these pages are perhaps mostly for discussion of editing, rather than the content of the described item. Perhaps Wikiversity could be used.
  17. Deposit own papers: You can upload appropriately licensed papers to Wikimedia Commons or perhaps Wikisource. Then you can link them from Wikidata.
  18. Research administration tools: No.
  19. Reuse of data from outside the service: You better believe! Although Wikidata is there to be used, a mass download from the Wikidata Query Service can run into timeout problems. To navigate the structure of individual Wikidata item, you need programming skills, – at least for the moment. If you are really desperate you can download the Wikidata dump and Blazegraph and try to setup your own SPARQL server.


So what can we use Wikicite for?

Posted on Updated on


Wikicite is a term for the combination of bibliographic information and Wikidata. While Wikipedia often records books of some notability it rarely records bibliographic information of less notability, i.e., individual scientific articles and books where there little third-party information (reviews, literary analyses, etc.) exists. This is not the case with Wikidata. Wikidata is now beginning to record lots of bibliographic information for “lesser works”. What can we use this treasure trove for? Here are a few of my ideas:

  1. Wikidata may be used as a substitute for a reference manager. I record my own bibliographic information in a big BIBTeX file and use the bibtex program together with latex when I generate a scientific document with references. It might very well be that the job of the BIBTeX file with bibliographic information may be taken over by Wikidata. So far we have, to my knowledge, no proper program for extracting the data in Wikidata and formatting it for inclusion in a document. I have begun a “wibtex” program for this, and only reached 44 lines so far, and it remains to be seen whether this is a viable avenue, whether the structure of Wikidata is good and convenient enough to record data for formatting references or that Wikidata is too flexible or too restricted for this kind of application.
  2. Wikidata may be used for “list of publications” of individual researchers, institutions, research groups and sponsor. Nowadays, I keep a list of publication on a webpage, in a latex document and on Google Scholar. My university has a separate list and sometimes when I write a research application I need to format the data for inclusion in a Microsoft Word document. A flexible program on top of Wikidata could make dynamic lists of publications
  3. Wikidata may be used to count citations. During the Wikicite 2016 Berlin meeting I suggested the P2860 property and Tobias quickly created it. The P2860 allows us to describe citations between items in Wikidata. Though we managed to use the property a bit for scientific articles during the meeting, it has really been James Hare that has been running with the ball. Based on public citation data he has added hundreds of thousands of citations. At the moment this is of course only a very small part of the total number of citations. There are probably tens of millions of scientific papers with each having tens, if not hundreds of citations, of citations, so with the 499,750 citations that James Hare reported on 11 September 2016, we are still far from covering the field: James Hare tweeted that Web of Science claims to have over 1 milliard (billion) citations. The citation counts may be compared to a whole range of context data in Wikidata: author, affiliated institution, journal, year of publication, gender of author and sponsor (funding agency), so we can get, e.g., most cited Dane (or one affiliated with a Danish institution), most cited woman with an image, etc.
  4. Wikidata may be used as a hub for information sources. Individual scientific articles may point to further ressources, such as raw or result data. I myself have, for instance, added links to the neuroinformatics databases OpenfMRI, NeuroVault and Neurosynth, where Wikidata records all papers recorded in OpenfMRI, as far as I can determine. Wikidata is then able to list, say, all OpenfMRI papers or all OpenfMRI authors with Magnus Manske’s Listeria tool.
  5. Wikicite information in Wikidata may be used to support claims in Wikidata itself. As Dario Taraborelli points out this would allow queries like “all statements citing journal articles by physicists at Oxford University in the 1970s”.
  6. Wikidata may be used for other scientometrics analyses than counting, e.g, generation of coauthor graphs and cocitation graphs giving context to an author or paper. The bubble chart above shows statistics for journals of papers in OpenfMRI generated with the standard Wikidata Query Service bubble chart visualization tool.
  7. Wikidata could be used for citations in Wikipedia. This may very well be problematic, as a large Wikipedia article could have hundreds of references and each reference needs to be fetched from Wikidata generating lots of traffic. I tried a single citation on the “OpenfMRI” article (it has later been changed). Some form of inclusion of Wikidata identifier in Wikipedia references could further Wikipedia bibliometrics, e.g., determine the most cited author across all Wikipedias.

My daily life

Posted on Updated on

London-based Danish comedian Sofie Hagen last year premiered her “Bubblewrap” standup/storytelling act. In part of act she read her own fanfiction, written when she was young and hot on a boyband. Hagen self-ironic commentary and further elaborations of teenage (or preteenage?) troubles made one of the finest standup performances I have seen.

I myself have gotten hold on my own young writing. An exercise book in the “Danish” course, where the hand-in would sometimes be my favorite topic: “fristil”, “freestyle”. Here you could select your own topic and let your fantasy run away. One hand-in that I discovered was “Min hverdag” (“My daily life”) from an exercise book from 6th grade (I was perhaps 12 years old), a short story with a self-confident carelessness detailing what has become a common theme in my life: getting late.

Here is my (present) English translation. Enjoy:

My daily life.

I just made a time machine when I got home from school. Yesterday I made a car powered by water, but that broke.

When I was finished with the time machine, I set it to the year 1872, 7 Savile Row, London, England. I went into the time machine and pressed the button which started the time machine. Everything went black. One minute passed, and then suddenly I was standing right in front of Phileas Fogg. He did not seem to be particularly surprised. I turned around, and there was Passepartout. I said hello in Danish. They understood that well, as I had an interpreter machine with me, that I had made a couple of weeks ago. They too said hello. Passepartout stood with a valise. Phileas Fogg and Passepartout went out of the street door and that I did too. That was lucky as they locked the door behind them. I asked if I could come along. Phileas Fogg said yes. We went to the end of Savile Row precisely as in “Around the World in Eighty Days”. But they went further and towards a store. I asked, whether they were not going around the world. Mr. Fogg answered that they had been around the world, so I did not want to play this game anymore. I pressed the button and flew back to the present.

I was a bit annoyed as I did not get around the world. I wanted to try again but then the time machine broke. I could of course have repaired it. I did not want to use the 5 minutes that it would take to repair it, as I had homework to do. I could of course put a robot to do it. But on the other hand: You also need to learn something.

The teacher remarked “Excellent. But where did you get the topic from?” :)

Here is the Danish original slightly edited:

Min hverdag.

Jeg lavede lige en tidsmaskine, da jeg kom hjem fra skole. I går lavede jeg en bil, der gik på vand. Men den gik i stykker.

Da jeg var færdig med tidsmaskinen, stillede jeg den på året 1872, Saville-row nr. 7 London, England. Jeg gik ind i tidsmaskinen og trykkede på knappen som startede tidsmaskinen. Alt blev sort. Der gik 1 minut, og så pludselig stod jeg lige foran Phileas Fogg. Han så ikke ud til at være særligt overrasket. Jeg vendte mig om, og der stod Passepartout. Jeg sagde goddag på dansk. Det kunne de godt forstå fordi jeg havde en oversættermaskine på mig, som jeg havde lavet for nogle uger siden. De sagde også goddag. Passepartout stod med en vadsæk. Phileas Fogg og Passepartout gik ud af gadedøren og det gik jeg også. Det var heldigt fordi de låste døren efter sig. Jeg spurgte om jeg måtte komme med. Phileas Fogg sagde ja. Vi gik ned for enden af Saville-row, præcis som i “Jorden rundt i 80 dage”. Men de gik videre og over mod en butik. Jeg spurgte, om de ikke skulle jorden rundt. Mr. Fogg svarede, at det havde de været, så jeg gad ikke at lege dette her mere. Jeg trykkede på knappen og jeg fløj tilbage til nutid.

Jeg var lidt sur fordi jeg ikke var kommet jorden rundt. Jeg ville prøve igen, men så gik tidsmaskinen i stykker. Jeg kunne selvfølgelig have repareret den. Jeg gad ikke at bruge de 5 minutter, der skulle til at reparere den, fordi jeg havde lektier for. Jeg kunne selvfølgelig sætte en robot til det. Men på den anden side: man skal jo også lære noget.

Neuroinformatics coauthor network – so far

Posted on

neuroinformatics coauthor network 2016-06-28

Screenshot of neuroinformatics coauthor network – so far. Only the big cluster is shown. Network with Jonas Kress default setup querying WDQS.

Backup of directory

Posted on Updated on

I at times does not recall the command to backup a directory. So here for my own sake of reference:

$ rsync -au /home/fnielsen/Pictures/ /media/fnielsen/a4f11e07-3f63-45cc-bcab-e7d135c14b9c/backup/Billeder/

the -a is the standard archive option. the -u is ‘update’ (“skip files that are newer on the receiver”).

Page rank of scientific papers with citation in Wikidata – so far

Posted on Updated on

A citation property has just be created a few hours ago, – and as of writing still not been deleted. It means we can describe citation network, e.g., among scientific papers.

So far we have added a few citations, – mostly from papers about Zika. And now we can plot the citation network or compute the network measures such as page rank.

Below is a Python program using everything with Sparql, Pandas and NetworkX:

statement = """
select ?source ?sourceLabel ?target ?targetLabel where {
  ?source wdt:P2860 ?target .
  SERVICE wikibase:label {
    bd:serviceParam wikibase:language "en" .

service = sparql.Service('')
response = service.query(statement)
df = DataFrame(response.fetchall(),

df.sourceLabel = df.sourceLabel.astype(unicode)
df.targetLabel = df.targetLabel.astype(unicode)

g = nx.DiGraph()
g.add_edges_from(((row.sourceLabel, row.targetLabel)
    for n, row in df.iterrows()))

pr = nx.pagerank(g)
sorted_pageranks = sorted((rank, title)
    for title, rank in pr.items())[::-1]

for rank, title in sorted_pageranks[:10]:
    print("{:.4} {}".format(rank, title[:40]))

The result:

0.02647 Genetic and serologic properties of Zika
0.02479 READemption-a tool for the computational
0.02479 Intrauterine West Nile virus: ocular and
0.02479 Internet encyclopaedias go head to head
0.02479 A juvenile early hominin skeleton from D
0.01798 Quantitative real-time PCR detection of 
0.01755 Zika virus. I. Isolations and serologica
0.01755 Genetic characterization of Zika virus s
0.0175 Potential sexual transmission of Zika vi
0.01745 Zika virus in Gabon (Central Africa)--20

Occupations of persons from Panama Papers

Posted on Updated on

Can we get an overview of the occupations of the persons associated with the Panama Papers? Well … that might be difficult, but we can get a biased plot by using the listing in Wikidata, where persons associated with the Panama Papers seems to be tagged and where their occupation(s) is listed. It produces the plot below.


It is fairly straightforward to construct such a bubble chart given the new plotting capabilities in the Wikidata Query Service. Dutch Wikipedian Gerard Meijssen seems to have been the one who has entered the information in Wikidata linking Panama Papers to persons via the ‘significant event‘ property. How complete he yet has managed to do this I do not know. Our Danish Wikipedian Ole Palnatoke Andersen set up a page on the Danish Wikipedia at Diskussion:Panama-papirerne/Wikidata tabulating with the nice Listeria tool of Magnus Manske. Modifying Ole’s SPARQL query we can get the count of occupations for the persons associated with the Panama Papers in Wikidata.

SELECT ?occupationLabel(count(distinct ?person) as ?count) WHERE {
  ?person wdt:P793 wd:Q23702848 ; wdt:P106 ?occupation .   
  service wikibase:label { bd:serviceParam wikibase:language "en" . }
} group by ?occupationLabel

Some people may see that politicians are the largest group, but that might simply be an artifact of the notability criterion of Wikidata: Only people who are somewhat notable or are linked to something notable are likely to be included in Wikidata, e.g., the common businessman/woman may not (yet?) be represented in Wikidata.

The bubble chart cuts letters of the words for the occupation. ‘murd’ is murderer. Joaquín Guzmán has his occupation set to murderer in Wikidata, – without source…