Month: September 2016
Recently Lambert Heller wrote an overview piece on websites for scholarly profile pages: “What will the scholarly profile page of the future look like? Provision of metadata is enabling experimentation“. There he tabularized the features of the various online sites having scholarly profile pages. These sites include (with links to my entries): ORCID, ResearchGate, Mendeley, Pure and VIVO (don’t know these two), Google Scholar and Impactstory. One site missing from the equation is Wikidata. It can produce scholarly profile pages too. The default Wikidata editing interface may not present the data in a nice way – Magnus Manske’s Reasonator – better, but very much of the functionality is there to make a scholarly profile page.
In terms of the features listed by Heller, I will here list the possible utilization of Wikidata:
- Portrait picture: The P18 property can record Wikimedia Commons image related to a researcher. For instance, you can see a nice photo of neuroimaging professor Russ Poldrack.
- Researchers alternative names: This is possible with the alias functionality in Wikidata. Poldrack is presently recorded with the canonical label “Russell A. Poldrack” and the alternative names “Russell A Poldrack”, “R. A. Poldrack”, “Russ Poldrack” and “R A Poldrack”. It is straightforward to add more variations
- IDs/profiles in other systems: There are absolutely loads of these links in Wikidata. To name a few deep linking posibilities: Twitter, Google Scholar, VIAF, ISNI, ORCID, ResearchGate, GitHub and Scopus. Wikidata is very strong in interlinking databases.
- Papers and similar: Papers are presented as items in Wikidata and these items can link to the author via P50. The reverse link is possible with a SPARQL query. Futhermore, on the researcher’s items it is possible to list main works with the appropriate property. Full texts can be linked with the P953 property. PDF of papers with an appropriate compatible license can be uploaded to Wikimedia Commons and/or included in Wikisource.
- Uncommon research product: I am not sure what this is, but the developer of software services is recorded in Wikidata. For instance, for the neuroinformatics database OpenfMRI it is specified that Poldrack is the creator. Backlinks are possible with SPARQL queries.
- Grants, third party funding. Well there is a sponsor property but how it should be utilized for researchers is not clear. With the property, you can specify that paper or research project were funded by an entity. For the paper The Center for Integrated Molecular Brain Imaging (Cimbi) database you can see that it is funded by the Lundbeck Foundation and Rigshospitalet.
- Current institution: Yes. Employer and affiliation property is there for you. You can see an example of an incomplete list of people affiliated with research sections at my department, DTU Compute, here, – automagically generated by the Magnus Manske’s Listeria tool.
- Former employers, education etc.: Yes. There is a property for employer and for affiliation and for education. With qualifiers you can specify the dates of employment.
- Self assigned keywords: Well, as a Wikidata contributor you can create new items and you can use these items for specifying field of work of to label you paper with main theme.
- Concept from controlled vocabulary: Whether Wikidata is a controlled vocabulary is up for discussion. Wikidata items can be linked to controlled vocabularies, e.g., Dewey’s, so there you can get some controlness. For instance, the concept “engineer” in Wikidata is linked the BNCF, NDL, GND, ROME, LCNAF, BNF and FAST.
- Social graph of followers/friends: No, that is really not possible on Wikidata.
- Social graph of coauthors: Yes, that is possible. With Jonas Kress’ work on D3 enabling graph rendering you got on-the-fly graph rendering in the Wikidata Query Service. You can see my coauthor graph here (it is wobbly at the moment, there is some D3 parameter that need a tweak).
- Citation/attention metadata from platform itself: No, I don’t think so. You can get page view data from somewhere on the Wikimedia sites. You can also count the number of citations on-the-fly, – to an author, to a paper, etc.
- Citation/attention metadata from other sources: No, not really.
- Comprehensive search to match/include own papers: Well, perhaps not. Or perhaps. Magnus Manske’s sourcemd and quickstatement tools allow you to copy-paste a PMID or DOI in a form field press two buttons to grap bibliographic information from PubMed and a DOI source. One-click full paper upload is not well-supported, – to my knowledge. Perhaps Daniel Mietchen knows something about this.
- Forums, Q&A, etc.: Well, yes and no. You can use the discussion pages on Wikidata, but these pages are perhaps mostly for discussion of editing, rather than the content of the described item. Perhaps Wikiversity could be used.
- Deposit own papers: You can upload appropriately licensed papers to Wikimedia Commons or perhaps Wikisource. Then you can link them from Wikidata.
- Research administration tools: No.
- Reuse of data from outside the service: You better believe! Although Wikidata is there to be used, a mass download from the Wikidata Query Service can run into timeout problems. To navigate the structure of individual Wikidata item, you need programming skills, – at least for the moment. If you are really desperate you can download the Wikidata dump and Blazegraph and try to setup your own SPARQL server.
Wikicite is a term for the combination of bibliographic information and Wikidata. While Wikipedia often records books of some notability it rarely records bibliographic information of less notability, i.e., individual scientific articles and books where there little third-party information (reviews, literary analyses, etc.) exists. This is not the case with Wikidata. Wikidata is now beginning to record lots of bibliographic information for “lesser works”. What can we use this treasure trove for? Here are a few of my ideas:
- Wikidata may be used as a substitute for a reference manager. I record my own bibliographic information in a big BIBTeX file and use the bibtex program together with latex when I generate a scientific document with references. It might very well be that the job of the BIBTeX file with bibliographic information may be taken over by Wikidata. So far we have, to my knowledge, no proper program for extracting the data in Wikidata and formatting it for inclusion in a document. I have begun a “wibtex” program for this, and only reached 44 lines so far, and it remains to be seen whether this is a viable avenue, whether the structure of Wikidata is good and convenient enough to record data for formatting references or that Wikidata is too flexible or too restricted for this kind of application.
- Wikidata may be used for “list of publications” of individual researchers, institutions, research groups and sponsor. Nowadays, I keep a list of publication on a webpage, in a latex document and on Google Scholar. My university has a separate list and sometimes when I write a research application I need to format the data for inclusion in a Microsoft Word document. A flexible program on top of Wikidata could make dynamic lists of publications
- Wikidata may be used to count citations. During the Wikicite 2016 Berlin meeting I suggested the P2860 property and Tobias quickly created it. The P2860 allows us to describe citations between items in Wikidata. Though we managed to use the property a bit for scientific articles during the meeting, it has really been James Hare that has been running with the ball. Based on public citation data he has added hundreds of thousands of citations. At the moment this is of course only a very small part of the total number of citations. There are probably tens of millions of scientific papers with each having tens, if not hundreds of citations, of citations, so with the 499,750 citations that James Hare reported on 11 September 2016, we are still far from covering the field: James Hare tweeted that Web of Science claims to have over 1 milliard (billion) citations. The citation counts may be compared to a whole range of context data in Wikidata: author, affiliated institution, journal, year of publication, gender of author and sponsor (funding agency), so we can get, e.g., most cited Dane (or one affiliated with a Danish institution), most cited woman with an image, etc.
- Wikidata may be used as a hub for information sources. Individual scientific articles may point to further ressources, such as raw or result data. I myself have, for instance, added links to the neuroinformatics databases OpenfMRI, NeuroVault and Neurosynth, where Wikidata records all papers recorded in OpenfMRI, as far as I can determine. Wikidata is then able to list, say, all OpenfMRI papers or all OpenfMRI authors with Magnus Manske’s Listeria tool.
- Wikicite information in Wikidata may be used to support claims in Wikidata itself. As Dario Taraborelli points out this would allow queries like “all statements citing journal articles by physicists at Oxford University in the 1970s”.
- Wikidata may be used for other scientometrics analyses than counting, e.g, generation of coauthor graphs and cocitation graphs giving context to an author or paper. The bubble chart above shows statistics for journals of papers in OpenfMRI generated with the standard Wikidata Query Service bubble chart visualization tool.
- Wikidata could be used for citations in Wikipedia. This may very well be problematic, as a large Wikipedia article could have hundreds of references and each reference needs to be fetched from Wikidata generating lots of traffic. I tried a single citation on the “OpenfMRI” article (it has later been changed). Some form of inclusion of Wikidata identifier in Wikipedia references could further Wikipedia bibliometrics, e.g., determine the most cited author across all Wikipedias.