Wikicite

Find titles of all works published by DTU Cognitive Systems in 2017

Posted on Updated on

Find titles of all works published by DTU Cognitive Systems in 2017! How difficult can that be? To identify all titles of works from a research organization? With Wikidata and the Wikidata Query Service (WDQS) at hand it shouldn’t be that difficult to do? Nevertheless, I ran into a few hatches:

  1. There is what we can call the “Nathan Churchill Problem”: Nathan Churchill was at one point affiliated with our research section Cognitive Systems and wrote papers, e.g., together with our Morten Mørup. One paper clearly identifies him as affiliated with our section. Searching the DTU website yields no homepage for him though. He is now at St. Michael’s Hospital, Toronto according to a newer paper. So is he no longer affiliated with the Cognitive Systems section? That’s somewhat difficult to establish with credible and citable sources. If he is not, then any simple SPARQL query on the WDQS for Cognitive Systems papers will yield his new papers which shouldn’t be counted as Cognitive Systems section papers. If we could point to a source that indicates whether his affiliation at our section is stopped we could add a qualifier to the P1416 property in his Wikidata entry and extend the SPARQL query. What I ended up doing, was to explicitly filter out two of Churchill’s publications with the ugly line “FILTER(?work != wd:Q42595201 && ?work != wd:Q36384548)“. The problem is of course not just confined to Churchill. For instance, Scholia currently lists new publications by our Søren Hauberg at the Scholia page for DIKU, – a department where he has previously been affiliated. We discussed the affiliation problem a bit in the Scholia paper, see page 253 (page 17).
  2. Datetime datatype conversion with xsd:dateTime. The filter on date is with this line: “FILTER(?publication_datetime >= "2017-01-01"^^xsd:dateTime)“. Something like “FILTER(?publication_datetime >= xsd:dateTime(2017))” does not work.
  3. Missing data. It is difficult to establish how complete the Wikidata listing is for our section with respect to publications. Scraping Google Scholar, PubMed and our local university database of publications could be a possibility, but this is far from streamlined with the tools I have developed.

The full query is listed below and the result is available from this link. Currently, 48 results are returned.

#defaultView:Table
SELECT ?workLabel 
WITH {
  SELECT 
    ?work (MIN(?publication_datetime) AS ?datetime)
  WHERE {
    # Find CogSys work
    ?researcher wdt:P108 | wdt:P463 | wdt:P1416/wdt:P361* wd:Q24283660 .
    ?work wdt:P50 ?researcher .
    ?work wdt:P31 wd:Q13442814 .
    
    # Nathan Churchill seems not longer to be affiliated!?
    FILTER(?work != wd:Q42595201 && ?work != wd:Q36384548)
    
    # Filter to year 2017
    ?work wdt:P577 ?publication_datetime .
    FILTER(?publication_datetime >= "2017-01-01"^^xsd:dateTime)
  }
  GROUP BY ?work 
} AS %results
WHERE {
  INCLUDE %results
  SERVICE wikibase:label { bd:serviceParam wikibase:language "en,da,de,es,fr,jp,nl,nl,ru,zh". }
}

 

Advertisements

Can you scrape Google Scholar?

Posted on

With the WikiCite project, the bibliographic information on Wikidata is increasing rapidly with Wikidata describing 9.3 million scientific articles and 36.6 million citations. As far as I can determine most of the work is currently done by James Hare and Daniel Mietchen. Mietchen’s Research Bot is over 11 million edits on Wikidata while Hare has 15 million edits. For entering data into Wikidata from PubMed you can basically walk your way through PMID starting with “1” with the Fatameh tool. Hare’s reference work can take advantage of a webservice provided by National Institute of Health. For instance, a URL such https://eutils.ncbi.nlm.nih.gov/entrez/eutils/elink.fcgi?dbfrom=pmc&linkname=pmc_refs_pubmed&retmode=json&id=5585223 will return a JSON formatted result with citation information. This specific URL is apparently what Hare used to setup P2860 citation information in Wikidata, see, e.g.,  https://www.wikidata.org/wiki/Q41620192#P2860. CrossRef may be another resource.

Beyond these resources, we could potentially use Google Scholar. A former terms of service/EULA of Google Scholar stated that: “You shall not, and shall not allow any third party to: […] (j) modify, adapt, translate, prepare derivative works from, decompile, reverse engineer, disassemble or otherwise attempt to derive source code from any Service or any other Google technology, content, data, routines, algorithms, methods, ideas design, user interface techniques, software, materials, and documentation; […] “crawl”, “spider”, index or in any non-transitory manner store or cache information obtained from the Service (including, but not limited to, Results, or any part, copy or derivative thereof); (m) create or attempt to create a substitute or similar service or product through use of or access to any of the Service or proprietary information related thereto“. Here is “create or attempt to create a substitute or similar service” a stopping point.

The Google Scholar terms document seems now to have been superseded by the all embracing Google Terms of Service. This document seems less restrictive: “Don’t misuse our Services” and “You may not use content from our Services unless you obtain permission from its owner or are otherwise permitted by law.” So it may be or may not be ok to crawl and/or use/republish the data from Google Scholar. See also a StackExchange question. and another StackExchange question.

The Google robots.txt limits automated access with the following relevant lines:

Disallow: /scholar
Disallow: /citations?
Allow: /citations?user=
Disallow: /citations?*cstart=
Allow: /citations?view_op=new_profile
Allow: /citations?view_op=top_venues
Allow: /scholar_share

“/citations?user=” means that you are allowed to bot access the user profiles. Google Scholar user identifiers may be recorded in Wikidata by a dedicated property, so you could automatically access Google Scholar user profiles from the information in Wikidata.

So if there is some information you can get from Google Scholar is it worth it?

The Scholia code now adds a googlescholar.py module with some preliminary Google Scholar processing attempts. There is command-line based scraping of a researcher profile. For instance,

python -m scholia.googlescholar get-user-data gQVuJh8AAAAJ

It ain’t not working too well. As far as I can determine you need to page with JavaScript to get more than the initial 20 results (it would be interesting to examine the Publish or Perish software to see how a larger set of results is obtained). Not all bibliographic metadata is available for each item on the Google Scholar page – as far as I see: No DOI. No PubMed identifier. The author list may be abbreviated with an ellipsis (‘…’). Matching of the Google Scholar item with data already present in Wikidata seems not that straightforward.

It is worth remembering that Wikidata has the P4028 property to link to Google Scholar articles. There ain’t no many items using it yet though: 31. It was suggested by Vladimir Alexiev back in May 2017, but it seems that I am the only one using the property. Bot access to the link target provided by P4028 is – as far as I can see from the robots.txt – not allowed.

Do we have a final schema for Wikicite?

Posted on Updated on

No, Virginia, we do not have a final schema for Wikicite IMHO.

Wikicite is a project that focuses on sources in the Wikimedia universe. Currently, the most active part of Wikicite is the setup of bibliographic data from scientific articles in Wikidata with the tools of Magnus Manske, the Fatameh-duo and the GeneWiki people, and particular James Hare, Daniel Mietchen and Magnus Manske have been active in automatic and semi-automatic setup of data. Jakob Voß’ statistics says we have – as of medium October 2017 – metadata from almost 10 million publications in Wikidata and recorded over 36 million citation between the described works.

Given that so many bibliographic items have been setup in Wikidata it may be worth to ask whether we actually have a schema for the setup of this data. While we surely have sort-of a convention that tools and editors follow it is not complete and probably up for change.

Here are some Wikicite-related schema issues:

  1. What instance is a scientific article? Most tools use instance of Q13442814, currently “scientific article” in English. But what is this? In English “scientific” means something different than the usual translation into Danish (“videnskabelig”) or German (“wissenschaftlicher“), – and these words are used in the labels of Q13442814. “Scientific” usually only entails natural science, leaving out social science and the humanities (while “videnskabelig”/”wissenschaftlicher” entails social science and humanities too). An attempt to fix this problem is to call these articles “scholarly articles”. It is interesting to think that what is one of the most used classes in Wikidata – if not the most used class – has an language ambiguity. I see no reason to restricted Q13442814 to only the English sense of science. It is all too difficult to distinguish between scientific disciplines: Think of computational humanities.
  2. What about the ontology of scientific work? Currently, Q13442814 is set as a subclass of academic journal articles, but this is not how we use it as conference articles in proceedings are set to Q13442814. Is a so-called abstract a “scientific article”? “Abstracts” appear, e.g., in neuroimaging conferences, where they are full referenceable items published in proceedings or supplementary journal issues.
  3. What is the instances of scientific article in Wikidata describing? A work or an edition? What happens if the article is reprinted (it happens to important work)? Should we then create a new item? Or amend the old item? If we create a new item then how should we link the two? Should we create a third item as a work item? Should items in preprint archives have their own item? Should that issue depend on whether the preprint version and the canonical version are more or less the same?
  4. How do we represent the language of an article? There are two generally used properties: original language of work and language of the work. There is a discussion about deleting one of them.
  5. How do we represent an author? Today an author can be linked to the article via the P50 property. However, the author label may be different than the name written in the article (we may refer to this issue as the “Natalie Portman Problem” as she published a scientific article as “Natalie Hershlag”). P1932 as a qualifier to P50 may be used to capture the way that the name is represented in the article, – a possible solution. Recently, Manske’s author name resolver has started to copy the short author name to the qualifier under P50. For referencing, there is still the problem that the referencing software would likely need to determine the surname, and this is not trival for authors with suffixes and Spanish authors with multiple surnames.
  6. How do we record the affiliation of a paper. Publicly funded universities and other research entities would like to make statistics on, for instance, the paper production, but this is not possible to do precisely with today’s Wikidata as papers are usually not affiliated with organizations, – only indirectly by the author affiliation. And the author affiliation might change as the author moves between different institutions. We already noted this problem in the first article we wrote about Scholia.
  7. How do record the type of scientific publication? There are various subtypes, e.g., systematic review, original article, erratum, “letter”, etc. Or the state of the article: submitted, under-review, peer-review, not peer-reviewed. The “genre” and the “instance of” properties can be used, but I have seen no ruling convention.
  8. How do we record what software and which datasets have been used in the article, e.g., for digital preservation. Currently, we are using “used” (P2283). But should we have dedicated properties, e.g., “uses software“? Do we have a schema for datasets and software?
  9. How do we record the formatting of the title, e.g., case? Bibliographic reference management software may choose to capitalize some words. In BibTeX you have the possibility to format the title using LaTeX commands. Detailed formatting of titles in Wikidata is currently not done, and I do not believe we have dedicated properties to handle such cases.
  10. How do we manage journals that change titles? For instance, for BMJ we have several items covering the name changes: Q546003, Q15746654, and Q28464921. Is this how we should do? There is the P156 property to connect subsequent versions.
  11. How should we handle series of conference proceedings? A particular article can  be “published in” a proceedings and such a proceedings may be part of a “series” that is a “conference proceedings series“. However, according to my recollection some/one(?) Wikidata bot may link articles directly as “published in” the conference proceedings series: they can have ISSNs and look like ordinary scientific journals.
  12. When is an article published? You have a number of publishers setting a formal publication date in the future for an article that is actually published prior to that formal date. In Wikidata there is to my knowledge only a single property for publication date. Preprints yield other publication dates.
  13. A minor issue is P820, arXiv classification. According to documentation it should be used as a qualifier to P818, the arXiv identifier property. Embarrassingly, I overlooked that and the Scholia arXiv extraction program and Quickstatement generator outputs it/them as a proper property.

Update:

Do we have a schema for datasets and software? Well, yes, Virginia. For software Katherine Thornton & Co. have produced Modeling the Domain of Digital Preservation in Wikidata.

Some statistics on scholarly data in Wikidata

Posted on Updated on

The Wikicite initiative have spawned a lot of work on bibliographic/source information in Wikidata. Particularly scholarly bibliographic information has been added to Wikidata. Recently James Hare announced that we have over 3 million citations recorded in Wikidata, – mostly due to automated additions made by Hare himself.

With the tools of Magnus Manske and James Hare that are presently central to the growth of scholarly bibliographic data on Wikidata, we do not get a direct link to the authors items of Wikidata. Such information presently needs to be added manually or in a semi-automated fashion. Sponsor/funding information is neither added automatically, – except for a US organization where James Hare added this information.

So how much data do we have in Wikidata when we ask if the data is linked to other Wikidata items? Below are a few queries to the Wikidata Query Service that attempt to answer some aspects of this question.

Scientific articles

How many items do we have in Wikidata that describe a scientific article and that is linked to an author item?

SELECT (COUNT(DISTINCT ?work) AS ?count)
WHERE {
  ?work wdt:P31 wd:Q13442814 .
  ?work wdt:P50 ?author .
}

The query returns 45’253.

How many scientific articles with one or more author items and no author name string (indicating that the author linking may be complete).

SELECT (COUNT(DISTINCT ?work) AS ?count)
WHERE {
  ?work wdt:P31 wd:Q13442814 .
  ?work wdt:P50 ?author .
  FILTER NOT EXISTS { ?work wdt:P2093 ?authorname }
}

This query gives 3’567.

How many items do we have in Wikidata that is claimed to be a scientific article?

SELECT (COUNT(DISTINCT ?work) AS ?count)
WHERE {
  ?work wdt:P31 wd:Q13442814 .
}

This query gives 677’630.

Scientific authors

How many authors are in Wikidata that have written a scientific article?

SELECT (COUNT(DISTINCT ?author) AS ?count)
WHERE {
  ?work wdt:P31 wd:Q13442814 .
  ?work wdt:P50 ?author .
}

The query returns 10’193.

How many authors are in Wikidata that have written a scientific article and where the gender is indicated?

SELECT (COUNT(DISTINCT ?author) AS ?count)
WHERE {
  ?work wdt:P31 wd:Q13442814 .
  ?work wdt:P50 ?author .
  ?author wdt:P21 ?gender .
}

This query gives 8’853.

How many authors are there in Wikidata that have written a scientific article and where the scientific article is recorded having made one or more citations.

SELECT (COUNT(DISTINCT ?author) AS ?count)
WHERE {
  ?work wdt:P31 wd:Q13442814 .
  ?work wdt:P50 ?author .
  ?work wdt:P2860 ?cited_work .
}

This query returns 6’586.

How many authors are there in Wikidata that have written a scientific article and where the scientific article is recorded having made one or more citations and the cited work is recorded with one or more author items.

SELECT (COUNT(DISTINCT ?author) AS ?count)
WHERE {
  ?work wdt:P31 wd:Q13442814 .
  ?work wdt:P50 ?author .
  ?work wdt:P2860 ?cited_work .
  ?cited_work wdt:P50 ?cited_author .
}

This query returns 5’614.

How many authors are there in Wikidata that have written a scientific article and where the scientific article is recorded having made one or more citations and the cited work is recorded with one or more author items and where the genders of both the citing and the cited author are known.

SELECT (COUNT(DISTINCT ?author) AS ?count)
WHERE {
  ?work wdt:P31 wd:Q13442814 .
  ?work wdt:P50 ?author .
  ?work wdt:P2860 ?cited_work .
  ?cited_work wdt:P50 ?cited_author .
  ?author wdt:P21 ?gender .
  ?cited_author wdt:P21 ?cited_gender .
}

This query gives 4,730.

How many authors are there in Wikidata that have written a scientific article and where the scientific article is recorded having made one or more citations and the cited work is recorded with one or more author items and where the genders of both the citing and the cited author are known and where there is no author name string in neither the work nor the cited work (indicating that the work and the cited work may be completely linked with respect to author name.

SELECT (COUNT(DISTINCT ?author) AS ?count)
WHERE {
  ?work wdt:P31 wd:Q13442814 .
  ?work wdt:P50 ?author .
  ?work wdt:P2860 ?cited_work .
  ?cited_work wdt:P50 ?cited_author .
  ?author wdt:P21 ?gender .
  ?cited_author wdt:P21 ?cited_gender .
  FILTER NOT EXISTS { ?work wdt:P2093 ?authorname }
  FILTER NOT EXISTS { ?cited_work wdt:P2093 ?cited_authorname }
}

This query gives only 551.

Sponsor/funders

Sponsors of scientific articles ordered by number of citations.

SELECT ?number_of_citations ?sponsorLabel
WITH {
  SELECT (COUNT(?citing_work) AS ?number_of_citations) ?sponsor
  WHERE {
    ?work wdt:P859 ?sponsor .
    ?work wdt:P31 wd:Q13442814 .
    ?citing_work wdt:P2860 ?work .
  }
  GROUP BY ?sponsor
} AS %result
WHERE {
  INCLUDE %result
  SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
ORDER BY DESC(?number_of_citations)
LIMIT 5

This query gives National Institute for Occupational Safety and Health, Lundbeck Foundation, The Danish Council for Strategic Research, National Institute of Allergy and Infectious Diseases, University of Wisconsin–Madison.

The Wikidata scholarly profile page

Posted on Updated on

my_coauthors

Recently Lambert Heller wrote an overview piece on websites for scholarly profile pages: “What will the scholarly profile page of the future look like? Provision of metadata is enabling experimentation“. There he tabularized the features of the various online sites having scholarly profile pages. These sites include (with links to my entries): ORCID, ResearchGate, Mendeley, Pure and VIVO (don’t know these two), Google Scholar and Impactstory. One site missing from the equation is Wikidata. It can produce scholarly profile pages too. The default Wikidata editing interface may not present the data in a nice way – Magnus Manske’s Reasonator – better, but very much of the functionality is there to make a scholarly profile page.

In terms of the features listed by Heller, I will here list the possible utilization of Wikidata:

  1. Portrait picture: The P18 property can record Wikimedia Commons image related to a researcher. For instance, you can see a nice photo of neuroimaging professor Russ Poldrack.
  2. Researchers alternative names: This is possible with the alias functionality in Wikidata. Poldrack is presently recorded with the canonical label “Russell A. Poldrack” and the alternative names “Russell A Poldrack”, “R. A. Poldrack”, “Russ Poldrack” and “R A Poldrack”. It is straightforward to add more variations
  3. IDs/profiles in other systems: There are absolutely loads of these links in Wikidata. To name a few deep linking posibilities: Twitter, Google Scholar, VIAF, ISNI, ORCID, ResearchGate, GitHub and Scopus. Wikidata is very strong in interlinking databases.
  4. Papers and similar: Papers are presented as items in Wikidata and these items can link to the author via P50. The reverse link is possible with a SPARQL query. Futhermore, on the researcher’s items it is possible to list main works with the appropriate property. Full texts can be linked with the P953 property. PDF of papers with an appropriate compatible license can be uploaded to Wikimedia Commons and/or included in Wikisource.
  5. Uncommon research product: I am not sure what this is, but the developer of software services is recorded in Wikidata. For instance, for the neuroinformatics database OpenfMRI it is specified that Poldrack is the creator. Backlinks are possible with SPARQL queries.
  6. Grants, third party funding. Well there is a sponsor property but how it should be utilized for researchers is not clear. With the property, you can specify that paper or research project were funded by an entity. For the paper The Center for Integrated Molecular Brain Imaging (Cimbi) database you can see that it is funded by the Lundbeck Foundation and Rigshospitalet.
  7. Current institution: Yes. Employer and affiliation property is there for you. You can see an example of an incomplete list of people affiliated with research sections at my department, DTU Compute, here, – automagically generated by the Magnus Manske’s Listeria tool.
  8. Former employers, education etc.: Yes. There is a property for employer and for affiliation and for education. With qualifiers you can specify the dates of employment.
  9. Self assigned keywords: Well, as a Wikidata contributor you can create new items and you can use these items for specifying field of work of to label you paper with main theme.
  10. Concept from controlled vocabulary: Whether Wikidata is a controlled vocabulary is up for discussion. Wikidata items can be linked to controlled vocabularies, e.g., Dewey’s, so there you can get some controlness. For instance, the concept “engineer” in Wikidata is linked the BNCF, NDL, GND, ROME, LCNAF, BNF and FAST.
  11. Social graph of followers/friends: No, that is really not possible on Wikidata.
  12. Social graph of coauthors: Yes, that is possible. With Jonas Kress’ work on D3 enabling graph rendering you got on-the-fly graph rendering in the Wikidata Query Service. You can see my coauthor graph here (it is wobbly at the moment, there is some D3 parameter that need a tweak).
  13. Citation/attention metadata from platform itself: No, I don’t think so. You can get page view data from somewhere on the Wikimedia sites. You can also count the number of citations on-the-fly, – to an author, to a paper, etc.
  14. Citation/attention metadata from other sources: No, not really.
  15. Comprehensive search to match/include own papers: Well, perhaps not. Or perhaps. Magnus Manske’s sourcemd and quickstatement tools allow you to copy-paste a PMID or DOI in a form field press two buttons to grap bibliographic information from PubMed and a DOI source. One-click full paper upload is not well-supported, – to my knowledge. Perhaps Daniel Mietchen knows something about this.
  16. Forums, Q&A, etc.: Well, yes and no. You can use the discussion pages on Wikidata, but these pages are perhaps mostly for discussion of editing, rather than the content of the described item. Perhaps Wikiversity could be used.
  17. Deposit own papers: You can upload appropriately licensed papers to Wikimedia Commons or perhaps Wikisource. Then you can link them from Wikidata.
  18. Research administration tools: No.
  19. Reuse of data from outside the service: You better believe! Although Wikidata is there to be used, a mass download from the Wikidata Query Service can run into timeout problems. To navigate the structure of individual Wikidata item, you need programming skills, – at least for the moment. If you are really desperate you can download the Wikidata dump and Blazegraph and try to setup your own SPARQL server.

 

So what can we use Wikicite for?

Posted on Updated on

openfmri-journal-statistics-2016-09-19

Wikicite is a term for the combination of bibliographic information and Wikidata. While Wikipedia often records books of some notability it rarely records bibliographic information of less notability, i.e., individual scientific articles and books where there little third-party information (reviews, literary analyses, etc.) exists. This is not the case with Wikidata. Wikidata is now beginning to record lots of bibliographic information for “lesser works”. What can we use this treasure trove for? Here are a few of my ideas:

  1. Wikidata may be used as a substitute for a reference manager. I record my own bibliographic information in a big BIBTeX file and use the bibtex program together with latex when I generate a scientific document with references. It might very well be that the job of the BIBTeX file with bibliographic information may be taken over by Wikidata. So far we have, to my knowledge, no proper program for extracting the data in Wikidata and formatting it for inclusion in a document. I have begun a “wibtex” program for this, and only reached 44 lines so far, and it remains to be seen whether this is a viable avenue, whether the structure of Wikidata is good and convenient enough to record data for formatting references or that Wikidata is too flexible or too restricted for this kind of application.
  2. Wikidata may be used for “list of publications” of individual researchers, institutions, research groups and sponsor. Nowadays, I keep a list of publication on a webpage, in a latex document and on Google Scholar. My university has a separate list and sometimes when I write a research application I need to format the data for inclusion in a Microsoft Word document. A flexible program on top of Wikidata could make dynamic lists of publications
  3. Wikidata may be used to count citations. During the Wikicite 2016 Berlin meeting I suggested the P2860 property and Tobias quickly created it. The P2860 allows us to describe citations between items in Wikidata. Though we managed to use the property a bit for scientific articles during the meeting, it has really been James Hare that has been running with the ball. Based on public citation data he has added hundreds of thousands of citations. At the moment this is of course only a very small part of the total number of citations. There are probably tens of millions of scientific papers with each having tens, if not hundreds of citations, of citations, so with the 499,750 citations that James Hare reported on 11 September 2016, we are still far from covering the field: James Hare tweeted that Web of Science claims to have over 1 milliard (billion) citations. The citation counts may be compared to a whole range of context data in Wikidata: author, affiliated institution, journal, year of publication, gender of author and sponsor (funding agency), so we can get, e.g., most cited Dane (or one affiliated with a Danish institution), most cited woman with an image, etc.
  4. Wikidata may be used as a hub for information sources. Individual scientific articles may point to further ressources, such as raw or result data. I myself have, for instance, added links to the neuroinformatics databases OpenfMRI, NeuroVault and Neurosynth, where Wikidata records all papers recorded in OpenfMRI, as far as I can determine. Wikidata is then able to list, say, all OpenfMRI papers or all OpenfMRI authors with Magnus Manske’s Listeria tool.
  5. Wikicite information in Wikidata may be used to support claims in Wikidata itself. As Dario Taraborelli points out this would allow queries like “all statements citing journal articles by physicists at Oxford University in the 1970s”.
  6. Wikidata may be used for other scientometrics analyses than counting, e.g, generation of coauthor graphs and cocitation graphs giving context to an author or paper. The bubble chart above shows statistics for journals of papers in OpenfMRI generated with the standard Wikidata Query Service bubble chart visualization tool.
  7. Wikidata could be used for citations in Wikipedia. This may very well be problematic, as a large Wikipedia article could have hundreds of references and each reference needs to be fetched from Wikidata generating lots of traffic. I tried a single citation on the “OpenfMRI” article (it has later been changed). Some form of inclusion of Wikidata identifier in Wikipedia references could further Wikipedia bibliometrics, e.g., determine the most cited author across all Wikipedias.