Latest Event Updates

HACK4DK 2017

Posted on Updated on

The HACK4DK is an annual event in Copenhagen, bringing together cultural nerds and computer nerds for building interesting things with cultural data. I have been participating since the very beginning and participated in this year’s HACK4DK which took place at ENIGMA, a to-be museum in Østerbro, Copenhagen.

The winning project among around 19 projects this year was Tin Toy, a neat augmented reality application using images from the toy collection of Holstebro Museum. I believe they used the AR.js Javascript library. There is a YouTube video that attempts to capture the attractiveness of the project:

The result of my struggles with the a-frame Javascript library is available on this page: Under the name “Virtual Gallery of Denmark” it was suppose to be a virtual reality environment with presentation of Danish art. The end result became a somewhat less dynamic but meditative environment with textured panels flying around in a virtual environment and with sound from old rerecorded phonographs in the Ruben Collection made available by the Royal Library in Aarhus.


I did not rely on the data provided at the event, but used data from the cultural institutions that were already uploaded to Wikimedia Commons and where the metadata was described on Wikidata. Both the images of the paintings (which was from Skagens Museum) and the sound were available at Wikimedia Commons and well-annotated on Wikidata.

The images was fetched with SPARQL queries to the Wikidata Query Service and API calls to the Wikimedia Commons API, and as such it is fairly easy to change the virtual environment to use other files which I did afterwards: The Giersing-Bach-Ishizaka-Nielsen virtual environment uses images on Wikimedia Commons where Wikidata records the artist as being Harald Giersing. Here the sound is from the Kimiko Ishizaka‘s Open Goldberg Variations project.


While a-frame models are suppose to run straight from the web browser on smartphones, my models seem to have hefty hardware requirements, – the images have quite high resolutions. It takes over 10 seconds on my computer to download all the image and sound files associated with the models. Nevertheless, with a strong computer, a big screen and good headphones, it is quite interesting to view and hear as the paintings and sound fly by.


Danish words for snow

Posted on Updated on

According to Laura Martin, Franz Boas may have been the first to point to the relative richness of Eskimo words for snow: “Eskimo Words for Snow”: A Case Study in the Genesis and Decay of an Anthropological Example. American Anthropologist, 88(2):418. Boas listed aput, qana, piqsirpoz, and qimuqsuq. English may have snow, hail, sleet, ice, icicle, slush, and snowflake as listed on the English Wikipedia on Eskimo words for snow. There seems to be more than that, e.g., firn. Danish is not (as) polysynthetic as Eskimo, but it has lots of compounds, which make it possible to create a good number of words for snow. Most of these words derive from sne and is.

Update 2017-09-13: Added skosse.

Word Translation Explanation
bræ large mass of ice
bundis ice at the bottom of the ocean/sea
drivis ice floating on the water, either “havis” og “søis”
firn firn snow older than a year
flodis ice from a river
fnug snowflake
frostsne snow below freezing, as oppose to tøsne
fygesne drifting snow
gletsjer/gletscher glacier
gletscheris ice in/from a glacier
grå is first stage of “ungis”, according to DMI
gråhvid is second state of “ungis”, according to DMI
hagl hail precipitate with small pellets of ice
haglkorn hailstone small pellet of ice
iglo/igloo  iglo
havis sea ice ice in the ocean/sea
indlandsis Indlandsisen is the big “iskappe” in Greenland
is ice frozen water that is (usually) transparent
isbarriere the edge of an “isshelf”, according to DMI
isblok block of ice
isflade sheet of ice
isflage floe
isfront the edge og a “isshelf”
isfod ice frozen to the coast or (second meaning) the ice below the water
iskalot ice-covered area near the poles
iskant the edge of a floe
iskappe ice cap very large connected mass of snow, e.g., the one in Greenland
iskorn see also “kornsne”
isbræ large mass of ice, the same as “bræ”
islag layer of ice, not the same as “isslag”
isrand the edge og a floe
isshelf floating gletcher
isskorpe layer of ice on top of water or snow
isskruning ice pack
isslag glaze, black ice, freezing rain raindrops below freezing that becomes ice when hitting the ground or structure
isstykke a piece of ice
istap  icicle
isterning ice cube
isvand ice water water with ice in it, usually for drinking
julesne  Christmas snow snow falling or lying during Christmas
kunstsne  artificial snow snow artificially made
lavine avalanche
nysne snow recently falling, as opposed to firn
pakis “drivis” with a high concentration, according to DMI
polaris sea ice that have survived at least one summer meting
puddersne powder snow light snow
rim hard rime “white ice that forms when the water droplets in fog freeze to the outer surfaces of objects.” according to English Wikipedia
sjap slush
sjapis slush ice
slud sleet a mixture of rain and falling snow
sne snow used about falling snow and snow on the ground
snebold snowball snow formed as a ball, of used to through in a snowball fight
snebunke  pile of snow
snebyge snow shower
snedrive snowdrift
snedrys small amount of precipitation of snow
snedække layer/cover of snow
snefnug snowflake
snefog snowdrift
snefygning snow in strong wind
snehule snow formed as a cave for fun or survival, see also “igloo”
snehytte more or less the same as an “iglo”
snekorn snow grain
snelag layer of snow
snemand snowman snow formed as a sculpture of a human
snemark field of snow
snemasse mass of snow
snesjap slush
sneskred avalanche snow falling down a slope
snestorm snowstorm
snevejr snow weather with falling snow
tyndis thine ice
sjap sleet
søis “lake ice”
tøris (“tøris” is usually “dry ice”)
tøsne melting snow snow that is melting
ungis Sea ice between “tyndis” and “vinteris”, according to DMI

Some information about Scholia

Posted on


Scholia is mostly a web service developed from GitHub at in an open source fashion. It was inspired by discussions at the WikiCite 2016 meeting in Berlin. Anyone can contribute as long as their contribution is under GPL.

I started to write the Scholia code back in October 2016 according to the initial commit at Since then particularly Daniel Mietchen and Egon Willighagen have joined in and Egon has lately be quite active.

Users can download the code and run the web service from their own computer if they have a Python Flask development environment. Otherwise the canonical web site for Scholia is which anyone with an Internet connection should be able to view.

So what does Scholia do? The initial “application” was a “static” web page with a researcher profile/CV of myself based on data extracted from Wikidata. It is still available from: I added a static page for my research section, DTU Cognitive Systems, showing scientific page production and a coauthor graph. This is available here:

The Scholia web application was an extension of these initial static pages so a profile page for any researcher or any organization could be made on the fly. And now it is no longer just authors and organizations where there is a profile page, but also works, venues (journals or proceedings), series, publishers, sponsors (funders) and awards. We have also “topics” and individual pages showing specialized information about chemicals, proteins, diseases and biological pathways. A rudimentary search interface is implemented.

The content of the web pages of Scholia with plots and tables are made from queries to the Wikidata Query Service, – the extended SPARQL endpoint provided by the Wikimedia Foundation. We also pull in text from the introduction of the articles in the English Wikipedia. We modify the table output of the Wikidata Query Service so individual items displayed in table cells link back to other items in Scholia.

Egon Willighagen, Daniel Mietchen and I have described Scholia and Wikidata for scientometrics in the 16-pages workshop paper “Scholia and scientometrics with Wikidata” The screenshots shown in the paper has been uploaded to Wikimedia Commons. These and other Scholia media files are available in category page

Working with Scholia has been a great way to explore what is possible with SPARQL and Wikidata. One plot that I like is the “Co-author-normalized citations per year” plot on the organization pages. There is an example on this page: Here the citations to works authored by authors affiliated with the organization in question are counted and organized in a colored bar chart with respect to year of publication, – and normalized for the number of coauthors. The colored bar charts have been inspired by the “LEGOLAS” plots of Shubhanshu Mishra and Vetle Torvik.

Part of the Python Scholia code will also work as a command-line script for reference management in the LaTeX/BIBTeX environment using Wikidata as the backend. I have used this Scholia scheme for a couple of scientific papers I have written in 2017. The particular script is currently not well developed, so users would need to be indulgent.

Scholia relies on users adding bibliographic data to Wikidata. Tools from Magnus Manske are a great help as are Fatameh of “T Arrow” and “Tobias1984” and the WikidataIntegrator of the GeneWiki people. Daniel Mietchen, James Hare and a user called “GZWDer” have been very active adding much of the science bibligraphic information and we are now past 2.3 million scientific articles on Wikidata. You can count them with this link:

My h-index as of June 2017: Coverage of researcher profile sites

Posted on Updated on

The coverage of different researcher profile sites and their citation statistics varies. Google Scholar seems to be the site with the largest coverage, – it even crawls and indexes my slides. The open Wikidata is far from there, but may be the only one with machine-readable free access and advanced search.

Below is the citation statistics in the form of the h-index from five different services.

h Service
28 Google Scholar
27 ResearchGate
22 Scopus
22(?) Semantic Scholar
18 Web of Science
8 Wikidata

Semantic Scholar does not give an overview of the citation statistics, and the count is somewhat hidden on the individual article pages. I attempted as best as I could to determine the value, but it might be incorrect.

I made a similar statistics on 8 May 2017 and reported it on the slides Wikicite (page 42). During the one and a half month since that count, the statistics for Scopus has change from 20 to 22.

Semantic Scholar is run by the Allen Institute for Artificial Intelligence, a non-profit research institute, so they may be interested in opening up their data for search. An API does, to my knowledge, not (yet?) exist, but they have a gentle robots.txt. It is also possible to download the full Semantic Scholar corpus from (Thanks to Vladimir Alexiev for bringing my attention to this corpus).

When does an article cite you?

Posted on Updated on

Google Scholar alerted me to a recent citation to my work from Teacher-Student Relationships, Satisfaction, and Achievement among Art and Design College Students in Macau, a paper published in Journal of Education and Practice of to me unknown repute.

In the references, I see a listing of Persistence of Web References in Scientific Research where I was among the coauthors. So in which context is this paper cited? I seems strange that an article about link rot is cited by an article about teacher-student relationships… Indeed I cannot find the reference in body text when I search on the first author’s last name (“lawrence”).

Indeed several other items in references listing I cannot find: Joe Smith’s “One of Volvo’s core values”, Strunk et al.’s “The element of style” and Van der Geer’s “The art of writing a scientific article”. Notable is it that the first four references is out of order in the otherwise alphabetic sorted list of references, so there must be an error. Perhaps it is an error arising from a copy-and-paste typo?

In this case, I would say, that even though being listed, I am not actually cited by the article. The “fact” of whether it is a citation or not is important to discuss if we want to record the citation in Wikidata, where “Persistence of Web References in Scientific Research” is recorded with the item Q21012586, see also the Scholia entry. Possible we could record the erroneous citation and the use the Wikidata deprecated rank facility: “Value is known to be wrong but (used to be) commonly believed”.

Some statistics on scholarly data in Wikidata

Posted on Updated on

The Wikicite initiative have spawned a lot of work on bibliographic/source information in Wikidata. Particularly scholarly bibliographic information has been added to Wikidata. Recently James Hare announced that we have over 3 million citations recorded in Wikidata, – mostly due to automated additions made by Hare himself.

With the tools of Magnus Manske and James Hare that are presently central to the growth of scholarly bibliographic data on Wikidata, we do not get a direct link to the authors items of Wikidata. Such information presently needs to be added manually or in a semi-automated fashion. Sponsor/funding information is neither added automatically, – except for a US organization where James Hare added this information.

So how much data do we have in Wikidata when we ask if the data is linked to other Wikidata items? Below are a few queries to the Wikidata Query Service that attempt to answer some aspects of this question.

Scientific articles

How many items do we have in Wikidata that describe a scientific article and that is linked to an author item?

  ?work wdt:P31 wd:Q13442814 .
  ?work wdt:P50 ?author .

The query returns 45’253.

How many scientific articles with one or more author items and no author name string (indicating that the author linking may be complete).

  ?work wdt:P31 wd:Q13442814 .
  ?work wdt:P50 ?author .
  FILTER NOT EXISTS { ?work wdt:P2093 ?authorname }

This query gives 3’567.

How many items do we have in Wikidata that is claimed to be a scientific article?

  ?work wdt:P31 wd:Q13442814 .

This query gives 677’630.

Scientific authors

How many authors are in Wikidata that have written a scientific article?

SELECT (COUNT(DISTINCT ?author) AS ?count)
  ?work wdt:P31 wd:Q13442814 .
  ?work wdt:P50 ?author .

The query returns 10’193.

How many authors are in Wikidata that have written a scientific article and where the gender is indicated?

SELECT (COUNT(DISTINCT ?author) AS ?count)
  ?work wdt:P31 wd:Q13442814 .
  ?work wdt:P50 ?author .
  ?author wdt:P21 ?gender .

This query gives 8’853.

How many authors are there in Wikidata that have written a scientific article and where the scientific article is recorded having made one or more citations.

SELECT (COUNT(DISTINCT ?author) AS ?count)
  ?work wdt:P31 wd:Q13442814 .
  ?work wdt:P50 ?author .
  ?work wdt:P2860 ?cited_work .

This query returns 6’586.

How many authors are there in Wikidata that have written a scientific article and where the scientific article is recorded having made one or more citations and the cited work is recorded with one or more author items.

SELECT (COUNT(DISTINCT ?author) AS ?count)
  ?work wdt:P31 wd:Q13442814 .
  ?work wdt:P50 ?author .
  ?work wdt:P2860 ?cited_work .
  ?cited_work wdt:P50 ?cited_author .

This query returns 5’614.

How many authors are there in Wikidata that have written a scientific article and where the scientific article is recorded having made one or more citations and the cited work is recorded with one or more author items and where the genders of both the citing and the cited author are known.

SELECT (COUNT(DISTINCT ?author) AS ?count)
  ?work wdt:P31 wd:Q13442814 .
  ?work wdt:P50 ?author .
  ?work wdt:P2860 ?cited_work .
  ?cited_work wdt:P50 ?cited_author .
  ?author wdt:P21 ?gender .
  ?cited_author wdt:P21 ?cited_gender .

This query gives 4,730.

How many authors are there in Wikidata that have written a scientific article and where the scientific article is recorded having made one or more citations and the cited work is recorded with one or more author items and where the genders of both the citing and the cited author are known and where there is no author name string in neither the work nor the cited work (indicating that the work and the cited work may be completely linked with respect to author name.

SELECT (COUNT(DISTINCT ?author) AS ?count)
  ?work wdt:P31 wd:Q13442814 .
  ?work wdt:P50 ?author .
  ?work wdt:P2860 ?cited_work .
  ?cited_work wdt:P50 ?cited_author .
  ?author wdt:P21 ?gender .
  ?cited_author wdt:P21 ?cited_gender .
  FILTER NOT EXISTS { ?work wdt:P2093 ?authorname }
  FILTER NOT EXISTS { ?cited_work wdt:P2093 ?cited_authorname }

This query gives only 551.


Sponsors of scientific articles ordered by number of citations.

SELECT ?number_of_citations ?sponsorLabel
  SELECT (COUNT(?citing_work) AS ?number_of_citations) ?sponsor
    ?work wdt:P859 ?sponsor .
    ?work wdt:P31 wd:Q13442814 .
    ?citing_work wdt:P2860 ?work .
  GROUP BY ?sponsor
} AS %result
  INCLUDE %result
  SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
ORDER BY DESC(?number_of_citations)

This query gives National Institute for Occupational Safety and Health, Lundbeck Foundation, The Danish Council for Strategic Research, National Institute of Allergy and Infectious Diseases, University of Wisconsin–Madison.

How to quickly generate word analogy datasets with Wikidata

Posted on Updated on

One popular task in computational linguistics/natural language processing is the word analogy task: Copenhagen is to Denmark as Berlin is to …?

With queries to Wikidata Query Service (WDQS) it is reasonably easy to generate word analogy datasets in whatever (Wikidata-supported) language you like. For instance, for capitals and countries, a WDQS SPARQL query that returns results in Danish could go like this:

  ?country1Label ?capital1Label
  ?country2Label ?capital2Label
where { 
  ?country1 wdt:P36 ?capital1 .
  ?country1 wdt:P463 wd:Q1065 .
  ?country1 wdt:P1082 ?population1 .
  filter (?population1 > 5000000)
  ?country2 wdt:P36 ?capital2 .
  ?country2 wdt:P463 wd:Q1065 .
  ?country2 wdt:P1082 ?population2 .
  filter (?population2 > 5000000)
  filter (?country1 != ?country2)
  service wikibase:label
    { bd:serviceParam wikibase:language "da". }  
limit 1000

Follow this link to get to the query and press “Run” to get the results. It is possible to download the table as CSV-formatted (see under “Download”). One issue to note that you have multiple entries for countries with multiple capital cities, e.g., Sydafrika (South Africa) is listed with Pretoria, Kapstaden (Cape Town) and Bloemfontein.