HACK4DK 2017

Posted on Updated on

The HACK4DK is an annual event in Copenhagen, bringing together cultural nerds and computer nerds for building interesting things with cultural data. I have been participating since the very beginning and participated in this year’s HACK4DK which took place at ENIGMA, a to-be museum in Østerbro, Copenhagen.

The winning project among around 19 projects this year was Tin Toy, a neat augmented reality application using images from the toy collection of Holstebro Museum. I believe they used the AR.js Javascript library. There is a YouTube video that attempts to capture the attractiveness of the project:

The result of my struggles with the a-frame Javascript library is available on this page: Under the name “Virtual Gallery of Denmark” it was suppose to be a virtual reality environment with presentation of Danish art. The end result became a somewhat less dynamic but meditative environment with textured panels flying around in a virtual environment and with sound from old rerecorded phonographs in the Ruben Collection made available by the Royal Library in Aarhus.


I did not rely on the data provided at the event, but used data from the cultural institutions that were already uploaded to Wikimedia Commons and where the metadata was described on Wikidata. Both the images of the paintings (which was from Skagens Museum) and the sound were available at Wikimedia Commons and well-annotated on Wikidata.

The images was fetched with SPARQL queries to the Wikidata Query Service and API calls to the Wikimedia Commons API, and as such it is fairly easy to change the virtual environment to use other files which I did afterwards: The Giersing-Bach-Ishizaka-Nielsen virtual environment uses images on Wikimedia Commons where Wikidata records the artist as being Harald Giersing. Here the sound is from the Kimiko Ishizaka‘s Open Goldberg Variations project.


While a-frame models are suppose to run straight from the web browser on smartphones, my models seem to have hefty hardware requirements, – the images have quite high resolutions. It takes over 10 seconds on my computer to download all the image and sound files associated with the models. Nevertheless, with a strong computer, a big screen and good headphones, it is quite interesting to view and hear as the paintings and sound fly by.


How to quickly generate word analogy datasets with Wikidata

Posted on Updated on

One popular task in computational linguistics/natural language processing is the word analogy task: Copenhagen is to Denmark as Berlin is to …?

With queries to Wikidata Query Service (WDQS) it is reasonably easy to generate word analogy datasets in whatever (Wikidata-supported) language you like. For instance, for capitals and countries, a WDQS SPARQL query that returns results in Danish could go like this:

  ?country1Label ?capital1Label
  ?country2Label ?capital2Label
where { 
  ?country1 wdt:P36 ?capital1 .
  ?country1 wdt:P463 wd:Q1065 .
  ?country1 wdt:P1082 ?population1 .
  filter (?population1 > 5000000)
  ?country2 wdt:P36 ?capital2 .
  ?country2 wdt:P463 wd:Q1065 .
  ?country2 wdt:P1082 ?population2 .
  filter (?population2 > 5000000)
  filter (?country1 != ?country2)
  service wikibase:label
    { bd:serviceParam wikibase:language "da". }  
limit 1000

Follow this link to get to the query and press “Run” to get the results. It is possible to download the table as CSV-formatted (see under “Download”). One issue to note that you have multiple entries for countries with multiple capital cities, e.g., Sydafrika (South Africa) is listed with Pretoria, Kapstaden (Cape Town) and Bloemfontein.

Mixed indexing with integer index in Pandas DataFrame

Posted on Updated on

Indexing in Python’s Pandas can at times be tricky. Here is an example with mixed indexing (.ix) with integer index:

I ran into the issue when I wanted index with integer for DataFrame representing EEG data in one of its methods

Hull level coloring of a cortical surface representation

Posted on Updated on


Hull level coloring of a cortical surface representation constructed by Heather Drury and David Van Essen.
Hull level coloring of a cortical surface representation constructed by Heather Drury and David Van Essen.

I have just rediscovered by old surface coloring function from the 2003 version of the Brede Toolbox. It can color a surface according to hull level. Here it is with a modified cortical surface representation provided by Heather Drury and David Van Essen.

Matlab code with the Brede Toolbox:

S = brede_sur_drury;
color = brede_sur_color(S, 'style', 'rgb');figure, 
brede_ta3_frame, brede_ta3_sur(S, 'color', color);

and then followed by

print -dpng hulllevelcoloring.png

Zipf plot for word counts in Brown corpus

Posted on


There are various ways of plotting the distribution of highly skewed (heavy-tailed) data, e.g., with a histogram with logarithmically-spaced bins on a log-log plot, or by generating a Zipf-like plot (rank-frequency plot) like the above. This figure uses token count data from the Brown corpus as made available in the NLTK package.

For fitting the Zipf-curve a simple Scipy-based approach is suggested on Stackoverflow by “Evert”. More complicated power-law fitting is implemented on the Python package powerlaw described in Powerlaw: a Python package for analysis of heavy-tailed distributions that is based on the Clauset-paper.

Git: multiuser and multiple accounts

Posted on Updated on

We are still struggling somewhat with Git for multiple developers development of the Smartphone Brainscanner code. It may well be a RTFM-problem. We presumably have the Github Smartphone Brain Scanner code setup.

However, we also have a private department git accounts working with gitolite which brings some problems.

If you got multiple computers each with a different public key then you need extra tricks to be able to clone, push and pull from all computers. Here are the steps that got me working:

  1. I send one of my public keys to our department system administrators who then sets up an account with there specially developed script.
  2. With the account setup I can clone my gitolite-admin repository. git clone <git username>@<department git server>:gitolite-admin.
  3. The keydir at gitolite-admin/keydir/<git username>.pub is supposed to contain my public key from one of the computers. In a subdirectory I can put my public key from another computer, e.g., cp gitolite-admin/keydir/<name of other computer>/<git username>.pub
  4. Followed by the git commands git add, git commit and git push.
  5. Specify in the ~/.ssh/config the username of the git server. Under Host <department git server> put User <git username>.

To have other users access the repository I create I have tried:

  1. In gitolite-admin/conf/gitolite.conf added the a line such as @sbs2 = <my git username> <another user's git username> <a third user> and then under repo sbs2-Brain3D I added RW+ = @sbs2.

One user has reported that it now allows him to read and write in the repository, while cloning still is a problem for another user…

(originally published on Tumblr two months ago: Git: multiuser and multiple accounts)

Hack4dk contributions

Posted on Updated on

Ten projects was shown at the final showdown:

  • Kræn‘s (with help from Emma) Natmus Mosaic with autocropping and search facility. Code available from Github
  • Kim Bach and … Game with Daell’s Varehus catelogue: try to guess the decade
  • Henrik, Andreas and …(?), mail art mail box.
  • Mobil app with public art. List of nearby art shown with photos and on maps.
  • Public art search engine
  • Search engine and viewer for Copenhagen Police “Mandtaller”. Machine vision in Javascript.
  • Rasmus Erik: Wikipedia link visualization, image extract with classification og the Police images, quiz with decade.
  • Heat map movie through time of Copenhageners
  • SMK image visualization.
  • Steen Thomassen: Join the Danish Wikipedia and the Danish film database, e.g., for Tommy Kenter. The useful tool is running from Wikimedia Labs server.

Winner became the heatmap movie with Kræn’s Natmus Mosaic viewer as runner up.