Latest Event Updates

So what can we use Wikicite for?

Posted on

openfmri-journal-statistics-2016-09-19

Wikicite is a term for the combination of bibliographic information and Wikidata. While Wikipedia often records books of some notability it rarely records bibliographic information of less notability, i.e., individual scientific articles and books where there little third-party information (reviews, literary analyses, etc.) exists. This is not the case with Wikidata. Wikidata is now beginning to record lots of bibliographic information for “lesser works”. What can we use this treasure trove for? Here are a few of my ideas:

  1. Wikidata may be used as a substitute for a reference manager. I record my own bibliographic information in a big BIBTeX file and use the bibtex program together with latex when I generate a scientific document with references. It might very well be that the job of the BIBTeX file with bibliographic information may be taken over by Wikidata. So far we have, to my knowledge, no proper program for extracting the data in Wikidata and formatting it for inclusion in a document. I have begun a “wibtex” program for this, and only reached 44 lines so far, and it remains to be seen whether this is a viable avenue, whether the structure of Wikidata is good and convenient enough to record data for formatting references or that Wikidata is too flexible or too restricted for this kind of application.
  2. Wikidata may be used for “list of publications” of individual researchers, institutions, research groups and sponsor. Nowadays, I keep a list of publication on a webpage, in a latex document and on Google Scholar. My university has a separate list and sometimes when I write an research application I need to format the data for inclusion in a Microsoft Word document. A flexible program on top of Wikidata could make dynamic lists of publications
  3. Wikidata may be used to count citations. During the Wikicite 2016 Berlin meeting I suggested to the P2860 property and Tobias quickly created it. The P2860 allows us to describe citations between items in Wikidata. Though we managed to use the property a bit for scientific articles during the meeting, it has really been James Hare that has been running with the ball. Based on public citation data he has added hundreds of thousands of citations. At the moment this is of course only a very small part of the total number of citations. There are probably tens of millions of scientific papers with each having tens, if not hundreds of citations, of citations, so with the 499,750 citations that James Hare reported on 11 September 2016, we are still far from covering the field: James Hare tweeted that Web of Science claims to have over 1 milliard (billion) citations. The citation counts may be compared to a whole range of context data in Wikidata: author, affiliated institution, journal, year of publication, gender of author and sponsor (funding agency), so we can get, e.g., most cited Dane (or one affiliated with a Danish institution), most cited woman with an image, etc.
  4. Wikidata may be used as a hub for information sources. Individual scientific articles may point to further ressources, such as raw or result data. I myself have, for instance, added links to the neuroinformatics databases OpenfMRI, NeuroVault and Neurosynth, where Wikidata records all papers recorded in OpenfMRI, as far as I can determine. Wikidata is then able to list, say, all OpenfMRI papers or all OpenfMRI authors with Magnus Manske’s Listeria tool.
  5. Wikicite information in Wikidata may be used to support claims in Wikidata itself. As Dario Taraborelli points out this would allow queries like “all statements citing journal articles by physicists at Oxford University in the 1970s”.
  6. Wikidata may be used for other scientometrics analysis than counting, e.g, generation of coauthor graphs and cocitation graphs giving context to an author or paper. The bubble chart above shows statistics for journals of papers in OpenfMRI generated with the standard Wikidata Query Service bubble chart visualization tool.
  7. Wikidata could be used for citations in Wikipedia. This may very well be problematic, as a large Wikipedia article could have hundreds of references and each reference needs to be fetched from Wikidata generating lots of traffic. I tried a single citation on the “OpenfMRI” article (it has later been changed). Some form of inclusion of Wikidata identifier in Wikipedia references could further Wikipedia bibliometrics, e.g., determine the most cited author across all Wikipedias.

My daily life

Posted on Updated on

London-based Danish comedian Sofie Hagen last year premiered her “Bubblewrap” standup/storytelling act. In part of act she read her own fanfiction, written when she was young and hot on a boyband. Hagen self-ironic commentary and further elaborations of teenage (or preteenage?) troubles made one of the finest standup performances I have seen.

I myself have gotten hold on my own young writing. An exercise book in the “Danish” course, where the hand-in would sometimes be my favorite topic: “fristil”, “freestyle”. Here you could select your own topic and let your fantasy run away. One hand-in that I discovered was “Min hverdag” (“My daily life”) from an exercise book from 6th grade (I was perhaps 12 years old), a short story with a self-confident carelessness detailing what has become a common theme in my life: getting late.

Here is my (present) English translation. Enjoy:

My daily life.

I just made a time machine when I got home from school. Yesterday I made a car powered by water, but that broke.

When I was finished with the time machine, I set it to the year 1872, 7 Savile Row, London, England. I went into the time machine and pressed the button which started the time machine. Everything went black. One minute passed, and then suddenly I was standing right in front of Phileas Fogg. He did not seem to be particularly surprised. I turned around, and there was Passepartout. I said hello in Danish. They understood that well, as I had an interpreter machine with me, that I had made a couple of weeks ago. They too said hello. Passepartout stood with a valise. Phileas Fogg and Passepartout went out of the street door and that I did too. That was lucky as they locked the door behind them. I asked if I could come along. Phileas Fogg said yes. We went to the end of Savile Row precisely as in “Around the World in Eighty Days”. But they went further and towards a store. I asked, whether they were not going around the world. Mr. Fogg answered that they had been around the world, so I did not want to play this game anymore. I pressed the button and flew back to the present.

I was a bit annoyed as I did not get around the world. I wanted to try again but then the time machine broke. I could of course have repaired it. I did not want to use the 5 minutes that it would take to repair it, as I had homework to do. I could of course put a robot to do it. But on the other hand: You also need to learn something.


The teacher remarked “Excellent. But where did you get the topic from?” :)

Here is the Danish original slightly edited:

Min hverdag.

Jeg lavede lige en tidsmaskine, da jeg kom hjem fra skole. I går lavede jeg en bil, der gik på vand. Men den gik i stykker.

Da jeg var færdig med tidsmaskinen, stillede jeg den på året 1872, Saville-row nr. 7 London, England. Jeg gik ind i tidsmaskinen og trykkede på knappen som startede tidsmaskinen. Alt blev sort. Der gik 1 minut, og så pludselig stod jeg lige foran Phileas Fogg. Han så ikke ud til at være særligt overrasket. Jeg vendte mig om, og der stod Passepartout. Jeg sagde goddag på dansk. Det kunne de godt forstå fordi jeg havde en oversættermaskine på mig, som jeg havde lavet for nogle uger siden. De sagde også goddag. Passepartout stod med en vadsæk. Phileas Fogg og Passepartout gik ud af gadedøren og det gik jeg også. Det var heldigt fordi de låste døren efter sig. Jeg spurgte om jeg måtte komme med. Phileas Fogg sagde ja. Vi gik ned for enden af Saville-row, præcis som i “Jorden rundt i 80 dage”. Men de gik videre og over mod en butik. Jeg spurgte, om de ikke skulle jorden rundt. Mr. Fogg svarede, at det havde de været, så jeg gad ikke at lege dette her mere. Jeg trykkede på knappen og jeg fløj tilbage til nutid.

Jeg var lidt sur fordi jeg ikke var kommet jorden rundt. Jeg ville prøve igen, men så gik tidsmaskinen i stykker. Jeg kunne selvfølgelig have repareret den. Jeg gad ikke at bruge de 5 minutter, der skulle til at reparere den, fordi jeg havde lektier for. Jeg kunne selvfølgelig sætte en robot til det. Men på den anden side: man skal jo også lære noget.

Neuroinformatics coauthor network – so far

Posted on

neuroinformatics coauthor network 2016-06-28

Screenshot of neuroinformatics coauthor network – so far. Only the big cluster is shown. Network with Jonas Kress default setup querying WDQS.

Backup of directory

Posted on Updated on

I at times does not recall the command to backup a directory. So here for my own sake of reference:

$ rsync -au /home/fnielsen/Pictures/ /media/fnielsen/a4f11e07-3f63-45cc-bcab-e7d135c14b9c/backup/Billeder/

the -a is the standard archive option. the -u is ‘update’ (“skip files that are newer on the receiver”).

Page rank of scientific papers with citation in Wikidata – so far

Posted on Updated on

A citation property has just be created a few hours ago, – and as of writing still not been deleted. It means we can describe citation network, e.g., among scientific papers.

So far we have added a few citations, – mostly from papers about Zika. And now we can plot the citation network or compute the network measures such as page rank.

Below is a Python program using everything with Sparql, Pandas and NetworkX:

statement = """
select ?source ?sourceLabel ?target ?targetLabel where {
  ?source wdt:P2860 ?target .
  SERVICE wikibase:label {
    bd:serviceParam wikibase:language "en" .
  }
} 
"""

service = sparql.Service('https://query.wikidata.org/sparql')
response = service.query(statement)
df = DataFrame(response.fetchall(),
    columns=response.variables)

df.sourceLabel = df.sourceLabel.astype(unicode)
df.targetLabel = df.targetLabel.astype(unicode)

g = nx.DiGraph()
g.add_edges_from(((row.sourceLabel, row.targetLabel)
    for n, row in df.iterrows()))

pr = nx.pagerank(g)
sorted_pageranks = sorted((rank, title)
    for title, rank in pr.items())[::-1]

for rank, title in sorted_pageranks[:10]:
    print("{:.4} {}".format(rank, title[:40]))

The result:

0.02647 Genetic and serologic properties of Zika
0.02479 READemption-a tool for the computational
0.02479 Intrauterine West Nile virus: ocular and
0.02479 Internet encyclopaedias go head to head
0.02479 A juvenile early hominin skeleton from D
0.01798 Quantitative real-time PCR detection of 
0.01755 Zika virus. I. Isolations and serologica
0.01755 Genetic characterization of Zika virus s
0.0175 Potential sexual transmission of Zika vi
0.01745 Zika virus in Gabon (Central Africa)--20

Occupations of persons from Panama Papers

Posted on Updated on

Can we get an overview of the occupations of the persons associated with the Panama Papers? Well … that might be difficult, but we can get a biased plot by using the listing in Wikidata, where persons associated with the Panama Papers seems to be tagged and where their occupation(s) is listed. It produces the plot below.

PanamaPapersOccupations

It is fairly straightforward to construct such a bubble chart given the new plotting capabilities in the Wikidata Query Service. Dutch Wikipedian Gerard Meijssen seems to have been the one who has entered the information in Wikidata linking Panama Papers to persons via the ‘significant event‘ property. How complete he yet has managed to do this I do not know. Our Danish Wikipedian Ole Palnatoke Andersen set up a page on the Danish Wikipedia at Diskussion:Panama-papirerne/Wikidata tabulating with the nice Listeria tool of Magnus Manske. Modifying Ole’s SPARQL query we can get the count of occupations for the persons associated with the Panama Papers in Wikidata.

SELECT ?occupationLabel(count(distinct ?person) as ?count) WHERE {
  ?person wdt:P793 wd:Q23702848 ; wdt:P106 ?occupation .   
  service wikibase:label { bd:serviceParam wikibase:language "en" . }
} group by ?occupationLabel

Some people may see that politicians are the largest group, but that might simply be an artifact of the notability criterion of Wikidata: Only people who are somewhat notable or are linked to something notable are likely to be included in Wikidata, e.g., the common businessman/woman may not (yet?) be represented in Wikidata.

The bubble chart cuts letters of the words for the occupation. ‘murd’ is murderer. Joaquín Guzmán has his occupation set to murderer in Wikidata, – without source…

 

Om Henrik Krügers ‘Sømænd i Helvede’

Posted on

Sært at en enorm katastrofe med over tusinde dræbte kan affærdiges som en lille promille i 2. Verdenskrigs hav af rædsel. På sin vis virker det tyske overraskelsesangreb på den italienske havn Bari i 1943, hvor de fik ram på allierede skibe lastet med konventionel ammunition og sennepsgasbomber, som en parrallel til Henrik Krügers bog om samme. På trods af at hændelsen omtales som Lille Pearl Harbor, finder man ikke at angrebet indtager en større plads i litteraturen om 2. Verdenskrig. Heller ikke Krügers bog har gjort sig særligt bemærket. Krüger har selv udgivet bogen på on-demand-forlaget Skriveforlaget, og jeg fandt den tilfældig i udsalg fra det lokale bibliotek for vel ikke mere end 10 kroner.

Selv blev jeg overrasket over at læse at man ikke blot havde eksperimenteret med giftgas under 2. Verdenskrig, men tillige fabrikeret et stort antal giftgasbomber og transporteret dem til Europa til opmagasinering just-in-case. Krüger argumenterer for at adskillige døde som følge af hemmeligholdelsen af ladningen med giftgas, – giftgas, der havde regnet ned over soldater og søfolk efter at ammunitionsskibene var eksploderet. Grunden til at vi har hørt så lidt om angrebet skyldtes måske at den blot lagde sig i rækken af krigens almindelige død. Det skete på mindre end en time den 2. december 1943. Samme nat sendtes i følge A.C. Graylings opgørelse over 400 bombefly mod Berlin og natten efter over 500 mod Leipzig, hvor Grayling noterer 1.717 døde. Tænksom bliver man når man hører det tyske sprog blandt turister, hvis forfædre 2. generationer bagud kan have lidt i brandbombernes helvede.

Krüger skriver at det er en historie der aldrig er fortalt. Krüger støtter sig dog til engelsk-sprogede bøger. Hvor han får merit er gennem den danske vinkel, hvor han har interviewet flere danskere omkring skibet med navnet Lars Kruse. Med dette får han mindet de danske sømænds stille heroiske indsats.

Fra LibraryThing.