Latest Event Updates

On climate strike

Posted on

I am not on climate strike, but for the sake of respect for the enormous dependency we have on electricity, I had planned to avoid electricity for 24 hours.

So far not so good.

I stayed late Thursday to complete a paper submission et al. and that went into early Friday. The desktop computer and light were switched on, so lets start at wake-up time instead…

My bedside clock is electrical. It has a battery, so potentially I could let it go off-grid instead of switching it off. A room in my home is without windows. I have candles in the room but they do not light up much… Breakfast includes milk from the refrigerator. Any use of water is presumably dependent on electrical water pumps somewhere along the tubes. My smartphone can – as any other mobile phone – be off the power grid. My electrically powered home wifi is typically on, but I could have used 4G, which would enable me to have off-power grid Internet.

Off to work, my bike does not require on-grid electricity. The back light is battery powered, the front light is a dynamo. However, the room with the bike is without natural light. It is necessary to switch on light or bring your own light to find the bike.

My employer has physical access control with a card. The door is not locked at the hour that I came, – so I am able to get in… The coffee machine is electrical, the heater for warm water is electrical…. My desk telephone is electric.

For work, I had plenty of paper (which these letters are written on) and I printed several articles to read.

Paying for lunch is an issue: At the local street food you pay with MobilePay or contactless card, – I do not recall seeing customers paying with cash. While MobilePay and card do not require on-grid electricity on your part, the receiver may have on-grid electricity at their end and certainly the card handling company has.

There is not non-electrical light at my employer which means that around 17:00 things get complicated. At around 18:00 I gave up. Until then I had read about two and a half papers and written two sections for a possible paper, – as well as checked my email via the off-grid smartphone.

Back home I switched on the light by old habit. Switched off, I went to buy something for dinner. I have an electrical stove, so cooking hot food would be impossible. I bought bread and salmon which did not require heating. Back home I managed to find candles, LED candles and a bright sun-charged lamp. From these lights, my battery radio and my off-grid smartphone I managed to eat and entertain myself for the rest of the evening.

Venezuela has had a blackout with major effects. Developed societies have become so dependent on electricity.

An occasional switch-off may indulge us with a sense of awe of modern electrical technology, previous generations’ ability to strive through the darkness and a respect for the light.


Coming Scholia, WikiCite, Wikidata and Wikipedia sessions

Posted on

In the coming months I will have three different talks on Scholia, WikiCite, Wikidata and Wikipedia at al.:

  • 3. October 2018 in DGI-byen, Copenhagen, Denmark as part of Visuals and Analytics that Matter conference, – the concluding conference for the DEFF-sponsored project Research Output & Impact Analyzed and Visualized (ROIAV).
  • 7. November 2018 in Mannheim as part of the Linked Open Citation Database (LOC-DB) 2018 workshop.
  • 13. december 2018 at the library of the Technical University of Denmark as part of Wikipedia – a media for sharing knowledge and research, an event for researchers and students (and still in the planning phase).

In september I presented Scholia as part of the Workshop on Open Citations. The slides with title Scholia as of September 2018 is available here.

Fru Astrid Grib af Thit Jensen

Posted on

Lillesøster Thit gir den hele armen med mord og død i psykologisk portrættering af en kærlighedsbefængt 28-årig kvinde, hvor tiltag til sprog a la storebror aldrig helt letter. Vældig meget kunst og melodrama hvor 40 sider lader en kvinde gå fra forelskelsens vanvid til vanvid. Kærligheden er voldsom, ugengældt, balstyrisk, overdreven men også uudtrykt; ganske kontrastfyldt mod brorens skolemesteragtige forhold til kærlighed.

Fra Librarything.

A viewpoint on a viewpoint on Wikipedia’s neutral point of view

Posted on Updated on

I recently looked into what we have of Wikipedia research from Denmark and discovered several papers that I did not know about. I have now added some to Wikidata, so that Scholia can show a list of them.

Among the papers was one from Jens-Erik Mai titled Wikipedian’s knowledge and moral duties. Starting from the English Wikipedia’s Neutral Point of View (NPOV) policy, he stresses a dichotomy between the subjective and the object and argues for a rewrite of the policy. Mai claims the policy has an absolutistic center and a relativistic edge, corresponding to an absolutistic majority view and relativistic minority views.

As a long time Wikipedia editor, I find Mai’s exposition is too theoretical. I lack good exemplifications: cases where the NPOV fails, and I cannot see in what concrete way the NPOV policy should be changed to accommodate Mai’s critique. I am not sure that Wikipedians distinguish so much between the objective and the subjective; the key dichotomy is verifiability vs. not veriability, – that the statements in Wikipedia are supported by reliable sources. In terms of center-edge, I came to think of events associated with conspiracy theories. Here the “center” view could be the conventional view while the conspiracy views the edge. It is difficult for me to accommodate a standpoint that conspiracy theories should be accepted as equal as the conventional view. It is neither clear to me that the center is uncontested and uncontroversial. Wikipedia – like a newspaper – has the ability to represent opposing viewpoints. This is done by attributing the viewpoint to the reliable sources that express them. For instance, central in the description of evaluation of films are quotations from reviews of major newspapers and notable reviewers.

I don’t see the support for the claim that the NPOV policy assumes a “politically dangerous ethical position”. On the contrary, Wikipedia is now – after the increase of fake news – been called the “last bastion”. The example given in The Atlantic post is the recent social media fuzz with respect to Sarah Jeong where Wikipedians reach a work with “shared facts about reality.”

Scholia is more than scholarly profiles

Posted on Updated on

Scholia, a website originally started as service to show scholarly profiles from data in Wikidata, is actually not just for scholarly data.

Scholia can also show bibliographic information for “literary” authors and journalists.

An example that I have begun on Wikidata is for the Danish writer Johannes V. Jensen whose works pose a very interesting test case for Wikidata, because the interrelation between the works and editions can be quite complicated, e.g., news paper articles being merged into a poem that is then published in an edition that are then expanded and re-printed… Also the scholarly and journalistic work about Johannes V. Jensen can be recorded in Wikidata. Scholia currently records 30 entries about Johannes V. Jensen, – and that does not necessarily includes works about works written by Johannes V. Jensen.

An example of a bibliography of a journalist is that of Kim Wall. Her works are almost always addressing very unique topics, – fairly relevant as sources in Wikipedia articles. Examples include an article on a special modern Chinese wedding tradition in Fairy Tale Romances, Real and Staged and an article on furries It’s not about sex, it’s about identity: why furries are unique among fan cultures.

An interesting feature about most of Wall’s articles, is that she let the interviewee have the final word by adding a quotation as the very final paragraph. That is also the case with the two examples linked above. I suppose that say something of Wall’s generous journalistic approach.



Addressing “addressing age-related bias in sentiment analysis”

Posted on Updated on

Algorithmic bias is one of the hot topics of research at the moment. There are observations of trained machine learning models that display sexism. For instance, the paper “Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings” (Scholia entry) neatly shows one example in its title with bias in word embeddings, –  shallow machine learning models trained on a large corpus of text.

A recent report investigated ageism bias in a range of sentiment analysis method, including my AFINN word list: “Addressing age-related bias in sentiment analysis” (Scholia entry). The researchers scraped sentences from blog posts and extracted those sentences with the word “old” and excluded the sentences where the word did not refer to the age of the person. They then replaced “old” with the word “young” (apparently also “older” and “oldest” was considered somehow). The example sentences they ended up with were, e.g., “It also upsets me when I realize that society expects this from old people” and “It also upsets me when I realize that society expects this from young people”. These sentences (242 in total) were submitted to 15 sentiment analysis tools and statistics was made “using multinomial log-linear regressions (via the R package nnet […])”.

I was happy to see that my AFINN was the only one in Table 4 surviving the test for all regression coefficients being non-significant. However, Table 5 with implicit age analysis showed some bias in my word list.

But after a bit of thought I wondered why there could be any kind of bias in my word list. The paper list an exponentiated intercept coefficient to be 0.733 with a 95%-confidence interval from 0.468 to 1.149 for AFINN. But if I examine what my afinn Python package reports about the words “old”, “older”, “oldest”, “young”, “younger” and “youngest”, I get all zeros, i.e., these words are not scored to be either positive or negative:


>>> from afinn import Afinn
>>> afinn = Afinn()
>>> afinn.score('old')
>>> afinn.score('older')
>>> afinn.score('oldest')
>>> afinn.score('young')
>>> afinn.score('younger')
>>> afinn.score('youngest')

It is thus strange why there can be any form a bias – even non-significant. For instance, for the two example sentences “It also upsets me when I realize that society expects this from old people” and “It also upsets me when I realize that society expects this from young people” my afinn Python package scores them both with the sentiment -2. This value comes solely from the word “upsets”. There can be no difference between any of the sentences when you exchange the word “old” with “young”.

In their implicit analysis of bias where they use a word embedding, there could possibly creep some bias in somewhere with my word list, although it is not clear for me how this happens.

The question is then what happens in the analysis. Does the multinomial log-linear regression give a questionable result? Could it be that I misunderstand a fundamental aspect of the paper? While som data seem to be available here, I cannot identify the specific sentences they used in the analysis.

Hyppige elementer blandt bedste danske film

Posted on Updated on

Bo Green Jensen har skrevet bogen De 25 bedste danske film, hvor man blandt andet finder Vredens Dag, Kundskabens træ, Babettes gæstebud og Den eneste ene. Denne korte liste på 25 film, der blev udgivet i 2002, har jeg lige indtastet i Wikidata via “katalog”-egenskaben. Når det er gjort, kan man benytte Wikidata Query Service til, med en SPARQL-databaseforespørgsel, at finde elementer der går igen blandt filmene. En sådan SPARQL-forespørgsel kunne se sådan ud:

SELECT (COUNT(?item) AS ?count) ?value ?valueLabel WHERE {
  ?item wdt:P972 wd:Q12307844 .
  ?item ?property ?value .
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],da,en". }
GROUP BY ?value ?valueLabel
HAVING (COUNT(?item) > 1)

Denne version tæller film og ordner elementerne efter hvor mange film de enkelte elementer indgår i. Informationen i Wikidata er nok ikke helt komplet. Med Magnus Manskes Listeria-værktøj kan man dog få en tabel konstrueret der viser at hver enkelt film er rimeligt godt dækket ind.

SPARQL’en findes her og resultatet ses her.

Det er ikke overraskende at et af de elementer der findes ved alle de 25 film er at de er oplistet i De 25 bedste danske film. Det er lissom en tautologi… Hvis vi går videre ned i hyppighed finder vi at Bodil Kjer og Anne Marie Helger er de højest placerede personer.

Bodil Kjer forbindes nok mest med gråtonede film fra 1940’erne og 1950’erne – i listen finder man hende som skuespiller i Otte akkorder, John og Irene og Mød mig på Cassiopeia – men i sin senere karriere gjorde hun sig også bemærket, dels som skrøbelig frue i Strømer, dels i den første danske Oscarvindende spillefilm. Hun er ikke en overraskelse.

Hvad jeg finder overraskende er at Anne Marie Helger ligger med 5 elementer, og dermed den næsthøjeste person på listen. Hun er skuespiller i Strømer, Johnny Larsen, selvfølgelig Koks i kulissen, og Erik Clausens De frigjorte. Hun figurerer også som manuskriptforfatter på Christian Braad Thomsens film.

En tak længere nede kommer Erik Balling, Ebbe Rode, Ib Schønberg og Anders Refn. Balling er producent på to film på listen og stod for både instruktion og manuskript på Poeten og Lillemor. Anders Refn er filmklipper på to og var tillige i en dobbeltrolle med instruktion og manuskript til Strømer.

Min navnebror Finn Nielsen er med på listen i forbindelse med tre film: Strømer, Johnny Larsen og Babettes gæstebud. Han gjorde forøvrigt også en fin(n) præstation i Kærlighedens smerte, som ikke kom på listen da instruktøren allerede er repræsenteret med Kundskabens træ.

Sverige står som samproduktionsland på fire film. Det er særligt i de senere års film, men den første film er faktisk Sult som jo er fra 1960’erne.

Og så iøvrigt mangler Bodil Kjer at blive talt med en ekstra gang: Som ekstra 26. emne lister Bo Green Jensen Far til fire-serien. I denne serie indgår der en legetøjselefant ved navn Bodil Kjer…