“Overzealous business types”?

Posted on Updated on

The University of Copenhagen and its problematic dismissal of notable scientist Hans Thybo have now landed in an editorial of Nature: “Corporate culture spreads to Scandinavia“. Their concluding claim is that “the threat is the colonization of universities by overzealous business types” (against academic freedom).

Interestingly, though the majority of the university board members is required by law to be from outside the university (not necessarily business), the university management has usually an academic background. And this is also the case for the management around Hans Thybo:

  1. The head of department for Hans Thybo is Claus Beier, see “Hans Thybos institutleder om fyringssagen“. Beier is a PhD and a professor with a long series of publications in climate change as can be studied on Google Scholar.
  2. Dean is John Renner Hansen, see “KU spildte ½ million på konsulentundersøgelse af Thybo for misbrug af forskningsmidler“. He is also researcher and claims to have “Approximately 600 publications in international refereed journals”
  3. Head of the university is Ralf Hemmingsen that I know as a notable researcher in psychiatry.

I am not convinced by the arguments in the Nature editorial which sets up “business types” against academics. I think that the case should rather be seen against the background of the case with Milena Penkowa and another story around the possible abuse of research funds on the Copenhagen University Hospital, see “Ny sag om fusk med penge til forskning“.

Guess which occupation is NOT the most frequent among persons from the Panama Papers

Posted on Updated on

POLITICIAN! Occupation as politician is not very frequent among people in the Panama Papers. This may come as a surprise to those who had studied a bubble chart put in a post on my blog. A sizeable portion of blog readers, tweeters and probably also Facebook users seem to have seriously misunderstood it. The crucial problem with the chart is that it is made from data in Wikidata, which only contains a very limited selection of persons from the Panama Papers. Let me tell you some background and detail the problem:

  1. Open Knowledge Foundation Danmark hosted a 2-hours meetup in Cafe Nutid organized by Niels Erik Kaaber Rasmussen the day after the release of the Panama Papers. We were around 10 data nerds sitting with our laptops and with the provided links most if not all started downloading the Panama Papers data files with the names and company information. Some tried installing the Neo4J database which may help querying the data.
  2. I originally spend most of my time at the cafe looking through the data by simple means. I used something like “egrep -i denmark’ on the officers.csv file. This quick command will likely pull out most of the Danish people in the release Panama Papers. The result of the command is a small manageable list of not more than 70 listings. Among the names I recognized NO politician, neither Danish nor international.
  3. The Danish broadcasting company DR has had a priority access to the data. It is likely they have examined the more complete data in detail. It is also likely that if there had been a Danish politician in the Panama Papers DR would have focused on that, breaking the story. NO such story came.. Thus I think that it is unlikely that there is any Danish politicians in the more complete Panama Papers dataset.
  4. Among the Danish listings in the officers.csv file from the released Panama Papers we found a couple of recognizable names. Among them was the name Knud Foldschack. Already Monday, the day of the release, a Danish newssite had run a media story about that name. One Knud Foldschack is a lawyer who has involved himself in cases for leftwing causes. Having such a lawyer mentioned in the Panama Papers was a too-good-to-be-true media story, – and it was. It turned out that Knud Foldschack had no less than both a father and a brother with the same name, and the newssite now may look forward to meet one of the Foldschacks in court as he wants compensation for being wrongly smeared. His brother seems to be some sort of business man. René Bruun Lauritsen is another name within the Danish part of the Panama Papers. A person bearing that name has had unfavourable mentioning in Danish media. One of the stories was his scheme of selling semen to women in need of a pregnancy. His unauthorized handling of semen with hand delivery got him a bit of a sentence. Another scheme involved outrageous stock trading. Whether Panama-Lauritsen is the same as Semen-Lauritsen I do not know, but one would be disappointed if such an unethical businessman was not in the Panama Papers. A third name shares a fairly unique name with a Danish artist. To my knowledge Danish media had not run any story on that name. But the overall conclusion of the small sample investigated, is that politicians are not present, but names may be related to business persons and possibly an artist.
  5. Wikidata is a site in the Wikipedia family of sites. Though not well-known, the Wikidata site is one of the most interesting projects related to Wikipedia and in terms of main namespace pages far larger than the English Wikipedia. Wikidata may be characterized as the structured cousin of WIkipedia. Rather than edit in free-form natural language as you do in Wikipedia, in Wikidata you only edit in predefined fields. Several thousand types of fields exist. To describe a person you may use fields such as date of birth, occupation, authority identifiers, such as VIAF, homepage and sex/gender.
  6. So what is in Wikidata? Items corresponding to almost all Wikipedia articles appear in Wikidata – not just the articles in the English Wikipedia, but also for every language version of Wikipedia. Apart from these items which can be linked to WIkipedia articles, Wikidata also has a considerable number of other items. For instance, one Dutch user has created items for a great number of paintings for the National Gallery of Denmark, – painting which for the most part have no Wikipedia article in any language. Although Wikidata records an impressive number of items, it does not record everything. The number of persons in Wikidata is only 3276363 at the time of writing and rarely includes persons that hasn’t made his/her mark in media. The typical listing in the Panama Papers is a relative unknown man. He will unlikely appear in Wikidata. And no one adds such a person just because s/he is listed in the Panama Papers. Obviously Wikidata has an extraordinary bias against famous persons: politicians, nobility, sports people, artists, performers of any kind, etc.
  7. Items for persons in Wikidata who also appear in the Panama Papers can indicate a link to the Panama Papers. There is no dedicated way to do this but the  ‘key event’ property has been used for that. It is apparently noted Wikimedian Gerard Meijssen who has made most of these edits. How complete it is with respect to persons in Wikidata I do not know, but Meijssen also added two Danish football players who I believe where only mentioned in Danish media. He could have relied on the English Wikipedia which had a overview of Panama Paper-listed people.
  8. When we have data in Wikidata, there are various ways to query the data and present them. One way use wiki whizkid Magnus Manske’s Listeria service with a query on any Wikipedia. Manske’s tool automagically builds a table with information. Wikimedia Danmark chairman Ole Palnatoke Andersen apparently had discovered Meijssen’s work on Wikidata, and Palnatoke used Manske’s tool to make a table with all people in Wikidata marked with the ‘key event’ “Panama Papers”. It only generates a fairly small list as not that many people in Wikidata are actually linked to the Panama Papers. Palnatoke also let Manske’s tool show the occupation for each person.
  9. Back to the Open Knowledge Foundation meeting in Copenhagen Tuesday evening: I was a bit disappointed not being able to data mine any useful information from the Panama Papers dataset. So after becoming aware of Palnatoke’s table I grabbed (stole) his query statement and modified to count the number of occupations. Wikimedia Foundation – the organization that hosts Wikipedia and Wikidata – has setup a so-called SPARQL endpoint and associated graphical interface. It allows any Web user to make powerful queries across all of Wikidata’s many millions of statements, including the limited number of statements about Panama Papers. The service is under continuous development and has in the past been somewhat unstable, but nevertheless is a very interesting service. Frontend developer Jonas Kress has in 2016 implemented several ways to display the query result. Initially it was just a plain table view, but now features results on a map – if any geocoordinates are along in the query result – and a bubble chart if there is any numerical data in the query result. Other later implemented forms of output results are timelines, multiview and networks. Making a bubble chart with counts of occupations with the SPARQL service is nothing more than a couple of lines of commands in the SPARQL language, and a push on the “Run” button. So the Panama Papers occupation bubble chart should rather be seen as a demonstration of capabilities of Wikidata and its associated services for quick queries and visualizations rather than a faithful representation of occupation of people mentioned in the released Panama Papers.
  10. A sizeable portion of people misunderstood the plot and regarded it as evidence of the dark deeds of politicians. Rather than a good understanding of the technical details of Wikidata, people used their preconceived opinions about politicians to interpret the bubble chart. They were helped along the way by, in my opinion, misleading title (“Panama Papers bubble chart shows politicians are most mentioned in document leak database”) and incomplete explanation in an article of The Independent. On the other hand, Le Monde had a good critical article.
  11. I believe my own blog were I published the plot was not to blame. It does include a SPARQL command so any knowledgeable person can see and modify the results himself/herself. Perhaps the some people were confused of my blog describing me as a researcher, – and thought that this was a research result on the Panama Papers.
  12. My blog has in its several years of existence had 20,000 views. The single post with the Panama Papers bubble chart yielded a 10 fold increase in the number of views over the course of a few days, – my first experience with a viral post. Most referrals were from Facebook. The referral does not indicate which page on Facebook it comes from, so it is impossible to join the discussion and clarify any misunderstanding. A portion of referrals also came from Twitter and Reddit where I joined the discussion. Also social media users using the WordPress comment feature on my blog I tried to engage. On Reddit I felt a good response while for Facebook I felt it was irresponsible. Facebook boosts misconceptions and does not let me join the discussion and engage to correct any misconceptions.

    The plot of a viral post: Views on my blog around the time with the Panama Papers bubble chart publication.
  13. Is there anything I could have done? I could have erased my two tweets and modified my blog post introducing a warning with a stronger explanation.

Summing up my experience with the release of the Panama Papers and the subsequent viral post, I find that our politicians show not to be corrupt and do not deal with shady companies – except for a few cases. Rather it seems that loads of people had preconceived opinions about their politicians and they are willing to spread their ill-founded beliefs to the rest of the world. They have little technical understand and does not question data provenance. The problems may be augmented by Facebook.

And here is the now infamous plot:


Occupations of persons from Panama Papers

Posted on Updated on

Can we get an overview of the occupations of the persons associated with the Panama Papers? Well … that might be difficult, but we can get a biased plot by using the listing in Wikidata, where persons associated with the Panama Papers seems to be tagged and where their occupation(s) is listed. It produces the plot below.


It is fairly straightforward to construct such a bubble chart given the new plotting capabilities in the Wikidata Query Service. Dutch Wikipedian Gerard Meijssen seems to have been the one who has entered the information in Wikidata linking Panama Papers to persons via the ‘significant event‘ property. How complete he yet has managed to do this I do not know. Our Danish Wikipedian Ole Palnatoke Andersen set up a page on the Danish Wikipedia at Diskussion:Panama-papirerne/Wikidata tabulating with the nice Listeria tool of Magnus Manske. Modifying Ole’s SPARQL query we can get the count of occupations for the persons associated with the Panama Papers in Wikidata.

SELECT ?occupationLabel(count(distinct ?person) as ?count) WHERE {
  ?person wdt:P793 wd:Q23702848 ; wdt:P106 ?occupation .   
  service wikibase:label { bd:serviceParam wikibase:language "en" . }
} group by ?occupationLabel

Some people may see that politicians are the largest group, but that might simply be an artifact of the notability criterion of Wikidata: Only people who are somewhat notable or are linked to something notable are likely to be included in Wikidata, e.g., the common businessman/woman may not (yet?) be represented in Wikidata.

The bubble chart cuts letters of the words for the occupation. ‘murd’ is murderer. Joaquín Guzmán has his occupation set to murderer in Wikidata, – without source…


Om Henrik Krügers ‘Sømænd i Helvede’

Posted on

Sært at en enorm katastrofe med over tusinde dræbte kan affærdiges som en lille promille i 2. Verdenskrigs hav af rædsel. På sin vis virker det tyske overraskelsesangreb på den italienske havn Bari i 1943, hvor de fik ram på allierede skibe lastet med konventionel ammunition og sennepsgasbomber, som en parrallel til Henrik Krügers bog om samme. På trods af at hændelsen omtales som Lille Pearl Harbor, finder man ikke at angrebet indtager en større plads i litteraturen om 2. Verdenskrig. Heller ikke Krügers bog har gjort sig særligt bemærket. Krüger har selv udgivet bogen på on-demand-forlaget Skriveforlaget, og jeg fandt den tilfældig i udsalg fra det lokale bibliotek for vel ikke mere end 10 kroner.

Selv blev jeg overrasket over at læse at man ikke blot havde eksperimenteret med giftgas under 2. Verdenskrig, men tillige fabrikeret et stort antal giftgasbomber og transporteret dem til Europa til opmagasinering just-in-case. Krüger argumenterer for at adskillige døde som følge af hemmeligholdelsen af ladningen med giftgas, – giftgas, der havde regnet ned over soldater og søfolk efter at ammunitionsskibene var eksploderet. Grunden til at vi har hørt så lidt om angrebet skyldtes måske at den blot lagde sig i rækken af krigens almindelige død. Det skete på mindre end en time den 2. december 1943. Samme nat sendtes i følge A.C. Graylings opgørelse over 400 bombefly mod Berlin og natten efter over 500 mod Leipzig, hvor Grayling noterer 1.717 døde. Tænksom bliver man når man hører det tyske sprog blandt turister, hvis forfædre 2. generationer bagud kan have lidt i brandbombernes helvede.

Krüger skriver at det er en historie der aldrig er fortalt. Krüger støtter sig dog til engelsk-sprogede bøger. Hvor han får merit er gennem den danske vinkel, hvor han har interviewet flere danskere omkring skibet med navnet Lars Kruse. Med dette får han mindet de danske sømænds stille heroiske indsats.

Fra LibraryThing.

Kritik af Thomas Ladegaards ‘Palmemordet

Posted on Updated on

Så kom endnu en dansk bog om Palmemordet. Og det op til 30-årsdagen. Teksten er på henvend 190 sider og altså ikke så lang som den dansk-oversatte Jan Bondeson bog. Mens andre Palmemordsforfatter ofte vælger en vinkling på stoffet så undgår Thomas Ladegaard at introducere nye tendentiøse teorier i bogen. Ja, faktisk lægges er det pauvret hvad der lægges frem af nyt materiale eller analyse, og Ladegaard synes ikke rigtigt at komme med nye meninger. Tag Christer Pettersson og vurderingen af hans skyld. Der ligger Ladegaards sig ganske op af Gunnar Wall. For kilderne benytter han sig tilsyneladende hovedsageligt af tidligere udgivne bøger og særligt Granskningskommissionen rapport. Lidt interview med Gunnar Wall og Poutiainen er det blevet til. På sin vis er det udmærket at Ladegaard holder sig til the-middle-of-the-road.

Givet at jeg har skrevet det meste af den danske Wikipedia var jeg ganske spændt på om bogen er påvirket Ladegaard og om der skulle være reference til WIkipedia. For mig at se er det tydeligt at Ladegaard har læst den danske Wikipedia artikel. Bogens struktur synes at være inspireret af den danske Wikipedia artikels struktur, og i visse afsnit finder man reminisenser, der enten skyldes at han har læst Wikipedia og haft den i baghovedet eller at vi har samme forlæg.

For eksempel på side 85 skriver Ladegaard: “Beviserne mod Underwood var nemlig overvældende i form af DNA, fodspor, ballistiske test, fund af tape og Underwoods tidligere trusler og jalousi”. På WIkipedia har jeg skrevet: “Til grund for afgørelsen lagde appelretten at beviserne mod Underwood var overvældende: DNA, fodspor, ballistiske test, fund af tape og Underwoods tidligere trusler og jalousi”

På side 140 skriver Ladegaard: “Rimborn skyndte sig tilbage til politibovedkvarteret, hvorefter man udsendte et signalement af to gerningsmænd, der muligvis tilhørte Ustasa-bevægelsen. I de kaotiske timer efter mordet blev dette signalement sendt ud og bidrog yderligere til forvirringen.” På Wikipedia: “Rimborn tog tilbage til politihovedkvarteret hvor man udfærdigede et signalement på to mænd med angivelse af at de muligvis tilhørte Ustasja-bevægelsen. I de kaotiske timer efter mordet blev dette signalement sendt rundt på telex omkring klokken to om natten.” Min tekst er med kilde fra Bondeson, – som det ses i referencen på Wikipedia. Bondeson skriver: “Rimborn vender straks tilbage til det kaotiske politihovedkvarter, hvor man udarbejder en generel alarm til hele det svenske politi.”

Hist og her er der andre eksempler hvor Ladegaard lader sig inspirere. For mig at se er der ikke tale om plagiering. Når jeg skriver på Wikipedia parafraserer jeg også, – måske for meget nogle gange. Skulle Ladegaard have citeret Wikipedia? Nja, det skal man normalt lade være med. Normalt gør os Wikipedianer et forsøg på at være korrekte, men man bør anse Wikipedia som en trin på vejen til de “rigtige” kilder og citerer dem i stedet. Man kan eventuelt takke Wikipedia som indgangsportal. Jeg finder ikke Wikipedia nævnt i appendix.

Det er to problemer når skribenter låner hist og her fra Wikipedia: 1) Hvis ikke de skriver at deres parafrase eller citat er fra Wikipedia kan Wikipedia komme til at fremstå som om det er Wikipedia der har lånt fra skribenten og ikke omvendt. 2) Hvis skribenten tror at det der står på Wikipedia er rigtigt og benytter det i sin tekst uden kilde til Wikipedia, så vil en fremtidig Wikipedianer kunne komme for skade at benytte skribentens tekst som kilde på Wikipedia. Nu sker der så en cirkulær killdereference, hvor fiktion kan opstår. Jeg ser ikke at Ladegaard forsynder sig her i det sidste tilfælde.

Overordnet synes jeg bogen er jævnt godt skrevet og den kommer rundt om alle de konventionelle emner med en fin vægtning. Det var nyt for mig at høre om “Stay behind”-netværket. Her synes Ladegaard at støtte sig til Gunnar Walls nyeste bog, som jeg ikke har haft lejlighed til at læse. En anden historie som jeg ikke havde hørt før var om det svenske DC-3 fly skudt ned af Sovjetunionen i 1952. En af kapitlerne er en lille Palme-biografi og navnene på politifolkene der har været nævnt som involveret i sagen lader Ladegaard nævne ved navn.

Hvis jeg skal kritisere elementer ved fremstilling: Ladegaard skriver på side 61 at hvis vidnet Morelius tidsangivelse holder, så blev “Palme offer for en forbrydelse, hvor der var flere involverede”. Det er jeg faktisk ikke enig. En spekulation der vist aldrig er fremført er at gerningsmanden har stået ved Dekorima og ventet og skudt Palme. Mordet kunne for eksempel have været sket som et mislykket gaderøveri, – ikke at jeg selv tillægger den teori noget særligt. En andet element er ved omtalen af en af 1970’ernes skandaler. Her skrives “justitsminister Lennart Geijer havde gjort brug af en prostitutionsring”. Jeg var under indtryk af at han blot var under beskyldninger for det. Det er muligt jeg tager fejl, men på WIkipedia har jeg ihvertfald formuleret mig med “Palmes justitsminister Geijer og oppositionslederen Thorbjörn Fälldin blev beskyldt for at have købt seksuelle ydelser.” Der er forskel på at benytte en prostitutionsring og at være beskyldt for at benytte en prostitutionsring.

På side 35 skriver Ladegaard “…besluttede parret at skifte til det modsatte fortov, da Lisbet gerne ville se på en kjole i et butiksvindue.” Jeg var under indtryk af at det først var efter de havde krydset gade at Lisbet valgte at stoppe op ved butiksvinduet, men jeg kan se at også Bondeson skriver at parret krydsede gaden fordi Lisbet ville. Det indgår ellers i en af fringe teorierne at Palme skulle eller ville mødes med nogen på vejen. Granskningskommissonens side 148 nævner ikke hvem der traf beslutningen om at krydse gaden.

Som så mange andre Palmemordsforfatter vælger Ladegaard at kritisere det svenske politis indsats stærkt. Det forekommer mig for nemt. På et tidspunkt vil jeg måske skrive om at det svenske politi nok ikke har været helt så dårligt som de ofte bliver malet frem til.

Fra LibraryThing.

Strategies of Legitimacy Through Social Media: The Networked Strategy

Posted on

Several years ago we started a research project, Responsible Business in the Blogosphere, together with, among others, members from the Corporate Social Responsibility (CSR) group at the Copenhagen Business School (CBS). The research project looked at social media, companies and their corporate social responsibility. The start of the project coincided with the ascent of Twitter and a number of our research publications from the project considered data analysis of Twitter message. Among them were my A new ANEW: Evaluation of a word list for sentiment analysis in microblogs about the development and evaluation of my sentiment analysis word list AFINN and Good Friends, Bad News – Affect and Virality in Twitter with analysis of information diffusion on Twitter, that is, retweets.

Strategies of Legitimacy Through Social Media: The Networked Strategy is our latest published work in the project. It describes a pharmaceutical company adopting Twitter for communication of CSR-related topics. It is a longitudinal case study with interviews of the people behind the company Twitter account and data mining of tweets. Itziar Castelló, Michael Etter and I authored the paper.

While I did not participate in the interviews nor the interesting analysis of that information, I did a sentiment analysis and topic mining of the tweets that we collected from the company Twitter account and by searching for the company name via the Twitter search API. The results are displayed in Table I and Figure 2 of the paper.

A note from the paper that I find interesting comments on the issues faced by the company as they developed the social media method:

“… the institutional orientation to hierarchical processes requiring approval for all forms of external communication; and the establishment of fixed working hours that ended at 4pm local time coexisting alongside a policy that customer complaints must be resolved within 48 hours, which prevented SED managers from conducting real-time conversations over the Twitter platform.”

Our paper argues for “a new, networked legitimacy strategy” for stackholder engagement in social media with “nonhierarchical, non-regulated participatory relationships”.

Strategies of Legitimacy Through Social Media: The Networked Strategy is available gratis in September 2015.

Review of Val McDermid’s “Forensics: The anatomy of crime”

Posted on Updated on

Val McDermid, apparently an author of some standing as a writer of untrue crime novels, has written a true crime walkthrough of forensics topics interweaving real-life cases and comments. The fine selection of topics has no overall progressive narrative to such an extend that most of the chapters may have been permuted without loss of coherency. If there is a base for the book it is a fascination and awe for modern forensics. She is a good writer. Perhaps her crime novels has trained her in writing clear prose. She delves not into academic technicalities that could perhaps have been interesting.

She has based her book on other books as well as a good number of interviews with a broad range of forensics experts. A few of these comes from the University of Dundee: Forensics chemist Niamh Nic Daéid and forensics antropologist Sue Black.

I find McDermid view of the fallibility of forensics balanced drawing forth cases where presumed experts lack self-critique. Bernard Spilsbury and a U.S. ballistic expert Thomas Quirk are critized. For Roy Meadow, McDermid presents aspects of the tragic Sally Clark case that I do not recall having read before: The appeal was not prompted by Meadow’s evidence but by Pathologist Alan Williams that had failed to disclose blood test results. I do sometimes find popular science writing lack an appropriate level of critique to the material. McDermid is one of the better writers, but I do find one case where she oversteps the confidence we should have in science. Here is what she writes on page 164: “We already know, for instance about the existence of a ‘warrior gene’ – present mainly in men – which is linked with violent and impulsive behaviour under stress”. When I read “We know” I get mad, and when I read ‘warrior gene’ I get extra mad. Behavioral genetics is a mess full of red herrings. Recent meta-analysis of the warrior gene polymorphism MAOA-uVNTR and antisocial behavior (“Candidate Genes for Aggression and Antisocial Behavior: A Meta-analysis of Association Studies of the 5HTTLPR and MAOA-uVNTR“) reaches a 95% confidence interval on 0.98-1.32, while, interesting a very low p-value (0.00000137). The strangeness of difference between confidence interval and p-value is discussed in the paper and presently walks over my head. What seems reasonable certain is the loads of between-study heterogeneity. Any talk of warrior gene needs to acknowledge the uncertainty.

There are certainly more elements to forensics than McDermid presents. A Danish newspaper has recently run a story about cell phone tower records used in courtroom cases. A person carrying a powered cell phone reveals his/her location, – but only with a certain exactness. Cell phones may not necessarily select the nearest cell tower. From my own experience I know that my cell phone can select cell towers in other countries from where I am located, e.g., my cell phone in Nordsjælland in Denmark can easily select a cell tower in Sweden 15 to 20 kilometers or more away and my cell phone in Romania switched to a Ukrainian cell tower perhaps 20 kilometers or more away. U.S state Oregon has seen the case of Lisa Marie Roberts that on her bad lawyer’s advice pleaded guilty in 2004 because of critical important cell tower evidence. In 2013 she was freed.

I was struck by one of the stories presented that originates from the book of criminal lawyer Alex McBride. A surveillance camera records a case of apparently straightforward violence, but McBride is able to get his client off by threatening to use another part of the camera recording showing a policeman mishandling a person in a case of wrongful arrest. The prosecution dropped the charge for the original case. It does not seem fair to the victim of the original crime that the criminal can go free just because another crime is committed. To me it looks like a kind of corruption and extortion.

(Review also available on LibraryThing)