Month: June 2010

Any "responsible" business talk in the Danish blogosphere?

Posted on

Frontpage news on the 25 June 2010 edition of Ingeni??ren, the Danish Engineering weekly magazine, has a story on “issues” in some foreign mining companies: Newmont, Rio Tinto, Freeport McMoRan, GoldCorp, Vedanta og Anglo Platinum. Accusations put forward are environmental pollution, workers protection, forced removal of the local population, corruption and even murder. The story goes on:

Watchdog DanWatch has examined contracts and found Danish companies Grundfos and FLSmidth have the mining companies as customers, e.g., selling pumps. Nothing is perhaps illegal about that. However, the two companies have stated corporate social responsibility (CSR) policies. For the investors FLSmidth writes on their website about suppliers:

In its general conditions of purchase, FLSmidth requires that subsuppliers comply with all local regulations concerning employee rights and safety and health.

And the two companies have also committed to the United Nations Global Compact, ??? a voluntary CSR commitment they can use for bragging. In interviews neither Kim N??hr Skibssted from Grundfos nor J??rgen Huno Rasmussen from FLSmidth believe that the companies have any responsibility for their customers actions. To me that would seem as a fundamental misunderstanding of what CSR commitment entails, though the issue of CSR is not so clear for a customer relation as it is for a supplier relation.

In our project Responsible Business in the Blogosphere we seek to examine how CSR issues appear in social media, so how does the present story diffuse into the blogosphere?

A search on FLSmidth and on Grundfos on Twitter returns nothing in relation to the story.
Searching the Facebook graph for Grundfos and FLSmidth gets you two items (one from DanWatch) and both linking to Ingeni??ren. Searching with Danish blog and news search engine I see only Ingeni??ren as well as one web page from Copenhagen Business School stating that one from the faculty has been cited in Ingeni??ren. Google blog search has nothing. And so far Ingeni??ren’s web page for the main story, Grundfos interview and the FLSmidth interview all have zero comments.

It appears that very little is written about such a Danish CSR story in the blogosphere, ??? which is in line with my general impression of other Danish CSR cases. The lack of CSR talk in the blogosphere will challenge our project.


The poetry of politicians: Lars Løkke Rasmussen’s attempt

Posted on Updated on

So Donald Rumsfeld made his famous poetry which according to BBC was:

there are known knowns; there are things we know we know. We also know there are known unknowns; that is to say we know there are some things we do not know. But there are also unknown unknowns – the ones we don’t know we don’t know.

From a statistics point of view the distinction between the three types makes sense: Known knows are fixed parameters in a statistical model, known unknowns are parameters to be estimated while unknown unknowns are confounds that enter the model as ‘noise’. In most cases you hope this noise has a reasonable statistical distribution (e.g., Gaussian and not a black-swan-distribution) and not too correlated with the parameters in the statistical model you are about to estimate.

Now I hear through Mikkel Wallentin that our Danish Minister of State (Prime Minister) Lars Løkke Rasmussen has made an attempt to follow Rumsfeld in the repeated pattern politician poetry in a discussion about tax. A diligent person has put it on YouTube, and my transcription is:

Så har vi altså valgt at lave et system hvor man skal aflevere lidt mindre end man gjorde får. Det er det vi har valgt, og det fører selvfølgelig til at dem der tjener mere og afleverer meget og nu afleverer lidt mindre – ja – de afleverer så mere mindre end dem der tjener lidt mindre og afleverer mindre men altså så afleverer mindre mindre.

With some loss in translation my attempt on an English version is:

So we have thus chosen to make a system where you must hand in a bit less than you did before. It is what we have chosen, and of course it leads to that those who earns more and hand in a lot and now hand in a bit less – well – they hand in more less than those who earns a bit less and hand in less and thus hand in less less.

I don’t think it was a prepared speech. It seems to lack the epistemological depth of Rumsfeld – and perhaps even logic. But humorous it is, which also Lars Løkke himself seems to realize from around 0:19 in the video.

Two-way citations in MediaWiki

Posted on

There has been some discussions on bibliographies and citations in wikis. Recently, wikimedian Samuel Klein wrote an entry on the Wiki research mailing list pointing to Wikimedia proposals WikiTextrose and Wikicite.

There is a general problem with citations: To cite the Wikicite page “[a] fact is only as reliable as the ability to source that fact, and the ability to weigh carefully that source”*. Think of the following sentence with Niels Bohr awfully miscited:

John Wayne is probably the best Flamenco dancer in the World (Bohr, 1913).

Although the MediaWiki software has an extension to structure footnotes on the individual wiki page, there is really no technical help to ensure that the stated fact is supported by the source. And we usually cannot go the to “(Bohr, 1913)” Wikipedia page and check which other wiki pages use (Bohr, 1913) as s source.

It is, however, possible to some extent to get MediaWiki to use more structured citations. In my Brede Wiki I make wiki pages for each primary sources, in my case, mostly scientific articles. Furthermore, the cite journal template in the Brede Wiki will automatically make wiki links based on the title parameter. So when I write “An early human brain mapping study with positron emission tomography found that the temporoparietal area was involved in spatial attention,[1]” on a page and the citation uses the cite journal template I get a wiki link to “A PET study of visuospatial attention”. With the “What links here” link on that page it is possible to see the citations, ??? although not in particularly detail. At least it will allow us to go both ways in the citations.

WikiTextrose and Wikicite goes further in a attempt to ensure that facts are cited, ??? and cited correctly.

The syntax suggested by WikiTextrose for a structured citations looks like this:

[[cite:isbn:067943593X:p11|”In the Second Century of the Christian Era, the empire of Rome comprehended the fairest part of the earth and most civilized portion of mankind.”| Gibbon describes the Roman empire at the time of the Antonines in very favorable terms.]]

Wikicite suggest a wiki syntax extension with “++fn”, e.g.,

Columbus was most likely Genoese++fn, although ++some historians claim he could have been born in other places, from the Crown of Aragon to the Kingdoms of Galicia or Portugal++fn, or in the Greek island of Chios++fn among others.

With a form interface Wikicite editors should then add the bibliographic details for each ‘fn’ instance. With an extra tool it would be possible to perform an article review on the citations, see the Wikicite review mockup.

Instead of having the bibliographic entries of the sources on the wiki itself, e.g., on Wikipedia, proposers suggest Wikicat as a bibliographic catalog used to support Wikicite and WikiTextrose.

I have been thinking if it would be possible to do more precise two-way citations with MediaWiki and the Semantic MediaWiki extension, and I have come up with the following scheme:

On a page with a primary source, here Scientific citations in Wikipedia, we added the bibligraphic details in an infobox-like template as well as the “facts” of the article within “fact” templates:

| title = Scientific citations in Wikipedia
| author1 = Finn ??rup Nielsen
| journal = First Monday
| volume = 3
| pages = 26
| year = 2009

== Facts ==
{{fact|Wikipedia uses scientific citations}}
{{fact|Scientific citations in Wikipedia shows some correlation with ordinary scientific citations}}

On a page that uses one of the facts we write, e.g.,

{{cite|Wikipedia uses scientific citations}}

== References ==

The fact template uses semantic markup and a semantic query

# {{{1}}} [[fact::{{{1}}}| ]] (citations: {{#ask: [[Cite::{{{1}}}]] | format=list }})

The paper template uses here only semantic markup, e.g.,

[[title::{{{title}}}]] is authored by [[has author::{{{author1}}}]] and published in [[{{{journal}}}]] in [[has publication year::{{{year}}}]].

The cite template makes use of the semantic search and query:

[[Cite::{{{1}}}]]{{#tag:ref | {{#ask: [[Fact::{{{1}}}]] | ?has author | mainlabel=- | format=list | headers=hide }} ({{#ask: [[Fact::{{{1}}}]] | ?has publication year | mainlabel=- | format=list | headers=hide }}) {{#ask: [[Fact::{{{1}}}]] }} }}.

This last template uses the somewhat obscure “#tag:ref” construct since template parameters in the standard application of the <ref> tag doesn’t work.


From the folklore of network analysis: The Erdos-Bacon number

Posted on Updated on

I have just discovered that I have an entry on IMDb through Director Dola Bonfis‘ documentary film Tankens Anatomi (The anatomy of thought). It is from 1997 but I do not recall seeing an entry for the film nor me on IMDb before. It almost makes it easy to compute my Bacon number. The Web-service The Oracle of Bacon allows you to type in name of two IMDb-listed people and it will then find the shortest path. However, I don’t seem to be present in The Oracle of Bacon database. Danish Entertainer, scientist and author Peter Lund Madsen also appears in the Tankens Anatomi movie, and he is present in the Oracle. Depending on the options set in The Oracle of Bacon it is possible to get to Kevin Bacon, although we need to go over, e.g., Mr Nice Guy which is just a recorded comedy show released on video. Mr Nice Guy features Trine Dyrholm who is a “proper” actress and from her it gets easy, e.g., by P.O.V. to Gareth Williams and Digging to China with Kevin Bacon. So it seems that I have a Bacon number of 4.

My combined Erdos-Bacon number then drops to 7.

In our research group we have relatively low Erdos numbers since our hub, Professor Lars Kai Hansen, wrote the concisely titled paper Neural Network Ensembles with Peter Salamon, – a researcher with an Erdos number of 1. The Hansen-Salamon paper from 1990 has become the most cited from our department (as far as I can determine). With Lars Kai I have written a large number of articles, e.g., Modeling of activation data in the BrainMapTM database: Detection of outliers.

Seven is still far from the five of Kiralee Hayashi, a former gymnastics champion, former scientist and present actress. According to her LinkedIn Profile she has worked at the Laboratory of Neuro Imaging (LONI), – a well-known neuroimaging research group. With noted neuroimaging researcher Paul Thompson she is on the author list together with big shot mathematician Shing-Tung Yau who has a Erdos number of 2, – according to Paul Thompson’s Wikipedia-cited Erdos number page. Their paper is Brain Surface Parameterization Using Riemann Surface Structure.

Now I have been trying to compute my Hayashi-Hayashi number. This must be 12 or less. Paul Thompson has a Hayashi-science number of one and through In vivo evidence for post-adolescent brain maturation in frontal and striatal regions Californian Terry Jernigan gets an Hayashi-science number of 2. (See also entry for the paper in the Brede Wiki). Terry is also in our Danish CIMBI brain project and with Jan Kalbitzer’s interesting neuroimaging seasonality paper Seasonal Changes in Brain Serotonin Transporter Binding in Short Serotonin Transporter Linked Polymorphic Region-Allele Carriers but Not in Long-Allele Homozygotes, where both Terry and I are in the author list, I will get a Hayashi-science number of just 3!

Allowing for the documentary/video trick and with Kiralee Hayashi and Trine Dyrholm in The Oracle of Bacon I get a Hayashi-film number of 5, and my Hayashi-Hayashi number then becomes 8.

What a small world.

(minor edit: 2012-10-16)


Closed data in climate research

Posted on

Peter Murray-Rust, Reader in Molecular Informatics at the University of Cambridge, reports from a meeting held at the Royal Institution 14 June 2010 about the cracked emails from the Climate Research Unit that became public right before the COP15 meeting 2009. He has summed up the meeting with an entry in his blog where he focuses on the issue of Open Data. One interesting bit in Murray-Rust’s entry is this:

” On more than one occasion the panel asserted that Climate data should only be analysed by experts and that releasing it more generally would lead to serious misinterpretations. It was also clear that on occasions data and been requested and refused. The reason appeared to be that these requests were not from established climate “experts”. This had led to the Freedom Of Information Act (FOI) being used to request Scientific Data from the unit. This had reached such a degree of polarisation that of over 100 requests only 10 had resulted in information being released by the University.”

I suppose that the CRU gets loads of annoying emails from people entrenced in a ‘climate denier’ camp, and researchers at CRU could probably spend the rest of their lifes serving the endless needs of these people. But I concur with Murray-Rust that

“The CRU is effectively a publicly funded body (as far as I know there is minimal industrial funding) and I believe there is a natural moral, ethical and political imperative to make the results widely available.”

One further issue is the “should only be analysed by experts”: One of the major articles in climate research, Global-scale temperature patterns and climate forcing over the past six centuries, has been under close scrutiny by other researchers resulting in two reports: one by the Committee on Surface Temperature Reconstructions for the Last 2,000 Years, National Research Council and another referred to as the Wegman Report, and 16 years after the publication of the article a corrigendum was published. The authors of the Wegman Report writes:

“We note that there is no evidence that Dr. Mann or any of the other authors in paleoclimatology studies have had significant interactions with mainstream statisticians.”

As far as I understand the critique centers at the data processing methods around a principal component analysis.

Data analysis is hard and rarely one sees a researcher being an expert both in statistics and in the application area.