Several years ago we started a research project, Responsible Business in the Blogosphere, together with, among others, members from the Corporate Social Responsibility (CSR) group at the Copenhagen Business School (CBS). The research project looked at social media, companies and their corporate social responsibility. The start of the project coincided with the ascent of Twitter and a number of our research publications from the project considered data analysis of Twitter message. Among them were my A new ANEW: Evaluation of a word list for sentiment analysis in microblogs about the development and evaluation of my sentiment analysis word list AFINN and Good Friends, Bad News – Affect and Virality in Twitter with analysis of information diffusion on Twitter, that is, retweets.
Strategies of Legitimacy Through Social Media: The Networked Strategy is our latest published work in the project. It describes a pharmaceutical company adopting Twitter for communication of CSR-related topics. It is a longitudinal case study with interviews of the people behind the company Twitter account and data mining of tweets. Itziar Castelló, Michael Etter and I authored the paper.
While I did not participate in the interviews nor the interesting analysis of that information, I did a sentiment analysis and topic mining of the tweets that we collected from the company Twitter account and by searching for the company name via the Twitter search API. The results are displayed in Table I and Figure 2 of the paper.
A note from the paper that I find interesting comments on the issues faced by the company as they developed the social media method:
“… the institutional orientation to hierarchical processes requiring approval for all forms of external communication; and the establishment of fixed working hours that ended at 4pm local time coexisting alongside a policy that customer complaints must be resolved within 48 hours, which prevented SED managers from conducting real-time conversations over the Twitter platform.”
Our paper argues for “a new, networked legitimacy strategy” for stackholder engagement in social media with “nonhierarchical, non-regulated participatory relationships”.
Strategies of Legitimacy Through Social Media: The Networked Strategy is available gratis in September 2015.
So Posterous has been acquired by Twitter. Great. And Posterous Spaces will remain up and running without disruption. Great.
“Twitter says that it will give users “ample notice” if it is going to make any changes to the service. We’ll take them at their word on this one, but if I was someone running a personal blog on Posterous, I would think about finding another place to host it soon.”
“So, in other words, Posterous will be available to you now, but we’ll let you know if we plan on shutting it down. That must be a fairly likely scenario to warrant that language being included in the initial announcement of the acquisition.”
Our Good Friends, Bad News – Affect and Virality in Twitter article from the Responsible Business in the Blogosphere project has now been published on Springer. It has for some time been available from arXiv and our departmental publication database.News seems often to focus on negativity: If you read News values article on Wikipedia you find stated that “Bad news is more newsworthy than good news”. So we examined the interaction between “newsness”, sentiment and retweeting with Twitter messages: Are news tweets with negative sentiment more retweeted? Read about that in the article.
A related recent paper is Does Bad News Go Away Faster? which has Jon Kleinberg among the co-authors.
Our principal investigator in the Responsible Business in the Blogosphere project Professor Mette Morsing and Itziar Castello have written a Danish article mentioning Good Friends, Bad News articles in its relation to how companies use and get exposed on social media.
Unfortunately, the workshop SocialComNet 2011 – where the article should have been presented – had to move to another venue – an information that came out around just one month prior to the event.
I have previously blogged about sentiment analysis. Code for simple sentiment analysis with my AFINN sentiment word list is also available from the appendix in the paper A new ANEW: Evaluation of a word list for sentiment analysis in microblogs as well as ready for download. It might be a little difficult to navigate the code, so here I have made the simplest example in Python of sentiment analysis with AFINN that I could think of.
(2012-12-01: Updated link to new gist at github)
Affective Norms for English Words (ANEW) is one affective word list among a number of others. ANEW seems to be regarded as a sort of reference in sentiment analysis research. It records valence, arousal and dominance on 1034 words on a continous scale between 1 and 9.A downside with ANEW is the restricted license: You are not allowed to use it in for-profit projects. Another problem is that ANEW was not developed for modern sentiment analysis. Slang words, such as ‘shit’ and ‘wow’, do not occure in ANEW. ANEW also lacks the inflection variants: It has ‘annoy’, but not ‘annoyed’, ‘annoys’, ‘annoying’ and ‘annoyingly’. You have to do, e.g., word stemming to match words against ANEW. I began to construct my own word list when I started to do sentiment analysis of COP15 Twitter messages and “temporal sentiment analysis”. It presently lists 1480 words with their associated valence between -5 and +5. I have inflection variants and slang words. In the past days I have looked into the discrepancies between my list and ANEW. The figure shows a scatterplot of the valences of mine and ANEW for the words that I can match. I stemmed both ANEW and my list and listed the words that differed in positive/negative valuence. These words were: affected, aggression, aggressions, aggressive, applause, alert, alienation, brave, hard, mischief, mischiefs, profiteer, silly. ‘Brave’ and ‘applause’ I record as negative: A clear sign error in my list, that I need to correct. ‘Affected’ I have as negative (it is usually used in a negative sense), while ANEW has ‘affection’ as very positive. My word stemming has a problem here. There is a similar problem with my ‘alienation’ and ‘profiteer’ compared to ANEW’s ‘alien’ and ‘profit’. ‘Aggression’ and similar words I have as negative while ANEW has ‘aggressive’ as slightly positive which I find strange. The same pattern occures for ‘silly’. That is negative to me. ‘Hard’ I have as slightly negative while ANEW has it slightly positive. In some connotations the word is used positively, but otherwise it seems to be used in mostly negative contexts. ‘Alert’ I too have as negative, while ANEW sets it as positive. On Twitter it seems mostly to be used on a negative sense, e.g.: Lost Pet Lost Pet Alert, have you seen Jake (link) or Russian Armed Forces on High Alert Over North Korea (link) ‘Mischief(s)’ I have as negative while it is slightly positive in ANEW. WordNet has two senses of the word that is clearly negative: “reckless or malicious behavior that causes discomfort or annoyance in others” and “the quality or nature of being harmful or evil”. WordNik reports a lot of definitions with one somewhat positive: “An inclination or tendency to play pranks or cause embarrassment.” On Twitter it often seems to be used ironically with a basic positive sense. I need to fix the errors in my word list and extend it, but I think it is a good alternative to ANEW. By the way Sune Lehmann Jørgensen has humorously called my sentiment word list AFINN as a wordpun on my name and ANEW. :-)
MongoDB by default exports to JSON, but I discovered that it can also export to a comma-separated values (CSV) file. You need to specify a field list. An example with data from the streaming API of Twitter with numerous fields:
mongoexport -d twitter -c tweets --csv -f text,created_at,
user.following -o tweets.csv