AFINN: A new word list for sentiment analysis on Twitter

Posted on Updated on

In the Responsible Business in the Blogosphere project I have in my own sweat of the brow created a sentiment lexicon with 2477 English words (including a few phrases) each labeled with a sentiment strength and targeted towards sentiment analysis on short text as one finds in social media. It has been constructed with the help of word lists maintained by Steve DeRose (Steven J. DeRose) and Greg Siegle.

We have used my word list for sentiment analysis on Twitter in a few studies, the most notable so far is Good Friends, Bad News – Affect and Virality in Twitter. However, we have not been quite sure how well it performed compared to other sentiment lexicons such as ANEW. I have included a number of words frequently used on the Internet that I have not found in ANEW: Obscene words and Internet slang acronyms such as LOL (laughing out loud). So do these extra words make my word list better? ANEW is constructed by multiple persons rating a word and should be much better validated than my list. So maybe this list is better?

In a simple comparison between ANEW and my list I looked on the correlation with the sentiment strength (valence) of each word in the list. I have previously written about that issue. Such an analysis doesn’t really answer how good they are for sentiment analysis.

A few weeks ago Sune Lehmann mentioned that in their study they got tweets labeled for sentiment strength by the Amazon Mechanical Turk (AMT). Their study was the Twittermood study (or “Pulse of the Nation” study) that were much mentioned in the media, e.g., The New Scientist and Scientific American. See also their YouTube video.

Alan Mislove had obtained 1,000 AMT-labeled tweets that each was labeled by 10 AMT workers and rated from 1 to 9. Through Sune I got hold on the Mislove data.

With the Mislove data I have now made a more careful study of the performance of the different word lists and this study is now written up in the position paper A new ANEW: Evaluation of a word list for sentiment analysis in microblogs. The version on our departmental homepage has the code listing.

When I measured the performance of my application of word lists with a correlation coefficient (between the AMT “ground truth” and my predictions for the sentiment of the tweet) I found that my list and ANEW were quite ahead of the word lists in General Inquirer and OpinionFinder. To be fair to the two latter word lists I should say that I did not utilize all their information for each word, — only the strength polarity. My list was slightly ahead of ANEW. Whether this is statistically significant I don’t know as I didn’t get around to perform a statistical test.

I also tried SentiStrength Web service sentiment analyzer on the 1,000 Mislove tweets. This is not just a simple word list but is a program that has, e.g., handling of emoticons, negations and spelling. This Web service showed to be the best. Slightly ahead of my list and ANEW.

I have now distributed my 2477-word list from our department server (the zip file link). During the course of evaluation I found a few embarrassing mistakes in my previous list: I thought it had 1480 words but it turned out that only 1468 were unique! Words were also sometimes in alphabetic disorder. The new list should have no such problems.

(Typo correction: 2011-03-17)

(Update 2015-08-25: If you are a Python programmer you might want to take a look at my afinn Python package available here: https://github.com/fnielsen/afinn)

26 thoughts on “AFINN: A new word list for sentiment analysis on Twitter

    P Ferrell said:
    September 7, 2011 at 5:14 am

    Thank you for making your word list public. I am using it on a basic sentiment analysis tool which will be driving expressions on a robotic face in response to textual inputs.It is certainly not a robust method and I make no claims of great success, but my children have enjoyed playing with it.

    Finn Årup Nielsen said:
    September 7, 2011 at 9:21 am

    Dear P Ferrell,Your face expression robot is quite interesting and I am glad your children enjoy it. Thanks for telling. If you are using Python you can take a look at http://fnielsen.posterous.com/simplest-sentiment-analysis-in-python-with-af where there is an example sentiment analysis function. However, as far as I could determine it did not make very much difference how you apply the word list and I guess you already have made the optimal use of the word list. For better performance we would need another approach.

    Tom said:
    December 22, 2011 at 11:21 am

    May I ask why your list does not include words like:> believe (positive… strangely enough your list does contain disbelieve!)> succeed (positive)> remote (bit negative)> battling (negative.. like fighting)> concerned (negative.. it does contain unconcerned!)> unlikely (negative)> likely (positive)I guess it cannot include all of them, but these seem pretty straightforward? And I pulled these from just one random news article.

    Finn Årup Nielsen said:
    January 3, 2012 at 11:16 am

    I agree that ‘succeed’ and ‘battling’ should be in the list. Thanks for these tips. ‘Concerned’ and ‘unlikely’ I have also put in. ‘Remote’ are often used as ‘remote control’ or for geographic reference: ‘remote Patagonia’. ‘Likely’ is difficult. Examples from Twitter: "Lots of disruption likely", "most likely I’ll freak out and have a seizure." ‘believe’ is also difficult.My development version of the word list have now grown to 2850 items. In the future I will post an update.

    Paulo O said:
    April 1, 2012 at 4:29 pm

    Hi Finn,Where can I get the most recent afinn list?

    Finn Årup Nielsen said:
    April 2, 2012 at 12:13 pm

    The most recent word list I have put on the Web is AFINN-111 and available from: http://www2.imm.dtu.dk/pubdb/views/publication_details.php?id=6010 See also: http://neuro.imm.dtu.dk/wiki/A_new_ANEW:_evaluation_of_a_word_list_for_sentiment_analysis_in_microblogs I have extended the word list to 2849 words but have not yet checked it and put it on the Web.Thanks for your interest.

      Pablo said:
      June 3, 2014 at 1:08 pm

      Hi Finn, it keep asking for username/password so I can’t access the files. Do I need special credentials? Thanks

    Trip Technician said:
    October 15, 2013 at 1:48 pm

    your work here is fantastic. I have been thinking about this and need to ask about where we should go when it comes to developing a sentiment model that is more about a spectrum of different emotional colours, as opposed to a +ve / -ve linear scale. For instance this guy http://www.derose.net/steve/resources/emotionwords/ewords.html has data which I used for some Twitter analytics. Human emotionality is so complex and rich that I’m sure there are many benefits of classifying words/phrases into a bucket of, say 5 -30 key primary emotions.

    I am not wishing to seem lazy but I am hoping that someone else can do some of the work in this. If you differentiate a small number of different emotional modes sentiment analysis jumps to lightspeed in terms of the richness of inference you can make from a word-analysis approach. As humans we find it perennially engaging what others feel.

    My work on the issue is here and I hope someone finds it useful.

    http://pythonism.wordpress.com/2013/06/16/elementary-sentiment-analysis-on-a-text-using-python/

    […] this list. This is the same list I used on the course previously, and I’m very grateful to Finn Arup Nielsen for making it […]

    Nick said:
    March 2, 2016 at 2:13 pm

    Finn. Thanks very much for the AFINN list. If you’re interested I’ve made use of it in a project to perform sentiment analysis on newspaper comments described here: http://www.nickstricks.net/wp/?p=204

    naveen said:
    March 8, 2016 at 10:50 am

    Hi! I’m new to Sentiment Analysis. Can you please help in classifying comments.

      Finn Årup Nielsen responded:
      March 8, 2016 at 11:03 am

      If you know Python I suggest you use my afinn toolbox available in the Python Package Index or here: https://github.com/fnielsen/afinn The package is reasonable out-of-the-box, – provided you have knowledge of Python.

    samar said:
    May 9, 2016 at 2:35 pm

    Hi! I was going through your program at http://andybromberg.com/sentiment-analysis/. Can you please explain why you are using naive bayes if you are already giving scores to your data through AFINN(which is what we do in lexical analysis)?

      Finn Årup Nielsen responded:
      May 9, 2016 at 2:52 pm

      Note I am not Andy Bromberg, the guy that did the analysis in R, but I can try to answer anyway. I think that training a classifier may often be better than relying on a wordlist, – that is, if you got sufficient amount of training data. Particular in the case where the wordlist does not fit well with the corpus, as I would believe is the case with a movie review dataset. People achieving good performance in sentiment analysis nowadays often use multiple wordlists, bag-of-words/phrases and emoticons as features for a classifier.

        samar said:
        May 9, 2016 at 3:12 pm

        oh! Extremely sorry for this confusion. Thanks for AFINN and thanks for replying. Got my answer :)

        Finn Årup Nielsen responded:
        May 9, 2016 at 3:16 pm

        No problem! :-)

        samar said:
        May 16, 2016 at 8:08 am

        Hi.
        I was going through the AFINN wordlist.I dont understand why some words like perfect, peace, smile, stunning which despite having positive meaning are given negative score? I hope I am understanding the wordlist correctly.

    Finn Årup Nielsen responded:
    May 16, 2016 at 9:17 am

    You must be reading the file in a wrong way. My file on Github https://github.com/fnielsen/afinn/blob/master/afinn/data/AFINN-111.txt has, e.g., “perfect 3” and “peace 2”. I wonder how you are reading the sentiment score?

    Cristian Arias said:
    May 22, 2016 at 9:38 pm

    Hi. Can you tell me what is the criterion for thw word score? an algorithm or experience? Where can I read about your study for this list?
    Thank you from Colombia

    Starstruck said:
    July 5, 2016 at 11:15 am

    […] used the syuzhet package to calculate sentiment. For this example, I am using the AFINN method, which was created to handle short texts like the terse, careless streams of thought found […]

    […] game. Sigh. After the tweets were squeaky clean, I scored the sentiment of each tweet using the AFINN lexicon which gives a sentiment score between -5 to 5 for several words. Finally, I created […]

    Mahavar anjali said:
    February 21, 2017 at 11:41 am

    I want to know the method how you set the valance or strength of the word?

Leave a comment