Entertained by scandalous deceiving melancholy, hurrah!

Posted on Updated on


I my effort to beat the SentiStrength text sentiment analysis algorithm by Mike Thelwall I came up with a low-hanging fruit killer approach, — I thought. Using the standard movie review data set of Bo Pang available in NLTK (used in research papers as a benchmark data set) I would train an NTLK classifier and compare it with my valence-labeled wordlist AFINN and readjust its weights for the words a little.

What I found, however, was that for a great number of words the sentiment valence between my AFINN word list and the classifier probability trained on the movie reviews were in disagreemet. A word such as ‘distrustful’ I have as a quite negative word. However, the classifier reports the probability for ‘positive’ to be 0.87, i.e., quite positive. I examined where the word ‘distrustful’ occured in the movie review data set:

$ egrep -ir "\bdistrustful\b" ~/nltk_data/corpora/movie_reviews/

The word ‘distrustful’ appears 3 times and in all cases associated with a ‘positive’ movie review. The word is used to describe elements of the narrative or an outside reference rather than the quality of the movie itself. Another word that I have as negative is ‘criticized’. Used 10 times in the positive moview reviews (and none in the negative) I find one negation (‘the casting cannot be criticized’) but mostly the word in a contexts with the reviewer criticizing the critique of others, e.g., ‘many people have criticized fincher’s filming […], but i enjoy and relish in the portrayal’.

The top 15 ‘misaligned’ words using my ad hoc metric are listed here:


Diff. Word AFINN Classifier
0.75 hurrah 5 0.25
0.75 motherfucker -5 0.75
0.75 cock -5 0.75
0.68 lol 3 0.12
0.67 distrustful -3 0.87
0.67 anger -3 0.87
0.66 melancholy -2 0.96
0.65 criticized -2 0.95
0.65 bastard -5 0.65
0.65 downside -2 0.95
0.65 frauds -4 0.75
0.65 catastrophic -4 0.75
0.64 biased -2 0.94
0.63 amusements 3 0.17
0.63 worsened -3 0.83


It seems that reviewers are interested in movies that have a certain amount of ‘melancholy’, ‘anger’, distrustfulness and (further down the list) scandal, apathy, hoax, struggle, hopelessness and hindrance. Whereas smile, amusement, peacefulness and gratefulness are associated with negative reviews. So are movie reviewers unempathetic schadefreudians entertained by the characters’ misfortune? Hmmm…? It reminds me of journalism where they say “a good story is a bad story”.

So much for philosophy, back to reality:

The words (such as ‘hurrah’) that have a classifier probability on 0.25 and 0.75 typically occure each only once in the corpus. In this application of the classifier I should perhaps have used a stronger prior probability so ‘hurrah’ with 0.25 would end up on around the middle of the scale with 0.5 as the probability. I haven’t checked whether it is possible to readjust the prior in the NLTK naïve Bayes classifier.

The conclusion on my Thelwallizer is not good. A straightforward application of the classifier on the movie reviews gets you features that look on the summary of the narrative rather than movie per se, so this simple approach is not particular helpful in readjustment of the weights.

However, there is another way the trained classifier can be used. Examining the most informative features I can ask if they exist in my AFINN list. The first few missing words are: slip, ludicrous, fascination, 3000, hudson, thematic, seamless, hatred, accessible, conveys, addresses, annual, incoherent, stupidity, … I cannot use ‘hudson’ in my word list, but words such as ludicrous, seamless and incoherent are surely missing.

(28 January 2012: Lookout in the code below! The way the features are constructed for the classifier is troublesome. In NLTK you should not only specify the words that appear in the text with ‘True’ you should also normally specify explicitely the words that do not appear in the text with ‘False’. Not mentioning words in the feature dictionary might be bad depending on the application)



2 thoughts on “Entertained by scandalous deceiving melancholy, hurrah!

    Harold Baize said:
    June 4, 2014 at 4:29 pm

    Thank you for the word list for sentiment analysis. I am applying it to progress notes for mental health patients, based on an example using R. I’m finding that many words have context specific meaning in mental health. As example, the word “alert” is given a -1 rating in your list. Apparently the meaning attached is that of “alarm” whereas in mental health the meaning is positive (mentally focused and aware).

    Rather than make multiple changes to your list of words, it occurs to me that we might follow your methods to develop a list specific to mental health contexts. Can you direct me to a concise description of your methodology?

      Finn Årup Nielsen responded:
      June 4, 2014 at 4:57 pm

      When I developed the word list I used my own judgement as well as searched Twitter (and sometimes read Wikipedia articles) to get examples of its use in context. I think you should get hold on an appropriate corpus for your domain and then search for the sentences where the words occurs. Then you will get a rough estimate of how often it appears and negative and positive contexts.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s