Hot or not or what: Data mining attractiveness

Posted on Updated on

Hotornot2009-02-25t0130

From the media we hear that women are most attractive at 31. That fact is based on an “poll of 2,000 men and women, commissioned by the shopping channel QVC to celebrate its Beauty Month.” So this is a kind of science that is part of a media effort of a company. We also see such use of science in neuromarketing research. However, in this case the results are likely to be reasonably ok.

The web site Hot or Not has according to Wikipedia both been an inspiration for YouTube and Facebook. The site allows you to rate men and women based on their uploaded photo.

Back in 2009 I became aware of Hot or Not in a nerdish way: The computer programming book Programming Collective Intelligence uses the site as a real-life example for prediction based on annotation in the social web. Hot or Not has an API, so you can get some data from the site. You need an API key, and last time I checked you couldn’t obtain new keys, but I could use the one given in the book.

So I started to download data. You don’t get the individual ratings but the average rating for each person as well as a bit of demographics, e.g., the age. So there is really not so much you can do. The programming book try to predict the rating based on gender, age and location (US state).

I tried to see how the rating varied with age. I managed to make a plot of a sample of men and women from Hot or Not, and the result somewhat surprised me. I was expecting a decay in rating for women and men as a function of age, with around 31 years as a good candidate for maximum rating. However when I look on the ratings for women there is very little decay, in fact if you fit a second order polymonium you actually see a slight rise for older women. With unscrupulous extrapolation you would say that 100-year old women are maximum attractive. Men have the ‘correct’ decay with a highest rating somewhere around 30 or before. But there is considerably variance within year compared to the average between years.

One explanation for the effect seen among women is that only beautiful older ladies would “dare” to upload their image, while ugly young women are not afraid. There is also the possibility that we really cannot trust the average ratings reported to us by Hot or Not. I have got an account myself and uploaded an image. Presently I got a rating on 7.7 based on 206 people (the scale goes from 1 to 10). Hot or Not reports that I am “hotter than 74% of men on this site!”. When I compare 7.7 with the data I can download the percentage does not fit: Around 90% of males score higher than my 7.7. Yet another possibility is that the way I call the Hot or Not API does not give a fair sample of the people actually in the Hot or Not database.

Hot or Not data has been used in a few scientific reports, see, e.g., Economic principles motivating social attention in humans that made their own ratings and If I’m Not Hot, Are You Hot or Not? that has HotorNot.com employees on the author list and thereby gained access to its unique data.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s