So tonight we have the Eurovision Song Contest. If inexperienced Lena Meyer-Landrut manages to avoid slipping in her dress and intonate reasonably well without dropping the microphone she should win by a fair margin comparable to old Ein bi??chen Frieden Nicole‘s victory. Lena has not an exceptionel voice, but her approach to the song is quite special — with a voice something in the area of Lisa Ekdahl, Björk and je ne sais quoi (Who is Kate Nash?). This approach singles the song out among the rest of the Eurovision entries, though the problem may be that the song sounds better in the radio than when performed live.
From a machine learning point of view the competition is so interesting, that it has reached all the way to statitician Andrew Gelman. Gelman and Co. have previously blogged about the bloc voting study on the song contest, but now he writes about the new machine learning competition platform Kaggle, their first competition being Forecast Eurovision Voting with a cash prize of USD1000. For this forecast competition machine learning researcher could use previous voting patterns, and I imagined using data from Twitter (but dang, yet one of those deadline I missed :-( ).
I wonder if Kaggle competitors can do anything against Google. Google has put up a web page with a forecast about the winner. They can use all the search data on the names of the Eurovision participants to do the prediction. It appears as quite a strong strategy, nevertheless it has failed in some cases for the semifinal: Popnation Sweden’s Anna Bergendahl missed the final even though Google had her in the top ten rank, while Denmark is in the final with the duo Chan??e og N’evergreen, while Google has them in the bottom fourth. So what is wrong with Google? Perhaps Google does not correctly handle the names. One would think that there is room for a lot of variations for the Danish entry: Chane och N’evergreen, Channe et Nevergreen, etc. which might not be captured in Google’s filter. Also they might not handle the correlation between the voting: Lets say that Alice and Bob like Germany and Sweden with a slight preference for Germany, while Andrew likes Denmark. Andrew vote Danish, while Alice and Bob vote German, and subsequently no votes for the Swedes. Furthermore it is possible that the Google searching is biased compared to the Eurovision voting: For example, are people are more likely to search on blondes but not as likely to vote on them?
By the way should Danmark win, Swedes actually win since the “Danish” song is actually written by Swedes, and should the “German” song win a Dane wins since Dane John Gordon and American Julie Frost is behind that song.