wikipedia

Valg til Wikimedia Foundation-bestyrelsen af affiliates-valgte medlemmer

Posted on April 16, 2019 Updated on April 16, 2019

De såkaldte affiliates, hvilket er Wikimedia chapters, User groups og Thematic groups, har mulighed for at vælge to pladser til Wikimedia Foundations (WMF) bestyrelse (Board of Trustees). Tidligere har det blot været Chapters der har haft mulighed for at vælge medlemmer, men fra januar 2019 er det nu også det betydelige antal af User groups der får indflydelse. Som jeg forstår er det for at få en bredere fundering, måske specielt af hvad der betegnes “emerging communities”.

De to nuværende affiliates-valgte er tidligere formand Christophe Henner fra Frankrig og ukrainske Nataliia Tymkiv. Communities vælger tre bestyrelsesmedlemmer. Disse medlemmer er James Heilman, Canada, Dariusz Jemielniak, Polen og spanske María Sefidari der i øjeblikket er formand. I forhold til affiliates-valgte synes der at være en fornemmelse for at community-valgte er fra store communities: Engelsk Wikipedia, Spansk Wikipedia. Det gælder så ikke helt for den polsk-valgte Jemielniak, der dog har gjort sig bemærket med en engelsk-sproget bog.

Affiliates-valget vil ske hurtigt i løbet af foråret 2019, hvor der først er en periode med nominereringer og derefter det egentlige valg. En håndfuld Wikimedianere fungerer som facilitatorer for valget. Disse facilitatorer kan ikke samtidig være nominerede, men hvis de fratræder facilitatorrollen kan de godt stille op. Jeg har indtryk af at de to nuværende medlemmer genopstiller.

Wikimedia Danmark skal deltage i afstemningen og spørgsmålet er så hvem vi skal stemme på og hvilke kriterier vi skal benytte. Henner og Tymkiv virker udmærkede og har jo erfaring. I hvilken grad de har evner til at banke i bordet og komme med originale levedygtige visioner står mindre klart for mig. Af andre der muligvis vil nomineres kan være Shani Evenstein. Hun virker også udmærket.

En person der stiller op bør ud over det formelle krav om bestyrelsesværdighed, have vægtig bestyrelseserfaring, forståelse for Wikimedia-bevægelsen og være et rimeligt tilgængeligt ansigt i det internationale Wikimediamiljø. Derudover være indstillet på at lægge en god portion ulønnet arbejdstimer på skæver timer af døgnet, og være opmærksom på at man arbejder for WMF, – ikke for affiliates, community eller Wikipedia. Hvis man kigger på sammensætningen i WMF er Europa & Nordamerika godt repræsenteret, dog ingen fra Nordeuropa. Der er en læge (James Heilman), akademikere, grundlæggeren Jimmy Wales, en med økonomierfaring (Tanya Capuano) og forskellige andre erfaringer. Henner synes at være den eneste med teknisk erfaring (et element jeg ville værdsætte) og derudover kan man sige at der mangler repræsentation fra Latinamerika (omend Seridari jo taler spansk), Afrika og Østasien (Esra’a Al Shafei har rod i Bahrain).

Afstemningen koordineres på Meta ved Affiliates-selected Board seats. Der findes vejledning til vælgere på Primer for user groups. Den hollandske formand Frans Grijzenhout har oploadet en handy scorematrix for kandidaterne. Nomineringen har også sin egen side. Nomineringerne er åbne indtil 30. April 2019. Efter at nomineringerne er indkommet er der kort tid i april og lidt af maj til at udfritte de nominerede.

This entry was posted in society and tagged wikimedia, Wikimedia Danmark, wikipedia.

Luftige spørgsmål til Wikimedia Strategi 2030

Posted on March 28, 2019

Wikimedia forsøger at tænke langsigtet og lægge en strategi der sigter mod året 2030. Et udkast er tilgængelig fra https://meta.wikimedia.org/wiki/Strategy/Wikimedia_movement/2017/Direction

Her er nogle luftige spørgsmål der måske ville kunne få folk til at tænke over tingene:

Hvorfor skal vi ha’ en strategi? Bør Wikimedia ikke blot udvikling sig organisk? Kan vi overhovedet forsige meget til 2030? Hvis vi ikke allerede kender vores strategi sidder vi så ikke allerede fast?
Sidder vi fast i wiki-interfacet?
Skal vi fortsætte med PHP MediaWiki interfacet som det primære software?
Hvorfor er Wikiversity ikke blevet større, og slet ikke eksisterende på dansk? Er det fordi folk ikke gide lave Wikiversity? Er det fordi vi ikke ved hvad wikiversity er eller skal være? Er det fordi wiki-tekniske ikke fungerer i undervisningssammenhæng. Hvad skal vi ændre for at få det til at fungere?
Hvorfor laver folk ikke flere video? Er det fordi at det er teknisk for besværlig? Er det for produktionsmæssigt for besværligt? Hvordan kunne Wikimedia hjælpe?
Hvorfor er Stackoverflow det primære sted for faglige spørgsmål og svar? Burde det ikke have været Wikimedia der var det?
Skal Wikimedia Foundation modtage penge fra firmaer så som Google? Vil det kunne skabe et afhængighedsforhold? Ifølge Peter Gøtzsches mening er patientforeninger påvirket i uheldig retning på grund af afhængighed til medicinalfirmaer. Kan Wikimedia-bevægelsen løbe ind i samme problem? Skaber det problemer med pengedonation, for eksempel i forbindelse med lobbyvirksomhede til EU’s ophavsretsdirektiv?
Hvorfor kan OpenStreetMap kører med et mindre budget? Skyldes det langt mindre server load? Burde Wikimedia neddrosle og vælge en slags OpenStreetMap-model med hvor server værket bliver bedre distribueret til andre?
“Knowledge equity” er et af to centrale begreber i Wikimedia Foundations strategi og noget svært at oversætte. Financial equity er hvad der på danske betegnes egenkapital. Et latinsk ord der nærmer sig findes i Den Store Danske, ellers er min nærmeste tanke det forældede udtryk “billighed”, – “ret og billighed” som det hedder i en dansk sang. Et sådant ord kan vi næppe bruge. Hvad kan vi på dansk forstå som “knowledge equity”?
Kan Wikimedia komme i en situation som man har set Cochrane Collaboration hvor den professionaliserede del af organisationen kommer til at udmanøvrere græsrødderne? Hvad gør vi for at det ikke ske?
Skal vi være stolt af at den danske Wikipedia stort set er opbygget gratis? Sidst jeg spurgte på den danske Wikipedias Landsbybrønd om Wikimedia Strategi blev det nævnt.
Knowledge as a service følger en as-a-service-mønster man ser i datalogi. Her kan det hedder Platform-as-a-service e software-as-a-service. Hvad skal vi egentlig ligge i det? Jeg selv har skabt Scholia, et websted der viser videnskabelige data fra Wikidata via SPARQL-forespørgsler til Wikidata Query Service og Ordia, der gør det samme for leksikografiske data. Som sådan falder tanker om knowledge as a service fint i slag, – og jeg har da også forgæves forsøgt at erindre om det var mig der var med til at foreslå begrebet ved et internationalt Wikimedia-møde i 2017.
Skal Wikimedia engagere sig i aktivisme, så som det sås til afstemningen om EU’s nye ophavsretsdirektiv? Har vi nogen succeshistorier på at det hjælper?
Wikimedia Danmark har fået penge af Wikimedia Foundation til blandt andet et roll-up-banner. Det har været brugt i nogle få sammenhænge og vist været i tv. Er det sådan at Wikimedia Foundation skal bruge dets penge?
Den visuelle editor synes at kunne hjælpe mange nye brugere, men er redigering af Wikipedia på en smartphone ikke meget besværlig? Kan man overhoved gøre noget ved det?
Skal Wikimedia Foundation støtte forskere der bygger værktøjer eller undersøger fænomener på Wikimedia’s wikier?
Normalt fungerer Wikipedia hurtigt, men hvis man kommer til et net der er langsomt oplever man at der kan være frustrerende at arbejde med, for eksempel Wikidata. Er det mon ikke frustrere at arbejde med wikier fra lande som ikke har hurtigt Internet? HVad kan der gøres ved det?
Linux udvikles med en distribueret model, og sådan gør man med mange andre software systemer. Hvor er Wikipedia og andre Wikimedia wikier ikke distribuerede hvor fork og pull requests er nemt?
Hvor mange af Wikimedia Foundations indsamlede midler skal anvendes på events, så som Wikimania?

This entry was posted in society, technical and tagged wikimedia, wikipedia.

Coming Scholia, WikiCite, Wikidata and Wikipedia sessions

Posted on September 12, 2018

In the coming months I will have three different talks on Scholia, WikiCite, Wikidata and Wikipedia at al.:

3. October 2018 in DGI-byen, Copenhagen, Denmark as part of Visuals and Analytics that Matter conference, – the concluding conference for the DEFF-sponsored project Research Output & Impact Analyzed and Visualized (ROIAV).
7. November 2018 in Mannheim as part of the Linked Open Citation Database (LOC-DB) 2018 workshop.
13. december 2018 at the library of the Technical University of Denmark as part of Wikipedia – a media for sharing knowledge and research, an event for researchers and students (and still in the planning phase).

In september I presented Scholia as part of the Workshop on Open Citations. The slides with title Scholia as of September 2018 is available here.

This entry was posted in science and tagged Scholia, Wikicite, Wikidata, wikipedia.

A viewpoint on a viewpoint on Wikipedia’s neutral point of view

Posted on August 20, 2018 Updated on August 20, 2018

I recently looked into what we have of Wikipedia research from Denmark and discovered several papers that I did not know about. I have now added some to Wikidata, so that Scholia can show a list of them.

Among the papers was one from Jens-Erik Mai titled Wikipedian’s knowledge and moral duties. Starting from the English Wikipedia’s Neutral Point of View (NPOV) policy, he stresses a dichotomy between the subjective and the object and argues for a rewrite of the policy. Mai claims the policy has an absolutistic center and a relativistic edge, corresponding to an absolutistic majority view and relativistic minority views.

As a long time Wikipedia editor, I find Mai’s exposition is too theoretical. I lack good exemplifications: cases where the NPOV fails, and I cannot see in what concrete way the NPOV policy should be changed to accommodate Mai’s critique. I am not sure that Wikipedians distinguish so much between the objective and the subjective; the key dichotomy is verifiability vs. not veriability, – that the statements in Wikipedia are supported by reliable sources. In terms of center-edge, I came to think of events associated with conspiracy theories. Here the “center” view could be the conventional view while the conspiracy views the edge. It is difficult for me to accommodate a standpoint that conspiracy theories should be accepted as equal as the conventional view. It is neither clear to me that the center is uncontested and uncontroversial. Wikipedia – like a newspaper – has the ability to represent opposing viewpoints. This is done by attributing the viewpoint to the reliable sources that express them. For instance, central in the description of evaluation of films are quotations from reviews of major newspapers and notable reviewers.

I don’t see the support for the claim that the NPOV policy assumes a “politically dangerous ethical position”. On the contrary, Wikipedia is now – after the increase of fake news – been called the “last bastion”. The example given in The Atlantic post is the recent social media fuzz with respect to Sarah Jeong where Wikipedians reach a work with “shared facts about reality.”

This entry was posted in science and tagged denmark, wikipedia.

A question to Wikimedia Foundation and Wikimania 2014

Posted on August 13, 2014 Updated on October 1, 2014

An open question to Wikimedia Foundation and Wikimania 2014 and its organizing committee with Ed:

Almost everything with the Wikimania meeting in London in August 2014 went very well, people, talks, entertainment, organization, monkey, squirrel, etc. What I am confused about is what happen during my last hour at the meeting Sunday evening: After the buffet I had the experience of meeting two females, one who gave me a business card with a link to wikitranslate.org, and claimed to be behind the wiki web site. The females seemed not very old, in fact when queried one of them claimed to be 10 years old, and when queried further, she responded she had made the web site when 6 years old with a little help from a family member…

During the Wikimania meeting the documentary about Aaron Swartz, The Internet’s Own Boy, was shown. In that documentary we learned that Aaron Swartz was 12 years old when he created the Wikipedia-like site InfoBase. Thus prodigies can create wiki web sites when they are 12 years old. From that we can deduce that it is unlikely that a six year old female can produce a wiki web site. The closest explanation for my extraordinary vivid experience at Wikimania I can come up with is then that it was a hallucination.

My question is then: How do I get rid of the hallucination? Have other Wikimania participants had a similar hallucinations of meeting preteens claiming to make web sites? Or am I just getting old?

The hallucination has persisted for many days now because I still both see and feel the business card I got.

This entry was posted in humor, technical and tagged wikimania, wikimedia, wikipedia.

Sentiment colored sequential collaboration network

Posted on February 5, 2013 Updated on October 24, 2013

Sentiment colored sequential collaboration network of some of the Wikipedians editing the Wikipedia articles associated with the Lundbeck company. Red are negative sentiment, green are positive.

The “sequential collaboration network” is inspired by Analyzing the creative editing behavior of Wikipedia editors: through dynamic social network analysis. Brian Keegan has also done similar kind of network visualization.

Sentiment analysis is based on the AFINN word list.

This entry was posted in programming and tagged afinn, network analysis, sentiment analysis, wikipedia.

Jean-Pierre Hombach and Amazon.com: Large-scale Wikipedia copyright infringers?

Posted on January 24, 2013 Updated on November 12, 2013

An entity calling itself “Jean-Pierre Hombach” presents itself with “I’m a German writer Comedian and short filmmaker. I’m studying media at the University of Vic.”

The profile on Twitter states “Jean-Pierre Hombach: I’m a German Hobbie writer Comedian and short filmmaker. I’m studying media at the University of Vic. Jean-Pierre speaks fluent English. Rio de Janeiro · http://goo.gl/bFdsV“. There are also a Google Plus account and a Facebook account linked.

The shortlink leads to Amazon.com that lists 17 works. All these have been published in the first part of 2012. What a prolific writer!

If you go to the Justin Bieber book on Google Books you will find “Copyright (C) Jean-Pierre Hombach. All rights reserved. ISBN 978-1-4710-8069-2”. So apparently this Hombach takes the copyright for the work.

If you go a bit further in the book you will read as the first line “Justin Drew Bieber is a Canadian pop/R&B singer, songwriter and actor.” That sounds awfully wikipedish and an examination of the book quickly reveals that this is Wikipedia! “Hombach” has simply aggregated a lot of Wikipedia articles together. If you go all the way to page 505 you will even see that the Jakarta Wikipedia article has been included in the Justin Bieber book… ehhh…?

I leave it as a execise to the reader to examine the rest of the books of Mr. “Hombach”. You may, e.g., begin with the Bob Marley book.

Obviously the copyright does not belong the “Hombach”, but to Wikipedia contributors. It is licensed under CC BY-SA and should be stated so according to the license (and re-licensed under the same license). Otherwise it is not even Copyfraud it is simply Copyright infringement.

Amazingly, a book by “Jean-Pierre” reached number 16 on the music biographies bestseller list according to Los Angeles Times. In that book the contributors are listed in the back and there might also be the CC-license although the page is not available to me on Google Books. Maybe he have read a bit about the CC license.

Amazon.com will gladly sell that to you for $23.90 without telling you that the author is not Jean-Pierre.

One interesting issue to note is that “Hombach” copied Wikipedia hoaxster Legolas2186 material on Lady Gaga. Initially it confused me as the “Hombach” book was stated to be copyrighted in 2010 while Legolas2186 hoaxster first added the segment to Wikipedia in the summer of 2011.

To me the wrongful attribution, lack of proper attribution and obfuscation (wrt. copyright year) seem illegal. Wikipedia contributors to the respective works should be able to sue Hombach and Amazon.com for selling their copyrighted works that are not appropriately licensed.

Update 2013-01-25: A Google search on Jean-Pierre Hombach reveals that the works of Jean-Pierre has at least been used five time as a source in Wikipedia, i.e., we have a citation circle! One time in Belieber and one time in Decca Records.

Update 2013-01-25: Apparently Wikipedia has a page for everything http://en.wikipedia.org/wiki/Wikipedia:Republishers Thanks to Gadget850 (Ed)

This entry was posted in society and tagged copyright, wikipedia.

More on automated sentiment analysis of Danish politicians on Wikipedia

Posted on January 9, 2013 Updated on October 2, 2013

Previously today I put up sentiment analysis of Danish politician Ellen Trane Nørby on the text of the Danish Wikipedia.

Unfortunately, I could not resist the temptation of spending a bit of time on also running the analysis for some other Danish Politicians. I did it for Prime Minister Helle Thorning-Schmidt, former Prime Minister Lars Løkke Rasmussen, former Foreign Minister Lene Espersen and former Minister Ole Sohn.

For Rasmussens article we see a neutral factual bibliographic article until 2008, though with a slight increase in the end of 2007 when he became Minister of Finance. Then in May 2008 we see a drop in sentiment with the introduction of a paragraph mentioning an “issue” related to his use of county funds for private purposes. Since then the article has been extended and now generally positive. There are some spikes in the plots. These spikes are typically vandalism that persist for a few minutes until reverted.

For Helle Thorning-Schmidt we see a gradual drop up towards the election she wins and after that her article gains considerable positivity. I haven’t check up much on this in the history, but I believe it is related to the tax issue her and her husbond, movie star Stephen Kinnock, had and a number of other issues. As I remember there was concern or discussion on the Danish Wikipedia on whether these “issues” should fill up so large a portion of the article and on the 3 December 2011 a user moved the content to another page.

I believe I am one of the major perpetrators behind both the Lene Espersen and Ole Sohn articles. Both of the articles have large sections which describe negative issues (I really must work on my positivity, these politicians are not that bad). However, the sentiment analysis shows the Ole Sohn article as more positive. Maybe this is due to the “controversy” section described that he paid “tribute” to East Germany and that his party received “support” from Moscow, i.e., my simple sentiment analysis does not understand the controversial aspect of support from communist Moscow and just think that “support” is positive.

Writing politicians article on Wikipedia I find it somewhat difficult to identify good positive articles that can be used as sources. The sources used for the encyclopedic articles usually comes from news articles and these have often a negative bias with a focus on “issues” (political problems). Writing the Lene Espersen article I found that even the book “Bare kald mig Lene”, which I have used a source, has a negativity bias. If I remember correctly Espersen did not want to participate in the development of the book, presumably because she already had the notion that the writers would focus on the problematic “issues” in her career.

(2013-01-10: spell correction)

This entry was posted in programming and tagged denmark, politics, sentiment analysis, wikipedia.

Sentiment analysis of Wikipedia pages on Danish politicians

Posted on January 9, 2013 Updated on April 7, 2015

We are presently analyzing company articles on Wikipedia with simple sentiment analysis to determine how well we see any interesting patterns, e.g., whether the Wikipedia sentiment correlates with real world attitudes and events with relation to the company. Such analyses might uncover that there was a small edit war in relation to Lundbeck articles in the beginning of December 2012. We are also able to see that the Arlas Foods article was affected by the Muhammed Cartoon Crisis and the 2008 Chinese milk scandal.

In Denmark in the beginning of January 2013 there has been media buzz on Danish politicians and their staff doing biased edits in the Danish Wikipedia. The story carried forth by journalist Lars Fogt focused initially on Ellen Trane Nørby.

It is relatively easy to turn our methods employed for companies to Danish politicians. The sentiment analysis works by matching words to a word list labeled with “valence”. The initial word list worked only for English, but I have translated it to Danish and continuously extend it. So now one needs only to download the relevant Wikipedia history for a page and run the text through the sentiment analysis using the computer code I already have developed.

The figure shows the sentiment for Ellen Trane Nørby’s Danish Wikipedia article through time. The largest positive jump in sentiment (the way that I measure it) comes from a user inserting content on 2 March 2011. This revision inserts, e.g., “great international commitment” and “impressive election”. Journalist Lars Fogt identified the user as Ellen Trane Nørby staff.

Surely the simple word list approach does not work well all the time. The second largest positive jump in sentiment arise when a user deletes a part of the article for POV reasons. That part contained negative words such as svag (weak), trafficking and udsatte (exposed). The simple word list detects the deletion of the words as a positive event. However, the context which they appeared in was actually positive, e.g, “… Ellen Trane Nørby is a socially committed politician, who also fights for the weak and exposed in society, …”.

As far as I understand journalist Lars Fogt used the Danish version of the Wikipedia Scanner provided by Peter Brodersen, see the list generated for Ellen Trane Nørby. Brodersen’s tool does not (yet?) provide automated sentiment score, but does a good job in providing an overview of the edit history.

(2013-01-16: typo correction)

This entry was posted in programming, society and tagged afinn, denmark, politics, sentiment analysis, wikipedia.

NumPy beginner’s guide: Date formatting, stock quotes and Wikipedia sentiment analysis

Posted on June 12, 2012 Updated on October 23, 2013

Last year I acted as one of the reviewers on a book from Packt Publishing: The NumPy 1.5 Beginner’s Guide (ISBN 13 : 978-1-84951-530-6) about the numerical programming library in the Python programming language. I was “blinded” by the publisher, so I did not know that the author was Ivan Idris before the book came out. For my reviewing effort I got a physical copy of the book, an electronic copy of another book and some new knowledge of certain aspects of the NumPy.

One of the things that I did not know before I came across it while reviewing the book was the date formatter in the plotting library (matplotlib) and the ability to download stock quotes via a single function in the NumPy library (there is an example starting on page 171 in the book). There is a ‘candlestick’ plot function that goes well with the return value of the quotes download function.

The plot shows an example of the use of date formatting with stock quotes downloaded from Yahoo! via NumPy together with sentiment analysis of Wikipedia revisions of the Pfizer company.

	import urllib, urllib2
	import simplejson as json
	import dateutil.parser
	import datetime
	import matplotlib.dates
	import matplotlib.finance
	from matplotlib import pyplot as plt
	import nltk.corpus
	import numpy as np
	import re
	import copy


	companies = {
	'Novo Nordisk': {'stock': 'NVO', 'wikipedia': 'Novo_Nordisk'},
	'Pfizer': {'stock': 'PFE', 'wikipedia': 'Pfizer'}
	}


	filebase = '/home/fn/'

	# Sentiment word list
	# AFINN-111 is as of June 2011 the most recent version of AFINN
	filename_afinn = filebase + '/data/AFINN/AFINN-111.txt'
	afinn = dict(map(lambda (w, s): (unicode(w, 'utf-8'), int(s)), [
	ws.strip().split('\t') for ws in open(filename_afinn) ]))

	stopwords = nltk.corpus.stopwords.words('english')
	stopwords = dict(zip(stopwords, stopwords))


	# Word splitter pattern
	pattern_split = re.compile(r"[^\w-]+", re.UNICODE)

	def sentiment(text, norm='sqrt'):
	"""
	Sentiment analysis.
	(sentiment, arousal, ambivalence, positive, negative) = sentiment(test)
	"""
	words_with_stopwords = pattern_split.split(text.lower())
	# Exclude stopwords:
	words = filter(lambda w: not stopwords.has_key(w), words_with_stopwords)
	sentiments = map(lambda word: afinn.get(word, 0), words)
	keys = ['sentiment', 'arousal', 'ambivalence', 'positive', 'negative']
	if sentiments:
	sentiments = np.asarray(sentiments).astype(float)
	sentiment = np.sum(sentiments)
	arousal = np.sum(np.abs(sentiments))
	ambivalence = arousal – np.abs(sentiment)
	positive = np.sum(np.where(sentiments>0, sentiments, 0))
	negative = – np.sum(np.where(sentiments<0, sentiments, 0))
	result = np.asarray([sentiment, arousal, ambivalence, positive, negative])
	if norm == 'mean':
	result /= len(sentiments)
	elif norm == 'sum':
	pass
	elif norm == 'sqrt':
	result /= np.sqrt(len(sentiments))
	else:
	raise("Wrong ''norm'' argument")
	else:
	result = (0, 0, 0, 0, 0)
	return dict(zip(keys, result))


	today = datetime.date.today()

	# Matplotlib x-axis date formatting
	days_locations = matplotlib.dates.DayLocator()
	months_locations = matplotlib.dates.MonthLocator()
	months_formatter = matplotlib.dates.DateFormatter("%Y %b")

	# Prepare URL and download for Wikipedia
	opener = urllib2.build_opener()
	opener.addheaders = [('User-agent', 'Finn Aarup Nielsen, +45 45 25 39 21')]
	urlbase = "http://en.wikipedia.org/w/api.php?"


	for company, fields in companies.items():
	wikipedia_revisions = []
	urlparam = {'action': 'query',
	'format': 'json',
	'prop': 'revisions',
	'rvlimit': 50,
	'rvprop': 'ids\|timestamp\|content',
	'titles': fields['wikipedia']}
	for i in range(7):
	url = urlbase + urllib.urlencode(urlparam)
	wikipedia_result = json.load(opener.open(url))
	wikipedia_revisions.extend(wikipedia_result['query']['pages'].values()[0]['revisions'])
	print("%s: %d" % (company, len(wikipedia_revisions)))
	if 'query-continue' in wikipedia_result:
	urlparam.update(wikipedia_result['query-continue']['revisions'])
	else:
	break
	wikipedia_last_timestamp = wikipedia_revisions[-1]['timestamp']
	wikipedia_last_datetime = dateutil.parser.parse(wikipedia_last_timestamp)
	wikipedia_last_date = datetime.datetime.date(wikipedia_last_datetime)
	for n, revision in enumerate(wikipedia_revisions):
	wikipedia_revisions[n].update(sentiment(revision['*']))
	companies[company].update({'wikipedia_revisions': copy.deepcopy(wikipedia_revisions)})
	companies[company].update({'quotes': matplotlib.finance.quotes_historical_yahoo(fields['stock'], wikipedia_last_date, today)})
	xaxis_range = matplotlib.dates.date2num(wikipedia_last_date), matplotlib.dates.date2num(today)
	fig = plt.figure()
	for i in range(1,3):
	ax = fig.add_subplot(2, 1, i)
	ax.xaxis.set_major_locator(months_locations)
	ax.xaxis.set_minor_locator(days_locations)
	ax.xaxis.set_major_formatter(months_formatter)
	if i == 1:
	quotes = companies[company]['quotes']
	h = matplotlib.finance.candlestick(ax, quotes)
	h = plt.ylabel('Stock prize')
	h = plt.title(company)
	else:
	x = map(lambda fs: matplotlib.dates.date2num(dateutil.parser.parse(fs['timestamp'])), wikipedia_revisions)
	y = map(lambda fs: fs['sentiment'], wikipedia_revisions)
	h = plt.plot(x, y)
	h = plt.xlabel('Date')
	h = plt.ylabel('Wikipedia sentiment')
	h = ax.set_xlim(xaxis_range)
	fig.autofmt_xdate()
	plt.show()

view raw

Nielsen2012Numpy_quotes.py

hosted with ❤ by GitHub

This entry was posted in programming and tagged matplotlib, numpy, python, sentiment analysis, stockquotes, wikipedia.

	Finn Årup Nielsen on Wikidata and ChatGPT integrati…
	derenrich on Wikidata and ChatGPT integrati…
	Finn Årup Nielsen on Wikidata and ChatGPT integrati…
	derenrich on Wikidata and ChatGPT integrati…
	Wikidata and ChatGPT… on Multihub question answering wi…

Finn Årup Nielsen's blog

– research, science, technology, music, personal opinions, etc.