Moritz Stefaner's Data Visualization Projects

The next Eyeo Festival speaker was Moritz Stefaner.  He called his presentation Truth and Beauty – “the two big maxims that should guide our work”.  As for his background, he started out in the web industry duringthe first bubble, 1998-2002.  Then he started looking into data visualization, human factors and formal and computational aspect, getting a BS in Cognitive science.  He became fascinated with the idea of creating information spaces that could be navigated.  Now he’s freelance, he designs with code, considers himself a designer.  Wants to let the data speak to us, combining the functional and the aesthetic approach.  He likes to dig in data and sicover hypotheses, and hopes that his work could transfer some of that fascination.

He started by describing his visualization of deletion discussions on Wikipedia.  Should a given page be kept or deleted?  It’s debated and ultimately an administrator makes the decision.  He gives a sample page called “biscuits and human sexuality”.  Each discussion starts at the bottom, the root of the tree.  For each page, a red segment is added and the branch takes a turn to the right, each time someone votes to delete.  But a green turn to the left results from a keep vote.   The angles and the lengths decay so the arguments at the end have a smaller impact than the ones at the beginning. The end result looks like a plume of threads curling off in different directions, and you can see for particular pages when there might be an inflection point in the argument.

The project started with an email from “a random person on the Internet”.  A researcher who was working on analyzing wikis, and they’d analyzed 200,000 of these discussions.  Stefaner was interested because “they had a unique data set which was huge”.  So he just began to play with the data.  He used Notebox, it’s similar to Processing, but in Python and he likes Python.  At first he didn’t use the decaying technique so the “threads” for the long discussions were very long.  Ultimately he forced himself to just use the top 100 from the articles that were kept and the articles that were deleted.

The next project he showed was started as a conference tool, to visualize the tweets at a given conference.  He found some stuff out there that was straightforward but unattractive.  So he decided to code his own.  Visually, who is the most important in the conversation?  What is the core and what is on the periphery?  Each Twitter poster’s icon represents one tweet, if a tweet gets @replies and retweets it gets larger.  And clicking on things blows them up to see what is going on with each one.  A big challenge for the future is figuring out what data to throw away, dealing with the streams.

A map of where New Yorkers moved from and to shows data one why people moved, which boroughs people moved into or out of… and Stefaner didn’t go into too much detail.

The Better Life Index, a funded project, shows a new way of how to look at ranking countries.  It looks at factors like life satisfaction,  governance, safety… and you can individualize it by dragging things into your own preferences and then send your  personalized visualization to a friend.  The visualization became a big part of marketing for the client. Oecdbetterlifeindex.org.

He rejects clients who request, “I want to show that…”.  He believes, like Sir Arthur Conan Doyle, “It’s a capital mistake to theorize before one has data…”.  He prefers questions like “I want to see if…”.

I’ll add links later when I can…