Telling Stories with Data, A VisWeek 2010 Workshop

This is a guest post by Joan DiMicco, who heads the IBM Visual Communication Lab. Matt McKeon, Karrie Karahalios, and Joan hosted a workshop on Telling Stories with Data. These are the highlights.

What is a story? In a classic sense, a story has characters, events, and a progression. In our postmodern, meta-obsessed culture, we also tend to think about story in terms of the identities of the author and audience.

Now what if the story involves data? How does visualization support telling a story with data? How do journalists think about data visualization as part of their stories? How can visualization tools help data storytellers construct narratives?

At VisWeek 2010 in Salt Lake City, Matt McKeon, Karrie Karahalios, and I organized a workshop to explore this topic of Telling Stories with Data. We were initially motivated by our observation that people often use visualization to share personal perspectives and to tell stories about social situations.

For example, in May of this year, Matt was interested in exploring how Facebook’s default privacy settings have changed over time. To that end, he created a visualization to illustrate their evolution. Matt then posted the visualization to his website, along with some explanatory text that further communicated his point of view. At the time, Facebook’s privacy policy changes were a trending news topic, and the visualization spread rapidly through Twitter, Facebook, and several news blogs. Through use of animation and appropriate metaphor, this visualization told a simple yet compelling story: the information you put on Facebook is visible to larger and larger groups of people.

Yet the things that we found most striking about reaction to this visualization went beyond the issues we tend to think about as visualization designers. Matt found that changing a single sentence in the framing text of the visualization radically altered the nature of the comments that people made. Furthermore, as the story spread, people retold it as their own story, using the data visualization to describe elements from their personal experiences with Facebook.

This suggests factors that transcend the boundaries of “Visualization Design 101” and that affect the creation, interpretation, and popularization of data-centric narratives. We organized this workshop in order to explore these topics, bringing together some great speakers and participants to hash out an agenda for research and practice around storytelling with data. Luckily, this turned out to be a theme at VisWeek 2010: Jock Mackinlay and Fernanda Viégas organized a panel around the same topic, and Ed Segel and Jeff Heer’s excellent design space survey of narrative visualization provides a much-needed stake in the ground for further research.

Context

In 2007, the InfoVis capstone by Stephen Few offered some provocative views about the disconnect of the information visualization academic community from the world of people who require visualization to understand complex data. The suggestion to the academic community was to connect with the rest of the world in more ways, and to provide better “visualization for the masses” rather than expert-only tools. Since 2007, collaborative visualization tools (e.g. Tableau and Many Eyes) and the popularization of interactive visualization in the news media have pushed visualization further into the realm of popular consciousness and enabled more people to create and interact with visualizations.

We see the theme of “storytelling with data” as a re-focusing of the discussion of “visualization for the masses” that emphasizes the communicative intent of visualization, rather than simple mass appeal. A visualization can be viewed in terms of its intended message, the characters and events within the data, and the intended progression the audience can take through the visualization. Additionally, an effective story can be understood by different audiences with varying levels of expertise with the material.

The Workshop

We organized our workshop around four themes: Storytelling Practice, Interactive Journalism, Storytelling Theory, and Tools for Storytellers. Below is a summary of the eight presentations, along with some of our thoughts. More details can be found on the workshop website, along with PDFs of each speaker’s presentation.

Storytelling Practice

We invited two data storytellers to share their ideas, questions and observations with us.

Jérôme Cukier runs the OECD factblog, a venue for popularizing the economic and development data that his organization gathers. His goal in creating a given blog post is to capture the interest of the reader within 10 seconds; visualization is an important technique for doing so. Once he has the reader’s attention, the text of the blog post describes the data in more detail, including links to download the data. Like a journalist, his approach is to start with the story he believes is in the data, rather than performing analysis to find stories. Once he has a story in mind, he deliberately limits the data he visualizes to create a simple, coherent story. The story’s purpose is to provide a concrete message that is expressed in few data points.

Matthias Shapiro is an independent visualization designer and developer; those of you who have read Beautiful Visualization may remember his chapter on storytelling. He began his talk by describing classic story structures, such as the heroic epic; he uses these models to structure the pace at which he introduces visual data into his narrative. He then shared two videos to illustrate different modes of audience engagement: the Obama Administration’s Road to Recovery video frames unemployment statistics as a near-mythic struggle between a disaster and a hero, while Matthias’ own Obama Budget Cuts Visualization uses the down-to-earth metaphor of pennies to explain Obama’s proposed budget cuts in 2009. These stories both assume a basic understanding on the part of the reader, and each slowly introduces more data, represented by visuals, as the story builds. He recommends data storytellers place the data into a context and scale that the user already understands. Finally, a narrative twist or surprise at the end tends to make for a persuasive story.

Interactive Journalism

Interactive data visualization is changing the way journalists present stories, and The New York Times is often cited as an example of its power. CNN is enabling readers to use data to tell personal stories within the framework of interactive visualization.

Manav Tanneeru and Toni Pashley from CNN.com presented several examples of their visualization work, including the powerful Home and Away interactive feature. Produced in collaboration with Stamen Design, Home and Away presents military casualties in Iraq and Afganistan by connecting the locations of each trooper’s birth and death, along with some search and dynamic query features based on demographics. The visualization is deeply integrated with CNN’s iReport platform, allowing family and friends of the deceased to tell their personal stories in the broader narrative of the conflicts. Toni placed special emphasis on the need to rapidly engage an audience with data of this scale, and shared some of their techniques and metrics for success. Manav described some of the features of the visualization (such as deep linking to specific visual states) that enabled them to treat Home and Away as a platform for telling multiple stories in several articles while still empowering their audience to explore on their own.

Nicholas Diakopoulos of Rutgers University presented a different model of interactivity in journalism by exploring the concept of “Game-y InfoGraphics.” His projects, both in and out of the newsroom, use traditional game mechanics to engage users with data; Nick argues that this process of interaction helps people to discover stories they may not have found otherwise. One example of this is Salubrious Nation, a guessing game that encourages people to explore outside of their own demographic and geographic regions while playing the game.

One can draw an interesting contrast between the approaches of our three speakers in this session. They each have the goal of encouraging users to explore the data, yet Manav and Toni emphasized interaction techniques (such as splashes, explicit search tasks, and featured iReport content) to immediately engage the casual reader by helping them to find personally relevant data. On the other hand, Nick’s visualizations use game mechanics to encourage readers’ interest in people and places that are unfamiliar to them.

Storytelling Theory

One of our goals for the workshop was to learn what humanities scholarship might have to say about the creation of data narratives. In this session, we invited two speakers to lead a discussion of the theoretical aspects of storytelling: what kinds of stories exist, what the component elements of a story are, and how those might be embodied in a visualization.

Jessica Hullman, of the University of Michigan, used the example of the NYTimes’ How Different Groups Spend their Day to illustrate elements of narrative theory and their application to critical scholarship of data stories. A multilayered approach that identifies the agents, events, and progression in a story enables us to deconstruct their presentation in a given visualization and thus understand the design choices made. Similarly, a model of perspective can help us to isolate the narrative voice of a visualization and tease out the motivations, relationships, and social contexts that enable that visualization to tell a story effectively. These two approaches can help us to analyze existing visualizations that tell stories, and therefore develop theories that can inform the creation of more effective and engaging visual narrative devices. Jessica’s future work focuses on analyzing perspective in data visualization, and the manner in which it drives interpretation of data.

Next, Aisling Kelliher from Arizona State University gave us an overview of computational storytelling, beginning with the fundamental concepts of Aristotelian poetics (perhaps a first for an IEEE conference) and exploring how a given medium influences the styles and experiences of stories. She then went on to share some of her own work in rich media storytelling, from
systems that enable people to assemble stories from their photos to visualizations that track broader trends across social media. In doing so, she constructed a map of telling stories in interactive media – ranging from individual to societal and human-authored to emergent.

Aisling’s talk sparked a discussion of characters and scale in data stories — many statistical graphics contain data that spans a broad population. The challenge of telling a story with such a dataset is identifying individuals in the visualization in such a way that their stories form part of the broader epic.

Tools for Storytelling

There are several tools available for everyday users to create visualizations, and these tools are becoming increasingly sophisticated in their support for conveying and sharing complex stories. We asked Jock MacKinlay of Tableau Software to illustrate features in Tableau Public that support storytelling. Jock shared an example from his personal experience: a comparison of two math textbooks that used multiple data visualizations coupled with a textual narrative. Jock created this feature in order to persuade his local community to select a different textbook for their school system. Jock pointed out that a highly customizable visualization appearance is critical for systems that support storytelling, because an author may use a number of layout, animation, and visual features to control the flow of a narrative.

Wesley Willett from UC Berkeley presented his tool CommentSpace as another example of how visualization tools can support users in creating stories. CommentSpace builds upon the capabilities of previous platforms like sense.us and Many Eyes by supporting more structured commenting, allowing hypothesis construction and evidence gathering. Through CommentSpace, a user can explore a fully-featured visualization and construct an argument for or against a hypothesis by tagging comments and different views of the data. Wesley’s recommendation for tools for storytellers is to support three stages of story development: exploration of the data, organization of the points, and synthesis of the story.

Conclusion

All in all, storytelling offers a compelling lens for both critiquing existing visualization systems and designing new ones. A number of issues were identified by both speakers and participants in this workshop as important to consider when using this perspective, including:

  • reconciling the open-ended nature of interactive visualizations with the fixed paths of traditional storytelling
  • identifying specific interaction techniques in visualization systems that assist with storytelling
  • applying methods from film and other time-based media to visualization design
  • identifying potential characters, events, and plot in a dataset and revealing them to an audience

We’re excited to explore these and other issues with the visualization community, and think that a focus on storytelling with data has the potential to promote increasingly sophisticated and data-literate conversations in the world at large.

My many thanks to Joan, Matt, and Karrie. Visit the workshop page for more details on telling stories with data and the presenters’ slides.

15 Comments

  • great post! As a Tufte fan and a purchaser of (an unfortunately misprinted) Beautiful Visualization, I think your closing statement we “think that a focus on storytelling with data has the potential to promote increasingly sophisticated and data-literate conversations in the world at large.” is spot on. With so many complex and complicated issues on our plates, being able to ‘see’ relationships and use data constructively will be critical to surviving the morass we are presently in.

  • I’m a big fan and advocate for data-driven creativity. Great to see this guest post crop up.

  • Thanks for the recap Joan. Wish I was there. Storytelling with data continues to be a thrilling endeavor and its great to get a big picture on what folks are doing. Indeed lovely key points for the community to consider in conclusion there at the end. When it comes to interactive storytelling, the value of using data is the definitive nature about it that reports clearly on the past to inform the future. I still think the visualization community is largely missing the real value it has to offer – the ‘so what’ factor. Data journalism has the unique ability to personalize the context of data and give reader level feedback that points toward relevant action. The field has an exciting future and great potential since people are most moved by truth in stories, but data journalism should fundamentally present an actionable pathway in conclusion. If a story is worth my time to not only read but also think analytically about, its payoff has to be a clear, personal decision I can make to impact and enter the story myself. That kind of anticipation will encourage future readings with similar presentation styles. This is where data journalism should intrinsically point viewers, and a direction the budding field should think more seriously about.

    I am genuinely grateful for the strides being taken and inspiration that abounds from you and the community. Its a daily joy! Cheers all around.

  • You write:
    “…Like a journalist, [Jérôme Cukier’s ] approach is to start with the story he believes is in the data, rather than performing analysis to find stories. Once he has a story in mind, he deliberately limits the data he visualizes to create a simple, coherent story. The story’s purpose is to provide a concrete message that is expressed in few data points.”

    If so, then Mr. Cukier has the process completely backward and violates the principles of GOOD journalism. That is, if he starts with “the story” in mind, then he will typically be headed toward simply confirming stereotypes with the selective capture of data supporting his notions. On the other hand, if one starts with solid data and subsequent high-level analysis of the data, (a) the validity of “common-sense knowledge” can be tested and (b) new stories that truly inform the public of new phenomena (it’s called “news”) can be found.
    I recall a research effort a few years back by Russell Clemings at the Fresno (California) Bee. As I recall, it was widely believed that the city’s population was expanding in one particular direction. Clemings, using census data and data-mining tools like GIS, discovered that, in fact, the population in the center city was expanding. Why? Because immigrant communities were taking advantage of lower housing costs in the center city and moving multiple families into available homes and apartments. The result? Stereotypes and “everyone knows” myths at least questioned by citizens and, we hope, city administrators.

    • To be fair to Jerome, I think we oversimplified his position a bit. What this characterization of his process meant to imply is that it is VERY rare that a dataset tells a complete story.

      No dataset can adequately capture all of the qualitative factors that contribute to understanding — even an expert analyst approaching a complex dataset in a completely unbiased manner brings with them a wealth of associations, experience, and context that enable them to adequately interpret the things that they see.

      Data can only ever relate the bones of a story. Human motivations, cultural contexts, and other externalities provide the flesh. The difference between how a journalist typically approaches data (with a story in mind that they’ve gathered from conversations, research and previous experience) and how an analyst approaches it (starting with data, looking for patterns, and then needing to go outside the dataset for an explanation) is simply a matter of where you start.

      Issues like the one you raise were repeatedly brought up during the workshop. They’re valid concerns, however I think the visualization community gets a little too wound up about confirmation bias sometimes . Data itself is also subject to bias and error, and interpretation of patterns one identifies in data is an especially treacherous process.

      When telling stories with data, inductive reasoning from a dataset and abductive reasoning *towards* a dataset are two sides of the same coin; neither absolve the storyteller from a need to be vigilant towards the truth.

    • Hi Tom, thanks for your remark
      This is a question I got at the workshop so thanks for giving me the opportunity to clarify
      I work at OECD, like hundreds of economists and researchers. So, I have the luxury of working on subjects which have already been analysed by some of the most reliable experts on earth. Now, when experts talk to experts their communication style is very technical and accurate, like our books.
      With that background, how can we tell stories with our data? that’s why I say start from the story. From the mass of information we produce, we look for facts which can be reformulated in a simple, but memorable way. This gives us an angle to write an article.
      This step is really critical. If we skipped this and only released the data (which we also do), I wouldn’t leverage the analysis that has already been carried out, and we would run the risk to see the data misinterpreted.
      These stories are more conclusions than hypotheses. They can be counter-intuitive, like the example you give, or this one: we know that there is a strong relationship between obesity and income. Yet we cannot establish that one causes the other. And, they are checked by the original researcher.
      so the aim is not to spread stereotypes but rather to use data to support new ideas with facts

  • Peter Kinnaird November 12, 2010 at 6:26 am

    I was fortunate enough to attend this workshop. Not only were the presentations outstanding but the working groups that developed between sessions made for interesting discussion.

    I had recently been having a discussion with my adviser about how to approach a particular visualization problem and this workshop helped clarify my ideas and articulate a coherent strategy for the problem. Jerome’s presentation was especially helpful in identifying both what we were doing wrong and how we could improve our approach.

    One of the working groups I was a part of reached a consensus that Jerome’s approach was pretty good given his work environment. We did think there was a missing stage though: consideration of the audience. We thought this could be stage 0 in Jerome’s framework or perhaps he considers it already present in stage 1. Either way, we thought the presentation might benefit from the inclusion of this idea.

    Overall it was an excellent workshop!

  • Very interesting and informative post. The interactive element is of particular interest to the Center for Digital Information http://digitalinfo.org These techniques are rarely employed by the organizations we are focused on, and we maintain they need to use them more. Many nonprofits working on all sorts of issues have a keen interest in “storytelling,” but view it a certain way (e.g., a video about a person or group, a written narrative with clear beginning, middle, and end, etc.). It is usually not heard in the same breath as “interactive” and certainly not “data.” So this post is a good way to show how visual, interactive, data-rich features can be used in support of a broader definition of “storytelling.” Thanks.