Sunday fun: The Data Storyteller

This weekend, while researching something else, I came across an enjoyable talk by Ben Wellington which was recorded at a TEDx event on Broadway this spring.  In the talk, he explains that over a year ago, he wouldn’t have have known what a data storyteller was, nor would he have imagined that he would have become one, but that all changed after he used some data about biking accidents from the New York City Open Data Portal to create the heat map shown below and it became the subject of articles on a few New York-area websites and blogs.

The Terrifying Cycling Injury Map of NYC, 2013 Edition by  Ben Wellington

The talk is entertaining and informative, and I recommend watching it to learn more about what he does with his blog and how he has turned into a data storyteller.  As he explains, though, he realized that one reason his blog had caught on was that it provided a way of combining his interests in data science, urban planning, and, of all things, improvisational comedy.

From my perspective, both the talk and his approach encapsulate the “use everything” philosophy that I wrote about in an earlier post.  As he explains, he’s not using complicated math here, but he’s asking a variety of creative and simple questions, and then trying to see what answers the data provides to those questions.

In my last post, I wrote about some ways to use some of the features of Excel and Access to get started with data analysis.  Now while that may seem like a very basic way to start, as this talk illustrates, basic analysis can tell us many kinds of things if we use our imaginations and ask questions related to our interests and experiences.  Wellington even provides examples at the end of his talk of how students who were new to statistical analysis were able to use the New York City data to begin asking and answering their own questions.

Beyond those points, though, both the talk and the approach that it outlines illustrate that for people who are interested in learning to work with data, there are lots of free and accessible ways to get started.  The New York City Open Data Portal is a great resource that contains a lot of information, but there are many other large, open data sets out there if one knows where to look.   Here, for instance, is but one listing of the many sorts of public data sets that are available.  And although I plan to talk more about other kinds of data analysis software in future posts, here is a link to the open-source geographic information system software that Ben Wellington used to create his initial map of cycling accidents.

For me, though, one of the key points to take away from this talk is that if you are interested in working with data, you no longer need to make the mistake that I made for many years in thinking that you are seriously limited by the data that is readily available or even by the software that you do or don’t possess.  Even if your long-term goal is building predictive models, you can gain a lot of knowledge and expertise by practicing with other sorts of data first, and when you do, you might find that your interests take you in different directions than you originally imagined.

Have you used data to do any interesting storytelling of your own?  If your city or state has any open data portals, what sorts of questions would you be interested in finding the answers for?  Please share your thoughts below.

So You Want to Work with Data? The Case for Indirection

Data analytics and “Big Data” have been hot topics for organizations of all kinds for several years now, and more and more businesses and nonprofits are looking for ways to make use of them.  The rising visibility of data analytics has occurred alongside another trend that we see arise periodically in American culture, a growing insistence on practical education above all else.  Such an approach places a premium on utility and directness, as though they offer the best method for learning or applying any new skill.

While I have noticed an increasing number of articles lately about the importance of practical education, articles that also display a disregard for the liberal arts, such a focus on practical skills is short-sighted, in that it doesn’t necessarily teach anything about the uncertainties of truth, the ambiguities of language, the nature of rhetoric, or the primacy of metaphor.  Directness has great value, but sometimes indirectness offers the greatest path to knowledge.

I take my cue here from Robert Frost who in 1930 gave a talk that was later published under the title “An Education by Poetry.”  Frost observes:

What I am pointing out is that unless you are at home in the metaphor, unless you have had your proper poetical education in the metaphor, you are not safe anywhere.  Because you are not at home with figurative values:  you don’t know the metaphor in its strength and its weakness.  You don’t know how far you  may expect to ride it and when it may break down with you.  You are not safe in science.  You are not safe in history.  (Frost, 334)

If, as Frost posits, all thinking is metaphorical, then part of what metaphor teaches us is how to be comfortable with its limits, part of what it helps us recognize is how to live with uncertainty as we try to define the contours of the objects in front of us.

Robert Frost Farm 03 Robert Frost Farm 03 by TechSavi is licensed under CC BY-NC 2.0.

But how do we come to understand metaphor?  Well, of course, one way is through reading poetry, and Frost is instructive on that point, too.  In a 1954 essay entitled (appropriately enough for my purposes here) “The Prerequisites,” Frost explains that we come to understand the ambiguities of any single poem partly by reading more poetry:

A poem is best read in light of all the other poems ever written.  We read A the better to read B (we have to start somewhere; we may get very little out of A).  We read B the better to read C, C the better to read D, D the better to go back and get something more out of A.  Progress is not the aim, but circulation.  The thing is to get among the poems where they hold each other apart in their places as the stars do.  (Frost, 418).

Even in the world of data analysis, in looking for patterns and insights in columns and rows of variables, an ability to think metaphorically is  a key component in figuring out workable truths.  Frost is talking about poetry above, but one hardly needs to be a master of metaphor to recognize that his description also applies to one key technique for becoming proficient in the art and science of data analysis.   In data analysis, as in reading poetry, each analysis we conduct, each variable we examine, each relationship we test, helps prepare us for the next one, and that helps us learn more, in turn, about the previous ones.

The title of my post comes from the idea of being indirect by speaking in metaphors, but as it happens, through my word choice, I’ve also unintentionally hit upon a truth about working with computers.  The late Cambridge Computer Scientist David Wheeler was famous for saying that  “All problems in computer science can be solved through another level of indirection, except of course for the problem of too many indirections.”  Though the quote may be apocryphal (having also been attributed to Butler Lampson, who claimed Wheeler as the source), it nevertheless uses the term “indirection” metaphorically to describe the ways in which problems are broken down, by working backwards or sideways or through other levels of abstraction.  As Diomidis Spinellis writes: “The quote rings in my head on various occasions: when I am forced to talk to a secretary instead of the person I wish to communicate with, when I first travel east to Frankfurt in order to finally fly west to Shanghai or Bangalore, and—yes—when I examine a complex system’s source code.”

Without downplaying the importance of understanding statistics or statistical software to the field of data analysis, therefore, one prerequisite for working creatively with data has little to do with software or statistics.  Although someone certainly needs a tolerance for numbers, formulas, spreadsheets, and databases, to be sure, beyond that, one essential element to working with data is a well-cultivated imagination, one with the ability to envision connections and relationships that don’t seem obvious or logical and, even better, one that is willing to ask all kinds of questions, no matter how absurd, silly, trivial, or beside the point they might seem at first.

Work Cited:

Frost, Robert.  Poetry and Prose.  Edited by Edward Connery Latham and Lawrence Thompson.  New York:  Henry Holt, 1972.