42. How to See Data
Kim Rees, Periscopic
@periscopic, @krees
Notas del editor
About 2 years ago we hired our first data scientist, and we plunked him down at a desk and gave him a bunch of data. Well, then I quickly realized that even if you’re data minded, you still need guidance to make the leap to visuals... It’s a CREATIVE PROCESS.So, this talk is about that. It’s spawned from discussions with Andrew, our data scientist. It’s our very informal, by the seat of your pants, process of jumping from data to visual data.
So how do we wrap our heads around what data ACTUALLY looks like? How do we make that transition?
The first step is obvious... Choose a tool and throw your data into it. I prefer Tableau because it’s quick and dirty... I don’t program much anymore, so this keeps me from getting stymied by syntax. It’s pretty flexible and usually gives me some quick insights. It lets me see the SHAPE of the data. And if you want to use Tableau just for this challenge, you can try it for free for 30 days or use Tableau Public for free.
Our Data Scientist, Andrew, also likes R, Python, and D3. You get the richness of Tableau, but also the flexibility of programming custom stuff into your visualizations including motion. Then there are always specialized tools such as Gephi, Processing, etc, .... For networks and other specialized datasets.If you have questions about tools, I’ll be answering questions in the chat after this presentation, so that’s a good time to talk about tools.
So as an example I wanted to show this process from one of our first projects. It was called State of the Sockeye and it was about salmon. So we threw all our data into Tableau and just looked at the data in different ways.Here’s a quick chart we did about where the fish are monitored. We found that some fish were well monitored, but most had only 1 or 2 sites.
So this is what we showed in the final design. Here each center dot is the fish population and the monitoring sites are the outer rings of dots.
Again in Tableau,we showed the percent growth or decline for each population.
As well as the count of the fish over time. Color indicates an overall increase or decline.
Those two charts were combined for the final design on the site.
Here we’re showing the fish counts in a scatterplot in Tableau. You can see that most are down in the lower left corner, but some populations grow exceptionally well which makes it hard to see all the tiny dots in the lower left.
You really need to zoom in to see the populations.
So in the final design, we accommodated that by allowing the user to control the zoom level. Also, we used a log scale to help fit more on the screen.
At some point, no matter what tool you’re using, you need to take a step back. You’ve been heads down in the weeds of data for awhile, so you’ll need to get a new perspective on it.There’s a process for teasing life into data. This is where the magic and the hard work happens. It’s unthinking yourself out of those weeds of data.
You need to look at the thing the data represents. So if you’re dealing with the flight dataset, look at images of airplanes, airports, travel. Think about what those numbers are actually saying...
What do you see or notice that’s outside the range of your data... Or what about your data stands out?
Tomorrow is Veterans Day in the US, so I have to give a shout out to my husband to is a Vet.
If you watched Mahir and Ben last night (or whenever it was your time), you’ll know they talked a lot about drawing. I can’t agree with them more. Sketching is the absolute fastest way to find a solution. It doesn’t matter if you draw well or not, you just need to get visual ideas down on paper.... You don’t need to be a designer or artist to do it.
This is an example of my sketchbook for a project we did for the Economist Intelligence Unit. It’s about rankings of a cybersecurity index. So we started with some simple bar charts.
Then plotted the ranks out and quickly hit upon this concept of a “rank stream”
Each country is at the top and their scores follow a stream down to the bottom of the page.
The final design was a bit compromised, but here’s how it came out with their development team.
Most data is living, breathing data, meaning, it has a real world life. It’s salmon that are dying at sea, kids being bullied, people without enough money to feed their families. You need to confront the emotion of data.
Are there strong and weak relationships? Is there heavy data and light data? Here is a small graphic about pediatric deaths. Wait, that sounds rather clinical.... Actually, these are kids who are dying. Here they are represented as a bubble chart. But wait, bubbles? Okay, that says “kids” but more along the lines of “fun kids.” It doesn’t really say “dead kids.”
Here’s something more appropriate. This shows a bit of the emotion of this data.... These are paths. And at the end of each path, those kids are no longer with us. So, when we look at this, we don’t lose sight of the message. My own son was on this path, but then his doctors cured him of cancer. Much different than seeing him as part of a bubble.
Now others have know how to use emotion for a long time....
Politicians....
Environmentalists,
activists....
So why do we see data as sterile? .... As devoid of place or context or feelings?I challenge you to use your emotions as a starting point when looking at your data.
Here’s a project we did in 2004. That’s a long time ago, back before we even had a designer, so please don’t snicker.This was a project that showed the statistics of rape.
Here at the bottom, if you can’t read, it says: “He was a friend. He physically forced her in a car. She didn’t fight back because she feared being killed.”We essentially constructed a story based on the statistics of rape. So each incident is real... tangible... It’s also in real-time. One rape is shown every 2 minutes – the actual rate of sexual assault at the time.So here we’re showing the statistics for all – we’re visualizing all the numbers, but we’re focusing on just one. Just one – right now.Telling that personal story can sometimes highlight something that’s otherwise overlooked or dismissed. For instance, when you read one of these stories about rape between men, you really start to see what even the small numbers mean.
Okay, so then you get to a point where you feel like.... Well, a little overwhelmed by all this amorphous emotion and all these qualitative things.... Everything starts to feel a bit mushy.That’s when I go back to the data. This is the dance you do with data.... It’s like sculpting... Just as with any other medium, you have to respond to the qualities of your data, to let it go in its own direction... Then have the ability to coax it back to elegance.... To cull its significance.
One quick thing you can often find in data is relationships. What are some of the relationships in a debate? Topics have keywords. Candidates speak words. Etc. That structure may exist in your data source already... Here’s a starting sketch we did when looking at our data from the presidential debates.
Some other explorations of that data.
Here’s our final design. It’s best to understand the relationships by describing them in real words.
Sometimes focusing on just one allows you to see the essential relationships. You don’t need to overwhelm the viewer with everything at once.This is Stanford’s Dissertation Browser. It’s basically a network, but THANKFULLY they didn’t draw a network diagram. Instead, you click on an area of study, such as CompSci in this example, and it shows how closely related that corpus of dissertations is to other areas. Here, you can see that compsci also gets tied in with a bunch of engineering, but also some chemistry, philosophy, etc.
When I click on biochemistry, I can see that it’s related to many other fields – the life sciences, but also natural sciences and engineering.
Okay, there’s so much more. This is just a high level look. Some quick notes to finish off.
This is a fantastic place to start if you’re stuck. It basically starts in the center with the type of thing you’re trying to show – comparison, relationship, distribution, etc. – then shows a decision tree for each of those things to get to the right chart. Now it’s terribly limited, but it can often get you at least headed in the right direction.
Look to other data visualizations, patterns, artwork, color schemes, for inspiration. We use Pinterest to keep some boards of our inspirations... You can find them at the link here.
Sketchsketchsketch! Drawing is the fastest way to get concepts realized. Once you find the right visual, you can go back to your tools to create it.
Thanks. I’ll be answering questions in chat after this, so please meet me over there. Thanks!