Category Archives: Domain-Specific

Snowzilla!

Published by:

After shoveling the driveway several times and burning through the Netflix que, one way to counter act cabin fever is to hunt down some snowfall data and play around with it.  So, I found some data over at the National Weather Service that contains snowfall depth measurements collected from a variety of sources around the region at various time points during the storm.

The map shows the maximum snowfall depth at any given location recorded from Friday until Sunday.  The deepest measurements are labeled. The area near West Virginia clearly bore the brunt of the storm, but there were some areas closer to DC that came close.  Everyone pretty much got a lot of snow.

snowzilla_mapBrunel Code:

map('usa') + x(Lat) y(Lon) max(Snowfall) color(Snowfall:blues) tooltip(City,Snowfall) style("stroke-width:0;opacity:.4;size:15px") + 
map('labels')  + x(Lat) y(Lon) max(Snowfall) top(Snowfall:10)  label(Snowfall, '"') text style("font-family:Impact;fill:darkblue") tooltip(City,Snowfall)

Apparently there was a bit of controversy regarding the exact snowfall measurement at Washington National Airport.  To try to look at this, I added a timeline graph that is linked to the map so I could see the snowfall amounts at different time points.  The data are binned to roughly 2 hours and these bins are colored by the number of measurements taken within the time range.  Clicking on a bin shows the measurements on the map–and I zoomed in to the airport.  I do not appear to have all the data showing the issue; but, I can see measurements in nearby areas and who did them.  Perhaps the upshot was that the difference is significant for historical and business reasons–but it probably won’t make your back feel much better.

snowzilla_zoom_mapBrunel Code:

map('usa') at(0,0,100,75) + x(Lat) y(Lon) size(Snowfall:200%) max(Snowfall) color(Source) label(Snowfall,City) tooltip(Snowfall, City, Source) interaction(filter)  at(0,0,100,75) + map(labels)  at(0,0,100,75) |  x(Time) bin(Time:20) color(#selection) opacity(#count) interaction(select) tooltip(Time)  at(0,85,100,100)

Since the storm lasted nearly 36 hours, it can also be interesting to look at the depths over time.  The variable sized paths below show that the snow generally started to really pile up Friday night and also that Maryland and West Virginia seemed to reach their peak a little bit sooner than Virginia.  The boxes that are overlaid on the paths show the number of measurements that were taken at binned time intervals.  More measurements were taken in Virgina and Maryland–and most measurements seem to have been taken towards the end of the major accumulation.

snowzilla_time

Brunel Code:

path x(Time) y(State) color(State) bin(Time:20)  size(Snowfall) max(Snowfall) legends(none) + x(Time) y(State) color(#count) bin(Time:30) style('height:20px')

Lastly, if you are familiar with the area, you’ll quickly recognize the county names.  Below is a cloud with county names sized by the max snowfall depths and colored by their state.  Counties with larger snowfall amounts appear more towards the center.  Most names are nearly the same size because everyone got a lot of snow!

snowzilla_cloud

Brunel Code:

cloud color(State) size(Snowfall:150%) label(County) max(Snowfall) sort(Snowfall)  style('.element {font-family:Impact;}')

Brunel: Open Source Visualization Language

Published by:

BRUNEL is a high-level language that describes visualizations in terms of composable actions. It drives a visualization engine (d3) that performs the actual rendering and interactivity. It provides a language that is as simple as possible to describe a wide variety of potential charts, and to allow them to be used in Java, Javascript, python and R systems that want to deliver web-based interactive visualizations.


At the end of the article are a list of resources, but first, some examples. The dataset I am using for these is a set of data taken from BoardGameGeek which I processed to create a data set describing the top 2000 games listed as of Spring 2015. Each chart below is a fully interactive visualization running in its own frame. I’ve added the brunel description for each chart below each image as a caption, so you can go to the Builder anytime and copy the command into the edit box to try out new things.

data('sample:BGG Top 2000 Games.csv') bubble color(rating) size(voters) sort(rating) label(title) tooltip(title, #all) legends(none) style('* {font-size: 7pt}') top(rating:100)

This shows the top 100 games, with a tooltip view for details on the games. They are packed together in a layout where the location has no strong meaning
— the goal is to show as much data in as small a space as possible!
In the builder, you can change the number in top(rating:100) to show the top 1000, 2000 … or show the bottom 100. You could also add x(numplayers) to divide up the groups by recommended number of players

data('sample:BGG Top 2000 Games.csv') line x(published) y(categories) color(categories) size(voters:200) opacity(#selection) sort(categories) top(published:1900) sum(voters) legends(none) | data('sample:BGG Top 2000 Games.csv') bar y(voters) stack polar color(playerage) label(playerage) sum(voters) legends(none) at(15, 60, 40, 90) interaction(select:mouseover)

This example shows some live interactive features; hover over the pie chart to update the main chart. The main chart shows the number of people voting for games in different categories over time, and the pie chart shows the recommended minimum age to enjoy a game. So when you hover over ‘6’, for example, you can see that there have been no good sci-fi games for younger players in the last 10 years. Use the mouse to pan and zoom the chart (drag to pan, double-click to zoom).

data('sample:BGG Top 2000 Games.csv') treemap x(designer, mechanics) color(rating) size(#count) label(published) tooltip(#all, title) mean(rating) min(published) list(title:50) legends(none)

Head to the Builder Site to modify this. You could try:

  • change the list of fields in x(…) — reorder then or use fields like ‘numplayers’, ‘language’
  • remove the ‘legends(none)’ command to show a legend
  • change size to ‘voters’ — and add a ‘sum(voters)’ command to show the total number of voters rather than just counts for each treemap tile

Do you want to know more?

Follow links below; gallery and cookbook examples will take you to the Brunel Builder Site where you can create your own visualizations and grab some Javascript code to embed them in your web pages … which is exactly how I built the above examples!

Visualizing Tennis

Published by:

I’m a member of the American Statistical Association’s “Statistics in Sport” section (http://www.amstat.org/sections/sis/) and I’m also British by birth, so Andy Murray’s success at Wimbledon this year was interesting to me for two reasons. I took a look at some of the data on Murray (collected by IBM’s SlamTracker initiative — http://2013.usopen.org/en_US/slamtracker/ ) with a view to doing a little visual analysis, so now I have another reason to be interested …

I found some data on his performance over a few years leading up to Wimbledon 2013 and wanted to look at trends. Now usually I prefer to create several linked visualizations and look at them together, but for this data I found that several of the stats I was interested in worked nicely when plotted in the same system. Here’s what I came up with:

Image

Continue reading

From the Vaults: Maps are Just Another Element

Published by:

For the Grammar of Graphics language-based approach to visualization, and therefore in the RAVE visualization system, maps are simply another element that can be used within the grammatical formulation.

Although most people consider a map a very different entity from a bar chart, all that really differs between a bar chart and a map of areas like the one included here is that instead of representing a row of data by a bar, we use a polygon (or set of polygons) on a map. Otherwise their properties ought to be the same — we can apply color, patterns, labels, transparency. We can set a summary statistic when there are multiple values for each polygon to reflect min, max, mean, median, range, or any of the regular sets of items. We can flip, transpose and panel the charts. Essentially, from the grammatical point of view, if you can do it to a bar chart, you can do it to a map. The only limitation is that whereas the sizes of the bars can be set or determined by data, the map polygons cannot, so setting sizes on the map polygons has no effect.

US Chorlopleth

Orthogonality is also important — so we can say we want a point element instead of a polygon, as in the above where we’ve added a second element to a RAVE US Map conveying different data as well as being a good place to put labels