I received my voter’s pamphlet in the mail yesterday, and it inspired me to create a data visualization on voting. If you start to type, “Voting is a…” on Google, the autocomplete offers “right,” “privilege,” “civic duty,” and “waste of time.” When I 18, my perspective was the latter. I was apathetic; what is the point of voting if elections are never won by a single vote? Then my dad talked some sense into me (turns out dads can be right about some things). Voting is each of the first three autocompletes. I still hold the opinion that my single vote will not affect the outcome of an election, but I also believe in voicing my views and leading by example. Voting gives me the opportunity to express my wishes for how we ought to be governed. If everyone held that attitude, then the officials we elect will actually reflect the desires of the citizens, and that’s the purpose of our democracy. So get out there and vote in the 2014 Midterm Election!

This map shows voter turnout by county for the 2012 Presidential Election. I’ve calculated the percentage of voters out of the voting-age population. Other analyses on this topic sometimes use voter-eligible population (removing non-citizens, those in prison, etc.) This would shift some of the county percentages up a few points, but the map trends would not change.

I’ve used a diverging color scheme centered on the median to show which counties have higher than typical voter turnout, and which have lower. What’s interesting here is that the balance doesn’t follow population density or political affiliation. Some places are simply more politically active than others.

Data sources:


http://factfinder2.census.gov/faces/nav/jsf/pages/index.xhtml (5-yr ACS 2012 - Table S0101)

Here’s a time series of video game hardware and software sales.  I’ve normalized the data by population. The tables below the graphs show the best-selling items for that year. The software is binned by hardware platform. Lighter shades correspond to handheld devices. Orange is for Nintendo consoles, blue for PlayStation, and green for Microsoft.

Though the gaming industry is not as strong as it was in 2010, it’s still a lucrative industry. In the US and Europe, home consoles now play a dominant role in the market, while in Japan, there is a greater focus on handheld platforms.

Data sources: http://www.vgchartz.com/yearly/ (gaming data)

http://www.multpl.com/sitemap (population data)

My younger brother has made several attempts to buy a house in the Bay Area, but has been outbid every time. It’s an aggressive real estate market. So I was curious to see how it compares to other markets in the country. Redfin publishes 90-day averages for the regions where it operates, so I’ve graphed these data. As you can see, San Francisco, San Jose, and Oakland appear to be in a league of their own.

Note that Redfin does not have a presence in NYC, so those data are not provided. To be consistent, I decided not to calculate them using another source. But for comparison’s sake, San Francisco homes sold for a median of $845/sq ft this summer, while NYC homes sold for a median of $497/sq ft (according to Zillow). That said, the medians in some parts of NYC are >$1500. It’s quite neighborhood dependent. Regardless, these graphs reflect the difficulty of purchasing a home in the Bay. More offers, more bidding wars, more expensive homes per sq ft. If you’re trying to buy a house there, good luck!

Data source: https://www.redfin.com/

Seeing your favorite band live is made much easier by having the musical artist come to your hometown. But not all of us are so fortunate – for some, traveling to a larger city is the only chance to see a concert. I was curious to see how closely tour locations follow the distribution of the US population. In other words, if 12% of the US population is in California, do touring musicians play about 12% of their concerts in that state? Obviously, this will vary by artist, so I selected 11 who have toured multiple times in the US.

Using historical tour locations, I calculated, for each artist, the percentage of US concerts by state. I then subtracted from that the percentage of the US population that lives in each respective state. Using subtraction (rather than division) tends to highlight states with larger populations; this is intentional, as I was less interested in states with <1% of the population and only a handful of concerts played. None of the artists has played a public concert in all 50 states!

When this method yields a positive number, it suggests that the artist may overplay in that state. That is, people living in that state may have an easier time attending a concert by the artist. A negative number indicates that the artist underplays there. For each map, the states with the most extreme positive and negative values are marked with an “O” and a “U,” respectively. In doing the analysis, it was also clear that some artists are much more likely to tour globally, while others rarely play outside of the US. So I added the bar graph to represent those data.

Because I’ve used all available historical data from the source, some artists may have a bias towards their hometown or where they first started gaining popularity (e.g., DMB in VA). It’s also worth noting that musical styles may play a role, as Toby Keith and Kenny Chesney have strikingly similar maps.

Data source: http://www.songkick.com/

I had a debate with my family once about the most popular wine grape.  (I thought it would be Merlot…I was wrong.)  This weekend, I was in Napa and it reminded me that I’ve been meaning to make a graphic on the subject.  So here it is - the distribution of wine grapes in California.  The land acreage dedicated to growing wine grapes is not surprising.  Napa leads the way, with 9.60% of its land area used for growing wine grapes. San Joaquin takes second at 8.20%, and Sonoma, third, at 5.97%.  Note the scale is nonlinear because so many counties are below 1%.  Most counties grow more red than white wine grapes. However, Chardonnay claims the most overall acreage.

Data source: http://www.nass.usda.gov/Statistics_by_State/California/Publications/Grape_Acreage/

There has been considerable debate lately about reclining airplane seats. Some feel it’s their right to recline. Others see it as outrageously rude. And then there are those who couldn’t care less. As a short person, I’ve never had a problem with the person in front of me reclining, but I can understand that it would be frustrating if your legs were already wedged in the space. 

The leg room we’re given on a plane varies significantly by airline, seat class, and whether the flight is short haul (less than six hours) or long haul (greater than six hours). The data source provides the seat pitch (distance between your seatback and the one in front of you) and seat width (armrest to armrest) for various seat types (e.g., standard, recliner, flat bed) and plane types (e.g., Boeing 737, Airbus A320) for all reporting airlines. Currently, the source has data for 109 airlines. For instances where a range was reported, I used the average value.

I’ve graphed the data on six separate plots because there was considerable overlap, even after jittering the points. This allows you to see clustering by seat class. The average seat pitch and width for short-haul economy class is 31.7 inches and 17.5 inches, respectively.

Data source: http://www.seatguru.com/charts/generalcharts.php

In honor of the new football season, set to kickoff tomorrow, I’ve graphed some NFL data.  This is similar in nature to my previous NBA shooting percentage graphic, but I’ve skipped the polar coordinates this time around.

The graph shows the career yards gained for every NFL player (1932-2013) on the y-axis, separated by rushing and receiving.  For example, Emmitt Smith’s 18,355 rushing yards lead the pack in the orange dots, but his purple dot for 3,224 receiving yards is buried somewhere in the pile.

The x-axis takes the career rushing or receiving yards and divides by the number of carries or receptions, respectively. Note that I have limited the x-axis to 0 to 30 yards per carry or catch. There are, however, some data points outside this domain. Particularly, there are many players who had only a few runs, most of which were for losses (or quarterbacks who earned negative yardage when sacked).  These result in negative average yards per rush.  Though I haven’t included these in the plot, the effect can be seen in the cumulative distribution functions, which are based on all data, not just points in the viewable domain.

Data source: http://www.pro-football-reference.com/

I really enjoyed this article in The Economist on the astonishing increase in textbook prices relative to the consumer price index, so I decided to investigate some other items.  Food and apparel both had interesting results, so here are the graphs!  These use CPI for all urban consumers, and the data are seasonally adjusted.

Data source: http://www.bls.gov/cpi/data.htm

I have previously defended Portland’s weather. The point I made was that, while it absolutely rains more often (more days per year) in Portland than most other large cities, we get less rain overall (in terms of depth of water). These maps visualize the latter point using average precipitation from 30-yr normals (1981-2010). Note that almost half of Portland’s precipitation occurs in November-January, but the combined precipitation in July and August is less than 4% of the annual total. To emphasize this seasonality, I’ve also mapped normals for July and December, when Portland is drier and wetter, respectively, than most of the country.

Data source: http://www.prism.oregonstate.edu/normals/

First things first – I’m not a Mormon. I loved the musical (The Book of Mormon) and Krakauer’s Under the Banner of Heaven. While I may not agree with or even understand the LDS beliefs, I’m fascinated by their propagation capabilities (both in terms of reproduction and proselytizing). The LDS Church is such a fast growing denomination, it makes me wonder why other religions don’t try to mimic their approach. It’s just impressive.

This graphic reveals the geographic spread of the LDS Church as a function of time. Its growth has been focused particularly in the past 30 years. Of the 143 currently operating temples, 123 (86%) were dedicated in 1983 or later. It should be noted that this graphic only shows temples, not churches. There are many more churches than temples!

Data source: http://www.ldschurchtemples.com/chronological/

I was reading some of the comments about the very sad story of Michael Brown’s killing in Ferguson, MO. One reader claimed that most homicides are white killing white or black killing black, rather than interracial. While this is true, it doesn’t diminish the impact of a horrible event like this.

To make this graph, I’ve taken a five-year average of homicide rates from the FBI crime reports. The numbers are raw; they are not normalized by the demographic breakdown of the US. Please be aware that these are based on race, not ethnicity! For the purpose of these data, the FBI considers Hispanic/Latino to be an ethnicity.

Data source: http://www.fbi.gov/about-us/cjis/ucr/crime-in-the-u.s/ (Expanded Homicide Data Table 6 for years 2008-2012) 

I recently watched several of The International 4 matches. I’d never seen Dota 2 before and, while it was entertaining, some of the games seemed to drag on a bit longer than I expected.  As a mediocre SC2 player, my games are over in 15 minutes.  I guess zergling rushes don’t exist in Dota 2.  So I searched the forums for average game length, and was surprised by how many discussions there were with a lot of individual opinions but no consensus answer.  Obviously, game length is different for average players than it is for pros, but it was easier to get game length results from premier tournaments, so that’s what I’ve used here. I added League of Legends to make the graph slightly more interesting.

For SC2, I used GSL tournaments, including Up & Down, Code A, and Code S matches.

For Dota2, I used The International (2, 3, and 4), but only Captains Mode games.

I don’t know anything about LoL, so I tried to choose tournaments that had the most prize money.  That included Season 3 World Championship, PANDORA.TV Champions Winter 2013-2014, HOT6iX Champions Summer 2013, and a couple slightly smaller ones that reported game lengths.

Data sources:




The USA imprisons a lot of people; it has more than 700 prisoners per 100k population.  As a country, its imprisonment rate is second only to Seychelles (which is a small African country comprising many islands with only 90k people). The USA has 21.7% of the global prisoner population, but only 4.4% of the total global population. So is the USA legal system too effective? Or are other countries’ systems not effective enough?   

This graphic puts the issue into perspective. The area of each country’s rectangle is proportional to the total number of prisoners it has. The value for the color is normalized using each country’s population (prisoners per 100k people).

Data source: http://www.prisonstudies.org/highest-to-lowest/prison-population-total

As we all know, many of the greatest distance runners come from Kenya and Ethiopia, and the world’s fastest sprinter (Usain Bolt) is Jamaican.  How do the best runners from other countries compare?

These maps show fastest race time by country relative to the world record for 100 m, 1500 m, 10 km, and a marathon. The data are based on men’s records, and do not include wind-assisted times. A sample calculation is provided to show the meaning of the scale.  All times were divided by the world record time, and then converted to percentage slower than world record time.

Unfortunately, despite the source having thousands of completion times for each race, only a handful of countries are represented for each race length; gray shading means there were no data. Despite this limitation, you can still identify which regions rise to the top, and which are a bit slower.

Data source: http://www.alltime-athletics.com/men.htm

EDIT: Thanks for the feedback regarding missing countries!  Because the country codes change and multiple standards are available, several were not joined correctly in the GIS.  I’ve fixed as many as I can, so the new map has more countries displayed.  Thanks again!

Religious buildings (churches, mosques, synagogues, temples, and other places of worship) often have an intentional orientation, largely to assist with fixing the direction people face when praying. The altar in Christian churches is often pointed toward the liturgical east. Islamic mosques are traditionally oriented toward the Qibla (direction of Mecca).

For these calculations, I selected five countries that are dominated by five different religions (Thailand – Buddhism; Italy – Catholicism; Israel – Judaism; Pakistan – Islam; India – Hinduism). The shapefile containing the Israel buildings was merged with Palestine, which is predominantly Islamic. Though these could be separated, the exact border between the two countries is a bit tenuous, so I opted to leave it as a single region.

The method for the calculation is shown on the graphic. For each building footprint, a bounding rectangle is defined. This rectangle is oriented to minimize its width. The orientation of the building is then measured as the azimuth of the rectangle’s height (longer sides). Orientation is counted in both directions, so a building facing due east is also considered to face west. The plots show the frequency of a given orientation in 5° bins.

As you can see, most religious buildings in these countries are aligned east-west. Pakistan is slightly north of east from Mecca, which may explain why many of the religious buildings there are orientated WSW-ENE.

Data source: http://download.geofabrik.de/