SIDEBAR
»
S
I
D
E
B
A
R
«
How to Make a Heatmap – a Quick and Easy Solution
Jan 21st, 2010 by analyticjournalism

Thanks to Nathan at Flowing Data:

How to Make a Heatmap – a Quick and Easy Solution

How to Make a Heatmap – a Quick and Easy Solution

How do you make a heatmap? This came from kerimcan in the FlowingData forums, and krees followed up with a couple of good links on how to do them in R. It really is super easy. Here's how to make a heatmap with just a few lines of code, but first, a short description of what a heatmap is.

The Heatmap

In case you don't know what a heatmap is, it's basically a table that has colors in place of numbers. Colors correspond to the level of the measurement. Each column can be a different metric like above, or it can be all the same like this one. It's useful for finding highs and lows and sometimes, patterns.

On to the tutorial.

Step 0. Download R

We're going to use R for this. It's a statistical computing language and environment, and it's free. Get it for Windows, Mac, or Linux. It's a simple one-click install for Windows and Mac. I've never tried Linux.

Did you download and install R? Okay, let's move on.

Step 1. Load the data

Like all visualization, you should start with the data. No data? No visualization for you.

For this tutorial, we'll use NBA basketball statistics from last season that I downloaded from databaseBasketball. I've made it available here as a CSV file. You don't have to download it though. R can do it for you.

I'm assuming you started R already. You should see a blank window.

Now we'll load the data using read.csv().

nba <- read.csv("http://datasets.flowingdata.com/ppg2008.csv", sep=",")

We've read a CSV file from a URL and specified the field separator as a comma. The data is stored in nba.

Type nba in the window, and you can see the data.

Step 2. Sort data

The data is sorted by points per game, greatest to least. Let's make it the other way around so that it's least to greatest.

nba <- nba[order(nba$PTS),]

We could just as easily chosen to order by assists, blocks, etc.

Step 3. Prepare data

As is, the column names match the CSV file's header. That's what we want.

But we also want to name the rows by player name instead of row number, so type this in the window:

row.names(nba) <- nba$Name

Now the rows are named by player, and we don't need the first column anymore so we'll get rid of it:

nba <- nba[,2:20]

Step 4. Prepare data, again

Are you noticing something here? It's important to note that a lot of visualization involves gathering and preparing data. Rarely, do you get data exactly how you need it, so you should expect to do some data munging before the visuals. Anyways, moving on.

The data was loaded into a data frame, but it has to be a data matrix to make your heatmap. The difference between a frame and a matrix is not important for this tutorial. You just need to know how to change it.

nba_matrix <- data.matrix(nba)

Step 5. Make a heatmap

It's time for the finale. In just one line of code, build the heatmap (remove the line break):

nba_heatmap <- heatmap(nba_matrix, Rowv=NA, Colv=NA,

col = cm.colors(256), scale="column", margins=c(5,10))

You should get a heatmap that looks something like this:

Step 6. Color selection

Maybe you want a different color scheme. Just change the argument to col, which is cm.colors(256) in the line of code we just executed. Type ?cm.colors for help on what colors R offers. For example, you could use more heat-looking colors:

nba_heatmap <- heatmap(nba_matrix, Rowv=NA, Colv=NA,

col = heat.colors(256), scale="column", margins=c(5,10))

For the heatmap at the beginning of this post, I used the RColorBrewer library. Really, you can choose any color scheme you want. The col argument accepts any vector of hexidecimal-coded colors.

Step 7. Clean it up – optional

If you're using the heatmap to simply see what your data looks like, you can probably stop. But if it's for a report or presentation, you'll probably want to clean it up. You can fuss around with the options in R or you can save the graphic as a PDF and then import it into your favorite illustration software.

I personally use Adobe Illustrator, but you might prefer Inkscape, the open source (free) solution. Illustrator is kind of expensive, but you can probably find an old version on the cheap. I still use CS2. Adobe's up to CS4 already.

For the final basketball graphic, I used a blue color scheme from RColorBrewer and then lightened the blue shades, added white border, changed the font, and organized the labels in Illustrator. Voila.

Rinse and repeat to use with your own data. Have fun heatmapping.

 

GPS, mapping and Economic Development in your town
Dec 17th, 2008 by analyticjournalism

 Colleague Owen Densmore points us to this page with these comments:

This use of gps may play a role in understanding economic development in any city by watching the flows within the city:

http://digitalurban.blogspot.com/2008/12/gps-city-tracks-1-year-in-24-hours-via.html

This gets me to an aspect of ED I'm interested: MicroED.  It comes from the observation that all cities' ED is unique.  Think about every city you've lived in and you'll notice that each was unique.  For me, Rochester NY: Kodak/Xerox company towns; Silicon Valley: A network of startups and established companies with a highly mobile social/skill network.  Here in Santa Fe, we are similarly unique.

I think this is core: discover your unique environment and capitalize on improving it through managing it.  Data farming your city.  Graph its flows.

   — Owen

2008-12-17

GPS City Tracks: 1 Year in 24 Hours via Google Earth



GPS tracks can show the 'life' of a city, which parts of the city are working, areas that are no go zones and sections dedicated to shopping, work, entertainment etc. The possibilities for using GPS data to examine our cities 'health' are intriguing which turns us to the work by Fabian over at http://urbantick.blogspot.com/

The movie below illustrates Fabian's paths around the city of Plymouth over 365 days, compressed and visualised in Google Earth:




plymouth365_24H_duration from urbanTick on Vimeo.

Google Earth is an excellent tool for displaying GPS data, especially over time, we are just starting to look into other options, perhaps After Effects – any thoughts or ideas for visualising GPS tracks over time would be great…

See http://urbantick.blogspot.com/ for more movies and examples on visualising GPS tracks in the city.


 

Flickr's Burning Man Map Uses Open Street Map
Aug 28th, 2008 by Tom Johnson

Brady Forrest, at O'Reilly's Radar, tips us to an interesting mash-up of Flickr, Open Street Map and the  Burning Man festival.  Why not use this idea for local festivals — fairs, classic car rallies, an introduction to a new shopping center?

Flickr's Burning Man Map Uses Open Street Map

Posted: 26 Aug 2008 07:38 PM CDT

flickr osm brc map

Flickr is best known for its photo-sharing, but increasingly its most innovative work is coming from its geo-developers (Radar post). Yesterday they announced the addition of a street-level map of Black Rock City so that we can view geotagged Burning Man photos. Flickr got the mapping data via Open Street Map's collaboration with Burning Man.

yahoo brc map


Flickr uses Yahoo! Maps for most of their mapping (and fine maps they are). The underlying data for them is primarily provided by NAVTEQ.
NAVTEQ's process can take months to update their customers' mapping
data servers. For a city like Burning Man that only exists for a week
every year that process won't work. However, an open data project like
Open Street Map can map that type of city. To the right you can see
what Yahoo's map currently looks like.


This isn't the first time Flickr has used OSM's data. They also used it to supplement their maps in time for the Beijing Olympics. I wonder if Yahoo! Maps will consider using OSM data so that their sister site doesn't continue to outshine them (view Beijing on Yahoo Maps vs. Flickr's Map to see what I mean). OSM's data is Creative Commons Attribution-ShareAlike 2.0.


In other geo-Flickr news they have added
KML and GeoRSS to their API. This means that you can subscribe to
Flickr API calls in your feed reader or Google Earth. (Thanks for the
tip on this Niall)


If you want to get more insight into Flickr's geo-thinking watch their talk from the Where 2.0 2008
conference after the jump.



Putting Open Source tools to work for community reporting
Jun 13th, 2008 by Tom Johnson

The phrases “community journalism” and “convergence journalism” have been around for decades (in the case of the former) and at least 10 years in the case of the latter.  For a long time, “community journalism” referred to the publishing of “…a small daily, 20,000 or less, or maybe a larger weekly or twice- or thrice-weekly.” And “convergence” most often talked about using various print and Audio/Visual media to deliver the same old reportorial product of traditional newspapers and broadcast.

Finally, some are starting to see that the real and much-needed “convergence” has to be implemented on the front-end of the reportorial process.  Paul Niwa, at Emerson College, has done just that with some graduate students who created bostonchinatown.org.  And we are grateful to Niwa for writing a “how and why we did it” piece for the current issue of the Convergence Newsletter.

Here's Niwa's lede, but do check out the entire piece:

“Community Embraces a Converged Journalism-Sourcing Project

By Paul Niwa, Emerson College

Boston’s Chinatown is one of the largest and oldest Asian American neighborhoods in the country. Yet, this community of 40,000 does not even have a weekly newspaper. Coverage of the neighborhood in the city’s metropolitan dailies is also weak. In 2006, The Boston Globe and the Boston Herald mentioned Chinatown in 78 articles. Only 16 percent of the sources quoted in those articles were Asian American, indicating that newspapers relied on information from non-residents to cover the neighborhood. With all this in mind, I created the bostonchinatown.org project as an experiment to build a common sourcebook for newsrooms.” 


 

More on the SoCal fire coverage
Oct 25th, 2007 by Tom Johnson

This comes from the Poynter blog…..

Posted by Amy Gahran 5:42:13 PM
CA Wildfire Coverage: Intriguing Online Approaches

KPBS San Diego is offering fire news updates via Twitter — possibly the best use of this service I've ever seen.

While much of Southern California burns, online news staffs and citizen journalists definitely aren't fiddling around. Here's a quick roundup of some of the more intriguing efforts:

What kinds of innovative online coverage of the fires are you seeing today? Please comment below.

(Thanks to the members of Poynter's Online News discussion group for tips to some of the items above.)


 

Impact of feedback in mass media message.
Jun 30th, 2007 by JTJ

A recent article worth a look over by the journalism community. What we do DOES have impact.

Juan Carlos González-Avella, Mario G. Cosenza, Konstantin Klemm, Víctor M. Eguíluz and Maxi San Miguel (2007)


Information Feedback and Mass Media Effects in Cultural Dynamics

Journal of Artificial Societies and Social Simulation vol. 10, no. 3 9

PDF at http://jasss.soc.surrey.ac.uk/10/3/9.html
Received: 11-Jan-2007 Accepted: 18-May-2007 Published: 30-Jun-2007

________________________________
Abstract
We study the effects of different forms of information feedback associated with mass media on an agent-agent based model of the dynamics of cultural dissemination. In addition to some processes previously considered, we also examine a model of local mass media influence in cultural dynamics. Two mechanisms of information feedback are investigated: (i) direct mass media influence, where local or global mass media act as an additional element in the network of interactions of each agent, and (ii) indirect mass media influence, where global media acts as a filter of the influence of the existing network of interactions of each agent. Our results generalize previous findings showing that cultural diversity builds up by increasing the strength of the mass media influence. We find that this occurs independently of the mechanisms of action (direct or indirect) of the mass media message. However, through an analysis of the full range of parameters measuring cultural diversity, we establish that the enhancement of cultural diversity produced by interaction with mass media only occurs for strong enough mass media messages. In comparison with previous studies a main different result is that weak mass media messages, in combination with agent-agent interaction, are efficient in producing cultural homogeneity. Moreover, the homogenizing effect of weak mass media messages is more efficient for direct local mass media messages than for global mass media messages or indirect global mass media influences. Keywords: Agent Based Model, Culture, Dissemination, Mass Media

Some imaginative election "gaming" from USC and the Annenburg Center
Jun 19th, 2007 by JTJ

From All Points Blog

Monday, June 18. 2007


The Redistricting Game

University of Southern California students developed the online game for the Annenburg Center for Communications to teach about the challenges (and partisanness) of redistricting. Along the way players learn that to keep their candidates elected they may need to examine ethical issues. The game is Flash-based.

From the [original News 10] site: The Redistricting Game is designed to educate, engage, and empower citizens around the issue of political redistricting. Currently, the political system in most states allows the state legislators themselves to draw the lines. This system is subject to a wide range of abuses and manipulations that encourage incumbents to draw districts which protect their seats rather than risk an open contest.


 

A semi- "by the numbers" tutorial on data visualization
Feb 14th, 2007 by JTJ

Juan C. Dürsteler, in Barcelona, Spain, edits a fine online magazine devoted to information graphics.  The current issue describes “… the diagram for the process of
Information Visualisation as seen by Yuri Engelhardt and the author
after a series of discussions about its nature and the process that
leads from Data to Understanding.” 

And it is available in English and Spanish.  Check out
http://www.infovis.net/printMag.php?num=187&lang=2



Something less than half a measure
Oct 17th, 2006 by JTJ

A brief comment was passed along on the NICAR-L (National Institute for Computer-Assisted Reporting) listserv this morning by Daniel Lathrop, of the Seattle Post-Intelligencer.  Said he:

Really interesting story on lobbyists-related-to-lawmakers in The USA
Today. I think those of us who cover money-in-politics should all have
a little story envy on this one.



http://www.usatoday.com/news/washington/2006-10-16-lobbyist-family-cover_x.htm


Daniel Lathrop
Seattle P-I


Well, yeah.  An interesting story, but also one demonstrating why newspapers as institutions simply do not grasp the shift in power inherent in the Digital Age, a shift away from institutions and to citizens. 

First, the story reports: “The family connections between lobbying and lawmaking are prompting
complaints that Congress is not doing enough to police itself
.”  Fair enough, but can't you SHOW us, in the online version, the evidence to support this sweeping generalization of “prompting complaints.”  Why should we take your word for it, guys, when the evidence must be at hand.

Second, “…USA TODAY reviewed thousands of pages of financial disclosures and
lobbyist registrations, property records, marriage announcements and
other public documents to identify which lawmakers and staffers had
relatives in the lobbying business.
”  WOW!  Would I like to see those pages, and even drill down into them to see if there's anything there related to my representative.  But nooooooooo.  The paper must of had some way to manage all this
public-record data, some way to cross-reference it, to search it, to retrieve documents and
content.  Why not put all that up on the
web and let readers peruse their own subjects of interest?

Ironically, an example of the power shift mentioned above turns up, buried in a sidebar to the story, “Little Accountability in Earmarks.”  There we find reference to something called the Sunlight Foundation.  I had not heard of the Sunlight Foundation, but, hey, it's only been around since the first of the year.  It turns out this organization is doing just what newspapers should be doing: leveraging the power of the digital environment to connect people to the data and tools needed to analyze that data so they can make informed decisions.

Another opportunity missed by the industry, and tragically so.




Using GIS to increase tax revenues
Aug 21st, 2006 by JTJ

An interesting piece in the NYTimes on Sunday, “Finding Tax Revenue Through Aerial Imaging,” highlights yet another industry and example of how public administrators are using GIS, in this case to increase the revenue stream.  We think that if journalists are not hip to these tools, then they cannot ask the right questions of the public's administrators.

…Until recently, assessors had to accept homeowners’ claims or visit
the properties themselves. But in 2003, the city hired the Pictometry
International Corporation, a company in Rochester, N.Y., to provide
images of every building in the city.

Once a year, Pictometry
flies a Cessna 172 over Philadelphia, taking thousands of
black-and-white photographs. The low-altitude shots, unlike satellite
images, show buildings at about a 40-degree angle. Pictometry’s
computers organize the photos so they can be searched by address.
Nearly 200 employees in Mr. Mescolotto’s office have the software on
their computers.

Pictometry isn’t the only company offering
aerial photos to assessors, but it has won adherents in more than 200
cities and counties, according to Dante Pennacchia, Pictometry’s chief
marketing officer. Its competitors include an Israeli company, Ofek
International, working with Aerial Cartographics of America, based in
Orlando, Fla….”
http://www.nytimes.com/2006/08/20/realestate/20nati.html



»  Substance:WordPress   »  Style:Ahren Ahimsa