The Evolution of Darwin's Ideas
Sep 7th, 2009 by analyticjournalism

FlowingData passes along the link to this fine piece of work by Ben Fry.  “Ben Fry Visualizes the Evolution of Darwin’s Ideas” Journos could be using a similar approach to analyze the evolution of the ideas of public officials.

Ben Fry Visualizes the Evolution of Darwin’s Ideas

Posted by Nathan / Sep 7, 2009 to Artistic Visualization / 2 comments

Ben Fry Visualizes the Evolution of Darwin’s Ideas

“Ben Fry, well-known for Processing and plenty of other data goodness, announced his most recent piece, On the Origin of Species: The Preservation of Favoured Traces, made possible by The Complete Work of Charles Darwin Online.

The visualization explores the evolution of Charles Darwin's theory of, uh, evolution. It began as a less-defined 150,000-word text in the first edition and grew and developed to a 190,000-word theory in the sixth edition.

Watch where the updates in the text occur over time. Chunks are removed, chunks are added, and words are changed. Blocks are color-coded by edition. Roll over blocks to see the text underneath.

As usual, excellent work, Mr. Fry.”



Mary Ellen Bates on "Google Squared"
Aug 25th, 2009 by analyticjournalism

Mary Ellen Bates offers up this good tip on “Google Squared” at

Bates Information Services, ________________________________________________________________________________________

August 2009

Google Squared

Google Labs — the public playground where Google lets users try out new products or services that aren't yet ready for prime time — is my secret weapon for learning about cool new stuff. My favorite new discovery in Google Labs is Google Squared. It's a demonstration of a search engine trying to provide answers instead of just sites, and at a higher level than the simple “smart answers” you see when you search for “time in Rome” or “area code 909”. Rather, Google analyzes the retrieved pages, identifies common elements, and creates a table with the information it has compiled.

This is a fascinating tool that helps you compile facts into tables that Google builds on the fly. Hard to describe, easier to show. Go to and type in a query that will retrieve a number of similar things — organic farms in Colorado, for example, or women CEOs… even superhero powers.

Google Squared generates a table of facts extracted from its index, with the items you are searching for as the left-most column, along with columns for whatever related characteristics are relevant for the topic. For organic farms in Colorado, for example, the table in the search results has columns for the name of the company, an image from the farm's web site, a snippet of description about the farm, and columns for telephone number, location and “season.” Note that some of these columns may have few entries in them, depending on what information Google analyzed. For women CEOs, the table includes the CEO's name, a photo, a snippet that indicates what her position is, her date of birth, and her nationality. For superhero powers, you will find the superhero's name, a photo, a far-too-brief description of said superhero, the hero's first appearance (in print, that is), publisher and even the hero's “abilities”.

Interestingly, you can insert your own items in a Google Squared table, and either let Google populate the rest of the row or type in whatever content you want in that row. I added Catwoman to my superheroes table and Google filled in the new row with her photo and description; I could provide the rest of the info. For some tables, Google even suggests additional columns. For my superheroes table, I could add columns for Aliases, Alter Ego, Profession (the Joker is a lawyer, of course), and so on. You can add your own columns, as well.

You can also delete a row or column that isn't relevant to your search. If you log in to your Google account, you can save your customized tables for later use. And you can export the table into Excel (the images are exported as URLs).

Google Squared is never going to compete with a real human's analysis of a collection of facts, but it can be a great way to start brainstorming, as a quick way to organize the results of your search, and as a starting point for a nicely-presented deliverable for your client.

“May I publish or reproduce this InfoTip?” Be my guest! Just make sure you credit the source, Bates Information Services, and include the URL,

Designing for Big Data
Apr 29th, 2009 by analyticjournalism

Much of this is well-known by those of us who have worked with dataviz for the past decade or two, but his ending conclusions are solid and worth reviewing.

Key quote from Jeffrey Veen: “We need to create tools to help people manipulate THEIR data.”

 Good examples of how to use large data sets to find and tell stories and, if desired, to answer YOUR questions about the data.

Video: Designing for Big Data

This is a 20-minute talk I gave at the Web2.0 Expo in San Francisco a couple weeks ago. In it, I describe two trends: how we're shifting as a culture from consumers to participants, and how technology has enabled massive amounts of data to be recorded, stored, and analyzed. Putting those things together has resulted in some fascinating innovations that echo data visualization work that's been happening for centuries.

I've given this talk a few times now, but this particular delivery really went well. Only having 20 minutes forced me to really stay focus, and the large audience was very engaged. I'll be giving an extended version of this talk in June at the UX London conference, with a deeper look at how we integrated design and research while I was at Google.


How the right kind of data visualization could lead to new research questions or insights.
Dec 30th, 2008 by analyticjournalism

Nathan, over at, posts this interesting data visualization from the Baylor College of Medicine. No, it probably doesn't give a science writer a story in itself, but the concept of taking a complex data set and illustrating that data with the right tool — in this case, Circos — good generate some interesting reporting vectors. For example, could Circos show us something about traffic patterns? Ambulance or fire department response times? We're not sure, but we hope someone could probe this a bit.

Researchers Map Chaos Inside Cancer Cell

Posted by Nathan / Dec 29, 2008 to Network Visualization / 2 comments

Researchers Map Chaos Inside Cancer Cell

The thing about cancer cells is that they suck. Their DNA is all screwy. They've got chunks of DNA ripped out and reinserted into different places, which is just plain bad news for the cells in our body that play nice. You know, kind of like life. Researchers at the Baylor College of Medicine in Houston have compared the DNA of a certain type of breast cancer cell to a normal cell and mapped the differences (and similarities) with the above visualization.

The graphic summarizes their results. Round the outer ring are shown the 23 chromosomes of the human genome. The lines in blue, in the third ring, show internal rearrangements, in which a stretch of DNA has been moved from one site to another within the same chromosome. The red lines, in the bull's eye, designate switches of DNA from one chromosome to another.

Some design would benefit the graphic so that your eyes don't bounce around when you look at the technicolor genome but it's interesting nevertheless.

Check out the Flare Visualization Toolkit or Circos if you're interested in implementing a similar visualization with the above network technique.


GPS, mapping and Economic Development in your town
Dec 17th, 2008 by analyticjournalism

 Colleague Owen Densmore points us to this page with these comments:

This use of gps may play a role in understanding economic development in any city by watching the flows within the city:

This gets me to an aspect of ED I'm interested: MicroED.  It comes from the observation that all cities' ED is unique.  Think about every city you've lived in and you'll notice that each was unique.  For me, Rochester NY: Kodak/Xerox company towns; Silicon Valley: A network of startups and established companies with a highly mobile social/skill network.  Here in Santa Fe, we are similarly unique.

I think this is core: discover your unique environment and capitalize on improving it through managing it.  Data farming your city.  Graph its flows.

   — Owen


GPS City Tracks: 1 Year in 24 Hours via Google Earth

GPS tracks can show the 'life' of a city, which parts of the city are working, areas that are no go zones and sections dedicated to shopping, work, entertainment etc. The possibilities for using GPS data to examine our cities 'health' are intriguing which turns us to the work by Fabian over at

The movie below illustrates Fabian's paths around the city of Plymouth over 365 days, compressed and visualised in Google Earth:

plymouth365_24H_duration from urbanTick on Vimeo.

Google Earth is an excellent tool for displaying GPS data, especially over time, we are just starting to look into other options, perhaps After Effects – any thoughts or ideas for visualising GPS tracks over time would be great…

See for more movies and examples on visualising GPS tracks in the city.


Librarians and "IT Professionals" – Getting to the root of it all
Nov 14th, 2008 by analyticjournalism

Amy Disch, library director of The Columbus (Ohio) Dispatch, sends along these links via the News Librarians' listserv (  This is a gentle reminder about how the foundations of good publications today rest, first, on the integration of library AND IT skills.

Watch them in the order listed:




Three Tuesdays workshop on data and the political campaigns at the Santa Fe Complex
Sep 27th, 2008 by Tom Johnson

Handicapping the Horserace

Published by Don Begley at 10:09 pm under Complex News, event

Handicapping the Horserace
    •September 30, 2008 – 6:30-8 pm  •October 7, 2008 – 6:30-8 pm  •October 14, 2008 – 6:30-8 pm

It’s human nature: Elections and disinformation go hand-in-hand. We idealize the competition of ideas and the process of debate while we listen to the whisper campaigns telling us of the skeletons in the other candidate’s closet. Or, we can learn from serious journalism to tap into the growing number of digital tools at hand and see what is really going on in this fall’s campaigns. Join journalist Tom Johnson for a three-part workshop at Santa Fe Complex to learn how you can be your own investigative reporter and get ready for that special Tuesday in November.

Over the course of three Tuesdays, beginning September 30, Johnson will show workshop participants how to do the online research needed to understand what’s happening in the fall political campaign. There will be homework assignments and participants will contribute to the Three Tuesdays wiki so their discoveries will be available to the general public.

Everyone is welcome but space will be limited. A suggested donation of $45 covers all three events or $20 will help produce each session. Click here to sign up.

  • The Daily Tip Sheet (September 30, 6:30 pm)

    Newspapers are a ‘morning line’ tip sheet. There isn’t enough room for what you need to know.

    Newspapers can be a good jumping-off point for political knowledge, but they rarely have enough staff, staff time and space to really drill down into a topic. Ergo, it is increasingly up to citizens to do the research to preserve democracy and help inform voters. Tonight we will be introduced to some of the city, state and national web sites to help in our reporting and to a few digital tools to help you save and retrieve what you find.
  • Swimming Against the Flow (October 7, 6:30 pm):

    How to track data to their upstream sources.

    A web page and its data are not static events. (Well, usually they are not.) Web pages and digital data all carry “signs” of where they came from, who owns the site(s) and sometimes who links to the sites. We will discuss how investigators can use these attributes to our advantage, and also take a step back to consider the “architecture of sophisticated web searching.”
  • The Payoff (October 14, 6:30 pm)

    Yup, it IS about following the money. But then what?

    Every election season, new web sites come along that make it easier to follow the money — election money. This final workshop looks at some of those sites and focuses on how to get their data into a spreadsheet. Then what? A short intro to slicing-and-dicing the numbers. (Even if you are a spreadsheet maven, please come and act as a coach.)

This workshop is NOT a sit-and-take-it-in event. We’re looking for folks who want to do some beginning hands-on (”On-line hands-on”, that is) investigation of New Mexico politics. And that means homework assignments and contributing to our Three Tuesdays wiki. Participants are also encouraged to bring a laptop if you can. Click here to sign up.

Tom Johnson’s 30-year career path in journalism is one that regularly moved from the classroom to the newsroom and back. He worked for TIME magazine in El Salvador in the mid-80s, was the founding editor of MacWEEK, and a deputy editor of the St. Louis Post-Dispatch. His areas of interest are analytic journalism, dynamic simulation models of publishing systems, complexity theory, the application of Geographic Information Systems in journalism and the impact of the digital revolution on journalism and journalism education. He is the founder and co-director of the Institute for Analytic Journalism and a member of the Advisory Board of Santa Fe Complex.


Flickr's Burning Man Map Uses Open Street Map
Aug 28th, 2008 by Tom Johnson

Brady Forrest, at O'Reilly's Radar, tips us to an interesting mash-up of Flickr, Open Street Map and the  Burning Man festival.  Why not use this idea for local festivals — fairs, classic car rallies, an introduction to a new shopping center?

Flickr's Burning Man Map Uses Open Street Map

Posted: 26 Aug 2008 07:38 PM CDT

flickr osm brc map

Flickr is best known for its photo-sharing, but increasingly its most innovative work is coming from its geo-developers (Radar post). Yesterday they announced the addition of a street-level map of Black Rock City so that we can view geotagged Burning Man photos. Flickr got the mapping data via Open Street Map's collaboration with Burning Man.

yahoo brc map

Flickr uses Yahoo! Maps for most of their mapping (and fine maps they are). The underlying data for them is primarily provided by NAVTEQ.
NAVTEQ's process can take months to update their customers' mapping
data servers. For a city like Burning Man that only exists for a week
every year that process won't work. However, an open data project like
Open Street Map can map that type of city. To the right you can see
what Yahoo's map currently looks like.

This isn't the first time Flickr has used OSM's data. They also used it to supplement their maps in time for the Beijing Olympics. I wonder if Yahoo! Maps will consider using OSM data so that their sister site doesn't continue to outshine them (view Beijing on Yahoo Maps vs. Flickr's Map to see what I mean). OSM's data is Creative Commons Attribution-ShareAlike 2.0.

In other geo-Flickr news they have added
KML and GeoRSS to their API. This means that you can subscribe to
Flickr API calls in your feed reader or Google Earth. (Thanks for the
tip on this Niall)

If you want to get more insight into Flickr's geo-thinking watch their talk from the Where 2.0 2008
conference after the jump.

Putting Open Source tools to work for community reporting
Jun 13th, 2008 by Tom Johnson

The phrases “community journalism” and “convergence journalism” have been around for decades (in the case of the former) and at least 10 years in the case of the latter.  For a long time, “community journalism” referred to the publishing of “…a small daily, 20,000 or less, or maybe a larger weekly or twice- or thrice-weekly.” And “convergence” most often talked about using various print and Audio/Visual media to deliver the same old reportorial product of traditional newspapers and broadcast.

Finally, some are starting to see that the real and much-needed “convergence” has to be implemented on the front-end of the reportorial process.  Paul Niwa, at Emerson College, has done just that with some graduate students who created  And we are grateful to Niwa for writing a “how and why we did it” piece for the current issue of the Convergence Newsletter.

Here's Niwa's lede, but do check out the entire piece:

“Community Embraces a Converged Journalism-Sourcing Project

By Paul Niwa, Emerson College

Boston’s Chinatown is one of the largest and oldest Asian American neighborhoods in the country. Yet, this community of 40,000 does not even have a weekly newspaper. Coverage of the neighborhood in the city’s metropolitan dailies is also weak. In 2006, The Boston Globe and the Boston Herald mentioned Chinatown in 78 articles. Only 16 percent of the sources quoted in those articles were Asian American, indicating that newspapers relied on information from non-residents to cover the neighborhood. With all this in mind, I created the project as an experiment to build a common sourcebook for newsrooms.” 


UC Berkeley Library's Congressional Research Tutorials
Apr 4th, 2008 by Tom Johnson

We have long been fans — and users — of the research tutorials created by the good folks in the UC Berkeley library.  This item below from The Scout Report reminds me of that work and why I like it so much.  You, too, might find it a helpful link for your training efforts.

UC Berkeley Library's Congressional Research Tutorials [Macromedia Flash Player]

Making a clear and direct path through the vast amount of Congressional materials can be quite a chore, even for the most seasoned and experienced researcher. Fortunately, the University of California at Berkeley Library has created these fine Congressional tutorials. Designed to help users locate materials both online and in the library, these tutorials are in the form of short Flash-enabled videos. Most of the tutorials last about two minutes, and they include “Find a Bill”, “How Do I Contact My Representative?”, “Find Congressional Debate”, and “Find a Hearing”. After viewing one (or more) of these tutorials, users can also make their way to the “What's going on in Congress right now?” area to stay on top of the various activities of this important legislative body. [KMG]


»  Substance:WordPress   »  Style:Ahren Ahimsa