Interesting Tool of the Week: Quadrigam – Connecting two visualizers
Aug 18th, 2015 by Tom Johnson

Take me to Quadrigam

Connect two visualizers

All of the charts have input data and output data. You can synchronize two charts just using the output data element of the first chart as the input data element of the second chart. You can even build formulas based on data outputs, like for instance using the element selected in a control list to be part of a formula which filters a given datasset based on one column which elements match with the selected element in the control list. Charts use to have two main data outputs:

  • On over: When on over with the mouse, the element will be the data output of that chart.
  • Selection: When clicking on a given element of a chart (e.g: a serie in a Line chart) that will be the output element that can be synchronize with another chart. In Maps, we also have “zoom” and “center” which make super easy to syncrhonize, for instance, two maps (the zoom and center of the first map act as input data elements of the second map).
PDF Tables: Outstanding tool extracts tables to Excel
Jun 13th, 2015 by Tom Johnson

I just gave this a spin using the City of Santa Fe 2015 budget, a 150-pager.  Seemed to be very fast in the conversion and quite accurate.  Unless you need the text, it is even faster if you edit out text pages and just run those pages containing the desired tables.  The result is that each page becomes a separate Excel page, then they can be sliced-and-diced as necessary.

Kudos to the ScraperWiki folks.

Accurately extract tables from PDFs
No more time consuming and error prone copying and pasting


Another fine tool for slicing and dicing data….
Nov 9th, 2010 by Tom Johnson

From …..

Find the names in your data with Mr. People

November 8, 2010 to Online Applications | Post on Twitter

Inspired by Shan Carter's simple data converter, appropriately named Mr. Data Converter, Matthew Ericson just put Mr. People online. The tool lets you paste a list of names, and it will parse the first and last name, suffix, title, and other parts for you. You can even have multiple names in a single row.

Years ago, while trying to clean up the names of donors in campaign finance data from the Federal Election Commission, I hacked together a Perl module — loosely based on the Lingua-EN-NameParse module — to standardize names. One port to Ruby later, I've finally put together a Web front end for it.

Getting data in the right format, whether for analysis or visualization, can be a huge pain. Imagine. All the data you need is right in front of you, but you can't do anything with it yet, because as often is the case, it's not in a nice and pretty rectangular format. So anything that makes this easier and quicker is an instant bookmark for me.

[Mr. People via @mericson]


Ramping up your statistical skills
Sep 3rd, 2010 by Tom Johnson

From FlowingData….

Statistical literacy guides for the basics

Sep 3, 2010 to Statistics | Post on Twitter

Guide to statistical charts - before and after

“You can get pretty far with data graphics with just limited statistical knowledge, but if you want to take your skills, resume, and portfolio to the next level, you should learn standard data practices. Of all places, UK Parliament has some short and free guides to help you with basic statistical concepts. They provide 13 notes, each only two or three pages long that can help you with stuff like how to adjust for inflation, confidence intervals and statistical significance, or basic graph suggestions [pdf]. I like.”



UPDATED: Inflation Conversion Factors for Dollars 1774 to Estimated 2019
Oct 1st, 2009 by analyticjournalism

Here at the IAJ, we have long been a fan — and user — of Prof. Robert Sahr's “Inflation Conversion Factors” web site and tools.  We were snoozing at the switch a bit and didn't notice that Sahr updated the site in early June 2009.  Check it out: it's filled with both PDF and Excel data/tools to calculate the comparative costs of most anything from 1774 to 2019.  The site also includes some fun data:

The charts on the following topics are available either by scrolling down or by selecting the appropriate link:

          Price levels and the US economy

          Millionaires Then and Now

          Pay of Presidents and Members of Congress

     Selected Government-related Items (number of government employees, Social Security and AFDC/TANF; stamp prices, minimum wage, mean and median family income)

          Presidential Election Costs 1860 to 2000

     National Government Budget:  Outlays (Spending), Revenue, Deficits or Surpluses, and National Debt

          Selected Commodity Prices (gasoline and gold) [gasoline revised June 2009, using June 2009 price data and estimated 2009 dollar conversion factors]


          Budget Details


More insights into how and why journos can't deal with data
Sep 30th, 2009 by analyticjournalism

This tip comes from our friend Stephan Russ-Mohl, of the European Journalism Observatory.

Darned Statistics PDF Print E-mail
by Stephan Russ-Mohl   

European Journalism Observatory, September 26, 2009

 Many journalists face difficulties in dealing with statistics, and frequently lack the competence to present quantitative information to their publics in easy-to-grasp language.

This is nothing new, as most journalism textbooks contain tips on how to deal adequately with numbers and percentages. Thus far, these remain rules of thumb. Three U.S. researchers – Coy Callison, Rhonda Gibson and Dolf Zillmann – recently tested these rules. Drawing from 240 students who participated in an experiment, their empirical analysis provides new, surprising insights.

The test subjects could deal as easily with percentages as with absolute numbers. Contrary to textbooks, they experienced more difficulties when percentages were characterized verbally. For example, “30 percent of the citizens of XY have AIDS” has a meaning different from “3000 of the citizens of XY have AIDS” – but the percentage is no more difficult to grasp and remember. If, instead, the percentage is transformed into the phrase “Three of ten of the citizens of XY have AIDS,” an additional barrier of comprehension is created. The researchers mention, however, that their experiments should be expanded, and that less educated groups still need to be included.

Coy Callison et al.: How to Report Quantitative Information in News Stories, Newspaper Research Journal, Vol 30/Nr. 2, Spring 2009, 43-55.


Wondering what the prez is doing???
Sep 19th, 2009 by analyticjournalism

 We don't know how long this one's been around, but kudos to the gang at the WashPost for taking a concept/tool somewhat on the margin and putting it to good use for both reporters and readers. See

Jobs by SimplyHired

Every day President Obama meets with key members of his administration, Congress, foreign dignitaries, interest groups and regular citizens. Use our interactive database to track how Obama is spending his time, what issues are getting the most attention and who is influencing the debate. 

Subscribe to daily schedule via RSS.

CREDIT: Nathaniel Vaughn Kelso, Madonna Lebling, Karen Yourish, Ryan O'Neil, Wilson Andrews, Jacqueline Kazil, Todd Lindeman, Lucy Shackelford, Paul Volpe
Have information we could use or suggestions about how to improve the site? Contact Us.

© 2009 The Washington Post Company

Vintage Infographics From the 1930s
Sep 11th, 2009 by analyticjournalism

Nathan, over at FlowingData, has posted a fine example of infographics.  The work of Willard C. Brinton is a nice extension of what was being done by U.S. government agencies.  Turns out, Brinton's book can be found in used book sites, and at an affordable price.

Vintage Infographics From the 1930s

Posted by Nathan / Sep 11, 2009 to Infographics / 8 comments

Vintage Infographics From the 1930s

Someone needs to get me a paper copy of Willard Cope Brinton's Graphic Presentation (1939), because it is awesome.

Brinton discusses various forms of graphic presentation in the 524-page book and what works and what doesn't. There's also some good stuff in there about how to make your graphs, charts, maps, etc (by hand).

Have we seen these?

The most interesting part is that many of the graphics – despite having no computers in 1939 – look a lot like what we have today. Albeit, they're a little rougher because they're made by hand, but that's just added flavor.

For example, you've got the Sankey diagram above, or a “cosmograph” as Brinton calls it. The instructions read:

One thousand strips of paper are set on edge to represent 100% and are separated into component parts of 100%.

What? You want me to arrange 1,000 strips of paper to make my diagram? Brilliant, I say.

Here are your choropleth maps…


network diagram…


and of course some of your usual suspects…


The entire book is freely available in PDF format, but it's low resolution and takes forever to browse. Michael Stoll has posted some higher quality shots on Flickr.

I still want more though.

Seriously, does anyone know where I can get a copy?


Like what you see? Subscribe to the FlowingData RSS feed to stay updated on what's new in data visualization.


Mary Ellen Bates on "Google Squared"
Aug 25th, 2009 by analyticjournalism

Mary Ellen Bates offers up this good tip on “Google Squared” at

Bates Information Services, ________________________________________________________________________________________

August 2009

Google Squared

Google Labs — the public playground where Google lets users try out new products or services that aren't yet ready for prime time — is my secret weapon for learning about cool new stuff. My favorite new discovery in Google Labs is Google Squared. It's a demonstration of a search engine trying to provide answers instead of just sites, and at a higher level than the simple “smart answers” you see when you search for “time in Rome” or “area code 909”. Rather, Google analyzes the retrieved pages, identifies common elements, and creates a table with the information it has compiled.

This is a fascinating tool that helps you compile facts into tables that Google builds on the fly. Hard to describe, easier to show. Go to and type in a query that will retrieve a number of similar things — organic farms in Colorado, for example, or women CEOs… even superhero powers.

Google Squared generates a table of facts extracted from its index, with the items you are searching for as the left-most column, along with columns for whatever related characteristics are relevant for the topic. For organic farms in Colorado, for example, the table in the search results has columns for the name of the company, an image from the farm's web site, a snippet of description about the farm, and columns for telephone number, location and “season.” Note that some of these columns may have few entries in them, depending on what information Google analyzed. For women CEOs, the table includes the CEO's name, a photo, a snippet that indicates what her position is, her date of birth, and her nationality. For superhero powers, you will find the superhero's name, a photo, a far-too-brief description of said superhero, the hero's first appearance (in print, that is), publisher and even the hero's “abilities”.

Interestingly, you can insert your own items in a Google Squared table, and either let Google populate the rest of the row or type in whatever content you want in that row. I added Catwoman to my superheroes table and Google filled in the new row with her photo and description; I could provide the rest of the info. For some tables, Google even suggests additional columns. For my superheroes table, I could add columns for Aliases, Alter Ego, Profession (the Joker is a lawyer, of course), and so on. You can add your own columns, as well.

You can also delete a row or column that isn't relevant to your search. If you log in to your Google account, you can save your customized tables for later use. And you can export the table into Excel (the images are exported as URLs).

Google Squared is never going to compete with a real human's analysis of a collection of facts, but it can be a great way to start brainstorming, as a quick way to organize the results of your search, and as a starting point for a nicely-presented deliverable for your client.

“May I publish or reproduce this InfoTip?” Be my guest! Just make sure you credit the source, Bates Information Services, and include the URL,

SNA in R Talk, Updated with [Better] Video
Aug 20th, 2009 by analyticjournalism

OK, OK.  Using R can be a steep hill to climb for some.  But here, thanks to O”Reilly Radar, is a pretty good video of a presentation on using R as a Social Network Analysis tool.

 “Social Network Analysis in R — video and slides for talk on doing social network analysis with R.”

SNA in R Talk, Updated with [Better] Video

Update II: It occurred to me that it would be much better for people to be able to view the entire talk in a single video, rather than having to switch between sections; therefore, I uploaded the whole thing to Vimeo.

Tonight I will be givingOn August 6th I gave a talk at the New York City R Meetup on how to perform social network analysis in R using the igraph package. Below are the slides I will be going over covered during the talk, and all of the code examples from the presentation are available in the ZIA Code Repository in the R folder.

Below is a video of this talk, with a link to the slides I review during the presentation. If you are interested, I suggest downloading the slides and following along with videos while having the slides open, as much of what is on the screen in the video is hard to read.


Social Netowork Analysis in R from Drew Conway on Vimeo.

Andrew Little’s presentation on econometrics in R using Zelig and MatchIt are also available on YouTube starting here. I hope you enjoy the presentation, and please let me know if you have any questions or comments.



»  Substance:WordPress   »  Style:Ahren Ahimsa