Alfredo Covaleda,
Bogota, Colombia
Stephen Guerin,
Santa Fe, New Mexico, USA
James A. Trostle,
Trinity College, Hartford, Connecticut, USA
The good folks at Directions Magazine today tipped us off that Geodata.gov is open for business. Geodata.gov was spawned by the “Geospatial One-stop” program.
Geodata.gov doesn't have everything about everywhere (yet), but it's a solid — and very rich — data resource that should be high on a reporter's list of “data sites to check early in the reporting process.”
We appreciate NYTimes reporter SABRINA TAVERNISE's hard work last week reporting — and explaining what was behind the numbers Iraqi civilian deaths in “Data Shows Rising Toll of Iraqis From Insurgency.” There's always the fog of war and all that, but Tavernise surely spent a fair amount of time on the piece and, at the end of the day, does a good job of explaining how and why the numbers can vary so much from source to source and month to month.
Click here for the piece (unless the NYT has already archived it).
Check out “Mapping Hacks,” a new book on the O'reilly list by Schuyler Erle, Rich Gibson, Jo Walsh . “Mapping Hacks is a collection of one hundred simple techniques available to developers and power users who want to draw digital maps. You'll learn where to find the best sources of geographic data and then how to integrate that data into your own creations. With so many industrial-strength tips and tools, Mapping Hacks effectively takes the sting out of digital mapmaking.”
One of the insights to the craft that business reporters learn early in the game is that the key to understanding annual reports is to read the footnotes and endnotes. That's where the juicy stuff is. So it is, it seems, for educational reporters. A story in Sunday's St. Paul (Minn.) Pioneer-Press by higher education writer Paul Tosto, “'Home alone' data debatable” points out the importance of reading the footnotes.
Backstory: In June, a group called the Minnesota Commission on Out-of-School Time released a report claiming “Minnesota has the nation's highest percentage of teens home alone each afternoon. It has more young children taking care of themselves after school than any state in the country. Half its kids aren't part of any structured after-school activity.”
Tosto read the report, scratched his head and then looked at the footnotes. Ultimately, he found the data and sourcing for the Commission's report didn't hold up. Here's what Tosto had to say about how he picked up the scent of the story:
“My concerns about the Minnesota Commission on Out-of-School Time findings surfaced when the report came out June 2. The sweeping nature of one statement, “Minnesota is home to 950,000 young people and has the highest percentage in the country of children ages 12 and older alone at home every single afternoon” startled me. That was going to lead my story.
But when I tried to trace back the footnote, I found the Web link that was supposed to provide the source for the information didn't work. When I asked for clarity, I was sent information about 10- to 12-year olds, not teenagers, and the data was from 1997 and involved only 13 states.
I became worried enough about it that day that I didn't write anything on the report or its release.
I spent the next few weeks on and off asking the commission's chief of staff for more information, trying to nail down three key pieces of information the group was using.
With the first finding, they eventually acknowledged to me that they did not have data showing Minnesota as the state with “the highest percentage in the country of children ages 12 and older alone at home every single afternoon.” Somone had apparently confused information from a couple of reports.
With the second finding — Minnesota has the country's highest percentage of 10- to 12-year-olds caring for themselves after school — I went back to the origins of that data, calculations by the Urban Institute of data from the 1997 Survey of America's Families.
Minnesota did have the highest percentage of children reported in self care and it was much higher than the national average the Urban Institute had calculated. But when I talked to one Urban Institute researcher who'd worked with the data, she said it was incorrect to say that Minnesota had the highest in the country since the data involved only 13 states. And surveys done by Minnesota's Wilder Foundation just a couple of years later showed percentages of children in self care that were much smaller than the Urban Institute report.
With the third finding — “about half” the state's children were not part of a structured after school activity — I had concerns about the methodology.
The commission's press release initially cited a report by one of its researchers a year earlier as the source. When I looked at that report, I found essentially unscientific discussion groups conducted by the researcher at nine sites across the state. Only 101 kids participated and the demographics did not reflect Minnesota's race and ethnicity. When I raised questions about it, the commission said (despite its press release) that it didn't base its conclusion on those site visits. But the commission did not provide any local, scientific data to back it up.”
Very nice work by a reporter who simply asked: “What do we [in this case, they,] know and how do we know it.
A recent profile of mathematician-turned-geneticist Philip Green is a good-read introduction to bio-informatics, and bio-informatics just might produce some methodologies journalists can use to validate public records databases.
The article, “Bioinformatics,” is in the quarterly published by the Howard Hughes Medical Institute. Some highlights:
* Using a detailed computational model, [researchers] found that some kinds of [genetic] mutations occur at constant rates, like the ticking of a clock, which makes them useful for dating evolutionary events. Other kinds of mutations occur at varying rates de-pending on the generation times of the organism. This information in turn makes it much easier to identify parts of the genome that exhibit different patterns of change over time, indicating that the DNA in those regions is subject to selection and therefore playing a functional role. The idea, says Green, is to separate the noise of meaningless changes in DNA so that the signals of consequential changes emerge clearly from the background.” Journalists could look at which elements are changed in a data base and how often as a clue for the importance of the data base and the relative importance of various elements.
* “The main issue [in biology and genomics] is how quantitative we’re going to be able to get,” [Green] says. “Most people will accept the idea that we will know qualitatively how things are interacting with each other. But what you really want is a quantitative result, so that you can change the levels of one component and predict how it will affect the system.”
* “Back then, [says a colleague of Green’s] we wondered if there was a need for mathematics in biology. In the mid-1980s, there weren’t a lot of data. Biology was about analyzing the notes in your lab book. “In the last 20 years, biology has become dominated by huge data sets. Now it’s an exception rather than the rule to publish a paper that does not draw on large databases of biological information. Mathematical analysis has become a funda-mental part of biological research. It has turned out to be of equal importance to experimentation.” Take a look at the article. It suggests some parallels of investigation for analytic journalism.
We agree, there can be many reasons not to run a map in the IoP (Ink-on-Paper) version of a newspaper. And maps are sometimes run more as a graphic element in the page design than as a tool to tell a story in a better way. (Although this seems to happen less as “design and information consciousness” has percolated through journalism thanks to organizations like the Society for News Design.) Still, if a decision is made to use a map, then that graphic should add to the readers' understanding of usually complex data. Last week, the Palm Beach [Florida] Post carried a map showing the home county of U.S. troops killed in Iraq. The problem is, the KIA map shows the number killed without taking into account the size of the population from which those troops were recruited. Is there a better way? Of course, and the folks in the newsroom trenches had produced one: a map showing the KIA's relative to the population of the county where the soldiers were from. This one, of course, supplies some of the appropriate context. The problem was, the editors decided to publish the traditional-but-misleading map. Sigh.
Here is another on the same topic: * http://www.obleek.com/iraq/index.html
We're pleased that the PBS program “Frontline” is keeping up the good fight to produce important journalism. And thanks to the Librarian's Index to the Internet for pointing us to: Private Warriors
http://www.pbs.org/wgbh/pages/frontline/shows/warriors/ Subjects: Government contractors — United States | Public contracts — United States | Private security services | United States — Armed Forces — Management | New this week Created by je – last updated Jul 6, 2005
Be sure to drill down to the section, “Does Privatization Save Money.” A nice example of a reporter asking the right questions.
One of the foundational cross-over disciplines we think are of value to journalists is Forensic Accounting, at least that's the term used when applied in business. (It's “performance measurement” when talking about government.) One of the basic measurements in forensic accounting is to compare the percent of dollar distribution by type or sector in one instution to the percent of dollar distribution in a comparable institution. So it is that we were please to see Glen Justice dipping into the forensic accountants toolbox in Wednesday's NYTimes in his story “For a Lobbyist, Seat of Power Came With a Plate.” The story is about how lobbyist, and Tom Delay pal, Jack Abramoff apparently used his own restaurant in Washington, Signatures, as a place to meet and greet legislators. He just forgot to give them a check. Justice wrote:
“…While Signatures was popular, it struggled to make money, according to employees and documents.
'Mr. Abramoff and his companies invested more than $3 million in Signatures from January 2002 to May 2003, records show. At the same time, he and his employees gave away tens of thousands of dollars in food, wine and liquor, the records show. That includes menu prices for Mr. Abramoff's own food and drink, as well as employee discounts and free meals given by restaurant managers and staff, according to the records. Nationwide, the median expense for marketing, including free meals and drinks, was about 3.5 percent of sales for expensive restaurants like Signatures that spend the most on such promotions, according to the National Restaurant Association. One national restaurant consultant, Clark Wolf, said the figure can go as high as 5 percent.
'At Signatures, free meals and drinks for managers and guests alone were about 7 percent of revenues for the restaurant's first 17 months, according to former employees and financial records. Mr. Blum, the spokesman for Mr. Abramoff, disputed that percentage.”
Seems like pretty basic reporting, but more reporters would do well to make that one more call if they want to establish context in their stories.
Using traffic flow data and models to demonstrate simulation modeling as a learning tool seems to be akin to the Bunsen burner, i.e. a fundamental implement everyone uses. The Wall Street Journal science section reports this:
How Brief Drop in Cars Can Trigger Tie-Ups, And Other Traffic TalesJuly 1, 2005; Page B1
If you plan to hit the roads like the zillions of other drivers this holiday weekend, Avi Polus has a word of advice: patience.
A transportation engineer at Technion-Israel Institute of Technology in Haifa, Prof. Polus's concern isn't drivers' collective blood pressure but traffic flow. Like the growing number of other engineers and physicists who are hubcap-deep in the science of traffic, he is determined to explain infuriating mysteries such as phantom traffic jams (There's no bottleneck or accident at the front of this jam, so why weren't we moving?) and why a brief drop in volume can, paradoxically, trigger a long-lasting traffic jam.”
Be sure to download and check out the models from Martin Treiber of Dresden University of Technology.
We don't read every newspaper in the U.S. or the world every day, so our survey of the news media's infographics is, shall we say, a bit flawed. That said, we continue to be impressed by the ability of the NY Times infographic team to consistently come up with ways of showing a variety of concepts. There's a 250-year tradition of illustrating quantitative data, but taking concepts and turning them into quantitative is more recent. Yesterday, the NYT gang worked its magic on the issue of Sandra Day O'Connor and her votes as a justice. Check out: “Levels of Agreement” and “A Crucial Swing Vote.”