Tilling the soil makes for fertile crops, Congressionally speaking.
Dec 5th, 2005 by Tom Johnson

Kudos to Derek Willis and Adrian Holovaty of The Washington Post for the site “U.S. Congress Votes Database.”  One element we find of recent and special interest is the “late night votes
variables for both the House and Senate.  With a little more
probing and data slicing and dicing, it would make an interesting bit
of visual
statistics/infographics to do a longitudinal comparison of the time of
votes in various congresses.

This site/searchable database is a fine example of how investing in some basic data preparation
can create the potential for a ton of stories.  Why, for example, do
Democrats have such a preponderance (18 out of 20) of Representatives on the “missed
” list, but only 9 out of 20 on the similar list for the Senate?

This is
also a fine example of how a newspaper can do good things for itself
while doing good things for the community and readers.  This
database gives the WP reporters and editors a quick look-up of
Congressional activity, the kind of fact and detail that can enrich a
story.  At the same time, citizens can turn to this value-added
form of the public record to answer their own questions.

Derek Willis wrote to the news librarians listserv:


It's not part of a story or series, but the Post today launched a site

that may prove useful to your newsrooms or even as an inspiration to

learn Python: a congressional votes database that covers the

102nd-109th congresses (1991-present). Currently browsable, we're

working on adding a search engine and other features to it. Adrian

Holovaty, who works for, and I assembled the data

and he built the web framework to display it. All of the data is

gathered using Python, the database backend is PostgreSQL and the web

framework is Django.”

What are the demographics of Elkhart Lake, Wisconsin, pop. 1,034?
Nov 21st, 2005 by Tom Johnson

Cartography blog tips us to a valuable site when quick hits are
needed on a community, a SMALL place, in the U.S. or Canada. 
Check out ePodunk

“ePodunk is a site that
focuses on place and provides information on 25,000 communities in the
U. S. The site also contains a number of interesting maps, including
maps of the Katrina diaspora, ethnic origin, fastest growing counties
and others. There is also a Canadian version of the site, focusing on
Canadian places, but it, sadly, does not seem to have any maps.”

Yes, Virginia, methodology DOES matter
Nov 10th, 2005 by JTJ

A piece on calling the elections in Detroit:

MAKING A FORECAST: A secret formula helps producer call the election right



November 10, 2005

What was a viewer to believe?

As polls closed Tuesday, WDIV-TV (Channel 4) declared Freman Hendrix winner of Detroit's mayoral race by 10 percentage points.

WXYZ-TV (Channel 7) showed Hendrix ahead by 4 percentage points, statistically too close to call.

But WJBK-TV (Channel 2) got it right, declaring just after 9 p.m. that
Mayor Kwame Kilpatrick was ahead, 52% to 48%, which turned out to be
almost exactly the final 53%-47% outcome declared many hours later.

And it was vote analyst Tim Kiska who nailed it for WJBK, and for WWJ-AM radio, using counts from 28 of 620 Detroit precincts.

Kiska did it with help from Detroit City Clerk Jackie Currie. She
allowed a crew that Kiska assembled to collect the precinct tallies
shortly after the polls closed at 8 p.m.

Using what he calls a secret formula, Kiska calculated how those 28 precincts would predict the result citywide.

His formula also assumed that absentee voters chose Hendrix over Kilpatrick by a 2-1 ratio.

That's different from the methods of pollsters who got it wrong
Tuesday, Steve Mitchell for WDIV and EPIC/MRA's Ed Sarpolus for WXYZ
and the Free Press. Both men used telephone polls, calling people at
home during the day and evening and asking how they voted.

It's a more standard method of election-day polling, but Tuesday proved treacherous.

Kiska, a former reporter for the Free Press and Detroit News, has done
such election-day predictions since 1974, but said he was nervous

“Every time I go into one of these, my nightmare is I might get it
wrong,” said Kiska, a WWJ producer. “I had a bad feeling about this
going in. I thought there was going to be a Titanic hitting an iceberg
and hoping it wouldn't be me.”

Kiska said he especially felt sorry for his friend Mitchell.

Mitchell said he's been one of the state's most accurate political
pollsters over 20 years, but said his Tuesday survey of 800 voters
turned out to be a bad sample.

He said polling is inherently risky, and that even well-conducted polls
can be wrong one out of 20 times. “I hit number 20 this time.”

For Sarpolus, it's the second Detroit mayoral race that confounded his
polls. He was the only major pollster in 2001 who indicated Gil Hill
would defeat Kilpatrick.

Sarpolus said the pressure to get poll results on the air quickly made
it impossible to adjust his results as real vote totals were made
public during the late evening.

Of Kiska, Sarpolus said: “You have to give him credit. … But you have to assume all city clerks are willing to cooperate.”

Contact CHRIS CHRISTOFF at 517-372-8660 or

Digital detectives
Nov 3rd, 2005 by JTJ

those interested in the forensic process — and in this case, computer
forensics — be sure to check out this fine, fine piece of digital
detective work by Mark Russinovich, a computer security expert with
Sysinternals.  He
discovered evidence of a “rootkit” on his Windows PC.

We don't think journalists need to know how to DO this kind of
deep-diving probing, but  we should be aware that it is possible
and, broadly speaking, the methods if only to know the appropriate
search terms.

Through heroic forensic work,
he traced the code to First 4 Internet, a British provider of
copy-restriction technology that has a deal with Sony to put digital
rights management on its CDs. It turns out Russinovich was infected
with the software when he played the Sony BMG CD
Get Right With the Man by the Van Zant brothers.

Here's WIRED Magazine's take on the story, “The Cover-Up Is the Crime

And here's what Dan Gillmor had to say about it, with additional links.

Managing the news data flow
Nov 2nd, 2005 by Tom Johnson

We're all awash in data, so finding the significant bits and bytes that can lead to information is a maddening process.

Jon Burke, writing in the
November 2, 2005 edition of

MIT's Technology Review, presents some web-based technological options.  See
Finding Signals in the Noise.”

We were impressed by a new product/site called “Memeorandum,” but Burke points out a handful of alternatives.


“Few would dispute that we live in an age of
information overload. In the last few years alone, blogs have increased
the torrent of information each day to unmanageable levels.  This
would explain, then, why a corresponding torrent of startups has
surfaced recently to help us filter, manage, and control this flood of
information. Some rely on insightful algorithms that understand
popularity to filter the news, while others rely on the preferences of

For example, Digg
is a San Francisco startup that ranks news items by letting people
choose which stories they like. It just landed $2.8 million in venture
capital from Omidyar Network, former Netscape founder Marc Andreessen,
and Greylock Partners. We also understand that a comparable site — Memeorandum — may close a round of financing shortly.

The concept of making users prioritize or create hierarchies for news is not new — Slashdot
has been doing it since 1997. But the latest generation of sites like
Digg and Memeorandum are showing that user-prioritized news is, indeed,
a powerful and easy way to drive traffic — in some cases to a site
created by a single employee with a lone server.”

Simulated Journalism? Not exactly, but a topic of relevance
Nov 1st, 2005 by Tom Johnson

modeling is one of the four cornerstone areas of interest to the
IAJ.  It's a relatively new, and largely unknown, field that can
be of great advantage to journalists if we can take the time to learn
how it works and then how we can apply it to our field.  The best
resource to date for journalists is the J-Lab, ( at the University of Maryland.

But today along comes this announcement of a rich issue of the Journal
of Artificial Societies and Social Simulation
.  It's filled with
deep thinking and application.

Journal of Artificial Societies and Social Simulation
( published issue 4 of Volume 8 on 31
October 2005.

JASSS is an electronic, refereed journal devoted
to the exploration and understanding of social processes by means of
computer simulation.   It is freely available, with no


This issue is our largest
ever, with 12 peer-reviewed articles, eight of them forming a special
section on Epistemological Perspectives, edited by Ulrich Frank and
Klaus Troitzsch.

If you would like to volunteer as a referee and have
published at least one refereed article in the academic literature, you
may do so by completing the form at


Peer-reviewed Articles

How Can Social Networks Ever Become Complex? Modelling the Emergence of Complex Networks from Local Social Exchanges
   by  Josep M. Pujol, Andreas Flache, Jordi Delgado and Ramon Sanguesa

Violence and Revenge in Egalitarian Societies

   by  Stephen Younger

Influence of Local Information on Social Simulations in Small-World Network Models

   by  Chung-Yuan Huang, Chuen-Tsai Sun and Hsun-Cheng Lin

It Pays to Be Popular: a Study of Civilian Assistance and Guerrilla Warfare

   by  Scott Wheeler

Special Section on Epistemological Perspectives on Simulation

   by  Ulrich Frank and Klaus G. Troitzsch

Towards Good Social Science
   by  Scott Moss and Bruce Edmonds


A Framework for Epistemological Perspectives on Simulation
   by  Joerg Becker, Bjoern Niehaves and Karsten Klose


What is the Truth of Simulation?
   by  Alex Schmid

Logic of the Method of Agent-Based Simulation in the Social
Sciences:  Empirical and Intentional Adequacy of Computer
   by  Nuno David, Jaime Simao Sichman and Helder Coelho

Validation of Simulation: Patterns in the Social and Natural Sciences

   by  Guenter Kueppers and Johannes Lenhard

Stylised Facts and the Contribution of Simulation  to the Economic Analysis of Budgeting

   by  Bernd-O. Heine, Matthias Meyer and Oliver Strangfeld

Does Empirical Embeddedness Matter? Methodological Issues on Agent-Based Models for Analytical Social Science

   by  Riccardo Boero and Flaminio Squazzoni

Caffe Nero: the Evaluation of Social Simulation
   by  Petra Ahrweiler and Nigel Gilbert


Book Reviews    (Review editor: Edmund Chattoe)

Edmund Chattoe reviews:
       Routines of Decision Making by Betsch, Tilmann and Haberstroh, Susanne (eds.)



The new issue can be accessed through the JASSS home page: <>.

The next issue will be published at the end of January 2006.

Submissions are welcome: see


Editor: Nigel Gilbert, University of Surrey, UK
Forum Editor: Klaus G. Troitzsch, Koblenz-Landau University, Germany
Review Editor: Edmund Chattoe, University of Oxford, UK


Sent from the EPRESS journal management system,
Searching podcasts? Yes, the tools are coming along.
Oct 12th, 2005 by Tom Johnson

journalists often ignore audio (and video) content when researching a
story.  Partially there is the “medium bias” at play (i.e. “Hey, I
work in print, so that must be the most important source.”), but that
bias also has something to do with the lack of search tools and the
difficulty of getting those audio words into a transcript that can flow
into text.  Still, there is gold in those sight-and-sound files
for a reporter who can find them and take the time to extract the ore.

The always helpful blog
“PI News Link” run by Tamara Thompson posts the following:

“A new form of audio files called podcasts,
so named because they can be downloaded from the Internet to a portable
digital listening device (such as an iPod), are searchable through many
search engines.
Yahoo has just rolled out their podcast search. A keyword search of “legal” returned Involuntary Manslaughter: A Double Standard?, a broadcast with the editor of Massachusetts Lawyers Weekly. The Podcast Search Service catalogs a more extensive collection of websites with podcasts, searching terms within the site title or description. Pod Spider includes international audio files. Individual podcasts are beginning to be tagged, which will enable the searcher to uncover specific relevant audio files.”

Web scraping with Excel [Saturday highlights from the Global Investigating Journalism conference]
Oct 1st, 2005 by JTJ

Kaas, of the Danish
International Center for Analytical Reporting
, just
presented a fascinating session on how to use Excel tools to
“scrape”data off the web an import it into Excel, at least Excel XP. 
This is typically helpful where one needs to extract data from standardized tables
on dynamic web sites, for example those with demographic, economic or crime

He has posted some handouts at or

It's not yet clear to us if this is more efficient than writing PERL
or PHP scripts, but it's still an elegant hack.


Some great sessions at the Global Investigative Journalism Conference
Sep 30th, 2005 by JTJ

Friday's highlights from the conference in Amsterdam….

Henk van Ess has given two fine training sessions yesterday and this morning.  The first:

Training 02: Forensic surfing (Thursday 14.00 – 15.15)
How can you figure out the reliability of a website –
even without opening the site? How do you find the owner of a web site? How can you see how old a page is,
even if it doesn't say 'Page last updated at..'? How do you find the author of a Word document?
Welcome to the world of forensic surfing. Extra: CD-ROM with the course 'Internet Detective' for all participants.

Watch the HTML version at

The second session:

Hacking with Google (Friday 9.30 – 10.45)

“People make mistakes. They put sensitive data
on servers. They forget to remove delicate material. They leave
directories open with hidden files. Learn how to use Google in a
different way. The best search techniques for finding secret documents
from governments, institutions and companies. Open them with the right
questions. Henk van Ess
(AD, Netherlands) teaches you what sort of words you have to type,
which special syntax you have to use and how you should interpret the
answers. Note: this training will teach you how to find material that
shouldn't be on the web. It doesn't teach you how to hack into systems.”
This presentation can be viewed at
There is a companion book – The Google Hacker’s Guide:

Understanding and Defending Against the Google Hacker by Johnny Long (
— partial section at


More government employees may be removed from public records
Aug 18th, 2005 by JTJ

Tamara Thompson reports on her blog PI News Link….

~ more government employees may be removed from public records ~

By Tamara Thompson Investigations

SB 506
will add an additional group of public officials to the roster of
those whose personal data is confidential. Keep this idea filed in the
back of your hat. When subject to a potential threat, various
government employees may apply to have their address and other
identifiers removed from public records. In its current form, SB 506
the application for closure a public record. If the
document exists, you'll know that the subject has convinced another
public official that “a life threatening circumstance” exists that
impels the request for confidentiality.

“This bill would require a local elections official to extend this
confidentiality of voter registration information to specified public
safety officials, upon application, as specified, for a period of no
more than two years, if the local elections official is authorized to
do so by his or her county board of supervisors. The application of a
public safety official would be a public record.”

»  Substance:WordPress   »  Style:Ahren Ahimsa