The Dataweb and the DataFeret
Jan 3rd, 2008 by Tom Johnson

Marylaine Block's always informative “Neat New Stuff” [Neat New Stuff I Found This Week at] tipped us to the DataWeb site and its interesting tool, the Data Feret (or “dataferet”).

“TheDataWeb is a network of online data libraries that the DataFerrett application accesses the data through. Data topics include, census data, economic data, health data, income and unemployment data, population data, labor data, cancer data, crime and transportation data, family dynamics, vital statistics data, . . . As a user, you have an easy access to all these kinds of data. As a participant in TheDataWeb, you can publish your data to TheDataWeb and, in turn, benefit as a provider to the consumer of data.”

What is the DataFerrett?
DataFerrett is a unique data mining and extraction tool. DataFerrett allows you to select a databasket full of variables and then recode those variables as you need. You can then develop and customize tables. Selecting your results in your table you can create a chart or graph for a visual presentation into an html page. Save your data in the databasket and save your table for continued reuse. DataFerrett helps you locate and retrieve the data you need across the Internet to your desktop or system, regardless of where the data resides. DataFerrett:
* lets you receive data in the form in which you need it (whether it be extracted to an ascii, SAS, SPSS, Excel/Access file); or
* lets you move seamlessly between query, analysis, and visualization of data in one package;
* lets data providers share their data easier, and manage their own online data.
DataFerrett Desktop IconDataFerrett runs from the application icon installed on your desktop.

Check it out at


Zotero: I think they've got it this time
Oct 5th, 2007 by Tom Johnson

Yes, call us fickle and lacking in loyalty when it comes to note-taking and research organization tools.  Does anyone else remember the 5×8 cards with holes punched on all four perimeters?  You entered “tags” or keywords by clipping out the outer edge of the hole, and when you needed to find a particular note card, a knitting needle-sized wire was inserted into the whole pack.  Shake the cards and the desired note fell out.  Sometimes.

Since going digital 25 years ago, we've tried dozens of tools to try and bring some order to what we've turned up online and need to save.  Most were fine innovations and advances at the time, but there was often something that didn't quite meet all of our needs or desires.  That still might be true, but a new entry in the research management derby (thanks to the cite from The Scout Report quoted below) delivers up an impressive new tool.

Zotero is a Firefox extension with rich, intuitive tools that are flexible enough to support the way YOU want/need to work.  This is only version 1.0, but I think I have a new best friend.


“It can be hard to keep Tom Wolfe and Thomas Wolfe straight at times, and if you are working on an academic paper that incorporates both of these august characters, you probably want to keep those research sources in good order. Thanks to Zotero, it is very easy to do just that. Zotero is a Firefox extension that helps users collect, manage, and cite their research sources. Zotero can automatically capture citation information from web pages, store PDF files, and also export these citations with relatively ease. This very helpful extension is compatible with computers running Firefox 2.0.” [KMG]


The "Traditional Future" of library research
Sep 18th, 2007 by Tom Johnson

From O'Reilly Radar's Publishing blog comes this interesting item. See

The Traditional Future

“A prominent U.S. sociologist and student of professions, Andrew Abbott of the University of Chicago, has written a thought-provoking thesis on what he terms “library research” — that is, research as performed with library-held resources by historians, et. al, via the reading and browsing of texts — compared to social science research, which has a more linear, “Idea->Question->Data->Method->Result” type of methodology.

“The pre-print, “The Traditional Future: A Computational Theory of Library Research,” is full of insights about library centric research, including intriguing parallels between library research and neural net computing architectures; a comparison that made me think anew, and with more clarity, about how the science of history is conducted. Armed with a distinctive interpretation of library research, Abbott is able to draw some incisive conclusions about the ramifications of large repositories of digitized texts (such as Google Book Search) on the conduct of scholarship…”


Tracking the bucks all the way to court
Oct 2nd, 2006 by JTJ

Another unique investigation by The New York Times gets A1 play in this Sunday's edition (1 Oct. 2006) under the hed “Campaign Cash Mirrors a High Court's Rulings.”  Adam Liptak and Janet Roberts (who probably did the heavy lifting on the data analysis) took a long-term look at who contributed to the campaigns of Ohio's Supreme Court justices.  It ain't a pretty picture if one believes the justices should be above lining their own pockets, whether it's a campaign fund or otherwise.

In any event, there seems to be a clear correlation between contributions — and the sources — and the outcome to too many cases.  A sidebar, “Case Studies: West Virginia and Illinois,” would suggest there is much to be harvested by reporters in other states.

There is, thankfully, a fine description of how the data for the study was collected and analyzed.  See “
How Information Was Collected

There are two accompanying infographics, one  (Ruling on Contributors' Cases” ) is much more informative than the other (“While the Case Is Being Heard, Money Rolls In” ), which is a good, but confusing, attempt to illustrate difficult concepts and relationships. 

At the end of the day, though, we are grateful for the investigation, data crunching and stories.

Library on the moon
Sep 21st, 2006 by Tom Johnson

Friend Laura Soto-Bara posts the following to the NewsLib listserv:

Library on the moon
Wednesday, September 20, 2006

The moon might be a good place for a massive storehouse of digital
information, sort of a Lunar Library of Alexandria. That's the idea
proposed by NASA scientist David McKay, who ten years ago led the team
that announced that a Mars meteorite contained evidence of life.
According to the New Scientist blog, McKay says the lunar library could
be stored on computers buried in the ground, placed inside craters, or
located in hollow lava tubes….  From the post:

The benefits of lunar storage are that there is no oxygen to erode the
material, constant sub-freezing temperature and the Moon is currently
free of all of the havoc wreaked by humankind…

Families could even pay a fee to preserve photographs in the lunar
library for future civilizations. McKay calls it the “ultimate time

Brushing up on sophisticated searching techniques
Aug 22nd, 2006 by Tom Johnson

A helpful piece posted today on reminding us that just throwing what we 
think might be appropriate keywords into a search engine isn't the most
sensible research strategy. While you might find that the title of the article
is not exactly what it is about, the content is helpful. Here are the main points:

“Summary of Web Search Strategies

  • Determine appropriate search engines to recover information in both the

  • Surface and the Deep Web.
  • Structure the search query with punctu­ation and groups for the maximum
  • Use date restrictions to narrow the results.

  • Consider narrowing searches by using intitle, domain or specific site-

  • limited searches.
  • Use link checks to “Shepardize” the results.”

  • It's Not Rocket Science: Making Sense of Scientific Evidence,” by Paul

    Tracking people and public records
    Jul 21st, 2006 by JTJ

    Pete Weiss sends the following helpful tip to the CARR-L listserv:

    Abstracted from Genie Tyburski's TVC-Alert list:

    “(20 Jul) Ballard announces the completion of the <>Database
    of Sources on The Virtual Chase. Released in beta during April of this
    year, the database contains abstracts and links to Web-based sources of
    information for conducting research on companies or people and for
    finding legal or factual information. You may browse the database by
    subject or search it by keyword.


    At Virtual Chase

    Database of Sources

    Use the search box above to
    query our database of resources for finding legal or factual
    information or information about companies or people. Use the
    site search engine to expand your
    query to other resources available on The Virtual Chase.

    Information Guide
    – find annotated resources for
    conducting company research

    People Finder Guide
    – find annotated resources for conducting people

    Legal Research
    – find annotated resources for finding legal
    or factual information

    U.S. Terror Targets: Petting Zoo and Flea Market?
    Jul 13th, 2006 by JTJ

    Regular readers know that the IAJ has long been interested in the quality of the data in public records databases.  The NY Times of 12 July 2006 carries a front-page story by Eric Lipton on just how bad the data is in the “National Asset Database.”  As Lipton's story points out:

    “The National Asset Database, as it is
    known, is so flawed, the inspector general found, that as of January, Indiana,
    with 8,591 potential terrorist targets, had 50 percent more listed sites than
    New York (5,687) and more than twice as many as California (3,212), ranking the
    state the most target-rich place in the nation….

    “But the audit says that lower-level
    department officials agreed that some older information in the inventory “was
    of low quality and that they had little faith in it.

    “The presence of large numbers of out-of-place
    assets taints the credibility of the data,” the report says.”

    Sigh.  This is not a new problem, or even one that we can hang on the Bush Administration.  It started with the Clinton Administration in 1998.  In 1998, President Clinton issued Presidential Decision Directive No. 63
    (PDD-63), Critical Infrastructure Protection, which set forth principles for
    protecting the nation by minimizing the threat of smaller-scale terrorist attacks
    against information technology and geographically-distributed supply chains
    that could cascade and disrupt entire sectors of the economy.” [Source here.]

    Link to the PDF of the Inspector General's Report at

    Some well-deserved recognition for news researchers
    Jul 5th, 2006 by JTJ

    Many of us have long-recognized that a top-flight team of news researchers is the marrow of any good news operation.  So it is that we point you to a recent column in The Washington Post.


    The Post's Unsung Sleuths

    By Deborah Howell
    Sunday, July 2, 2006; B06

    reporting that appears in The Post is supported by an infrastructure of
    research that readers do not see, except as credited in the occasional
    tag line at the end of a story.

    Those tag lines don't begin to
    acknowledge the work done for reporters and readers by the News
    Research Center. The musty newspaper morgue of lore, brimming with
    crumbling clippings in tidy little envelopes, is now full of computers
    and researchers that Post journalists can't live without. Yes, there's
    still paper — about 7,500 books, 30 periodicals a month and 15 daily

    Center director Bridget Roeber said the researchers
    are “news junkies, who see themselves not just as librarians but
    journalists finding and analyzing original documents, tracking people
    down, finding leads, using obscure databases.”

    Ver 1.0 — The beat goes on
    Apr 18th, 2006 by JTJ

    We're pulling together the final pieces following the Ver 1.0
    workshop in Santa Fe last week.  Twenty journalists, social
    scientists, computer scientists, educators, public administrators and
    GIS specialists met in Santa Fe April 9-12 to consider the question,
    “How can we verify data in public records databases?” 

    The papers,
    PowerPoint slides and some initial results of three breakout groups are
    now posted for the public on the Ver1point0 group site at Yahoo.  Check it out.

    »  Substance:WordPress   »  Style:Ahren Ahimsa