Alfredo Covaleda,
Bogota, Colombia
Stephen Guerin,
Santa Fe, New Mexico, USA
James A. Trostle,
Trinity College, Hartford, Connecticut, USA
From the good ol' Librarians' Index to the Internet comes a good site/toolbox for learning and teaching stats. “The Claremont Colleges' “Web Interface for Statistics Education” (WISE) seeks to expand teaching resources offered through Introductory Statistics courses, especially in the social sciences. This project aims to develop an on-line teaching tool to take advantage of the unique hypertextual and presentational benefits of the World Wide Web (WWW). This teaching tool's primary application is as a supplement to traditional teaching materials, addressing specific topics that instructors have difficulty in presenting using traditional classroom technologies. The tool serves to promote self-paced learning and to provide a means for advanced students to review concepts.”
Once in a great while, a scholarly event occurs that, at least in hindsight, was a milestone in the sociology of knowledge. In modern times we have seen the 1975 Asilomar Conference on safety and regulation of recombinant DNA technologies, for example.
This past March 15-16, 2005, a National Science Foundation-sponsored conference was held outside of Washington, D.C. that, 20 years hence, might prove to be a similar milestone. While apparently there were no journalists participating, it was a meeting with great portent for us, especially those journalists who consider their best work to be a solid social science endeavor.
The conference was called: “SBE/CISE Workshop on Cyberinfrastructure for the Social Sciences.” Glossary time: “SBE” means “Social, Behavioral, and Economics.” “CISE” means “Computer & Information Science & Engineering.” Cyberinfrastructure? Well, you can figure that one out.
The workshop concept:
“Cyberinfrastructure is the coordinated aggregate of software, hardware and other technologies, as well as human expertise, required to support current and future discoveries in science and engineering. The challenge of Cyberinfrastructure is to integrate relevant and often disparate resources to provide a useful, usable, and enabling framework for research and discovery characterized by broad access and “end-to-end” coordination.
Today, most Cyberinfrastructure efforts are focused on the development and integration of Cyberinfrastructure technologies and resources. Fewer efforts have focused on the immense repercussions of the social dynamics and organizational, policy, management and administration decisions inherent in developing and deploying Cyberinfrastructure. Such choices, and the social, cultural, and behavioral impacts of how we develop, manage, and evolve Cyberinfrastructure will be critical to its success.
“Recommendations and Challenges
· Summary Recommendation 1: Develop and deploy enabling data-oriented Cyberinfrastructure targeted to the social and behavioral sciences.
· Summary Recommendation 2: Develop and deploy targeted toolkits, virtual, and computational environments for facilitating social and behavioral science research.
· Summary Recommendation 3: Instrument and design technologies to gather and provide key data for social scientists. Conversely, utilize human and computer interaction data to instrument and design Cyberinfrastructure technologies.
· Summary Recommendation 4: Ensure that confidentiality, privacy, and other social and policy considerations are included as part of the architecture of Cyberinfrastructure.
· Summary Recommendation 5: Involve social and behavioral scientists in the design of organizational frameworks, incentive structures, collaborative environments, decision-making protocols, and other social aspects of Cyberinfrastructure.
· Summary Recommendation 6: Develop adequate funding models for Cyberinfrastructure that will enable social and behavioral science research.
· Summary Recommendation 7: Develop explicit venues for funding inter-disciplinary SBE and CISE research on the social impacts of Cyberinfrastructure.
· Summary Recommendation 8: Develop the community for Cyberinfrastructure and Social Sciences through targeted funding programs, meetings, workshops, conferences, and other activities.”
Read through these recommendations, replacing word like “social and behavioral science” with “journalism,” and we would have a good mission statement for what we must do in the next 20 years. I encourage you to read the well-written final report of Cyberinfrastructure meeting. Yes, parts will seem esoteric to the reporters being pushed to turn out four news stories and a Sunday feature every day. But we hope that at least some editors with the vision thing and some journalism educators will read it and try to climb aboard the Cyberinfrastructure train.
We've been using a variety of web-based bookmarking tools for the past four or five years, tools like the now-departed Blink and Backflip. They were all OK (so long as they remained financially viable), but never quite seemed to meet all our needs. Recently, though, we learned about Furl (www.furl.net) and we like what we see. Furl is in beta, so we don't know what the ultimate price will be, but journalists will like the ease with which we can pull URLs off a web page, markup those savings with keywords, copy-and-paste webpage annotations and then save the citation in a folder of your making. Oh yeah, you can also save and e-mail the link(s) to anyone. In fact, we like Furl so much, we will be demo-ing it next week at the IRE conference in Denver.
As the Furl gang says:“Furl will archive any page, allowing you to recall, share, and discover useful information on the Web. Browse your personal archive of Web pages, and subscribe to other archives via RSS.”
Check it out.
It can.
The NYT this morning tells us that “Big News Media Join in Push to Limit Use of Unidentified Sources.” Readers are told:
“Concerned that they may have become too free in granting anonymity to sources, news organizations including USA Today, The Washington Post, The Los Angeles Times, NBC News and The New York Times are trying to throttle back their use. “But some journalists worry that these efforts could hamper them from doing their jobs – coming in a hothouse atmosphere where mistrust of the news media is rampant, hordes of newly minted media critics attack every misstep on the Web, and legal cases jeopardize their ability to keep unnamed news sources confidential…. “Last year, The New York Times adopted a more stringent approach to its treatment of confidential sources, including a provision that the identity of every unidentified source must be known to at least one editor. A committee of the paper's journalists recently recommended that the top editors put in place new editing mechanisms to ensure that current policies are enforced more fully and energetically.”
We look forward to these “new editing mechanisms.”
Yes, policies on unnamed sources should be made, those policies should be clear and everyone in the newsroom should know what they are. But more often (as in “every day”), editors must know the sources — indeed, all sources — are for a story, how to reach those souces and how to verify what the reporter wrote, even if the reporter is out-of-pocket.
This is not difficult if journalists recognize that a PC-based word processing application already has the tools to assist in this “Who Are The Sources” mission. (If the publication is still using something like the old Coyote terminals, sorry, we probably can't help you.)
The tool is the “comment” function in the word processor. While the newsroom is making policies about sourcing, add this one: “Every paragraph of every story will end with an embedded comment. That comment will show editors exactly how the reporter knows what he or she just wrote.” The comment might include a source's name, phone number and date-time-place of interview. The comment might include a URL or a bibliographic citation. It might include reference to the specific reporter's notebook. But in the end, the comments should be sufficient that an editor can “walk the cat backward” to determine exactly how the reporter knows what he/she just wrote. Doing so helps prevent unwarranted assumptions and errors of fact, if not interpretation.
There will be those of the Burn-Your-Notes School of libel defense who will contend this is comment thing is suicidal. We would suggest, first, that very few stories ever become court cases. Secondly remember that truth is the first defense in libel actions, and it is our responsibility to deliver that truth.
NYTimes science writer Gina Kolata publishes an interesting – and for her, atypical – story Sunday related to content analysis and the integration of statistical and graphic tools. (See “Enron Offers An Unlikely Boost To E-Mail Surveillance.”)The data under the digital microscope? One and a half million e-mails sent by the good folks at Enron that were posted to the Web in 2003 by the Federal Energy Regulatory Commission. She writes:
“Scientists had long theorized that tracking the e-mailing and word usage patterns within a group over time – without ever actually reading a single e-mail – could reveal a lot about what that group was up to. For example, would they be able to find the moment when someone's memos, which were routinely read by a long list of people who never responded, suddenly began generating private responses from some recipients? Could they spot when a new person entered a communications chain, or if old ones were suddenly shut out, and correlate it with something significant?
There may be commercial uses for the same techniques. For example, they may enable advertisers to do word searches on individual e-mail accounts and direct pitches based on word frequency.”
Gee, scientists doing the theorizing? Advertisers doing word searches? Might not “tracking the e-mailing and word usage patterns” be a good tool for journalists to think about using? Are there any journalism departments out there teaching anything about applied content analysis? It appears so. At least Mark Miller, formerly of the University of Tennessee, was doing so a decade ago. And there are some other interesting attempts, here and here by the Project for Excellence in Journalism. But it appears nothing as methodologically sophisticated as that carried out by the computer scientists and political scientists is being done by journalists.
Last week, NOAA predicated a serious hurricane season a'comin' in the Atlantic, which has implications for the entire U.S. East Coast. That's last week's news, but if one lives in California, Mexico, Central America or Japan, then today there's always the possibility of a major shaker. And those are just risks imposed by nature. Modeling these and other hazards of life is the mission of RMS, a fascinating California company demonstrating innovative thinking and analytic tools.
“RMS brings together a unique, multidisciplinary team of experts to create solutions for its clients’ natural hazard and financial risk management challenges. We are the technical leader in our market, with over 100 engineers and scientists devoted to the development of risk models. Of this number, approximately fifty percent hold advanced degrees in their field of expertise.
Our specialists track research among leading experts and academic institutions worldwide, and supplement this knowledge with internal R&D to ensure that our models provide the most complete and accurate quantification of risk.”
Yup — our kind of guys. Examples of the output of these “risk models” can be found here. Of special interest to U.S. journalists are the Catastrophe Risk maps. (They are a bit too small to read in detail, but big enough to get the gist of some of the RMS product.)
We hope to report more next week about RMS, how it does what it does and how there might be some synergy there for analytic journalists.
Here at the IAJ we believe one of the reasons people come to newspapers or broadcast stations is to get the data which, upon analysis, they can turn into information that helps them make decisions. Ergo, the more meaningful data a journalistic institution can provide, the greater value that institution has for a community. A good example arrived today thanks to Tara Calishain, creator of ResearchBuzz. She writes: ** Getcher Cheap Gas Prices on Google Maps <http://www.researchbuzz.org/getcher_cheap_gas_prices_on_google_maps.shtml> “Remember when I was saying that I would love a Gasbuddy / Google Maps mashups that showed cheap gas prices along a trip route? Turns out somebody has already done it — well, sorta. You can specify a state, city (only selected cities are available) and whether you're looking for regular or diesel fuel. Check it out at http://www.ahding.com/cheapgas/ “
The data driving the map is ginned up by GasBuddy.com It's not clear how or why GasBuddy gets its data, but it offers some story potential for journalists and data for news researchers. It has an interesting link to dynamic graphs of gas prices over time.
Surely the promotion department of some news organization could grab onto this tool, tweak it a bit, promote the hell out of it, and drive some traffic to and build loyalty for the organization's web site.
That's the obvious angle, but what if some enterprising journo started to ask some questions of the data underlying the map? What's the range in gas prices in our town/state? (In Albuquerque today, the range was from $2.04 to $2.28.) Are there any demographic or traffic flow match-ups to that price range? How 'bout the variance by brand?
Would readers appreciate this sort of data? We think so, especially if there was an online sign-up and the news provider would deliver the changing price info via e-mail or IM much like Travelocity tells us when airline ticket prices change by TK dollars.
Regional Economic Models Inc. cordially invites you to join us on June 7th for a teleconference regarding Base Realignment and Closure (BRAC). On Friday May 13th Department of Defense released Recommendations to the BRAC commission. We feel that a discussion of BRAC studies and analysis methods would be helpful to a number of communities: Topics to be discussed include: – Demographic effects of active military, reservists, & dependents. – Migration effects of re-alignment or closures. – Dynamic effects of government spending over time. – The Impacts of lost or reduced civilian contracts. – Previous BRAC studies using the REMI model. – Other topics by REMI Guest Speakers. A presentation will be sent out before the call in order to direct and facilitate discussion. There will be two teleconferences taking place on the 7th, one at 10am, one at 4pm EST, hosted by Frederick Treyz and Jonathan Lee. There is no fee for participation, but space is limited. If you are planning on joining us or would like to participate in the discussion please respond to this e-mail, register online at www.remi.com or contact us by phone at (413) 549-1169. We look forward to speaking with you in June! Yours truly, Frederick Treyz, Ph.D. Chief Executive Officer Regional Economic Models, Inc. 306 Lincoln Ave. Amherst, MA 01002 T. 413-549-1169 F. 413-549-1038 Fredtreyz@remi.com www.remi.com
The power of good infographics is that they can greatly aid in the upstream aspects of journalism — providing insight for journalists to understand what's happening with a particular phenomena — and then downstream, to help journalists tell the story and for the audience to understand it. The Digital Revolution has upped the ante far beyond what good ol' Leonardo was using and envisioning. One of the innovators in today's datasphere is Alexander Tsiaras. A recent story in Digital Journal has this to say about Tsiaras's company, Anatomical Travelogue: “Digital Journal — At ideaCity04, one presenter was so overflowing with information that host Moses Znaimer had to enter stage right and patiently sit beside him, a silent reminder to wrap it up. But you couldn’t ask Alexander Tsiaras to gloss over the wonders of the human body, from blood flow to cell mutation.
During his presentation, he showed images from his visualization software company Anatomical Travelogue, whose clients include Nike, Pfizer and Time Inc. Tsiaras and his 25 employees take data from MRI scans, spiral CT scans and other medical imaging technologies, and use them to create scientifically accurate 3D pictures and animations.
In 2003, his book of images of fetal development, From Conception to Birth, sold 150,000 copies and his latest work is Part Two of this fantastic voyage, The Architecture and Design of Man and Woman. For a chapter on sex, Tsiaras even scanned an employee doing the deed with his girlfriend — all in the name of science.”
Jump into the study of epidemiology with Prof. David Kleinbaum and Prof. Nancy Barker in the online course “Fundamentals of Epidemiology” at statistics.com June 10 – July 15. Using their electronic textbook “ActiveEpi”, this introductory course emphasizes the underlying concepts andmethods of epidemiology. Topics covered include: study designs (clinical trials, cohort studies, case-control studies, and cross-sectional studies), measures of disease frequency and effect. Dr. Kleinbaum, professor at Emory University, is internationally known for his textbooks in statistical and epidemiologic methods and also as an outstanding teacher. He is the author of “Epidemiologic Research-Principles and Quantitative Methods”, “Logistic Regression- A Self-Learning Text”, and “Survival Analysis- A Self Learning Text”. Prof. Barker is a consulting biostatistician and a co-author of the “ActivEpi Companion Text”, and has over 10 years of experience teaching short courses in epidemiology and biostatistics at Emory and at the Centers for Disease Control and Prevention. The course takes place online at statistics.com in a series of 5 weekly lessons and assignments. Course participants work directly with both instructors via a private discussion board. Participate in the course at your own convenience; there are no set times when you are required to be online. For registration and information: http://www.statistics.com/content/courses/epi1/index.html Peter Bruce courses@statistics.com P.S. Coming up June 3 at statistics.com: “Toxicological Risk Assessment” and “Using the Census's new 'American Community Survey' ” and, on June 10, “Categorical Data Analysis.”