Alfredo Covaleda,
Bogota, Colombia
Stephen Guerin,
Santa Fe, New Mexico, USA
James A. Trostle,
Trinity College, Hartford, Connecticut, USA
Here at the IAJ, there is growing curiosity about vlogs, blog sites that carry video. And, of course, we're always interested in maps. We recently ran across “Vlogmap.org,” a cool mash-up that integrates vlog sites with Google's mapping tools. Worth a visit, we think, and some consideration about how journalism organizations might apply the technology. “What is Vlogmap.org? VlogMap.org is an online resource which shows where participating vloggers are located around the world, along with links to key information about their video blogs. Anyone can submit info to VlogMap.org to be listed on the map, as long as you run a video blog. Why Vlogmap.org? VlogMap.org is intended to be a fun and interesting way to learn about and explore the vlogging community and its online offerings. How does it work? VlogMap visitors can click on any red pin to get links to the web address, the RSS feed, and the contact information for that location. Additionally, a user of VlogMap is able to zero in, and examine areas of vlogger concentration, such as New York City, Los Angeles, and London. Anyone can submit info to VlogMap.org to be listed on the map.”
We've long appreciated Ford Fessenden's forceful analytic journalism at the NYTimes, but a piece he has in today's Week in Review section leaves us yearning for more. In “Where Home Prices Rise Steeply, Bankruptcies Fall,” Ford raises some interesting — and appropriately inconclusive questions — about the relationship between real estate prices and the number of bankruptcies. And we're given a nicely colored map of U.S. counties and their changes in bankruptcy rates, 2000 to 2005. The quartile scale is huge: zero to 35 percent and greater than 35 percent, both up and down. The problem is there are no hard numbers to put the bankruptcies in context related to county population. And one or two counties down in southeastern Arizona have a greater than 35 percent decline in bankruptcies, but we know they have very sparce populations. “OK,” you might say, “there's simply no room to put all those numbers in the newspaper.”
Right, but they surely could be put online in a variety of ways. If there were three bankruptcies in 2005 and two in 2005, that's pretty close to a 35 percent decline, but hardly statistically significant. I'm sure this isn't Ford's fault; he has the data and is probably far more aware of its analytic pitfalls than we are. But editors — Editors! — have to begin thinking of stories as having many fascets, and work to deliver the richest amount of data as possible that is related to the stories and their context.
One of the real challenges for journalists wanting/needing to do GIS is getting the software and map files. Often the major roadblock is the newsroom budget. We recently learned of a project that uses the file-sharing capabilities of Bittorrent (the peer-to-peer file sharing program) to make maps available at our favorite price — free. Check out GeoTorrent.org
What types of data are shared? All different types of geospatial data is shared here. For example Air and satellite photo's, as well as attribute data. What formats is the data in? Imagery is in either ECW or JPEG 2000 format. Both formats allow high levels of compression. For example 1 terabyte (1,000 gigabytes) of raw data can be compressed to just 50 GB. JPEG 2000 also provides a lossless compression algorithm, allowing for pixel-for-pixel fidelity with the original dataset. Attribute (vector) data can be distributed in any common data format e.g. shape, tab files or native data formats. ” The files are often large, like the 5.5gb “North America Landsat Mosaic,” but there appears to be a growing number of non-US data.
This looks to be a tool with potential. Click here for the opening press release.
Technology Review's senior editor Wade Rousch delivers a fine overview of Google Earth in the magazine's October issue. The piece would be especially good as an introcuction to the tool/concept for someone who is relatively new to online mapping. See “Killer Maps.”
He has posted some handouts at dicar.org/global2005 or
Friday's highlights from the conference in Amsterdam…. Henk van Ess has given two fine training sessions yesterday and this morning. The first: Training 02: Forensic surfing (Thursday 14.00 – 15.15) How can you figure out the reliability of a website – even without opening the site? How do you find the owner of a web site? How can you see how old a page is, even if it doesn't say 'Page last updated at..'? How do you find the author of a Word document? Welcome to the world of forensic surfing. Extra: CD-ROM with the course 'Internet Detective' for all participants. Watch the HTML version at www.searchbistro.com/forensic.htm The second session: Hacking with Google (Friday 9.30 – 10.45)
“People make mistakes. They put sensitive data on servers. They forget to remove delicate material. They leave directories open with hidden files. Learn how to use Google in a different way. The best search techniques for finding secret documents from governments, institutions and companies. Open them with the right questions. Henk van Ess (AD, Netherlands) teaches you what sort of words you have to type, which special syntax you have to use and how you should interpret the answers. Note: this training will teach you how to find material that shouldn't be on the web. It doesn't teach you how to hack into systems.” This presentation can be viewed at www.searchbistro.com/hack.htm There is a companion book – The Google Hacker’s Guide: Understanding and Defending Against the Google Hacker by Johnny Long (johnny@ihackstuff.com) — partial section at www.searchbistro.com/googlehacks.pdf
http://www.sfexaminer.com/articles/2005/09/26/opinion/20050926_op03_policies.txt Using psychological science to set policy.
Profs. David Kleinbaum and Nancy Barker will present theironline short course “Analysis of Epidemiologic Data” Oct.14 – Nov. 11 at statistics.com. Topics covered in thecourse include: simple analysis of 2×2 tables, control ofextraneous variables (including an introduction to logisticregression), stratified analysis, and matching.
David Kleinbaum, a professor at Emory University's RollinsSchool of Public Health, is internationally known for histextbooks in statistical and epidemiologic methods and asan outstanding teacher. He is the author of “ActiveEpi”and “Epidemiologic Research- Principles and QuantitativeMethods” and has also taught over 150 short courses overthe past 30 years throughout the world.
Nancy Barker is a consulting biostatistician and a co-author of the “ActivEpi Companion Text,” and has over 10years of experience teaching short courses in epidemiologyand biostatistics at Emory University and the Centers forDisease Control and Prevention.
As with all online courses at statistics.com, there are noset hours when you must be online, and you can interactwith the instructor over a period of 4 weeks via a privatediscussion board. We estimate you will need about 10 hoursper week.
Registration: $399 ($299 academic)http://www.statistics.com/content/courses/epi3/index.html
Peter Brucepbruce@statistics.com
P.S. Also coming up – “Clinical Trial Design” Oct. 21 –Nov. 18 with Dr Vance Berger.statistics.com612 N. Jackson St.Arlington, VA 22201USA
Another piece in The Guardian this week (some of the Brit papers are a very good read) discusses how Tesco harvests — and then replants — customer data. This is of interest because Tesco, a British company, is hankering after the U.S. grocery chain, Albertson´s. See “Tesco stocks up on inside knowledge of shoppers´ lives´´ below and “ Profile of an upmarket C10 deserter“ sidebar.
· Crucible database is exhaustive – and secret · Government bodies are tapped for information
Guardian
The company refuses to reveal the information it holds, yet Tesco is selling access to this database to other big consumer groups, such as Sky, Orange and Gillette. “It contains details of every consumer in the UK at their home address across a range of demographic, socio-economic and lifestyle characteristics,” says the marketing blurb of dunnhumby, the Tesco subsidiary in question. It has “added intelligent profiling and targeting” to its data through a software system called Zodiac. This profiling can rank your enthusiasm for promotions, your brand loyalty, whether you are a “creature of habit” and when you prefer to shop. As the blurb puts it: “The list is endless if you know what you are looking for.”
This publicity material was, until recently, available on the website of dunnhumby, but now appears less forthcoming. Attempts by a number of Guardian reporters to retrieve their own personal information under the Data Protection Act led to a four month battle; the request was ultimately denied so the Guardian has appealed to the Information Commissioner. Tesco has provided some personal data held by Clubcard, the loyalty scheme that monitors members' shopping and which has been credited with fuelling the supermarket group's astronomical growth in the past decade.
But as far as Crucible is concerned, the company admits it has “put great effort into designing our services” so information is classed in a way that circumvents disclosure provisions in the Data Protection Act. Clues about the content of dunnhumby's database have appeared in the company's marketing literature. Crucible, it says, is a “massive pool” of consumer data. “In the perfect world, we would know everything we need to know about consumers. We would have a complete picture: attitudes, behaviour, lifestyle. In reality, we never know as much as we would like.” But Crucible, it suggests, has got much further than rival systems by pooling data from several sources and then using the vast Clubcard data pool to profile customers.
Together, Crucible and Zodiac can generate a map of how an individual thinks, works and, more importantly, shops. The map classifies consumers across 10 categories: wealth, promotions, travel, charities, green, time poor, credit, living style, creature of habit and adventurous.
A “Mrs Pumpkin” is cited: she makes pennies work when she shops, mostly uses cash, has a steady repertoire of products but experiments with the new, shops at various times, spends a little more on eco-friendly items, is involved with charitable giving, is rarely away and likes promotions for things she buys.
How does Tesco get the information? Clubcard is used to target promotions at particular cardholders. But Crucible is separate and Tesco insists that while loyalty scheme data is used by Crucible it does so anonymously rather than a house-by-house, name-by-name basis.
Dunnhumby's chairman, Clive Humby, offers a few more clues. Companies such as Experian, Claritas and Equifax have databases on individuals and Crucible collects from them all. Any questionnaire you may have completed, any reader offers you responded to, are bought to build up a picture of attitudes and habits. Crucible also trawls the electoral roll, collecting names, ages and housing information. It uses data from the Land Registry, Office for National Statistics and other bodies to generate a profile of the area you live in. Zodiac is employed to provide a more detailed profile. The combination is valuable to many consumer goods firms: dunnhumby generated profits of £4m on sales of £28m in the last year for which accounts are available. Some £12m of business was done directly with Tesco.
Mr Humby and Edwina Dunn founded dunnhumby. The two have a reputation as shrewd operators in the marketing industry and still own shares in the firm alongside Tesco's majority stake. How the supermarket group and other customers use the data is less clear. One former employee involved in the company's marketing told the Guardian that it can be used to decide how to target offers to individuals or where to open new stores.
A Tesco spokesman said last night: “All work carried out by dunnhumby is regulated by the Data Protection Act and the Direct Marketing Association Code of Practice.” But, as the supermarket unveils yet another set of sparkling half-year figures today, one thing is clear: while past success may have been built on the company knowing its customers, Tesco plans to secure its future by knowing everyone else's customers as well.
Profile of an upmarket C10 deserter
When it comes to my personal information, I'm a natural paranoid. So when signing up for a Tesco Clubcard to get those cashback vouchers and offers, I made a point of providing as little information as the application would allow.
No matter. According to Tesco's disclosures under the Data Protection Act (DPA), in the year my card was in use the supermarket managed to build a substantial – if rather wayward – portrait of this reluctant shopper's habits. A formal DPA request, followed by numerous letters to and fro, a terse telephone conversation and finally, a fax explaining that, yes, this information would be used in a journalistic exercise, finally produced two sides of information.
Apparently, I'm a gal who hankers after “finer foods”- indeed, a “natural chef”, though friends tell me this probably has more to do with my tendency to cook with natural ingredients than any signs of being a budding Nigella. I am, Tesco determines, “upmarket” – a reference, I suspect, to my habit of buying organic food (Green & Blacks mint chocolate being a particular favourite).
The database defines me through the past four years, placing me in the mysterious “C10” category for 2003, having been an “H13” a year earlier – whatever that means. My “family type” is “other,” though alternative social options are not listed. Most importantly for the supermarket, I just don't spend as much as I could there. Under “share of spend” with Tesco I am deemed to have “potential”.
My household carries a “reference number”, the date of my last visit, with branches used in the past. It says whether I have used Clubcard vouchers and correctly states I do not want my personal information to be passed to other parts of the “Tesco Group”. There is no information as to whether I am diabetic, teetotal or have a special diet.
Five slots describe my “shopping habits”, each carries the words “Not shopped in last eight weeks”. Clearly, I'm a Tesco deserter and a prime candidate for those £10-off vouchers that have been dropping through the letter box of late.
· To learn how to get your personal information under the Data Protection Act, see www.guardian.co.uk/foi
The UK paper The Guardian carries a couple interesting pieces this week on the British company, The Press Association, or as it is know now, the PA Group. Essentially, it demonstrates that investment in creative people who can leverage digital technology can make money.
See ´´The new heart of British journalism´´ and “Service used by every paper makes only 1% of the money ´´
The new heart of British journalism
A sleepy Yorkshire town has become the hub of an international publishing operation
Martin WainwrightTuesday September 20, 2005
Twice now, extraordinary things have happened to the sleepy market town of Howden – little more than a village on the rich, flat land where the river Humber is joined by the Yorkshire Ouse. The first time, in the 1920s when the local airfield became the centre of Britain's airship industry, ended abruptly with the loss of the R101 (and the then air minister) in a storm over northern France. The second time is now, and it shows no sign of collapsing at all.
Quietly over a decade, Howden has become one of the biggest centres of journalism in the country. More than 650 staff of the Press Association – well over double the organisation's workforce in London – occupy buildings scattered round the quaint streets, as if an Oxbridge college had dropped in. Editorial trainees are in the Bishop's Manor, a medieval roost with jumbo plasma TV screens in the fireplaces where the Bishops of Durham used to warm up after trekking down from the north-east. Guests from London stay in a redbrick Georgian manor house which looks like something out of Jane Austen.
The high command of PA Sport has the vast, curving top floor of a purpose-built office block which replaced the town's redundant police station and magistrates' court two years ago. From here, among scores of other sports information services, Premier League goals and match analysis are texted live to mobile phones all over the world.
Howden is the main laboratory for PA's expansion from a comprehensive and reliable news-wire into the structural support for newspapers, websites, television, radio and magazines. The guts of the service is produced elsewhere, by reporters at news events, parliament or sports fixtures, but the processing and ever more imaginative marketing go on in Yorkshire.
Tony Watson, PA's editorial director, a multiple award-winner and former editor of the Yorkshire Post, relishes the innovation. Outside his office on the ground floor, reporters' material is slimmed into Teletext bulletins (“An excellent subediting exercise,” he says. “The contents have to have exactly the right wordage to fill a line across the screen.”) On the next floor up, the same data is repackaged for listings and, with extra content, for breaking-news sections on websites, including the Guardian's. On the top floor it gets reprocessed again for sport.
Another section turns it into mini-bulletins for mobiles, text-only or with pictures. There are initiatives to expand it into digital TV, with a studio just opened and a specialist journalists' training course starting next month. Although PA has always been, and remains, modestly anonymous, its Howden super-office is starting to publish on a scale most editors must envy.
Touring the main building, Watson points out a wall pinned with national and international news pages from British local newspapers. Copy has always been provided for these by PA but now staff at Howden offer story choice and complete page layout too. A couple of those magazines dished out by rail companies are produced here with advertising and printing subcontracted to regional newspaper customers of PA. A canny use of partnerships has been part of the agency's growth. The editorial centre grew out of joint working with now vanished Westminster Press. PA Weather, which now sells its meteorology to road-gritting departments as well as the media, has just taken over the other, Dutch half of the joint operation.
Howden is now full up, says Watson, whose colleague Chris Buckley, managing director of PA Sport, takes over half the middle floor on Saturdays, when football needs 70 extra staff and the listings terminals are briefly unoccupied. There has been criticism about PA pay rates – this month the National Union of Journalists published a survey showing levels as low as £12,000 a year at Howden. But the size of the operation is buoying the flagging local economy, and vacancies are quickly filled.
And now there is India. By November, 50 staff will be backing up the Yorkshire operation in offices in Mangalore, on the south-west coast of India, which are also designed to be a jumping off point for further news and sport packaging overseas. “There's tremendous interest in British sport in Asia,” says Watson, describing automated systems in Howden which text or email results, as they happen, in Cantonese, Thai, Mandarin and many other languages. “But there's also a growing number of fixtures locally, which we can handle either for other markets or for the countries involved.”
Two recent deals see PA distributing German sports results in Germany and – from this autumn – selling South African premier league reports and results within South Africa. Mr Buckley says: “They're holding the World Cup there in five years' time and Fifa has recommended the data-processing system as a model for the rest of Africa.”
After the R101 tragedy in 1930, there was gloom in Howden when glamorous airship designers stopped coming from London. Today, the “Howden Flyer”, a direct, two-hour train service from London which stops at the town six times a day to drop off largely PA clients, is only going to get busier.