Alfredo Covaleda,
Bogota, Colombia
Stephen Guerin,
Santa Fe, New Mexico, USA
James A. Trostle,
Trinity College, Hartford, Connecticut, USA
Picking up some interesting Web 2.0 tools at the IRE's annual conference, this year in Phoenix.
The Programmableweb.com www.programmableweb.com/ Good jumpstation for APIs, Mashups, How-To info, etc. CityCon — www.tetonpost.com/citycon/ “CityCon allows you to find detailed information about any member of the current 110th U.S. Congress. Use the Input field above to query the CityCon database and the Internet for a U.S. City, State, Senator or Representative.” Maplight.org — www.maplight.org/ “MAPLight.org brings together campaign contributions and how legislators vote, providing an unprecedented window into the connections between money and politics. We currently cover the California Legislature and U.S.”
So the NYT did backtrack on the percent-of-change error described yesterday without assigning blame. That's fine. But the correction suggests another big story that we have only seen parts of. That is, of all the U.S. presence in Iraq — military and contractors — how many and what proportion are actually on the streets and how many and in what capacity are in support categories.
This weekend, friend-of-the-IAJ Joe Traub sent the following to the editor of the New York Times. Here's the story Joe is talking about: “White House….“
The headline error is bad enough (it's only in the hed, not not in the story) — and should be a huge embarrassment to the NYT. But the error gets compounded because while the Times no longer sets the agenda for the national discussion, it is still thought of (by most?) as the paper of record. Consequently, as other colleagues have pointed out, the reduction percentage gets picked up by other journalists who don't bother to do the math (or who cannot do the math.) See, for example: * CBS News — “Troop Retreat In '08?” — (This video has a shot of the NYT story even though the percentage is not mentioned. Could it be that the TV folks don't think viewers can do the arithmetic?)(NB: We could not yet find on the NPR site the transcript of the radio story that picked up the 50 percent error. But run a Google search with “cut in Troops by 50%” and note the huge number of bloggers who also went with the story without doing the math.)Colleague Steve Doig has queried the reporter of the piece, David Sanger, asking if the mistake is that of the NYT or the White House. No answer yet received, but Doig later commented: “Sanger's story did talk about reducing brigades from 20 to 10. That's how they'll justify the “50% reduction” headline, I guess, despite the clear reference higher up to cutting 146,000 troops to 100,000.”
Either way, it is a serious blunder of a fundamental sort on an issue most grave. It should have been caught, but then most journalists are WORD people and only word people, we guess.
We would also point out the illogical construction that the NYT uses consistently in relaying statistical change over time. To wit: “… could lower troop levels by the midst of the 2008 presidential election to roughly 100,000, from about 146,000…” We wince.
English is read from left to right. Most English calendars and horizontal timelines are read from left to right. When writing about statistical change, the same convention should be followed: oldest dates and data precedes newest or future dates and data. Therefore, this should best be written: “…could lower troop levels from about 146,000 to roughly 100,000 by the midst of the 2008 presidential election.”
Source: http://radar.oreilly.com/archives/2007/05/geocommons_shar.html
GeoCommons, Share Your GeoData
Posted: 23 May 2007 01:59 PM CDT
By Brady Forrest
GeoCommons is a new mapping site that allows members to use a variety of datasets to create their own maps. It provides the free geodata, a map builder tool,the ability to create heat maps, and a map hosting site. An API will be available shortly. GeoCommons comes from FortiusOne, a Washington, D.C. company. The public Beta is going to be releasedWhere 2.0's launchpad. Monday, May 28th, at Where 2.0's launchpad.
When building a map you can use one of the 1500 data sets (with 2 billion data attributes) that they have made freely available. The data sets vary widely and include things like “Identity Theft 2006”, “Coral Reef Bleaching – Worldwide”, “Starbucks Locations – Worldwide”, and “HAZUS – Seattle, WA – Resident Demographics”. As you can see below, data can be viewed in a tabular format prior to loading it onto a map. Data sets can be combined together so that you can see “The Prices of Living in NYC & SF” and “Barack vs. Clinton – Show Me the Money! ” — it seems to me that Barack has more widespread support.
We are finding O'Reilly's Radar an increasingly valuable site/blog to keep up with interesting developments in Web 2.0, publishing and the general Digital Revolution. Brady Forrest's contribution below is an example.
See http://radar.oreilly.com/archives/2007/05/trends_of_onlin.html
Trends of Online Mapping Portals
Posted: 21 May 2007 04:34 PM CDT
Last week there were several announcements made that show the direction of the online mapping portals. Satellite images and slippy maps are no longer differentiators for attracting users, everyone has them and as I noted last week there are now companies that have cropped up to service companies that want their own maps. Some of these new differentiators are immersive experiences, owning the stack, and data!
Immersive experience within the browser – A couple of weeks ago Google maps added building frames that are visible at street level in some cities. These 2.5D frames are very clean and useful when trying to place something on a street.
Now the Mercury News (warning: annoying reg required; found via TechCrunch) is reporting that these builds will soon be fully fleshed out.
The Mercury News has learned that Google has quietly licensed the sensing technology developed by a team of Stanford University students that enabled Stanley, a Volkswagon Touareg R5, to win the 2005 DARPA Grand Challenge. In that race, the Stanford robotic car successfully drove more than 131 miles through the Mojave Desert in less than seven hours. The technology will enable Google to map out photo-realistic 3-D versions of cities around the world, and possibly regain ground it has lost to Microsoft's 3-D mapping application known as Virtual Earth.
The Mercury News has learned that Google has quietly licensed the sensing technology developed by a team of Stanford University students that enabled Stanley, a Volkswagon Touareg R5, to win the 2005 DARPA Grand Challenge. In that race, the Stanford robotic car successfully drove more than 131 miles through the Mojave Desert in less than seven hours.
The technology will enable Google to map out photo-realistic 3-D versions of cities around the world, and possibly regain ground it has lost to Microsoft's 3-D mapping application known as Virtual Earth.
The license will be exclusive, but don't think Google will be the only ones with 3-D in the browser. Microsoft has had 3-D for a while now (unfortunately, it requires the .NET framework; my assumption is that the team is busy converting it to SilverLight). 3-D is going to become a standard part of mapping applications. The trick will be making sure that the extra data doesn't get in the way of the user's quest to get information. Buildings are slow to render and can obscure directions.
This strategy is a nice compliment to their current strategy of gathering and harnessing 3-D models from users. Currently these are only available in Google Earth. The primary location to get them is Google's 3D Warehouse. I suspect that we will start to see user contributed models on Google Maps.
No word on how many cities Google will roll out their 3D models in or when the new data will be available via their API.
Data, Data, & More Data – Until recently, search engines did not provide neighborhoods as a way of searching cities. Neighborhoods are an incredibly useful, if hard to define, method of defining an area of a city.
Google has now added neighboorhood data to their index, but they have not really done much with it. If you know the neighborhood name then you can use that to supplement searching a city. However, if you are uncertain or if you are unaware of the feature, then you are SOL. There is no indication that the feature exists, how widespread it is, or what the boundaries of the neighborhood are. I hope that they continue to expand on this feature.
Ask on the other hand has done a great job with this feature (see above). They surface nearby neighborhood names for easy follow-on searches (see below). They show you the bounds of the neighborhood quite clearly.
Ask is using data from SF startup Urban Mapping. Urban Mapping claims complete coverage of ~300 urban areas in the US and Canada (with Europe coming). This isn't an easy problem. Urban Mapping has been working at it for quite sometime and are known for having a good data set. They have also been aggregating transit data. An interesting thing to note is that many of the same neighborhoods available on Ask are also available on Google maps (examples: Tenderloin, SF: Google, Ask; Civic Center, SF: Google, Ask) No word yet if any of the other big engines are going to add neighborhood data, but my guess is that it will soon become a standard feature; it's too useful to not have.
Own the Stack – Until recently, Yahoo! used deCarta to handle creating directions (or routing). They have announced that they have taken ownership of this part of the stack and have built their own routing engine. Ask and Google still use deCarta. Microsoft has always had their own. Yahoo! is hoping to make their new engine a differentiator. In some ways this is analogous to Microsoft's purchase of Vexcel, a 3D imagery provider. Microsoft did not want the same 3D data as Google Earth or any other search engine for its 3D world.
I think that any vendor servicing Google, Microsoft, Ask, Yahoo or MapQuest will have to keep an eye on their next source of revenue. Those contracts aren't going to necessarily last too long. The geostack is too valuable to outsource.
There is only one part of the stack that I think *might* be to expensive for any one of the engines to buy or build out right. That's the street data and it's a data source primarily supplied by two companies, NAVTEQ and Tele Atlas. NAVTEQ has a market cap of 3.5 bilion dollars as of this writing; Tela Atlas has one of 1.4 billion pounds. These would be spendy purchases. Microsoft is currently working closely with Facet Technology Corporation to collect street data for cities to add a street-level 3D layer (see Facet's SightMap for a preview), but this Facet is not collecting data to match the other players. It will be interesting to see if Yahoo! parleys its partnershipOpenStreetMap into a data play. with
An interesting piece of analysis and visual infographics posted today on the O'Reilly Radar site. See http://radar.oreilly.com/archives/2007/05/baseball_team_overpaid.html
Assuming you have a baseball team, Ben Fry will let you answer that question. He has created a tool for visualizing the salary of Major League Baseball teams versus their performance in 2007 (prev. As he explains:
This sketch looks at all 30 Major League Baseball Teams and ranks them on the left according to their day-to-day standings. The lines connect each team to their 2007 salary, listed on the right. Drag the date at the top to move through the season. The first ten days of the season are ommitted because the rankings to (at least) that point are statistically silly. You can also use the arrow keys on the keyboard to move forward or backward one day. A steep blue line means that the team is doing well for its money, which reflects well on the team's General Manager. A steep red line implies that the team is throwing away money. The thickness of the line is proportional to the team's salary relative to the others.
This sketch looks at all 30 Major League Baseball Teams and ranks them on the left according to their day-to-day standings. The lines connect each team to their 2007 salary, listed on the right.
Drag the date at the top to move through the season. The first ten days of the season are ommitted because the rankings to (at least) that point are statistically silly. You can also use the arrow keys on the keyboard to move forward or backward one day.
A steep blue line means that the team is doing well for its money, which reflects well on the team's General Manager. A steep red line implies that the team is throwing away money. The thickness of the line is proportional to the team's salary relative to the others.
The images above are captures of the beginning of the season rankings (left) as compared to now (right). It looks like Boston is now at a break-even point whereas the Yankees are sinking and a bit over-paid. I wonder if any of the GM compensation decisions are made based on this tool.
We're at the UCLA conference center attending the 4th Lake Arrowhead Conference on Human Complex Systems First take:
Bill Lawless' interesting work finds that groups operating on a “concensus model” are less effective and efficient when compared to “majority model” decision-making groups.
Chasparis' work has implications for journalism institutions IF they understand that they can (should?) be the hub (or node) for facilitating transactions between users and those with the desired resources and/or between the journalistic institution and the community. The presentation is complicated and laden with equations — after all, the authors are in mechanical engineering — but study well their implications of how networks are created and emerge.
What this presentation suggests is that we could model circulation/promotion campaigns by “selling” one subscription to an individual household. Then, having planted that seed of recognition and brand AND assuming that there is neighbor-to-neighbor communication, we fertilize that seed by delivering for free our paper to the immediately adjacent neighbors. And, perhaps, we use stick-on/peel off labels to publicize something special for that node of concentration. Now we have created a potential point of commonality for the neighbors to talk about and, we hope, appreciate. The question then becomes “How can we create added value” for that cluster of subscribers.
Second point raised: Can we model what is the optimum time for prescription offers? Is 13-weeks the best or five? Let's find out.
See Gessler's homepage — http://gessler.bol.ucla.edu/ — for excellent collection of visual and dynamic tools for modeling.
Presentation on residential segregation modeling. “Schelling suggests that segregation can emerge at the active level even if it is not sought by the residents.” Later findings (Bruch and Mare): Segregation increases with indifference to segregation. Why? Not really a lack of indifference. Also, equal granularity in the multicultural function. (See also: http://paa2006.princeton.edu/download.aspx?submissionId=60143)
Conclusions:
Interesting discussion of what he terms “discourse communities.” i.e. “Dynamic interplay of cultural resources and situated identities.”
His approach is to apply a number of theoretical metrics (15 models) to building a “society” (based on good anthro data) and see which works best. An approach closely related to exploratory data analysis that analytic journalists often use.
Commonalities of models that worked well: 1) Agents were quasi-optimal (smart) 2) Agents nonetheless diverse (heterogeneous.e.g. individual agents doing different things.)
Interesting related link here for
Good presentation on simulation (computational modeling?) of the Tuberculosis cycle in Tijuana plus looking at models of corruption. He points out that the Chinese population in Tijuana is growing very fast. Interesting, and valuable, application of Maslow's pyramid of needs concepts (i.e starting with the physical needs to social to moral needs.)
Working on integrating Beer's Visable Systems Model with transactional analysis models.
Fifth Session
Objective: to make logistics systems work in/as complex adaptive models.[Essentially, this is about the best — most efficient — way to receive raw materials and deliver the finished product to customers of various types. Could have direct application for publishing industry, if it only knew about such methods.]
They are researching how to build-in RFID chips into products like cars to imbue the product with enough intelligence to, for example, figure out the most optimum way to get itself to a truck or ship.
PlaSMA: Multiagent-based simulation for logistics
This doesn't have anything to do with Analytic Journalism per se, but while flying from Cairo to Dubai recently I looked out the window at 39,000 feet somewhere over the sands of central Saudi Arabia. What to my wondering eyes did appear, but an expanse of pie charts.
Of course these are irrigated crops. A friend in Dubai, who grew up in Saudi Arabia, said the reason they are not all completely filled circles is because some growers don't have enough money (yet) to buy the equipment necessary to complete the 360-degree irrigation.
Our thanks to someone somewhere who pointed us to “Flashearth,” an interesting site under development that supplies links to multiple mapping programs that draw on global satellite imagery. The are: Google Maps; Microsoft VE (aerial); Microsoft VE (labels); Yahoo Maps; Ask.com (aerial); Ask.com (physical); OpenLayers; NASA Terra (daily).The sites vary in the degree of “zoomability,” but each offers slightly different capability and data. In any event, it is most likely worthy of a bookmark.
We realize there is a robust handful of very good infographic reporters and designers working out there for many different publications, but the gang at the NY Times just keeps on keepin' on with innovative — and 98 percent of the time — highly informative infographics and visual displays of data. Today's (25 Feb 2007) edition is a basket rich with fine examples:
* “Truck Sales Slip, Tripping Up Chrysler” (Business Section, p. 8). Offers up a complex (they often are) “treemap” of vehicle sales.
* “Who Do You Think We Are?” (Week in Review – Op-Art, p. 15). Ben Schott, author of “Schott’s Original Miscellany” and “Schott’s Almanac 2007,” a yearbook of American society.” presents some basic line and bar charts, but on subjects of interest to AJ readers. Specifically, “Confidence in Institutions” (the “press” is the lowest, even below Congress) and “Newspaper Readership.” (And you already know what that graph looks like.)
*) “How Two Rights Can Make a Wrong” (Week in Review – p. 5). Howard Markel, M.D. and Bill Marsh give us a fine graphic illustrating complex drug interactions.