The John Henry fight of man v. algorithm

I interviewed Josh Cohen, product manager for Google News, this week for the Guardian MediaTalkUSA podcast (out early next week) and asked him how many clicks to news sources Google News causes. The answer: a billion.

And then I saw this PaidContent report on URL-shortener Bit.ly thinking of offering a breaking news service. That doesn’t seem so crazy when you hear how many clicks it causes a month. The answer: a billion.

It so happens I just wrote this in my Media Guardian column, coming out Monday, about the Microsoft-Yahoo search lashup:

Oh, search still matters. But it is beginning to matter a little less. Venture capitalist Fred Wilson recently pointed out that 14% of traffic to his blog, avc.com, comes from Google, down from 29% the year before. Wilson argues that the difference is Twitter—that is, links from people over algorithms. (Note that Wilson is a Twitter investor.)

Now I’m hardly saying that Google is being overrun by the power of mankind. Nor will I argue that every link Bit.ly sends to is news – except more of it is than news organizations would admit if they were wise enough to expand the definition of news to the hyperspecific, a word a commenter below suggested I start using instead of hyperlocal. Your friend’s concert photos are news to him and you. Note also that Bit.ly isn’t the only source of human-powered live links; there’s the rest of Twitter and its other clients, not to mention Facebook and fresh blogging.

But I do think it’s significant that given the platform to collect the power of links by people, it can quickly match the power of the algorithm. I also think there’s even more power in bringing the two together.

7 Comments

  1. Conrad says:

    Some musings: is that a billion clicks a month? Equally divided by the 4500 individual sources Google claims that News has, that’s only 222,000 pageviews/month per source.

    Now, lets say that the top 25 sources of Google News get 50% of this traffic (possible, since 77% of sources in 2007 hadn’t had a single front page story). That means it’s sending a median of 25 million page views/month to each of the sources listed here: http://searchengineland.com/revealing-the-sources-of-google-news-11353

    A not insignificant amount of traffic (although considering that some blogs get hundreds of millions of page views/month from Google, it’s not exactly something you want to scream about from the rooftops either).

  2. Jay Levitt says:

    Has anyone actually brought the two together, though?

    Google won’t do *anything* unless there’s an algorithm; they have no support desk, no human editors, no content authors. (Street View is the one exception, and I’d love to know why.)

    Twitter, Wikipedia, Facebook, Digg, etc. are all about crowdsourcing with intentionally simple technologies. No algorithms.

    It’s hard to think of a company whose DNA involves complex algorithms with significant human input. Amazon, maybe, if you consider their merchants’ product listings to be the human input. But it’s not a common business pattern.

    Is that because it’s hard to be good at both? Or just because it’s a rare management team that wants to?

  3. Bob Wyman says:

    Jay Levitt wrote: “It’s hard to think of a company whose DNA involves algorithms with significant human input.”
    Well, that’s how Yahoo! got started. Originally, they had large numbers of “ontologists” classifying and categorizing sites for their directory of web sites. They then augmented the human “meatware” with software algorithms. Over time the meatware proved to be much too expensive for Yahoo! and the several other sites that attempted to copy their approach.

    Large scale meatware driven classification moved from commercial space to the non-profit Open Directory Project (http://www.dmoz.org/) which relies on a large (but declining) number of volunteer editors. The DMOZ data is consumed by most of the large search engine providers and, in many cases, is used to help train software machine learning systems that do further classifications. Given this use of DMOZ data, you should actually consider the combination of “human input” and complex algorithms to be fairly common. You benefit from it regularly.

    bob wyman

  4. Bob P. says:

    That’s an interesting list — or it’s interesting in that it’s not at all interesting. Yes, talk about the old guard. Of course, this is two years old. I’d like to see if there’s been much change.

    Well, I see right off there’s been at least one change. Coming in at No. 23 was a rather small — compared to most of the rest — but very good news organization: The Seattle P-I. Of course, the P-I was shut down earlier this year (I know it still exists online, and I have to say I haven’t kept up. Maybe someone could comment on the quality of its journalism since.)

    Herein lies the dilemma of our current situation, of course. The 23rd most important — if that’s the right term to use — source on the Google News homepage had to be shut down because it was bleeding money.

  5. lida says:

    Google won’t do *anything* unless there’s an algorithm; they have no support desk, no human editors, no content authors. (Street View is the one exception, and I’d love to know why.)

  6. lida says:

    Jay Levitt wrote: “It’s hard to think of a company whose DNA involves algorithms with significant human input.”
    Well, that’s how Yahoo! got started. Originally, they had large numbers of “ontologists” classifying and categorizing sites for their directory of web sites. They then augmented the human “meatware” with software algorithms. Over time the meatware proved to be much too expensive for Yahoo! and the several other sites that attempted to copy their approa

  7. […] einem Beitrag von buzzmachine.com kann man lesen, dass Google-News ca. 1.000.000.000 (1 Milliarde) Klicks pro Monat verzeichnen. Auf […]

Leave a Reply