At Open Tech, I was more than a little struck by the figures Jeremy Zawodny was reeling off, climaxing in the bald statement that Yahoo! has billions — 2, 3, 4 — of page views daily and 'billions and billions' of pages indexed: 'and no, I won't say how many billions we have' (around and about 15 minutes in on the mp3 recording of his talk).
I put down to naivety my surprise at the sheer scale of current Yahoo! search operations we were being invited to imagine (even allowing for sales pitch), so I didn't ask him about these figures. But I see now that this posting on the Yahoo! search blog has led to some informed reaction:
As it turns out we have grown our index and just reached a significant milestone at Yahoo! Search – our index now provides access to over 20 billion items. While we typically don't disclose size (since we've always said that size is only one dimension of the quality of a search engine), for those who are curious this update includes just over 19.2 billion web documents, 1.6 billion images, and over 50 million audio and video files.
John Battelle has swung into action:
… there are some tantalizing examples (I will add some in the next post) that one might expect would yield significantly different results between Yahoo and Google, given Yahoo's massive new size, but don't. The math, in essence, seems not to be adding up. At least, that is what the Google scientists are saying. But then again, I am not a mathematician, and there are always at least two sides to the story. So stay tuned and we'll see how this one plays out ...

