Yesterday afternoon, I went to the OII to hear Ricardo Baeza-Yates, Director of Yahoo! Research Barcelona and Yahoo! Research Latin America in Santiago, Chile:
In this talk we explore the current impact of social media or social networks, commonly called Web 2.0, where content is generated by users in sites like Yahoo! Answers, Flickr, YouTube or Del.icio.us. This phenomenon puts forward new research challenges that involves not only computer science, but also economy and psychology, just to mention a couple of related fields. We call this emerging new science, community systems, and we mention some of the issues that we are studying, as well as further open problems.
The webcast will, as ever, appear here, but here are some figures, thoughts and ideas that stuck:
- an estimated 5 billion people will be connected to the web by 2015
- today, there are 1.8 billion mobile phones
- 500 million people are expected to have mobile broadband connectivity by 2010
- the volume of internet traffic has increased 20 times in the last 5 years
- there are more than 110 million web servers
Yahoo:
- handles >4 billion page views per day
- processes 12 terabytes of data per day
- handles 2 million mail+IM messages per day
Ricardo put up a slide of what I now think of as the Bradley Horowitz creators/synthesisers/consumers pyramid (see here), followed by another of the three groups arranged in concentric circles: the history of the web has been from 'public web' (first 10 years) to 'my web' to 'our web', and consuming has now become a form of content production.
And so to user-generated content. In leading early adopter South Korea, 43.2% of the population with internet access has published UGC and 76.2% has used UGC. Examples of our web: Yahoo! Answers (the idea originated in South Korea), LAUNCHcast (Last.fm might have worked better with his audience yesterday, but the point was taken) and Flickr — in the case of the latter, fewer than 10 employees were "aided" by millions in the Flickr community. No surprise that several times Ricardo referred to James Surowiecki's The Wisdom of Crowds. (Pointers: espgame.org; peekaboom.org.)
Yahoo!'s vision: better search through people and our trillions of artifacts. Many questions and challenges (eg, How to deal with spam?, How to establish and factor in a user's reputation?, What role does the community of users play?, What are the incentive mechanisms?, Where else can we leverage the power of the people?), but the underlying drive is to put the wisdom of crowds to work, milking query actions (breakdown: 25% informational, 40% navigational and 35% transactional). (Pointer: Yahoo!'s Mindset research site.) Semi-gnomic conclusion: Yahoo! is not seeking to personalise the search query but to personalise the search task (= active information supply driven by user activity and context).
Worth watching the screencast when it's up. There's much more in Ricardo's talk than I've tried to catch here (search language, folksonomic tagging and the inter-relatedness of meanings that Yahoo! explores in search queries …), and I came away puzzled by a couple of things said or asked about blogs, but I liked the emphasis on the web as 'scientifically young'.

