This picture describes the Invisible Web, the deep layers of the web that search engines such as Google can not reach. ©
Flickr produces much better image search results than Google. YouTube produces much better video search results than Google. Twitter produces much better search on real time conversation than Google. There are numerous special cases where Google just does not produce the best results possible. Why is this?
Historically the Internet has been random web pages scattered across the world. Search engines, such as Google, have always excelled at organizing such random and unstructured information and making it meaningful for people doing searches. However, the unstructured nature of information across different web sites has meant there is always a boundary to how much the quality can be improved. How does Google determine the meaning of a picture on one site compared to the next? Should it use the words surrounding the picture? Should it give meaning to the filename of the picture? The Semantic Web is an initiative to provide structure to the information on the Internet. Though promising, after many years it still has yet to gain any traction.
Flickr, YouTube and Twitter - within their local spaces - have delivered on the promise of what a Semantic Web could be. They have made it easy to search through difficult to find photos, videos and conversational threads. They have done this by owning all the data they search through. In a radical departure from searching through random and unstructured web pages, because each of these three services owns the data they store - they can collect and structure the information exactly as they need to. For photos, Flickr enables users to tag parts of pictures with labels. YouTube requests people uploading videos to add details such as artist, title and other information. Twitter has immediate access to all twitter messages within its own databases - making real time search of Internet conversations possible - it does not need to crawl the Internet to update it's search results. By owning all the data, each of Flickr, YouTube and Twitter have explicitly structured the data to produce better results for users searching for photos, videos and through conversations.
With the emphasis of the Internet becoming "store everything on The Cloud", I expect the trend of mega-sites owning vast specialized and structured information to continue. You don't use Google to search for a house you want to buy - you use something like propertyfinder.com, which stores all the data on houses available on the housing market. If you want to purchase a book or toy you go to Amazon or eBay and search there. If you want to find out what your friends are doing, you don't ask Google - you ask Facebook instead. Are there other pillars of the Internet where a simple Google search just does not produce good enough results? What is the next opportunity whereby just owning all the data yourself will produce the best results for the user?
This post was inspired by an article by Battelle and a Twitter by Fred Wilson.