Friday, August 10, 2007

Will Google Be Destroyed By Open Source Search Engines?

Posted by Stephen Wellman, Aug 10, 2007 07:07 PM

» E-Mail
» Print
» Write To Editor
» Digg
» Slashdot

Could open source kill the golden egg that laid Google? If Wikia has their way, it just might.

The Wikia project, started by Wikipedia co-founder Jimmy Wales, seeks to the turn the process of building a search engine from a multimillion dollar project to one that could cost just hundreds or thousands of bucks. That's a game changer.

Here is a look at what Wikia hopes to accomplish:

The project, which was started by Wikipedia co-founder Jimmy Wales, consists of four components, the indexing of the Web, developing a search engine application, an algorithm, and using people to help filter sites and rank results.

One of the most expensive components of a search engine is the effort needed to index the Web. Companies have to buy servers and software to crawl the Web looking at what's on every page, in order to create a comprehensive list of what's on the Web.

Well, how will Wikia be able to provide all this search engine technology and service -- especially crawling the Web -- for free? Open user participation, of course:

The cost of indexing the Web is one of the main hurdles to starting a search engine, and for-profit companies have raised the bar year after year by indexing the Web more and more often. It used to be catalogued once a week, or once a day. Now it's once an hour, or even more often. The high cost of running these crawls has become a competitive weapon.

Wikia believes its crawl of the Web will cost nearly nothing, because it's asking Internet users to help out by downloading Web crawling software from Grub, which will use their computers during idle time to crawl the Web, and send results back to Wikia for the index. So far a thousand people have downloaded the application, and Penchina is hoping for 100,000 or more. The goal is to post the entire index online, as well as regular updates, so anyone can use them.

If I have any skepticism about Wikia, it centers on this piece. I know that distributed computing and its white hot offspring grid computing are big IT trends (and that they can work), but crawling the Web is the competitive advantage that Google, Yahoo, and Microsoft use to maintain their market share. Will a mishmash of random crawls from across the Web really be an adequate substitute to a centralized effort? We'll have to wait and see.

As for Wikia itself, I think that even if the distributed Web crawling doesn't work as well as Google, just having that option available -- along with open and free search engine parts -- will be the catalyst that both vertical search and the local search advocates have been looking for.

In May I predicted that Google would probably die not from direct competition with a new, direct rival but from competition from an army of tiny, but by the challenge posed by thousands specialty search engines and Web apps.

We've seen Technorati and Blinkx beat Google at its newer search initiatives. How many more little search engines will we see once Wikia goes live?

« Thumbs Surgically Altered For iPhone? Think Again | Main