QDN: Revolutionary Search Technologies

Mar 14, 2004 | SXSW

The first SxSW panel of the day for me is the Revolutionary Search Technologies panel, with representatives from Google, Feedster, MRL Ventures, and the University of Texas. The panel focused mainly on the future of search — where it’s going, what the challenges are, and how the technologies of today are trying to anticipate the challenges of tomorrow. I decided to take some notes, if anyone’s interested.

Plug policy cancelled! Thanks to Hugh Forrest.

Marissa Mayer — Google

Google’s primary aim is to organize the world’s information, to make it universally accessible and useful. The company states that the most important part of this aim is a focus on end-user needs, and then predict and react to those needs.

Google tries to anticipate needs (both current and future) by encouraging a policy of employees spending 20% of their time on innovation (hence the Google Labs site). Products and technologies that have come out of this are: Google Search Appliance, searching by location, AdSense, Google Toolbar and Deskbar, the calculator and glossary stuff, and stemmed queries (e.g., searching for “advisory” gets you “advisories”).

The future of search? One possibility: embedded computer screens with search applications running (e.g., on your car’s windshield, or in eyeglasses).

Scott Johnson — Feedster

While most people know what syndication is, the key to understanding what it does is to know that it’s a total reversal of the typical user-hunting-for-data web model. (What went unexplained is how syndication causes data to just randomly find the user; they still have to subscribe to syndication feeds, right? And if they don’t, they have to search for the feeds… which gets back to the user-hunting-for-data model.)

Feedster is a search engine for syndication data, and the big focus of Feedster is currency (as in timeliness, not as in money). The goal is to provide data that’s minutes to hours old, not days to weeks old; in a search for SxSW, Scott showed that the first search return was 1 hour and 22 minutes old. Another example that Scott gave was that Feedster had Janet Jackson’s nipple faster than anyone else, and he surmised that it’s providing information about the Madrid bombings on a much more timely fashion than the big search engines. Feedster indexes a variety of information feeds, not just the traditional XML syndication feeds (it’s even indexing BitTorrent and SourceForge!). They provide a slew of one-click subscription and data-munging options, and provide some access to the back-end metadata. Lastly, a few of the big apps (e.g., FeedDemon) integrates access into Feedster into the interface.

David Galbraith — MRL Ventures

(Interestingly, the title for David’s presentation was “(R)evolutionary search engines.”) David’s talk focused a lot on the challenges to search today, both from the perspective of the users and from the perspective of the engines.

Web has typically been a one-way thing — the search engines crawl and trawl, finding data to index. Advertising is starting to change that; searching for “holiday inn san jose” gets you the corporate site as the first return, which has been paid for by Holiday Inn.

Another challenge is going to be image search. Today, most images that Google indexes are indexed in the context of the text that’s posted along with them. As people start to post pictures from their cameraphones without any accompanying words for context, image search is going to get much tougher, and will be an opportunity for new technology to come in and solve a problem.

For search engines, syndication is tricky, since if search engines provided feeds, they wouldn’t drive any traffic back through themselves on a user’s quest from a search term to a search result. Comments are also tricky; they keep coming, and they have the possibility of providing skewed views of sites (e.g., Jason Kottke’s Matrix: Revolutions thread).

Spelling variants provide a challenge, and the way that most search engines try to deal with it — soundex — is based only on English phonemes, so it doesn’t start to deal with a large slice of the issue.

What’s the biggest opportunity for search? To deal with the local market. Citysearch may be the best model for now; Yahoo is making the most inroads.

Lastly, David threw out there a church vs. state analogy, with “church” representing algorithmic search returns, and “state” representing paid search returns. The biggest problem in the state camp is starting to be fraud. It also sets up an opportunity for the church camp — algorithmic search — to provide solutions by ferreting out the abusers and downrating them.

Questions &Answers

(The first was about RSS, and was inaudible.)

The second was about the perception of Google as a de facto monopoly; David felt that there are two big competitors out there right now (Microsoft and Yahoo), and that we’re not there yet.

The third was a request to talk about new Google algorithms; Marissa said that the algorithms are constantly being optimized, and that the goal is not to drop results out, but to fine-tune results to provide the main goal of relevancy. Scott Johnson then mentioned that Feedster doesn’t care at all if people come to the Feester site, but rather, that they subscribe to feeds and find the data they’re looking for. (My parenthetical aside: what’s the business model? Dunno.)

RSS as democratization — it allows more unknown people to become read, as the result of searches done against syndication databases.