« Powerset's Nitay Joffe in New York Times today | Main | Technology Review on Building a Better Search Engine »
August 9, 2007
Information Week on The Ultimate Search Engine
An article in Information Week this week, The Ultimate Search Engine, by Nick Hoover, gives an overview of many approaches and providers of next-generation search.
The article begins with a statistic about search frustration that I have heard several times but have not been able to find the data:
People search for 11 minutes on average before finding what they're looking for, and half abandon searches without getting that far, according to Microsoft. By Gartner's estimate, half of potential Web sales are lost because visitors simply can't find what they want.
The article covers the following ideas being explored for next-generation search:
- Natural language and semantic search: Search that benefits from the meaning of the query and/or the content being searched.
- Queryless search: Giving relevant results without a query box based on saved searches or the context of a current search.
- Personalization: Search results that benefit from knowledge of an individual.
- Social Search: Ways to improve search using humans. This includes tags, social filtering, human powered answers, among others.
- Results Oriented: Improvements to the results of search. This includes faceted refinement options, clustering, and direct answers to questions.
- Multimedia: this includes voice, image, video and universal search.
The discussion on natural language and semantic search features Powerset and Hakia.
Here are a few relevant excerpts:
With emerging tools, people will no longer have to dumb down their queries with the pidgin language understood by first-generation search engines. They'll be able to ask questions in English and other languages--or pose no question at all and automatically receive results based on their earlier queries or the applications they're using.
Most of today's search engines require a shorthand language some describe as keywordese. "It's kind of like talking to a 2-year-old," says Barney Pell, CEO of Powerset, a startup applying natural language processing to search. Over the next decade, Pell says, search engines will become more sophisticated in their ability to "understand meaning."Powerset, Hakia, and other companies are developing search engines that apply linguistics--the science of language--to interpret questions, analyze Web content, and, as necessary, refine results through interaction with users. Hakia CEO Riza Berkan envisions search engines becoming "knowledgeable creatures in the future if we teach them how to talk and how to understand."
Semantic search engines parse language much like an English student does, using dictionaries and thesauri to interpret the meaning of words and link them using common rules of syntax and sentence structure. The sentence "IBM bought Tivoli for $743 million in 1996" includes concepts such as buying, buyer, subject of buy, year of buy, and purchase price.
For now, the process is aided by human beings who apply language rules and define categories to narrow searches, though Hakia's search engine can use language cues to find rough meaning in concepts it doesn't yet understand. "If it was fully automated, we would claim we have invented a human being," Berkan says. Web search engines like Google and Yahoo employ linguists, too, though they're not as far along with semantic search as Hakia or Powerset. Google's search engine can spell check and returns synonyms and variations of words, but it doesn't always answer questions accurately.
Posted by barney at August 9, 2007 7:26 PM
This entry was posted in the following categories: Powerset , Search
Trackback Pings
TrackBack URL for this entry:
http://www.barneypell.com/blog/mt-tb.cgi/91
Comments
Post a comment
Thanks for signing in, . Now you can comment. (sign out)
(If you haven't left a comment here before, you may need to be approved by the site owner before your comment will appear. Until then, it won't appear on the entry. Thanks for waiting.)