« Powerset and Natural Language Search | Main | We are all natural language searchers »
October 11, 2006
The Powerset Blogstorm: 1 week later
I wrote a week ago about how
href="http://www.barneypell.com/archives/2006/10/powerset_and_na.html">Powerset
had become the subject of a blog storm, and shared my vision of natural
language search. Little did I realize that the storm had barely started. One
week later, there are now about 400
href="http://www.technorati.com/search/powerset?">blog articles about
Powerset, according to Technorati (over 100 with some authority). We got
covered by many of the leading writers on search and internet
technology. Below are a few comments on some of the articles by
high-authority bloggers.
-
href="http://www.techcrunch.com/2006/10/05/will-powerset-pull-a-google">Michael
Arrington on TechCruch presented the story to a broad audience. He
stated he has become so familiar with keywordese that he even
uses it now sometimes in IM and email discussions, but that he is open to
the possibility of improved communication of meaning and intent with natural
language search. The 60 comments to his article address the issues of
natural language and search from many useful perspectives.
Danny Sullivan’s Search Engine Watch gave a great critique of past
attempts at natural language search and wonders why Powerset would be any
different. He argues that natural language requires users to change
behavior, and is thus unlikely to succeed. By contrast, he is a big fan of
query refinement. For the record, in addition to natural language, I like
query refinement too, and I’ll throw in suggestions, guided navigation and
faceted refinement to round out the picture.
Erick Schonfeld at Business2.0 picked up on Danny’s criticism that “the most
‘natural’ thing for people is to be lazy”. He then talks about other
approaches to improving search: personalization, social search, and query
refinement.-
href="http://datamining.typepad.com/data_mining/2006/10/powerset_update.html">Matt
Hurst’s Data Mining picked up on my “grunting pidgin language”
characterization of keywordese. While I used an analogy of getting by speaking
first-year French but wanting more expressiveness, he gives a great analogy
of talking to a reference librarian in keywordese vs. English. This really
points out how much potential there is to go beyond what search offers users
today. -
href="http://valleywag.com/tech/powerset/why-powerset-unlike-snap-kosmix-clusty-and-eurekster-will-beat-google-205484.php">ValleyWag
says “If the company can pull this off, it has a shot at rescuing the world
from speaking Search Grunt.” - Om Malik has a poll on whether Powerset can really beat Google. As of
this writing, 20% votes cast agreed that “Powerset will reset
Google”. I think that’s a lot of confidence for a product most people have never
seen…
In response to all this buzz, we had many unsolicited offers of investment,
and our Powerset inbox was flooded with people wanting to join or help the
company. The letters came from Silicon Valley, of course, but also from
Bangalore, Brazil, and Buenos Aires!
(To those of you wrote asking if we will have a beta site for people to try
things out: Yes, and you should soon be able to sign up for our mailing list
on Powerset’s website to get notified
when that comes out.)
Anyway, we truly were not expecting all this attention yet, as we are not
releasing a product in the immediate future. It is a little daunting to have
so much attention but not be able show our product yet. Nobody can tell if
we are hype or substance (unless they know us). However, from my
perspective, one great thing about this blogstorm coming early is that it
has kicked off a vibrant discussion about the present and future of search,
and what it would mean to be able to express intention to a search engine in
a new way. That discussion goes beyond any one company and itself can lead
to the kind of transformation every startup hopes to achieve.
My Powerset CoFounder, Lorenzo
Thione, is in the process of writing a great article responding to the
critics of natural language search. Stay tuned.
Posted by barney at October 11, 2006 1:09 am
This entry was posted in Human Language Technology, Search, Weblogs
Trackbacks & Pingbacks
Trackback URL for this entry:
http://www.barneypell.com/xmlrpc.php
Comments
posted by JoeDuck at October 12th, 2006 3:04 pm
I think that some important points have been missed by all parties in this discussion. First, keywords (more general keyword logics) are the way we have now of doing “trick semantics”. In order to understand this, one has to understand that search is a two-sided coin; there is the query, and there is the database. Let’s take it for the moment that the database is The Web. The oft-overlooked point is that NEITHER a query NOR the database are expressed in pure semantics. Put in more plain terms: You don’t write what you think, nor does the person who write the web page you’re looking for; rather, you both use a surface code (NL) that bears a complex relationship to what what one “really means”. Keywords are a way that we have learned to see through this problem, and expert searches know how to do it. Suppose, for example, that you’ve interested in when marsupials first diverged from mammals. I have several options: I could write: “When did marsupials diverge from mammals?” If these were just interpreted as keywords, then if there happens to be a paper that contained the same words, I might win. But let’s assume that that’s not the case. Instead, there might be a paper that discusses (perhaps over a paragraph): “The evolution of kangaroos, and when they split off from the bears, their nearest mammalian counterparts.” (I’m making this up, I doubt that kangaroos and bears are closely related!) Here’s where the two headed coin comes in: You need to not only understand my query in some semantics terms, but you also need to understand the databse (that is, the sentences in the web pages) in similar semantic terms. Now, expertise in keyword search usage is, I claim, NOT merely a matter of knowledge of natural language, but a specalized skill that involves NL, but also involves understanding how to flip that double-headed coin — that is, how to take a query, in whatever terms, and create a new query that is likely to be able to find the appropriate things on the other side of the coin. This is not, I claim, the same as, nor even closely related to natural langauge processing, and indeed, I hypothesize that skill in NL is not highly correlated with it. (Everyone in my lab can speak several languages, but everyone in my lab comes to me because I can find things they can’t on google!) My hypothesis is that the step from NL queries to semantics is actually NOT the critical step in this process. Rather it is the step from Query-semantics (now approximated by keywords) to the other side of the coin: DB-semantics (now approximated by index terms), and the figuring out how to do THIS is going to be the critical step. I have a theory of how to do this… which I’ll tell you if you hire me! ![]()
Cheers,
‘Jeff
posted by jshrager at October 13th, 2006 12:29 am
I’m bullish about your prospects Bernie but don’t think that 20% poll means much. More people than that simply *want* Google to falter. But best of luck.
Also: We need MUCH better search paradigms, so hurry up dude!