<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Barney Pell&#039;s Weblog &#187; Powerset</title>
	<atom:link href="http://www.barneypell.com/archives/powerset/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.barneypell.com</link>
	<description></description>
	<lastBuildDate>Thu, 17 Dec 2009 09:20:05 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0.1</generator>
		<item>
		<title>Wolfram Alpha: A New Kind of Question-Answering System</title>
		<link>http://www.barneypell.com/2009/03/wolfram-alpha-a-new-kind-of-question-answering-system/</link>
		<comments>http://www.barneypell.com/2009/03/wolfram-alpha-a-new-kind-of-question-answering-system/#comments</comments>
		<pubDate>Mon, 23 Mar 2009 22:03:15 +0000</pubDate>
		<dc:creator>Barney</dc:creator>
				<category><![CDATA[Collective Intelligence]]></category>
		<category><![CDATA[Human Language Technology]]></category>
		<category><![CDATA[Information retrieval]]></category>
		<category><![CDATA[Powerset]]></category>
		<category><![CDATA[Science]]></category>
		<category><![CDATA[Search]]></category>
		<category><![CDATA[Software]]></category>
		<category><![CDATA[Web/Tech]]></category>

		<guid isPermaLink="false">http://174.120.172.92/~barneype/?p=124</guid>
		<description><![CDATA[There has been much excitement recently over the upcoming launch of Wolfram Alpha. This is a new question-answering system developed by Stephen Wolfram, inventor of Mathematica, and it is scheduled for a beta launch in May. Wolfram has been providing demos to industry insiders. I haven’t had a demo yet, but I have learned what [...]]]></description>
			<content:encoded><![CDATA[<p>There has been much excitement recently over the upcoming launch of Wolfram Alpha. This is a new question-answering system developed by Stephen Wolfram, inventor of Mathematica, and it is scheduled for a beta launch in May. Wolfram has been providing demos to industry insiders. I haven’t had a demo yet, but I have learned what I could from reading articles by Nova Spivak (“<a href="http://www.techcrunch.com/2009/03/08/wolfram-alpha-computes-answers-to-factual-questions-this-is-going-to-be-big/">Wolfram Alpha computes answers to factual questions. This is going to be big”</a>) and Doug Lenat (<a href="http://www.semanticuniverse.com/blogs-i-was-positively-impressed-wolfram-alpha.html">“I was positively impressed with Wolfram Alpha”</a>). And this weekend I spoke with William Tunstall-Pedoe, CEO of <a href="http://www.trueknowledge.com/">True Knowledge</a>, who also got a demo.  Many of my examples and conclusions come from conversation with William (thanks!).  Since life is short and so is the attention of web readers, I&#8217;ll give the rest of my thoughts in bullet form.</p>
<p><strong>What it is: A new kind of question-answering system. </strong></p>
<p><strong>Examples</strong></p>
<ul>
<li> Math: &#8220;2+2&#8243; and then a few simple math questions: &#8220;integrate xsin^4xdx&#8221;, &#8220;what is the square root of 18&#8243; etc.</li>
<li> Business: “gdp france” showed amount and graph of how it changed over time. “gdp france/germany” showed graph with both amounts and the ratio</li>
<li> “internet users in Europe”: Showed total, and a chart of usage by country in Europe, at the current time, specifically highlighting the biggest and smallest</li>
<li> “ISS”: generates a graphic rendition of the international space station orbiting earth and updating in real-time</li>
<li> “tides in san Francisco”: showed a graph of tides over time, where the times were listed in the local time regime current in the late 19th century for those data points. “tide NYC 11/12/1922” gave a single answer.</li>
<li> “weather”: showed graph of average temperature in Cambridge, MA (where Stephen was when doing the demo). Based on reverse IP lookup.</li>
<li> Computational fluid dynamics: typing in the name of a specific aerofoil produced a picture of that aerofoil along with its differential equations.</li>
<li> stock prices:  “MSFT CSCO” showed comparison chart</li>
<li> chemicals: Substances at temperature or pressure, got physical properties calculated. “H2SO4” showed a diagram and chemical properties. &#8220;5 molar h2s04&#8243; did something cool, I don’t know what.</li>
<li> genome sequences: “AGTAG” shows sequences from the human genome that match that pattern</li>
<li> data about people: “How old is Barack Obama” gives his age now. “When was Alan Turing born” gives the answer. “How old is Alan Turing” (a trick question) gives an error message with no human-readable explanation (True Knowledge, by contrast, tells you exactly why this is a trick question).</li>
</ul>
<p><strong>Coverage of data: It answers questions over the following types of structured data:</strong></p>
<ul>
<li> static tables and databases (e.g. a database of internet usage by country by year)</li>
<li> dynamic data feeds (e.g. historical stock market data, position of space shuttle, weather)</li>
<li> numerical inference (e.g. math questions)</li>
<li> numerical computations and simulations (e.g. tides, astronomy, chemistry)</li>
</ul>
<p><span id="more-124"></span></p>
<div id="a000132more">
<div id="more">
<p><strong> Form of queries</strong></p>
<li> The queries are expressed in template-based natural language or corresponding abbreviated forms</li>
<li> NL syntax: “what is the gdp of france”</li>
<li> Template compressed: {attribute} of {object} {time}  (“gdp france 2008”)</li>
<li> Mathematical expressions, or NL versions of these (as one might do in an entry-level LISP class)</li>
<li> I can imagine the query language supports (or could support) restrictions on presentation (plot, chart) and other constraints one might express in SQL (order by, etc), though I haven’t seen any examples showing this exists at present.<strong> Presentation and Answers</strong>
<ul>
<li> Answers can be a single fact, a table, or a graphical display of a live simulation.  Usually it’s a combination of these.</li>
<li> For ambiguous queries, it always picks one interpretation. And you can switch to something else if that’s wrong. (A drop-down menu of other alternatives).</li>
</ul>
<p><strong> Domains and Generality</strong></li>
<li> Wolfram Alpha is described as an open domain question answering system on structured data. But how exactly is this open domain? I distinguish three levels of domain generality:
<ul>
<li> Closed domain: A specified domain</li>
<li> Multi domain: Multiple domains are covered, we try to add more domains, but still treats each one a closed. Note: this can be accomplished through a unified or disjoint treatment.</li>
<li> Open domain: Any domain is within scope</li>
</ul>
</li>
<li>For Wolfram Alpha they have taken a domain-by-domain approach. For each domain, they determined what type of questions to support, and which data, feeds, or simulations to incorporate, and did hand curation to enable these.</li>
<li> The domains are typically fact and data oriented, especially where simulations are available<strong> Architecture</strong></li>
<li> The system is coded in Mathematica, about 4.5M lines of code, developed by a large team (100 people at present).</li>
<li> From this <a href="http://www.wolfram.com/products/mathematica/quickoverview/">presentation on Mathematica </a>it is quite easy to extrapolate what Wolfram Alpha is like &#8211; essentially Mathematica + a vast library of mathematical models and data attached + some error-tolerant processing of the user&#8217;s input (thanks Peter Clark for pointing this out).</li>
<li> Piecing together the Mathematica approach and generalizing from the examples and my own knowledge, I believe they have a basic level of representational tools that gets shared for multiple domains. Here&#8217;s how I would think about this:
<ul>
<li> Define the objects in the domain</li>
<li> Make a table of function names and attributes in the domain, and for each function or attribute list the restrictions on the type of objects that this can apply to.</li>
<li> Standardize representations of time and place and charting elements associated with these.</li>
<li> Import and normalize data</li>
<li> Associate data fields to objects and attributes in the domain</li>
</ul>
<p><strong> Infrastructure</strong></li>
<li> The system runs on thousands of expensive servers (running mathematica in real-time).</li>
<li> Apparently 10 machines per query give 1 queries per second (qps), so they can do 100 qps on 1,000 machines.<strong> What is innovative about this</strong></li>
<li> Rich mathematical computational infrastructure (Mathematica) to support mathematical aspects of natural language queries</li>
<li> Integration of mathematical inference and simulations along with structured data in a single question-answering system</li>
<li> Unprecedented level of structured data aggregation and curation</li>
<li> Rich presentation including static and dynamic elements and multiple modalities</li>
<li> (Potentially) Deployment of NL-to-SQL query translation in a multi-domain system. The technology has existed to do this for several years But I don’t know if anyone has deployed it yet. I’m not sure if Wolfram has deployed this and haven’t seen enough examples to indicate if they have.<strong> What it doesn’t do</strong></li>
<li> Queries or presentation against unstructured data (neither keyword nor NL queries against unstructured data, which is a strength of <a href="http://www.powerset.com/">Powerset</a>)</li>
<li> Queries requiring ontological or commonsense inference (whether structured or unstructured, which is a strength of True Knowledge and <a href="http://www.cyc.com/">Cyc</a>)</li>
<li> Answers in support of transactions (e.g. price feeds from many merchants or airlines), which is shown in various stages in many major search engines</li>
<li> Cross-domain multiple domains (e.g. “what was the weather in San Francisco when Yahoo was founded”, which is a strength of True Knowledge)<strong> Implications for the field</strong>
<ul>
<li> Question answering has been an important part of search results the whole time, but it has often been a second class citizen and hardly promoted</li>
<li> By increasing the level of comprehensiveness of structured questions (in terms of data and domains), this can increase awareness and usage of question answering systems</li>
<li> This should move question answering to be more of a competitive feature across search engines</li>
<li> Users will want to ask questions for structured and unstructured queries, not just structured queries, which will increase perceived differentiation for technology like Powerset</li>
<li> If the use of structured data and simulations prove valuable to large number of users and search engines, then this will increase the need to transform and route queries to vertical experts, potentially developed by ecosystem partners</li>
<li> This will increase the need and value for ecosystem players to add semantic markup to their structured data and simulations, hence making it easier to offer more semantic question answering and integration with other services, and expanding the value of the services by search engines in a virtuous cycle</li>
</ul>
<p><strong>Conclusion</strong></p>
<p>In conclusion, Wolfram Alpha is not going to be a new search engine or a universal answer engine. It is not going to put the existing major players or semantic search startups out of business. But there appears to be real innovation here, leading to at least a <span style="text-decoration: underline;">new kind of system</span> that we have not seen before.  I am eagerly looking forward to my turn to try it out.</li>
</div>
</div>
]]></content:encoded>
			<wfw:commentRss>http://www.barneypell.com/2009/03/wolfram-alpha-a-new-kind-of-question-answering-system/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>First Round Capital Holiday video</title>
		<link>http://www.barneypell.com/2008/12/first-round-capital-holiday-video/</link>
		<comments>http://www.barneypell.com/2008/12/first-round-capital-holiday-video/#comments</comments>
		<pubDate>Wed, 17 Dec 2008 19:52:13 +0000</pubDate>
		<dc:creator>Barney</dc:creator>
				<category><![CDATA[Fun]]></category>
		<category><![CDATA[Powerset]]></category>
		<category><![CDATA[Venture Capital]]></category>

		<guid isPermaLink="false">http://174.120.172.92/~barneype/?p=123</guid>
		<description><![CDATA[First Round Capital, one of Powerset&#8217;s investors, had a really innovative idea for a holiday card. The partners in the fund visited all their portfolio companies around the world and danced with them. In this video, you can see Barney Pell and Lorenzo Thione dancing with Josh Kopelman in the lobby at Powerset. I found [...]]]></description>
			<content:encoded><![CDATA[<p>First Round Capital, one of Powerset&#8217;s investors, had a really <a href="http://www.youtube.com/watch?v=EU_5P3GLWv4&#038;eurl=http://www.kopelman.com/holiday/holiday/ie.html&#038;feature=player_embedded">innovative idea for a holiday card</a>.<br />
The partners in the fund visited all their portfolio companies around the world and danced with them.  In this video, you can see Barney Pell and Lorenzo Thione dancing with Josh Kopelman in the lobby at Powerset.<br />
I found it really touching to watch the joyful spirit in all these startup companies even in the tough economy that is making life difficult for startups.<br />
<object width="425" height="344"><param name="movie" value="http://www.youtube.com/v/EU_5P3GLWv4&#038;color1=0xb1b1b1&#038;color2=0xcfcfcf&#038;hl=en&#038;feature=player_embedded&#038;fs=1"></param><param name="allowFullScreen" value="true"></param><embed src="http://www.youtube.com/v/EU_5P3GLWv4&#038;color1=0xb1b1b1&#038;color2=0xcfcfcf&#038;hl=en&#038;feature=player_embedded&#038;fs=1" type="application/x-shockwave-flash" allowfullscreen="true" width="425" height="344"></embed></object></p>
]]></content:encoded>
			<wfw:commentRss>http://www.barneypell.com/2008/12/first-round-capital-holiday-video/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Microsoft to acquire Powerset</title>
		<link>http://www.barneypell.com/2008/07/microsoft-to-acquire-powerset/</link>
		<comments>http://www.barneypell.com/2008/07/microsoft-to-acquire-powerset/#comments</comments>
		<pubDate>Thu, 03 Jul 2008 15:50:32 +0000</pubDate>
		<dc:creator>Barney</dc:creator>
				<category><![CDATA[Human Language Technology]]></category>
		<category><![CDATA[Information retrieval]]></category>
		<category><![CDATA[Powerset]]></category>
		<category><![CDATA[Search]]></category>

		<guid isPermaLink="false">http://174.120.172.92/~barneype/?p=118</guid>
		<description><![CDATA[On Monday, Microsoft and Powerset announced that Powerset is being acquired by Microsoft. In terms of timing, the companies announced that the deal was signed. There is still the customary period before the deal is officially closed (at which point, I expect we&#8217;re going to have a great party). I&#8217;m including, below, the text of [...]]]></description>
			<content:encoded><![CDATA[<p>On Monday, Microsoft and Powerset announced that Powerset is being acquired by Microsoft.</p>
<p>In terms of timing, the companies announced that the deal was signed. There is still the customary period before the deal is officially closed (at which point, I expect we&#8217;re going to have a great party).</p>
<p>I&#8217;m including, below, the text of the announcements from the blogs of Powerset andMicrosoft.<br />
I think these sum up pretty well the logic behind the acquisition on both sides.</p>
<p>It took a lot of work by many people to make this happen. Most significant, of course, was the entire team at Powerset, who executed so well to build and launch a wonderful product that showed the world what is now possible.</p>
<p>Immediately following the announcement, we had a day of calls with members of the press, which resulted in a lot of coverage. I&#8217;ll try to post a collection of links next week.</p>
<p>One press meeting that I really enjoyed was a <a href="http://www.techcrunch.com/2008/07/02/interview-with-barney-pell-and-ramez-naam-about-microsoft%e2%80%99s-powerset-acquisition-integration-to-begin-this-year/">podcast with me, Ramez Naam (Group Program Manager for Microsoft Live Search), and Mike Arrington for TechCrunch</a>.  That link provides an article, transcript, and the full audio of the interview.</p>
<p>There is a lot more to say about Powerset, Microsoft, the acquisition, and what it means for the future of search, linguistic technology, semantic web, etc. I am excited to be staying on with Microsoft in a strategy and evangelist role and I am looking forward to the chance to talk and write a lot more about this, and from a whole new perspective, soon.</p>
<p>Here is the text of <a href="http://www.powerset.com/blog/articles/2008/07/01/microsoft-to-acquire-powerset">Powerset&#8217;s blog announcement</a>:</p>
<blockquote><p>We’re excited to announce officially that Microsoft has signed an agreement to acquire Powerset.Powerset has always been a small company with big dreams, with the ultimate goal of changing the way humans interact with computers through language. We set out to improve search by indexing Web pages based on the meaning expressed in them rather than just the literal words. Powerset licensed breakthrough technology from PARC, hired world-renowned computational linguists and search engineers, and recently released a search and discovery experience for Wikipedia articles. Our technology helps to improve search results and also makes new features possible, such as Factz, which aggregates information from many articles to summarize a topic.</p>
<p>With any startup, the challenge is to take the seeds of an idea and grow it into a viable company. At Powerset, we transformed our idea into a world-class semantic search platform, demonstrating the future of search with our Wikipedia search experience. But building a large-scale semantic search engine is expensive, requiring an engineering effort and computing resources beyond what most start-ups could ever imagine. Because our goals around improving search align so well, Powerset has decided to team up with Microsoft. We believe that this is the fastest way to bring our technology to market at a large scale.</p>
<p>Microsoft shares our goal to improve search through deeper analysis of queries and documents, and understands that our technology and expertise will play a key role in the evolution of search. With an existing search infrastructure, incredible capital resources, unlimited data, a leading search team, and clear mission to revolutionize the search landscape, Microsoft can rapidly accelerate our progress in building semantic search technology and bringing it to full Web scale. When we launched our first product, we heard: this is great, but when and how will we get Powerset to go beyond Wikpiedia? Microsoft accelerates our ability to move Powerset to the entire Web faster than anyone could have imagined.</p>
<p>Powerset will continue to operate much as we currently do, working in the same building, with the same organizational structure, and with the same uniquely talented and growing team (apply on our jobs page). We’ll continue to tackle the hardest problems in parsing, semantics, ranking, indexing, scalable computing, user experience and all of our other specialties. But now we’ll do it with the support of Microsoft and the vast resources of the entire Live Search team.</p>
<p>Over the past couple of years Powerset has made amazing progress. Starting with just a big idea, we licensed the best linguistic technology, recruited a top-notch team, built out our datacenter, engineered a world-class semantic search platform, tackled deep natural language issues, improved relevance, innovated an interface and launched a great product. So few start-ups ever tackle such deep, scientific problems successfully and create the kind of value we’ve delivered in such short order.</p>
<p>For now, Powerset.com will continue to host our Wikipedia Search &amp; Discovery and we’ll be continuing to experiment with our product, based on user feedback. But, expect many announcements from us in the coming months about how we’re integrating our technology and features into Live Search.</p></blockquote>
<p>And here&#8217;s the text of <a href="http://blogs.msdn.com/livesearch/archive/2008/07/01/powerset-joins-live-search.aspx">Microsoft&#8217;s blog announcement</a>:</p>
<blockquote><p>Powerset joins Live SearchWe&#8217;re excited to announce that we&#8217;ve reached an agreement to acquire Powerset, a San Francisco-based search and natural language company.</p>
<p>Powerset will join our core Search Relevance team, remaining intact in San Francisco. Powerset brings with it natural language technology that nicely complements other natural language processing technologies we have in Microsoft Research.</p>
<p>More importantly, Powerset brings to Live Search a set of talented engineers and computational linguists in downtown San Francisco. This is a great team with a wide range of experience from other search engines and research organizations like PARC (formerly Xerox PARC).</p>
<p>We&#8217;re buying Powerset first and foremost because we&#8217;re impressed with the people there. Powerset CTO and cofounder Barney Pell is a visionary and incredible evangelist. When he introduced our senior engineers to some of the most senior people at Powerset — Search engineers and computational linguists like Tim Converse, Chad Walters, Scott Prevost, Lorenzo Thione, and Ron Kaplan — we came away impressed by their smarts, their experience, their passion for search, and a shared vision.</p>
<p>That shared vision is to take Search to the next level by adding understanding of the intent and meaning behind the words in searches and webpages.</p>
<p>We know today that roughly a third of searches don&#8217;t get answered on the first search and first click. Usually searchers find the information they want eventually, but that often requires multiple searches or clicks on multiple search results. Two specific problems are the most common reasons for this:</p>
<p>* Differences in phrasing or context between a user&#8217;s search and the way the same information is expressed on webpages. Search engines don&#8217;t understand today that &#8220;shrub&#8221; and &#8220;tree&#8221; are similar concepts. We don&#8217;t understand that &#8220;cancer&#8221; sometimes refers to a disease and sometimes refers to a horoscope and when a query or a webpage refers to which.<br />
* Lack of clarity in the descriptions for each webpage in the search results. Sometimes a result looks relevant from its short description on the results page but turns out to be not so relevant when you visit the actual page. As a result, searchers frequently click results and then rapidly click back when they realize they aren&#8217;t what they&#8217;re looking for.</p>
<p>These problems exist because search engines today primarily match words in a search to words on a webpage. We can solve these problems by working to understand the intent behind each search and the concepts and meaning embedded in a webpage. Doing so, we can innovate in the quality of the search results, in the flexibility with which searchers can phrase their queries, and in the search user experience. We will use knowledge extracted from webpages to improve the result descriptions and provide new tools to help customers search better.</p>
<p>Working with our existing Search team and other Microsoft teams that focus on natural language, Powerset will help us address all of those problems and opportunities.</p>
<p>We&#8217;re looking to add even more talented engineers to the San Francisco team to accelerate our shared progress. If you&#8217;re interested in joining the team, drop us a line.</p>
<p>We&#8217;ll have more to say about the things we&#8217;re doing in understanding searches and webpages through natural language technology in the coming months. In the meantime, please join me in welcoming Powerset to Microsoft!</p>
<p>Satya Nadella, Senior Vice President, Search, Portal, and Advertising</p></blockquote>
]]></content:encoded>
			<wfw:commentRss>http://www.barneypell.com/2008/07/microsoft-to-acquire-powerset/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Powerset launched today!</title>
		<link>http://www.barneypell.com/2008/05/powerset-launched-today/</link>
		<comments>http://www.barneypell.com/2008/05/powerset-launched-today/#comments</comments>
		<pubDate>Sun, 11 May 2008 11:59:42 +0000</pubDate>
		<dc:creator>Barney</dc:creator>
				<category><![CDATA[Powerset]]></category>

		<guid isPermaLink="false">http://174.120.172.92/~barneype/?p=116</guid>
		<description><![CDATA[Sunday 5/11/2008: Powerset has launched our first open product to the world! Our initial product offers users a whole new way to experience Wikipedia and Freebase content, based on our unique natural language understanding technology. A write-up about Powersets Wikipedia product is available on the Powerset blog. I will write more over the next few [...]]]></description>
			<content:encoded><![CDATA[<p>Sunday 5/11/2008: <a href="http://www.powerset.com">Powerset</a> has launched our first open product to the world!<br />
Our initial product offers users a whole new way to experience Wikipedia and Freebase content, based on our unique natural language understanding technology.<br />
<a href="http://www.flickr.com/photos/powerset/2483981870/" title="Powerset Homepage"><img src="http://farm4.static.flickr.com/3096/2483981870_3fc5198aea.jpg" width="500" height="360" alt="Powerset Homepage" /></a></p>
<p>A <a href="http://blog.powerset.com/2008/5/12/ready-powerset-go">write-up about Powersets Wikipedia product </a>is available on the <a href="http://blog.powerset.com">Powerset blog</a>.<br />
I will write more over the next few days about the product and it&#8217;s role in the ecosystem of search, content, linguistics, and semantic technology, but for now I&#8217;m just incredibly excited. I&#8217;ll just note a couple highlights from the evening.<br />
We were planning to launch at 9pm PST.  But in an unusual twist for a software company, one of our eager engineers actually flipped the switch to make everything live 15 minutes ahead of schedule.  Since everything was working, we just decided to go with it!<br />
Within the next couple of hours, the first press articles came out. Pretty much across the board, the journalists and bloggers captured the essence of our initial product.  They got what was special about it, and also recognized it for the initial step that this represents (finally freeing us of the Google Killer hype that is impossible for a small startup to live up to).<br />
Within 1 hour of launch, we received a note from a VC asking about possible investment in the company.<br />
And 2 hours after we were live, we had our first denial of service attack. An automated script sent a never-ending sequence of bizarre queries at our system.  Fortunately, our own engineers had been preparing for this kind of thing already and we managed to stay up and weather the storm.<br />
The whole company was gathered in the office. We spent time alternating between: making speeches and toasts, reading press articles, looking at the traffic and load, and watching the initial queries float by. The last part was the most exciting: real users and real queries!<br />
Since we launched on Sunday night on Mother&#8217;s day (thanks, Mom!), we had it pretty easy with relatively light traffic.  I think Monday is going to be an exciting day.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.barneypell.com/2008/05/powerset-launched-today/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>In 5 years we will search more with voice than typing</title>
		<link>http://www.barneypell.com/2008/02/in-5-years-we-will-search-more-with-voice-than-typing/</link>
		<comments>http://www.barneypell.com/2008/02/in-5-years-we-will-search-more-with-voice-than-typing/#comments</comments>
		<pubDate>Tue, 26 Feb 2008 20:30:25 +0000</pubDate>
		<dc:creator>Barney</dc:creator>
				<category><![CDATA[Human Language Technology]]></category>
		<category><![CDATA[Powerset]]></category>
		<category><![CDATA[Search]]></category>

		<guid isPermaLink="false">http://174.120.172.92/~barneype/?p=111</guid>
		<description><![CDATA[David Vogelpohl wrote an article, Will Microsoft Resurrect Natural Language Search, citing a recent AP article about Bill Gates and voice-based search. Here are some quotes from the AP article: People will increasingly interact with computers using speech or touch screens rather than keyboards, Microsoft Corp. Chairman Bill Gates said.“It’s one of the big bets [...]]]></description>
			<content:encoded><![CDATA[<p>David Vogelpohl wrote an article, <a href="http://www.marketingpilgrim.com/2008/02/will-microsoft-resurrect-natural-language-search.html">Will Microsoft Resurrect Natural Language Search</a>, citing a recent <a href="http://biz.yahoo.com/ap/080222/gates_goodbye_keyboards.html?.v=2">AP article</a> about Bill Gates and voice-based search.  Here are some quotes from the AP article:</p>
<blockquote><p>People will increasingly interact with computers using speech or touch screens rather than keyboards, Microsoft Corp. Chairman Bill Gates said.“It’s one of the big bets we’re making,” he said during the final stop of a farewell tour before he withdraws from the company’s daily operations in July.</p>
<p>In five years, Microsoft expects more Internet searches to be done through speech than through typing on a keyboard, Gates told about 1,200 students and faculty members Thursday at Carnegie Mellon University.</p></blockquote>
<p>David conjectures, as do I, that when people speak their searches they are more likely to use natural language than to use keywordese, and that this could change the game in search.</p>
<blockquote><p>I personally can envision Microsoft trying to integrate speech based data entry as closely as possible with our normal style of speaking. Perhaps the phrase “Where can I buy a hd tv?” would be more natural for searchers when you take away the limitations of the keyboard.Wide spread speech based data entry will almost certainly impact the way Microsoft and subsequently all other search engines deal with search queries.</p></blockquote>
<p>It&#8217;s interesting to see Bill Gates predicting this to happen within 5 years. In the blink of an eye, an entire industry is going to change dramatically.</p>
<p>While on the topic of predictions about voice and language, here&#8217;s one of my predictions that I have been meaning to write up:</p>
<blockquote><p>Within 8 years from now (2016), every category of consumer electronics will have some linguistic interface as a standard feature.</p></blockquote>
<p>By &#8220;linguistic interface&#8221;, I mean voice interactions or text-based interaction that is linguage-based. Not that these devices won&#8217;t still have nonlinguistic interfaces too (e.g. there will still be buttons, most likely). And by &#8220;every category&#8221;, I mean you will not find a category of consumer electronics that does not have some product in that category with that feature.</p>
<p>For example, users will expect to be able to talk to cameras, tvs, stereos, ipods, phones, watches, microwave ovens, refrigerators, cars, etc. There will still be some cameras that aren&#8217;t language-enabled, but every category will have some products that are.</p>
<p>As my friends Cliff Nass and Scott Brave write in their book, <a href="http://www.amazon.co.uk/Voice-Activated-Psychology-Interfaces-Wirelesses/dp/1575863324">Voice Activated</a>, when people interact with devices using voice, it also invokes the rest of their social apparatus. You can&#8217;t hear a voice without ascribing some kind of personality, gender, race, social status, etc to the source of the voice. So in addition to expecting linguistic capability, we&#8217;re also going to start expecting personality within the next decade.</p>
<p>I&#8217;ll stop here before I get carried away to the singularity&#8230;</p>
]]></content:encoded>
			<wfw:commentRss>http://www.barneypell.com/2008/02/in-5-years-we-will-search-more-with-voice-than-typing/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>LA Times on Founders Brunch and the PowerStache</title>
		<link>http://www.barneypell.com/2008/02/la-times-on-founders-brunch-and-the-powerstache/</link>
		<comments>http://www.barneypell.com/2008/02/la-times-on-founders-brunch-and-the-powerstache/#comments</comments>
		<pubDate>Mon, 25 Feb 2008 19:52:05 +0000</pubDate>
		<dc:creator>Barney</dc:creator>
				<category><![CDATA[Fun]]></category>
		<category><![CDATA[Powerset]]></category>
		<category><![CDATA[Social Networking]]></category>

		<guid isPermaLink="false">http://174.120.172.92/~barneype/?p=110</guid>
		<description><![CDATA[My friend Jessica Guynn just wrote an article that appeared online in the LA times today entitled: Brainstorming over bagels: Silicon Valley entrepreneurs seek camaraderie and capital at brunch. The article will appear in the LA Times print edition tomorrow morning. The articles covers the Founders Brunch, a networking event for founders of companies that [...]]]></description>
			<content:encoded><![CDATA[<p>My friend Jessica Guynn just wrote an article that appeared online in the LA times today entitled: <a href="http://www.latimes.com/business/la-fi-founders26feb26,0,6428275.story">Brainstorming over bagels: Silicon Valley entrepreneurs seek camaraderie and capital at brunch</a>.<br />
The article will appear in the LA Times print edition tomorrow morning.<br />
The articles covers the Founders Brunch, a networking event for founders of companies that I attend regularly.<br />
Many of my friends are quoted in the article, and there are photos of Auren Hoffman and Keith Rabois (our host this time).  Peter Thiel expressed the networking aspect of this kind of event well:</p>
<blockquote><p>Founders Brunch is important for the same reason Silicon Valley is important: There are all of these subtle network effects,&#8221; said Peter Thiel, a 40-year-old former PayPal executive now bankrolling some of the hottest Internet companies. &#8220;Otherwise why wouldn&#8217;t you start a tech company in Fresno where everything is cheaper? The advantage to being in Silicon Valley and the San Francisco area is that so many other people are doing the same thing.&#8221;</p></blockquote>
<p>Jessica noted that I had a new beard, and I explained my recent decision on growing it:</p>
<blockquote><p>Barney Pell, the 39-year-old co-founder of Powerset, a natural-language search engine trying to challenge Google, sported a new beard he vowed not to shave until his San Francisco start-up launched its new product.</p></blockquote>
<p>To be more accurate, I vowed not to shave off my beard until the launch, but I didn&#8217;t vow that I wouldn&#8217;t shave at all.  I made that mistake during graduate school.  I thought I was ready to submit my PhD thesis in about 3 months, and vowed not to shave or cut my hair until it was done. This was partly a way to motivate myself to finish, and partly a way to let my friends stop asking about my progress as they would clearly know when was done.  As it turned out, my thesis advisor thought I had more work to do, and I wound up taking a full year before finishing.  So by the time I was actually ready to submit my thesis, I had really long hair and a very full beard indeed.  I&#8217;m not going to do risk that again&#8230;<br />
Anyway, you might think I&#8217;m a maverick, but it turns out that most of Powerset is in on the gig. Almost all our employees are growing moustaches and/or beards in preparation for our upcoming launch.  Even women who can&#8217;t grow nearly as nice moustaches as the men have painted them on from time to time. And our folks even registered a domain name and created a website, <a href="http://www.powerstache.com">PowerStache.com</a>, featuring photos taken over time as people grow their beards and moustaches.<br />
It&#8217;s pretty silly and really wasn&#8217;t initially a coordinated effort, but it&#8217;s fun and reflects the excitement inside the company as we are nearing the time when the early version of our product will be available to the general public.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.barneypell.com/2008/02/la-times-on-founders-brunch-and-the-powerstache/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Powerset in Forbes article on the Language of Search</title>
		<link>http://www.barneypell.com/2008/02/powerset-in-forbes-article-on-the-language-of-search/</link>
		<comments>http://www.barneypell.com/2008/02/powerset-in-forbes-article-on-the-language-of-search/#comments</comments>
		<pubDate>Mon, 25 Feb 2008 00:16:54 +0000</pubDate>
		<dc:creator>Barney</dc:creator>
				<category><![CDATA[Human Language Technology]]></category>
		<category><![CDATA[Information retrieval]]></category>
		<category><![CDATA[Powerset]]></category>
		<category><![CDATA[Search]]></category>

		<guid isPermaLink="false">http://174.120.172.92/~barneype/?p=109</guid>
		<description><![CDATA[Forbes.com has a special issue on language, including interesting articles and interviews by some of my favorite writers on Language. I&#8217;m happy that natural language and semantic search was included in the special issue. Andy Greenberg from Forbes.com published his piece on language and search engines devoting a good portion of the article to Powerset [...]]]></description>
			<content:encoded><![CDATA[<p>Forbes.com has a special issue on language, including interesting articles and interviews by some of my favorite writers on Language.</p>
<p>I&#8217;m happy that natural language and semantic search was included in the special issue. Andy Greenberg from Forbes.com published his piece on language and search engines devoting a good portion of the article to <a href="http://www.powerset.com/">Powerset</a> and <a href="http://www.hakia.com/">Hakia</a>, featuring interviews with me and with Hakia&#8217;s founder Riza Berkan. The article, entitled <a href="http://www.forbes.com/business/2008/02/21/search-engine-semantic-tech-cx_ag_language_sp08_0221hakia.html">&#8220;Language Web-lish&#8221;</a> starts off with Andy using Powerset&#8217;s metaphor comparing people&#8217;s current use of search engines to communicating like cavemen:</p>
<blockquote><p>A question in English, like &#8220;What year was Hillary Clinton born?&#8221; becomes what he calls a primitive &#8220;keywordese&#8221;: &#8220;Hillary Clinton born year.&#8221;"We have this great gift of human intelligence based around language,&#8221; says Pell, &#8220;and now we have to translate it into a grunting pidgin language to interact with machines.&#8221;</p></blockquote>
<p>Andy described an example I showed him from Powerset:</p>
<blockquote><p>When a user enters the question, &#8220;In what year was Hillary Clinton born?,&#8221; Powerset&#8217;s algorithm doesn&#8217;t simply scour the Web for this collection of words in close proximity. Instead, it looks at pages with an eye for their meaning. Reading the sentence &#8220;Born to Dorothy and Hugh Rodham in 1947, Hillary Clinton is a New York senator,&#8221; Powerset will disassemble the sentence&#8217;s grammar and extract the fact of Hillary Clinton&#8217;s birth date. That fact is then connected with the user&#8217;s question, even if the word order of the result and the query didn&#8217;t originally match.</p></blockquote>
<p>Andy also went through an example from Hakia:</p>
<blockquote><p>Taking the question &#8220;What drug is best for treating a urinary tract infection?&#8221; Riza Berkan points to the word &#8220;drug.&#8221; Hakia&#8217;s algorithm, he says, understands that the word contains a massive subset of concepts including synonyms and specific names of medicines. When it spots a term that falls into that subset, like &#8220;Amoxicillin,&#8221; Hakia can substitute the medicine&#8217;s name for the word &#8220;drug&#8221; in the result.&#8221;You don&#8217;t want the word &#8216;drug,&#8217; you want the name of the drug,&#8221; says Berkan. &#8220;That&#8217;s a hidden failure in search engines, and people don&#8217;t even know what they&#8217;re missing.&#8221;</p></blockquote>
<p>Other natural language and semantic search companies mentioned included <a href="http://www.cognitionsearch.com/">Cognition Search</a> and <a href="http://www.lexxe.com/">Lexxe</a>.</p>
<p>As is typical, my friend Peter Norvig at Google gets the last word in the article:</p>
<blockquote><p>Google&#8217;s Peter Norvig, the search giant&#8217;s director of research, knows just how complex semantic algorithms can be: His Berkeley Ph.D. thesis tried to develop one in 1978. Every sentence of text, he says, took weeks to analyze. &#8220;The result was kind of like a dancing bear,&#8221; he says. &#8220;It was amazing that it could dance at all, but we didn&#8217;t expect it to star in the Moscow Ballet.&#8221;But that doesn&#8217;t mean Google&#8217;s engineers are idly watching semantic search from a distance, says Norvig. The company&#8217;s thousands of engineers are looking at how to incorporate semantic analysis into a search algorithm. But semantic analysis is just one of many directions that Google&#8217;s teams are exploring&#8230; &#8220;Basically, we just do whatever works,&#8221; says Norvig. &#8220;Instead of trying to understand everything, we&#8217;re trying to understand something about billions of pages a week.&#8221;</p>
<p>But does that pragmatic approach leave Google vulnerable to an innovative start-up willing to risk its fate on building meaning-based search from scratch?</p>
<p>&#8220;It&#8217;s unlikely,&#8221; says Norvig. &#8220;But even car companies have to worry about anti-gravity machines.&#8221;</p></blockquote>
<p>I think that analogy is quite a stretch. It&#8217;s more like big car companies having to worry about smaller companies focused on electric cars. They don&#8217;t have to worry about this immediately but, at some point, this is going to be the future of their industry.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.barneypell.com/2008/02/powerset-in-forbes-article-on-the-language-of-search/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Natural Language and the Semantic Web: ISWC Keynote talk</title>
		<link>http://www.barneypell.com/2007/11/natural-language-and-the-semantic-web-iswc-keynote-talk/</link>
		<comments>http://www.barneypell.com/2007/11/natural-language-and-the-semantic-web-iswc-keynote-talk/#comments</comments>
		<pubDate>Mon, 19 Nov 2007 20:29:54 +0000</pubDate>
		<dc:creator>Barney</dc:creator>
				<category><![CDATA[Collective Intelligence]]></category>
		<category><![CDATA[Human Language Technology]]></category>
		<category><![CDATA[Powerset]]></category>
		<category><![CDATA[Search]]></category>
		<category><![CDATA[ISWC07]]></category>
		<category><![CDATA[Korea]]></category>
		<category><![CDATA[natural language]]></category>
		<category><![CDATA[semantic web]]></category>

		<guid isPermaLink="false">http://174.120.172.92/~barneype/?p=102</guid>
		<description><![CDATA[I gave an invited keynote talk last week at The 6th International Semantic Web Conference and the 2nd Asian Semantic Web Conference, 2007. The abstract for the talk is below. The image below links to the original video and presentation slides. The live presentation (and video) contains technical demos that aren&#8217;t in the slides. Some [...]]]></description>
			<content:encoded><![CDATA[<p>I gave an invited keynote talk last week at <a href='http://videolectures.net/iswc07_busan/'>The 6th International Semantic Web Conference and the 2nd Asian Semantic Web Conference, 2007</a>.  The abstract for the talk is below.  The image below links to the original video and presentation slides.</p>
<p>The live presentation (and video) contains technical demos that aren&#8217;t in the slides.  Some of the demos are already available inside <a href="http://labs.powerset.com">Powerlabs</a> (e.g. Powermouse, which lets you browse and query our semantic database of facts extracted from Wikipedia), while some of these are still internal (e.g. an open search box, and output of our natural language system on full sentences).  I also gave some detailed walk-through showing how Powerset takes advantage of external semantic resources like <a href="http://wordnet.princeton.edu/">Wordnet</a> and <a href="http://www.Freebase.com">Freebase</a>.</p>
<p>For me, the most fun part of the talk was toward the end, where I got to speculate on how ecosystem effects can make natural language search and the semantic web become deeper and more powerful more quickly than people might expect. For example, advertisers, publishers, and vertical search sites will be able to contribute ontologies that enable them to get more users, better internal search, and more revenue, while having as a side effect that the broad search engines get more knowledgeable about different domains.<br />
The questions afterward were also challenging and interesting.<br />
<a href='http://videolectures.net/iswc07_pell_nlpsw/'><br />
<img src='http://videolectures.net/iswc07_pell_nlpsw/thumb.jpg' border=0/><br />
<br/>POWERSET &#8211; Natural Language and the Semantic Web</a><br/></p>
<p><span id="more-102"></span><br />
The Semantic Web promises to revolutionize access to information by adding machine-readable semantic information to content which is normally interpretable only by people. In addition, it will also revolutionize access to services by adding semantic information to create machine-readable service descriptions. This ambitious vision has been slow to take off because of a chickenand egg problem. Markup is required before people will build applications, applications are required before it is worth the hard work of doing markup. Natural language processing (NLP) has advanced to the point where it can break the impasse and open up the possibilities of the Semantic Web. First, NLP systems can now automatically create annotations from unstructured text. This provides the data that semantic web applications require. Second, NLP systems are themselves consumers of semantic web information and thus provide economic motivation for people to create and maintain such information. For example, a new generation of natural language search systems, as illustrated by Powerset, can take advantage of semantic web markup and ontologies to augment their interpretation of underlying textual content. They can also expose semantic web services directly in response to natural language queries.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.barneypell.com/2007/11/natural-language-and-the-semantic-web-iswc-keynote-talk/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Management changes at Powerset</title>
		<link>http://www.barneypell.com/2007/11/management-changes-at-powerset/</link>
		<comments>http://www.barneypell.com/2007/11/management-changes-at-powerset/#comments</comments>
		<pubDate>Thu, 01 Nov 2007 22:51:42 +0000</pubDate>
		<dc:creator>Barney</dc:creator>
				<category><![CDATA[Powerset]]></category>

		<guid isPermaLink="false">http://174.120.172.92/~barneype/?p=101</guid>
		<description><![CDATA[In this posting, I want to talk about some significant management changes at Powerset. The main changes are: Powerset is looking for a CEO to take the company to the next level of growth. I am transitioning my role from CEO to CTO. Ron Kaplan, who was our CTO and Chief Science Officer, is now [...]]]></description>
			<content:encoded><![CDATA[<p>In this posting, I want to talk about some significant management changes at Powerset.  The main changes are:</p>
<ul>
<li>Powerset is looking for a CEO to take the company to the next level of growth.
<li>I am transitioning my role from CEO to CTO.
<li>Ron Kaplan, who was our CTO and Chief Science Officer, is now our Chief Science Officer.
<li>Steve Newcomb has left his position as COO and moved on from the company.
</ul>
<p>Let me give you some historical context.  When I first had the idea for Powerset and was looking to build my initial management team, I knew that as a first-time CEO my strengths were around the technology and vision and not necessarily around management of a large organization.  I sought out a strong operating partner to share many of the C-level responsibilities while letting me do the things I was great at.  And that&#8217;s how I found Steve Newcomb, who became my cofounder and our COO.  This partnership worked well during the formation and rapid growth of the company. While I focused on the strategy around Powerset’s vision and technology and connecting Powerset with the outside world, Steve lead the company internally and brought strengths in execution on several other fronts.  We can all be proud of what we accomplished during Powerset’s early days but with the company’s very rapid growth and the team’s great progress, we knew it was time to re-evaluate.</p>
<p>After extensive thought and reflection, the Board and management team<br />
decided that the time was right for us to bring in a new CEO to take the<br />
company to the next level and for me to transition into the role of CTO.<br />
The Board evaluated what this change would mean for Steve, and concluded<br />
that bringing in a world-class CEO who is a strong operational manager would<br />
make the COO role redundant. By helping us get to this point, Steve did the<br />
job he signed up for and he has now left the company.  As many of you know,<br />
Steve has been a strong force for the company and a key part of what made<br />
Powerset an early success. He has also been a champion and protector of our<br />
corporate culture. These influences are now part of our DNA and we will<br />
continue to invest in and protect the inspirational culture that Steve<br />
helped to build. Steve will remain a friend of the company and a major<br />
shareholder and maintains the best wishes for the success of Powerset and<br />
the team. He has personal passions in some new directions which he will no<br />
doubt be writing about on his blog.</p>
<p>I consider this kind of deliberate reflection in order to make the best<br />
choices for the company a strong testament to Powerset’s management team.<br />
The result is truly what we all think is the best path for the company going<br />
forward. Bringing in a new world-class CEO will help the company grow and<br />
take advantage of the great opportunity ahead of us. I am proud of what we<br />
accomplished to get the company this far, and I really look forward to<br />
working and learning from a great CEO during our next stage of growth.  And,<br />
as a major shareholder in the company, I see this transition as something<br />
that will result in great long term value for the company.</p>
<p>While I enjoyed being CEO during the initial growth phase of the company,<br />
pulling together the early team and investors, and defining the vision and<br />
core strategy for the company, I believe the CTO role at this point plays to<br />
my best strengths and my passion. It also makes it easier for me to<br />
contribute ideas and technical solutions without people taking them as<br />
directives from the CEO.  This was not obvious when I decided to be CEO<br />
during the earlier growth of the company and in this sense it makes it<br />
easier and more appropriate for me to be part of the creative team.  As<br />
Founder and CTO, I will also continue as the technology visionary and<br />
evangelist for Powerset to the outside world.  Ron Kaplan, who has been our<br />
CTO and Chief Science Officer, will transition fully to the CSO title. This<br />
also gives Ron more time to guide the core science at the heart of<br />
Powerset&#8217;s differentiation.</p>
<p>We have recently kicked off a search to find the right CEO.  We have already talked with some excellent candidates and are confident that we will bring in someone of up to Powerset’s level of quality. If you are or know someone who could be a great CEO for a company with Powerset&#8217;s vision and visibility, I would love to talk with you.</p>
<p>So that&#8217;s the background for the current changes.  With that, I want to give some perspective on the development of startup companies, which may be useful for other early management teams facing similar stages of growth. The talents, roles, and personalities that work best for running a company are often different at different stages of the company&#8217;s growth. Each stage brings with it a challenging transition. Powerset is unusual only in the reflection and cooperation that the management team has demonstrated in making the right changes to propel the company through the next stages of growth.</p>
<p>In thinking about these changes, it is an interesting point for reflection about where Powerset is now and where we are going. It has been a little over two years since we incorporated the company and just one year since we raised Series A funding. What was largely a potential back then has become much more of a reality now. One year ago we had only a prototype, didn&#8217;t have a license or source code to our core technology, had a small team in general and no search team at all and people were asking why natural language might matter for search, wasn&#8217;t this impossible and hadn&#8217;t it already failed. Today, all of that has changed in ways that are beyond what anyone might have expected.</p>
<p>
The changes we are making now position us for a next phase that promises to be really exciting. We will bring our technology out in real products that users will enjoy and that will trigger changes across the entire ecosystem of search. I think the next year is going to be an amazing time for Powerset and I am as passionate as ever about Powerset, our technology, our team and our future.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.barneypell.com/2007/11/management-changes-at-powerset/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Tim Converse on Proximity is a Hack</title>
		<link>http://www.barneypell.com/2007/09/tim-converse-on-proximity-is-a-hack/</link>
		<comments>http://www.barneypell.com/2007/09/tim-converse-on-proximity-is-a-hack/#comments</comments>
		<pubDate>Wed, 12 Sep 2007 21:03:28 +0000</pubDate>
		<dc:creator>Barney</dc:creator>
				<category><![CDATA[Human Language Technology]]></category>
		<category><![CDATA[Powerset]]></category>
		<category><![CDATA[Search]]></category>
		<category><![CDATA[natural language search]]></category>
		<category><![CDATA[term proximity]]></category>

		<guid isPermaLink="false">http://174.120.172.92/~barneype/?p=100</guid>
		<description><![CDATA[Powerset&#8217;s Tim Converse wrote a great article entitled: Proximity is a Hack. In the article, Tim says that the two biggest improvements in web search were the use of links (including anchor text) and term proximity. The article explores the benefits of term proximity and argues that works to the extent that it approximates linguistic [...]]]></description>
			<content:encoded><![CDATA[<p>Powerset&#8217;s <a href="http://timconverse.wordpress.com">Tim Converse </a>wrote a great article entitled: <a href="http://timconverse.wordpress.com/2007/06/25/proximity-is-a-hack/">Proximity is a Hack</a>.<br />
In the article, Tim says that the two biggest improvements in web search were the use of links (including anchor text) and term proximity. The article explores the benefits of term proximity and argues that works to the extent that it approximates linguistic relationships in the text.<br />
He concludes that natural language processing of the documents should have the ability to more accurately capture linguistic relationships even if the query itself is in keywordese (as opposed to a natural language query with internal linguistic structure).</p>
<blockquote><p>
To recap: proximity is both a wonderfully powerful relevance feature, and a total hack. It helps enormously, but it’s not what you really want, it’s just sorta somewhat correlated with what you really want. What you need for what you really want is the underlying structure of all that web content: the real syntactic structure of the sentences, how the sentences connect to each other, how the facts relate, and (maybe) how the discourse flows and the topics connect. We’ve squeezed all the juice we can out of webpages considered as word-vectors; now it’s time to parse this stuff and get at the real structure.<br />
Can that be done? A couple of years ago I would have said no, but I hadn’t seen the PARC natural language technology then, and didn’t know that an effort this concerted and well-funded was on the way. Now, do I think that Powerset will do it? I still don’t know, frankly &#8211; there’s so much more to do to make it real and debugged and scaled the way it needs to be. But it’s clear to me that the next big thing in web search is either this or something a whole lot like this, and I think we have the best shot of anyone. And that’s why I’m at Powerset.   </p></blockquote>
<p>The article is definitely good reading for people interested in search and the potential benefits of NLP.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.barneypell.com/2007/09/tim-converse-on-proximity-is-a-hack/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
