<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Barney Pell&#039;s Weblog &#187; Search</title>
	<atom:link href="http://www.barneypell.com/archives/search/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.barneypell.com</link>
	<description></description>
	<lastBuildDate>Thu, 17 Dec 2009 09:20:05 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0.1</generator>
		<item>
		<title>Wolfram Alpha: A New Kind of Question-Answering System</title>
		<link>http://www.barneypell.com/2009/03/wolfram-alpha-a-new-kind-of-question-answering-system/</link>
		<comments>http://www.barneypell.com/2009/03/wolfram-alpha-a-new-kind-of-question-answering-system/#comments</comments>
		<pubDate>Mon, 23 Mar 2009 22:03:15 +0000</pubDate>
		<dc:creator>Barney</dc:creator>
				<category><![CDATA[Collective Intelligence]]></category>
		<category><![CDATA[Human Language Technology]]></category>
		<category><![CDATA[Information retrieval]]></category>
		<category><![CDATA[Powerset]]></category>
		<category><![CDATA[Science]]></category>
		<category><![CDATA[Search]]></category>
		<category><![CDATA[Software]]></category>
		<category><![CDATA[Web/Tech]]></category>

		<guid isPermaLink="false">http://174.120.172.92/~barneype/?p=124</guid>
		<description><![CDATA[There has been much excitement recently over the upcoming launch of Wolfram Alpha. This is a new question-answering system developed by Stephen Wolfram, inventor of Mathematica, and it is scheduled for a beta launch in May. Wolfram has been providing demos to industry insiders. I haven’t had a demo yet, but I have learned what [...]]]></description>
			<content:encoded><![CDATA[<p>There has been much excitement recently over the upcoming launch of Wolfram Alpha. This is a new question-answering system developed by Stephen Wolfram, inventor of Mathematica, and it is scheduled for a beta launch in May. Wolfram has been providing demos to industry insiders. I haven’t had a demo yet, but I have learned what I could from reading articles by Nova Spivak (“<a href="http://www.techcrunch.com/2009/03/08/wolfram-alpha-computes-answers-to-factual-questions-this-is-going-to-be-big/">Wolfram Alpha computes answers to factual questions. This is going to be big”</a>) and Doug Lenat (<a href="http://www.semanticuniverse.com/blogs-i-was-positively-impressed-wolfram-alpha.html">“I was positively impressed with Wolfram Alpha”</a>). And this weekend I spoke with William Tunstall-Pedoe, CEO of <a href="http://www.trueknowledge.com/">True Knowledge</a>, who also got a demo.  Many of my examples and conclusions come from conversation with William (thanks!).  Since life is short and so is the attention of web readers, I&#8217;ll give the rest of my thoughts in bullet form.</p>
<p><strong>What it is: A new kind of question-answering system. </strong></p>
<p><strong>Examples</strong></p>
<ul>
<li> Math: &#8220;2+2&#8243; and then a few simple math questions: &#8220;integrate xsin^4xdx&#8221;, &#8220;what is the square root of 18&#8243; etc.</li>
<li> Business: “gdp france” showed amount and graph of how it changed over time. “gdp france/germany” showed graph with both amounts and the ratio</li>
<li> “internet users in Europe”: Showed total, and a chart of usage by country in Europe, at the current time, specifically highlighting the biggest and smallest</li>
<li> “ISS”: generates a graphic rendition of the international space station orbiting earth and updating in real-time</li>
<li> “tides in san Francisco”: showed a graph of tides over time, where the times were listed in the local time regime current in the late 19th century for those data points. “tide NYC 11/12/1922” gave a single answer.</li>
<li> “weather”: showed graph of average temperature in Cambridge, MA (where Stephen was when doing the demo). Based on reverse IP lookup.</li>
<li> Computational fluid dynamics: typing in the name of a specific aerofoil produced a picture of that aerofoil along with its differential equations.</li>
<li> stock prices:  “MSFT CSCO” showed comparison chart</li>
<li> chemicals: Substances at temperature or pressure, got physical properties calculated. “H2SO4” showed a diagram and chemical properties. &#8220;5 molar h2s04&#8243; did something cool, I don’t know what.</li>
<li> genome sequences: “AGTAG” shows sequences from the human genome that match that pattern</li>
<li> data about people: “How old is Barack Obama” gives his age now. “When was Alan Turing born” gives the answer. “How old is Alan Turing” (a trick question) gives an error message with no human-readable explanation (True Knowledge, by contrast, tells you exactly why this is a trick question).</li>
</ul>
<p><strong>Coverage of data: It answers questions over the following types of structured data:</strong></p>
<ul>
<li> static tables and databases (e.g. a database of internet usage by country by year)</li>
<li> dynamic data feeds (e.g. historical stock market data, position of space shuttle, weather)</li>
<li> numerical inference (e.g. math questions)</li>
<li> numerical computations and simulations (e.g. tides, astronomy, chemistry)</li>
</ul>
<p><span id="more-124"></span></p>
<div id="a000132more">
<div id="more">
<p><strong> Form of queries</strong></p>
<li> The queries are expressed in template-based natural language or corresponding abbreviated forms</li>
<li> NL syntax: “what is the gdp of france”</li>
<li> Template compressed: {attribute} of {object} {time}  (“gdp france 2008”)</li>
<li> Mathematical expressions, or NL versions of these (as one might do in an entry-level LISP class)</li>
<li> I can imagine the query language supports (or could support) restrictions on presentation (plot, chart) and other constraints one might express in SQL (order by, etc), though I haven’t seen any examples showing this exists at present.<strong> Presentation and Answers</strong>
<ul>
<li> Answers can be a single fact, a table, or a graphical display of a live simulation.  Usually it’s a combination of these.</li>
<li> For ambiguous queries, it always picks one interpretation. And you can switch to something else if that’s wrong. (A drop-down menu of other alternatives).</li>
</ul>
<p><strong> Domains and Generality</strong></li>
<li> Wolfram Alpha is described as an open domain question answering system on structured data. But how exactly is this open domain? I distinguish three levels of domain generality:
<ul>
<li> Closed domain: A specified domain</li>
<li> Multi domain: Multiple domains are covered, we try to add more domains, but still treats each one a closed. Note: this can be accomplished through a unified or disjoint treatment.</li>
<li> Open domain: Any domain is within scope</li>
</ul>
</li>
<li>For Wolfram Alpha they have taken a domain-by-domain approach. For each domain, they determined what type of questions to support, and which data, feeds, or simulations to incorporate, and did hand curation to enable these.</li>
<li> The domains are typically fact and data oriented, especially where simulations are available<strong> Architecture</strong></li>
<li> The system is coded in Mathematica, about 4.5M lines of code, developed by a large team (100 people at present).</li>
<li> From this <a href="http://www.wolfram.com/products/mathematica/quickoverview/">presentation on Mathematica </a>it is quite easy to extrapolate what Wolfram Alpha is like &#8211; essentially Mathematica + a vast library of mathematical models and data attached + some error-tolerant processing of the user&#8217;s input (thanks Peter Clark for pointing this out).</li>
<li> Piecing together the Mathematica approach and generalizing from the examples and my own knowledge, I believe they have a basic level of representational tools that gets shared for multiple domains. Here&#8217;s how I would think about this:
<ul>
<li> Define the objects in the domain</li>
<li> Make a table of function names and attributes in the domain, and for each function or attribute list the restrictions on the type of objects that this can apply to.</li>
<li> Standardize representations of time and place and charting elements associated with these.</li>
<li> Import and normalize data</li>
<li> Associate data fields to objects and attributes in the domain</li>
</ul>
<p><strong> Infrastructure</strong></li>
<li> The system runs on thousands of expensive servers (running mathematica in real-time).</li>
<li> Apparently 10 machines per query give 1 queries per second (qps), so they can do 100 qps on 1,000 machines.<strong> What is innovative about this</strong></li>
<li> Rich mathematical computational infrastructure (Mathematica) to support mathematical aspects of natural language queries</li>
<li> Integration of mathematical inference and simulations along with structured data in a single question-answering system</li>
<li> Unprecedented level of structured data aggregation and curation</li>
<li> Rich presentation including static and dynamic elements and multiple modalities</li>
<li> (Potentially) Deployment of NL-to-SQL query translation in a multi-domain system. The technology has existed to do this for several years But I don’t know if anyone has deployed it yet. I’m not sure if Wolfram has deployed this and haven’t seen enough examples to indicate if they have.<strong> What it doesn’t do</strong></li>
<li> Queries or presentation against unstructured data (neither keyword nor NL queries against unstructured data, which is a strength of <a href="http://www.powerset.com/">Powerset</a>)</li>
<li> Queries requiring ontological or commonsense inference (whether structured or unstructured, which is a strength of True Knowledge and <a href="http://www.cyc.com/">Cyc</a>)</li>
<li> Answers in support of transactions (e.g. price feeds from many merchants or airlines), which is shown in various stages in many major search engines</li>
<li> Cross-domain multiple domains (e.g. “what was the weather in San Francisco when Yahoo was founded”, which is a strength of True Knowledge)<strong> Implications for the field</strong>
<ul>
<li> Question answering has been an important part of search results the whole time, but it has often been a second class citizen and hardly promoted</li>
<li> By increasing the level of comprehensiveness of structured questions (in terms of data and domains), this can increase awareness and usage of question answering systems</li>
<li> This should move question answering to be more of a competitive feature across search engines</li>
<li> Users will want to ask questions for structured and unstructured queries, not just structured queries, which will increase perceived differentiation for technology like Powerset</li>
<li> If the use of structured data and simulations prove valuable to large number of users and search engines, then this will increase the need to transform and route queries to vertical experts, potentially developed by ecosystem partners</li>
<li> This will increase the need and value for ecosystem players to add semantic markup to their structured data and simulations, hence making it easier to offer more semantic question answering and integration with other services, and expanding the value of the services by search engines in a virtuous cycle</li>
</ul>
<p><strong>Conclusion</strong></p>
<p>In conclusion, Wolfram Alpha is not going to be a new search engine or a universal answer engine. It is not going to put the existing major players or semantic search startups out of business. But there appears to be real innovation here, leading to at least a <span style="text-decoration: underline;">new kind of system</span> that we have not seen before.  I am eagerly looking forward to my turn to try it out.</li>
</div>
</div>
]]></content:encoded>
			<wfw:commentRss>http://www.barneypell.com/2009/03/wolfram-alpha-a-new-kind-of-question-answering-system/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>How Live Search Cashback got me the best online deals!</title>
		<link>http://www.barneypell.com/2008/12/how-live-search-cashback-got-me-the-best-online-deals/</link>
		<comments>http://www.barneypell.com/2008/12/how-live-search-cashback-got-me-the-best-online-deals/#comments</comments>
		<pubDate>Wed, 03 Dec 2008 19:17:03 +0000</pubDate>
		<dc:creator>Barney</dc:creator>
				<category><![CDATA[Ecommerce]]></category>
		<category><![CDATA[Search]]></category>

		<guid isPermaLink="false">http://174.120.172.92/~barneype/?p=122</guid>
		<description><![CDATA[Now that I am part of Microsoft Live Search following the acquisition of Powerset, I am making a point of trying out all the latest offerings from Live Search (including those under development) and many other Microsoft products in general. While I can&#8217;t talk about the products under development, it is fun when I can [...]]]></description>
			<content:encoded><![CDATA[<p>Now that I am part of Microsoft Live Search following the acquisition of Powerset, I am making a point of trying out all the latest offerings from Live Search (including those under development) and many other Microsoft products in general.</p>
<p>While I can&#8217;t talk about the products under development, it is fun when I can talk about things that  everyone can use today to do something better than what you can do elsewhere.</p>
<p>With that in mind, I want to talk about <a href="http://search.live.com/cashback">Live Search Cashback</a>. I have been interested in the concepts behind Cashback for a long time, since I was an early advocate for Cost Per Acquisition (CPA) as a future disruptive trend in ecommerce.</p>
<p><a href="http://www.barneypell.com/archives/2005/07/snapcom_raises.html">I wrote about this in connection with SNAP</a>, which I helped Mayfield invest in back when I was an entrepreneur in residence there.</p>
<p>In CPA, the merchants pay based on completed actions by customers (e.g. purchases), rather than just paying to display ads or for clicks by users who may not actually purchase anything.</p>
<p>CPA has potential to ad value for all parts of the ecommerce ecosystem:</p>
<ul>
<li>It is great for merchants because they only pay when they get real value, so they don&#8217;t have to worry about losing money on advertising.</li>
<li>It&#8217;s great for search engines (once there is enough scale) because they know just how much money they can make from advertisers if they can send the right traffic there, without worrying about limited advertising budgets, and they also get data about transactions so they can improve their service.</li>
<li>It&#8217;s great for searchers, because search engines can rank ads (or even organic search results) based on the transactional information from other users.  You would rather click on an ad or a search result that has led to happy purchases by other shoppers than click on one that attracted user interest but no sale.  This is a good way to filter our spam of many kinds.</li>
</ul>
<p>Microsoft&#8217;s Live Search Cashback has all these potential benefits, but with an added twist: Live Search actually gives some of the advertising revenue back to the users who make the purchases. Here is how it works:</p>
<ul>
<li>Search for a product on Live Search. There are now two different ways to buy using cashback:
<ul>
<li>You see a &#8220;gleam&#8221; on some of the ads that alerts you about the cashback opportunity. The ads list the cashback percentage.</li>
<li>You see a link to a Live Search product detail and &#8220;compare prices&#8221; page.   On that page, you see the typical list of prices for your product across a wide variety of merchants. However while typical engines sort by total price including shipping and taxes, Live Search Cashback also shows you the total prices after the cashback discount.</li>
</ul>
</li>
<li>When you click over to one of those merchants from either the search result ad or the cashback price comparison page, Microsoft asks for your email so it can remember you when it needs to give you cashback if you make a purchase.</li>
<li>Then you shop on the merchant page (any of the merchants you clicked on).</li>
<li>Any product you buy from a selected cashback merchant qualifies you for the stated cashback percentage.  You get an email from the Microsoft cashback program telling you how much cash you will get back.  You have to create a Windows Live account once in order to set up the program, but that&#8217;s easy and also gets you access to all the other Microsoft online services (including search personalization, hotmail, msn messenger, Healthvault, etc).</li>
<li>Then after a waiting period (60 days at present in most cases, which is long enough to prove you aren&#8217;t going to return the product and cancel your purchase), the cash shows up in your paypal account or other account you have specified.</li>
</ul>
<p>That sounds great in principle, but like everyone, I wondered whether this actually works. In particular, I wondered: <strong>Could I use Live Search Cashback to get the absolute lowest price for products I am shopping for?</strong></p>
<p><strong><br />
</strong> Here are some reasons why, in theory, the cashback system could fail to deliver the best deals:</p>
<ul>
<li>It is possible that the lowest-price merchants haven&#8217;t signed up for the cashback program yet</li>
<li>Online sellers offering cashback might raise their prices knowing that buyers would be getting cashback.  In that case, sellers would benefit from the program, but buyers would not and so it woudl be better to shop elsewhere for lower prices.</li>
</ul>
<p>I have heard a lot of excitement about cashback inside Microsoft, and rumors of people getting some good deals, but so far I had not seen any proof that Live Search Cashback could get the best deal if you really tried all possibilities.  So that&#8217;s what I set out to do.<br />
First, a little about my shopping style.  For high value items, I am really a price-conscious shopper.  I first do research to determine the product that I want to buy.  I use a lot of tools for this, but that will be the subject of another post.  For now, it&#8217;s all about price.  So once I have chosen the product I really want, I shop using all the tools and techniques available to get the lowest possible price.  It&#8217;s perhaps a bit silly in my case since I do value my time more than the actual dollar savings I achieve, but I also value the feeling that I used smarts and tools to get the &#8220;best bargain&#8221;.<br />
So with that in mind, here is my experience buying some pricey consumer electronics products on Cyber Monday this week. You can follow along at home and if you do it now you should get the same results.<br />
I wanted to buy two products:</p>
<ul>
<li><a href="http://search.live.com/results.aspx?q=Panasonic+TH-50PZ80U&amp;go=&amp;form=QBLH">Panasonic TH 50PZ80U &#8211; 50&#8243; plasma TV</a></li>
<li><a href="http://search.live.com/products/?q=Panasonic+DMP-BD35K&amp;go=&amp;form=QBCA">Panasonic DMP-BD35K Blue-ray DVD player</a></li>
</ul>
<p>The links above point to the Live Search product and results pages for these products, so you can try this at home.<br />
For the plasma TV, it&#8217;s an expensive item (list price is $2,000 plus shipping, as I write this), so I really wanted to shop around and get a great price. I searched for this product on Amazon, Shopping.com, PriceGrabber, NextTag, CNET, Google, Yahoo, Ebay, and Live Search. And I clicked on many of the ads that came up on these sites offering special deals, coupons, other comparison shopping engines, etc. Here is what I found using each of these (I list total price including shipping and taxes):</p>
<ul>
<li>Shopping.com: They have a nice feature that lists the lowest price from a trusted merchant. In this case, it is Adorama Camera, for $1,269.00. (There was a lower price quoted from another less trusted merchant, but the price on the site when you click was actually higher than the price listed on Shopping.com.  Annoying bait and switch tactic, perhaps?).</li>
<li> Pricegrabber and NextTag: Both listed Butterfly Photo at $1263, but it is not a highly trusted seller, and then Adorama Camera for $1269.</li>
<li>Amazon:  In the Amazon Marketplace, the lowest price merchant is again Adorama Camera, but through Amazon the price is $1296.00.  It looks like you pay $27 to buy through Amazon instead of going direct.</li>
<li> CNET: There was one merchant, 6TH Ave, that listed at $1244 but they had a low trust rating, so I discounted them. The lowest price trusted merchant on CNET, B&amp;H Photo, offered it at $1279.</li>
<li> Google: The lowest price merchant was Universal LCD, for $1265. (Note Google listed B&amp;H Photo with a 1,235.50 total price, but that was for a demo item with a scratch. The ones without scratch at B&amp;H Photo are $1279, as listed on CNET.)</li>
<li> Yahoo: The lowest price merchant was Universal LCD, for $1265, same as on Google.</li>
<li> Ebay: The lowest price seller offering buy-it-now was for $1469.  (It&#8217;s possible I could have got it for less waiting a few days for the auction, but so far I&#8217;ve hardly ever seen a hot item go on auction for less than the cheapest prices elsewhere, and I&#8217;m really only comparing things I can order right now that are in-stock.)</li>
</ul>
<p>Having tried all the other services, I then tried Live Search Cashback.  I tried both the &#8220;cashback ad&#8221; path and the &#8220;cashback price comparison&#8221; path (as described earlier).</p>
<ul>
<li> Ebay via Live Search Cashback: The best ad was the Ebay ad, offering cashback of 25%.  All you do is click on that Ebay ad, which takes you to Ebay. Then anything you buy on ebay using Paypal and buy-it-now gives you that 25% (be sure you go to eBay after clicking on the ad, not directly, in order to get the cashback). I found that same trusted seller offering it for $1469, but once I walked through the buying path it showed me that I would get $200 cashback (the cashback maximum is $200). So that nets out to $1269, which matches the lowest price I could find from any other service.</li>
<li> Best merchant from the Live Search comparison page: Adorama. They offered 3% cashback of their base price of $1269 (the same price I found using shopping.com) for a total net price of $1230.93. This is the one I wound up buying.</li>
</ul>
<p>So Live Search Cashback found me the best deal on the internet for my 50&#8243; Panasonic plasma TV!  It was interesting that despite the 25% cashback deal with Ebay, it was still better to use another cashback merchant.  I wondered if that was true for all products, or if sometimes it might be a better value to buy on Ebay with those big cashback deals.<br />
This turned out to be true for the Panasonic Blue-ray DVD player. I bought it on eBay for $259 and got $52 cashback, for a net price of $207.<br />
By comparison: shopping.com, nexttag, pricegrabber, yahoo all had B&amp;H Photo at $239.95 (with occasionally a lower-trusted merchant coming in at $224). Amazon Marketplace had it at $229.72 from Electronics Express, thus beating all the other deals except Live Search Cashback on this product.<br />
So the bottom line:<br />
<span style="text-decoration: underline;">I shopped for two high-quality, high value consumer electronics products that I wanted to buy.  I tried all online services I could think of, and for each of these products I got the best deal on the internet using Live Search Cashback!<br />
</span><br />
This is exciting for me personally and I am happy to share the exciting news with my friends and family, especially in the current economy where everyone is watching expenses.<br />
On a broader note, I must admit I was skeptical about whether cashback would really work and whether it would draw in new users to Live Search.  Now that I have done my own research and found that it really can get the best deals, I believe a lot more users will check with Live Search to see if there are cashback deals whenever they are shopping for something pricey at least.  Whether or not this impacts overall search market share in the near-term is still an open question, but at least it will serve to expose more users to some of the search innovation that is brewing inside Microsoft. I will write about some of these innovations in the coming weeks.<br />
<em>A note if you are following in my footsteps here: The Live Search price comparison page was tricky (that is, buggy!).  Unlike many of the more mature shopping engines, the Live Search comparison shopping page did not show whether the products were in stock and did not including shipping costs. And some of the prices listed on this page were not the same as you see when clicking over to the merchant. Data feed consistency is a problem for many of these comparison shopping engines, so you have to go and check each result to make sure it is as offered &#8212; if it looks too good to be true, it probably is.  I expect that these kinds of issues will be improved soon. </em></p>
]]></content:encoded>
			<wfw:commentRss>http://www.barneypell.com/2008/12/how-live-search-cashback-got-me-the-best-online-deals/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Marissa Mayer on the Future of Search</title>
		<link>http://www.barneypell.com/2008/09/marissa-mayer-on-the-future-of-search/</link>
		<comments>http://www.barneypell.com/2008/09/marissa-mayer-on-the-future-of-search/#comments</comments>
		<pubDate>Thu, 11 Sep 2008 17:57:47 +0000</pubDate>
		<dc:creator>Barney</dc:creator>
				<category><![CDATA[Search]]></category>

		<guid isPermaLink="false">http://174.120.172.92/~barneype/?p=120</guid>
		<description><![CDATA[The Official Google Blog is running a series in which search experts pontificate on the future of search. In the first installment, Google&#8217;s VP of Products, Marissa Mayer, writes about the future of search. I really liked the article. Here are a few thoughts I had while reading the article. This past Saturday, I kept [...]]]></description>
			<content:encoded><![CDATA[<p>The Official Google Blog is running a series in which search experts pontificate on the future of search.  In the first installment, Google&#8217;s VP of Products, <a href="http://googleblog.blogspot.com/2008/09/future-of-search.html">Marissa Mayer, writes about the future of search</a>.<br />
I really liked the article.  Here are a few thoughts I had while reading the article.</p>
<blockquote><p>This past Saturday, I kept track of the things that came up in conversation that I wanted to search for right then but couldn’t:<br />
Are &#8220;fab,&#8221; &#8220;goy&#8221; and &#8220;eely&#8221; words? (There was a Scrabble game going on.) What time does J.C. Penney open on Saturday? Which school has a team called the Banana Slugs? What is the team mascot for San Jose State? How much power does that hydroelectric dam generate? What do you call a group of turkeys? What time does Tropic Thunder show? What’s the name of that great Irish flute player, first name James? What’s the name of the largest city in Russia after Moscow and St. Petersburg? Which is older, a redwood or a cypress? What’s the oldest living thing and how old is it? Who sings “Queen of Hearts”? What kind of bird is that flying over there? Is the &#8220;LF&#8221; in San Francisco on Union Square or Union Street? What are the dance steps to the Charleston? What day of the week was The Lawrence Welk Show on? What are the lyrics to “In the Mood”? How does Coumadin differ from aspirin in its blood thinning effects? What was the story behind the naming of the number &#8220;googol&#8221;?<br />
Looking at this list, two things are very clear: (1) I could do a lot more searches and (2) search still has a lot of opportunity for innovation, change, and progress. There are lots of ways that search will need to evolve in order to easily meet user needs. Let’s look at some of my unanswered questions from Saturday and consider how search might change over the next 10 years. </p></blockquote>
<p>Thinking about the questions one might have asked, but didn&#8217;t, is a nice way to recognize some gaps in search.  Mayer concludes that she could answer all of her queries with search today, using the right keyword query, but that there must have been easier ways of getting there. It would be really interesting to see how much work it would take ordinary searchers to get these same answers using keyword search, or even how many tries it took Marissa to get the intended results.<br />
The points Marissa makes as follow-ons (including the value of natural language, voice, context, disambiguation, multimedia, and mobility) are all big and important problems.<br />
I also liked her summary of the ideal search engine:</p>
<blockquote><p>Your best friend with instant access to all the world’s facts and a photographic memory of everything you’ve seen and know. That search engine could tailor answers to you based on your preferences, your existing knowledge and the best available information; it could ask for clarification and present the answers in whatever setting or media worked best.
</p></blockquote>
<p>One interesting aspect of this definition is that it envisions that search engines will still exist as a category in the ideal future. I think there will always be value in having an automated, intelligent conversational partner, and I am a strong proponent of such a future vision.  But I also think that increased search intelligence will find its way into the flow of our daily lives and tasks. While there is value in answering factual questions in just the right way, there might be at least as much value in helping us with the task-oriented context in which the questions arise (why do we want movie times, why are we asking about pain killer ingredients) and in helping us to read the content once it is finally returned.<br />
Anyway, I look forward to the rest of the series on the future of search. It&#8217;s a good time for this dialogue as I start out in my new role as search strategist and evangelist at Microsoft.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.barneypell.com/2008/09/marissa-mayer-on-the-future-of-search/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Microsoft to acquire Powerset</title>
		<link>http://www.barneypell.com/2008/07/microsoft-to-acquire-powerset/</link>
		<comments>http://www.barneypell.com/2008/07/microsoft-to-acquire-powerset/#comments</comments>
		<pubDate>Thu, 03 Jul 2008 15:50:32 +0000</pubDate>
		<dc:creator>Barney</dc:creator>
				<category><![CDATA[Human Language Technology]]></category>
		<category><![CDATA[Information retrieval]]></category>
		<category><![CDATA[Powerset]]></category>
		<category><![CDATA[Search]]></category>

		<guid isPermaLink="false">http://174.120.172.92/~barneype/?p=118</guid>
		<description><![CDATA[On Monday, Microsoft and Powerset announced that Powerset is being acquired by Microsoft. In terms of timing, the companies announced that the deal was signed. There is still the customary period before the deal is officially closed (at which point, I expect we&#8217;re going to have a great party). I&#8217;m including, below, the text of [...]]]></description>
			<content:encoded><![CDATA[<p>On Monday, Microsoft and Powerset announced that Powerset is being acquired by Microsoft.</p>
<p>In terms of timing, the companies announced that the deal was signed. There is still the customary period before the deal is officially closed (at which point, I expect we&#8217;re going to have a great party).</p>
<p>I&#8217;m including, below, the text of the announcements from the blogs of Powerset andMicrosoft.<br />
I think these sum up pretty well the logic behind the acquisition on both sides.</p>
<p>It took a lot of work by many people to make this happen. Most significant, of course, was the entire team at Powerset, who executed so well to build and launch a wonderful product that showed the world what is now possible.</p>
<p>Immediately following the announcement, we had a day of calls with members of the press, which resulted in a lot of coverage. I&#8217;ll try to post a collection of links next week.</p>
<p>One press meeting that I really enjoyed was a <a href="http://www.techcrunch.com/2008/07/02/interview-with-barney-pell-and-ramez-naam-about-microsoft%e2%80%99s-powerset-acquisition-integration-to-begin-this-year/">podcast with me, Ramez Naam (Group Program Manager for Microsoft Live Search), and Mike Arrington for TechCrunch</a>.  That link provides an article, transcript, and the full audio of the interview.</p>
<p>There is a lot more to say about Powerset, Microsoft, the acquisition, and what it means for the future of search, linguistic technology, semantic web, etc. I am excited to be staying on with Microsoft in a strategy and evangelist role and I am looking forward to the chance to talk and write a lot more about this, and from a whole new perspective, soon.</p>
<p>Here is the text of <a href="http://www.powerset.com/blog/articles/2008/07/01/microsoft-to-acquire-powerset">Powerset&#8217;s blog announcement</a>:</p>
<blockquote><p>We’re excited to announce officially that Microsoft has signed an agreement to acquire Powerset.Powerset has always been a small company with big dreams, with the ultimate goal of changing the way humans interact with computers through language. We set out to improve search by indexing Web pages based on the meaning expressed in them rather than just the literal words. Powerset licensed breakthrough technology from PARC, hired world-renowned computational linguists and search engineers, and recently released a search and discovery experience for Wikipedia articles. Our technology helps to improve search results and also makes new features possible, such as Factz, which aggregates information from many articles to summarize a topic.</p>
<p>With any startup, the challenge is to take the seeds of an idea and grow it into a viable company. At Powerset, we transformed our idea into a world-class semantic search platform, demonstrating the future of search with our Wikipedia search experience. But building a large-scale semantic search engine is expensive, requiring an engineering effort and computing resources beyond what most start-ups could ever imagine. Because our goals around improving search align so well, Powerset has decided to team up with Microsoft. We believe that this is the fastest way to bring our technology to market at a large scale.</p>
<p>Microsoft shares our goal to improve search through deeper analysis of queries and documents, and understands that our technology and expertise will play a key role in the evolution of search. With an existing search infrastructure, incredible capital resources, unlimited data, a leading search team, and clear mission to revolutionize the search landscape, Microsoft can rapidly accelerate our progress in building semantic search technology and bringing it to full Web scale. When we launched our first product, we heard: this is great, but when and how will we get Powerset to go beyond Wikpiedia? Microsoft accelerates our ability to move Powerset to the entire Web faster than anyone could have imagined.</p>
<p>Powerset will continue to operate much as we currently do, working in the same building, with the same organizational structure, and with the same uniquely talented and growing team (apply on our jobs page). We’ll continue to tackle the hardest problems in parsing, semantics, ranking, indexing, scalable computing, user experience and all of our other specialties. But now we’ll do it with the support of Microsoft and the vast resources of the entire Live Search team.</p>
<p>Over the past couple of years Powerset has made amazing progress. Starting with just a big idea, we licensed the best linguistic technology, recruited a top-notch team, built out our datacenter, engineered a world-class semantic search platform, tackled deep natural language issues, improved relevance, innovated an interface and launched a great product. So few start-ups ever tackle such deep, scientific problems successfully and create the kind of value we’ve delivered in such short order.</p>
<p>For now, Powerset.com will continue to host our Wikipedia Search &amp; Discovery and we’ll be continuing to experiment with our product, based on user feedback. But, expect many announcements from us in the coming months about how we’re integrating our technology and features into Live Search.</p></blockquote>
<p>And here&#8217;s the text of <a href="http://blogs.msdn.com/livesearch/archive/2008/07/01/powerset-joins-live-search.aspx">Microsoft&#8217;s blog announcement</a>:</p>
<blockquote><p>Powerset joins Live SearchWe&#8217;re excited to announce that we&#8217;ve reached an agreement to acquire Powerset, a San Francisco-based search and natural language company.</p>
<p>Powerset will join our core Search Relevance team, remaining intact in San Francisco. Powerset brings with it natural language technology that nicely complements other natural language processing technologies we have in Microsoft Research.</p>
<p>More importantly, Powerset brings to Live Search a set of talented engineers and computational linguists in downtown San Francisco. This is a great team with a wide range of experience from other search engines and research organizations like PARC (formerly Xerox PARC).</p>
<p>We&#8217;re buying Powerset first and foremost because we&#8217;re impressed with the people there. Powerset CTO and cofounder Barney Pell is a visionary and incredible evangelist. When he introduced our senior engineers to some of the most senior people at Powerset — Search engineers and computational linguists like Tim Converse, Chad Walters, Scott Prevost, Lorenzo Thione, and Ron Kaplan — we came away impressed by their smarts, their experience, their passion for search, and a shared vision.</p>
<p>That shared vision is to take Search to the next level by adding understanding of the intent and meaning behind the words in searches and webpages.</p>
<p>We know today that roughly a third of searches don&#8217;t get answered on the first search and first click. Usually searchers find the information they want eventually, but that often requires multiple searches or clicks on multiple search results. Two specific problems are the most common reasons for this:</p>
<p>* Differences in phrasing or context between a user&#8217;s search and the way the same information is expressed on webpages. Search engines don&#8217;t understand today that &#8220;shrub&#8221; and &#8220;tree&#8221; are similar concepts. We don&#8217;t understand that &#8220;cancer&#8221; sometimes refers to a disease and sometimes refers to a horoscope and when a query or a webpage refers to which.<br />
* Lack of clarity in the descriptions for each webpage in the search results. Sometimes a result looks relevant from its short description on the results page but turns out to be not so relevant when you visit the actual page. As a result, searchers frequently click results and then rapidly click back when they realize they aren&#8217;t what they&#8217;re looking for.</p>
<p>These problems exist because search engines today primarily match words in a search to words on a webpage. We can solve these problems by working to understand the intent behind each search and the concepts and meaning embedded in a webpage. Doing so, we can innovate in the quality of the search results, in the flexibility with which searchers can phrase their queries, and in the search user experience. We will use knowledge extracted from webpages to improve the result descriptions and provide new tools to help customers search better.</p>
<p>Working with our existing Search team and other Microsoft teams that focus on natural language, Powerset will help us address all of those problems and opportunities.</p>
<p>We&#8217;re looking to add even more talented engineers to the San Francisco team to accelerate our shared progress. If you&#8217;re interested in joining the team, drop us a line.</p>
<p>We&#8217;ll have more to say about the things we&#8217;re doing in understanding searches and webpages through natural language technology in the coming months. In the meantime, please join me in welcoming Powerset to Microsoft!</p>
<p>Satya Nadella, Senior Vice President, Search, Portal, and Advertising</p></blockquote>
]]></content:encoded>
			<wfw:commentRss>http://www.barneypell.com/2008/07/microsoft-to-acquire-powerset/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title></title>
		<link>http://www.barneypell.com/2008/03/114/</link>
		<comments>http://www.barneypell.com/2008/03/114/#comments</comments>
		<pubDate>Tue, 25 Mar 2008 21:04:14 +0000</pubDate>
		<dc:creator>Barney</dc:creator>
				<category><![CDATA[Search]]></category>
		<category><![CDATA[Web/Tech]]></category>

		<guid isPermaLink="false">http://174.120.172.92/~barneype/?p=114</guid>
		<description><![CDATA[Semantic Web Patterns: A Guide to Semantic Technologies &#8211; ReadWriteWeb Alex Iskold wrote a nice article that provides an overview and categorization of semantic web approaches, technologies, and companies. Here are a few key points from the article, interspersed with some of my own perspectives. The Semantic Web is now capturing broad attention, and has [...]]]></description>
			<content:encoded><![CDATA[<p><a title="Semantic Web Patterns: A Guide to Semantic Technologies - ReadWriteWeb" href="http://www.readwriteweb.com/archives/semantic_web_patterns.php">Semantic Web Patterns: A Guide to Semantic Technologies &#8211; ReadWriteWeb</a><br />
Alex Iskold wrote  a nice article that provides an overview and categorization of semantic web approaches, technologies, and companies.<br />
Here are a few key points from the article, interspersed with some of my own perspectives.</p>
<ul>
<li>
The Semantic Web is now capturing broad attention, and has been called the number one trend in 2008 (by Richard MacManus, founder of ReadWriteWeb).</p>
<li>Yahoo! recently announced that their search engine is going to support RDF and microformats. This will provide incentive for publishers to use semantic markup in their content. This echoes a point I made in my semantic web keynote talk last year (see below), that search engines would create incentives to drive the semantic web faster than people may have expected.
<li>Several companies are now offering web services to support or automate semantic markup.  These include the Semantify web service from Dapper, the Open Calais web service from Reuters/ClearForest, and the Semantic Hacker API from TextWise.
<li>There are top-down and bottom-up approaches to the Semantic Web.  Bottom-up approaches require people to enter semantic markup.  This can be in strong semantic web formats using standards like RDF, or in lightweight markup formats, like Microformats.
<li>Search is potentially a killer app of semantic technologies. The author argues that semantic technologies alone are not enough to deliver better search, but when used in combination with the other search techniques they might be better.  I agree that the combination is best. But I disagree with the statement that<br />
<blockquote>Google&#8217;s algorithm, which is based on statistical analysis, deals just fine with semantic entities like people, cities, and companies.</p></blockquote>
<p>  I think there is a significant gap today between what we are used to with search engines and what is possible with stronger semantic approaches, and this will become clearer over the next year.<br />
<item>Contextual technologies use semantic markup within the page and combine that with external content and services.  Thus a user does not have to search in order to benefit from the semantics. Examples include Snap, Yahoo Shortcuts, and SmartLinks.  Such technologies are making their way into the browser, where they will have wider appeal and accelerate the trend toward the semantic web.</p>
<li>Semantic databases focus on building and utilizing structure semantic information (as opposed to marking up unstructured content).  Twine, by Radar Networks, and Freebase, by Metaweb, are two examples.  (I am personally familiar with Freebase as we are integrating this within our offerings at Powerset.) Over time, we will see increasing synergies between the semantic technologies based on structured and unstructured data.
</ul>
<p>I highly recommend this article to people interested in semantic technologies and search.  For my own perspective on the relationship between natural language, search, and the semantic web, you can see the video and presentation of my Keynote Talk at the 2007 International Semantic Web Conference, entitled <a href="http://www.barneypell.com/archives/2007/11/natural_language_and_the_semantic_web_iswc_keynote_talk.html">Natural Language and the Semantic Web</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.barneypell.com/2008/03/114/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>In 5 years we will search more with voice than typing</title>
		<link>http://www.barneypell.com/2008/02/in-5-years-we-will-search-more-with-voice-than-typing/</link>
		<comments>http://www.barneypell.com/2008/02/in-5-years-we-will-search-more-with-voice-than-typing/#comments</comments>
		<pubDate>Tue, 26 Feb 2008 20:30:25 +0000</pubDate>
		<dc:creator>Barney</dc:creator>
				<category><![CDATA[Human Language Technology]]></category>
		<category><![CDATA[Powerset]]></category>
		<category><![CDATA[Search]]></category>

		<guid isPermaLink="false">http://174.120.172.92/~barneype/?p=111</guid>
		<description><![CDATA[David Vogelpohl wrote an article, Will Microsoft Resurrect Natural Language Search, citing a recent AP article about Bill Gates and voice-based search. Here are some quotes from the AP article: People will increasingly interact with computers using speech or touch screens rather than keyboards, Microsoft Corp. Chairman Bill Gates said.“It’s one of the big bets [...]]]></description>
			<content:encoded><![CDATA[<p>David Vogelpohl wrote an article, <a href="http://www.marketingpilgrim.com/2008/02/will-microsoft-resurrect-natural-language-search.html">Will Microsoft Resurrect Natural Language Search</a>, citing a recent <a href="http://biz.yahoo.com/ap/080222/gates_goodbye_keyboards.html?.v=2">AP article</a> about Bill Gates and voice-based search.  Here are some quotes from the AP article:</p>
<blockquote><p>People will increasingly interact with computers using speech or touch screens rather than keyboards, Microsoft Corp. Chairman Bill Gates said.“It’s one of the big bets we’re making,” he said during the final stop of a farewell tour before he withdraws from the company’s daily operations in July.</p>
<p>In five years, Microsoft expects more Internet searches to be done through speech than through typing on a keyboard, Gates told about 1,200 students and faculty members Thursday at Carnegie Mellon University.</p></blockquote>
<p>David conjectures, as do I, that when people speak their searches they are more likely to use natural language than to use keywordese, and that this could change the game in search.</p>
<blockquote><p>I personally can envision Microsoft trying to integrate speech based data entry as closely as possible with our normal style of speaking. Perhaps the phrase “Where can I buy a hd tv?” would be more natural for searchers when you take away the limitations of the keyboard.Wide spread speech based data entry will almost certainly impact the way Microsoft and subsequently all other search engines deal with search queries.</p></blockquote>
<p>It&#8217;s interesting to see Bill Gates predicting this to happen within 5 years. In the blink of an eye, an entire industry is going to change dramatically.</p>
<p>While on the topic of predictions about voice and language, here&#8217;s one of my predictions that I have been meaning to write up:</p>
<blockquote><p>Within 8 years from now (2016), every category of consumer electronics will have some linguistic interface as a standard feature.</p></blockquote>
<p>By &#8220;linguistic interface&#8221;, I mean voice interactions or text-based interaction that is linguage-based. Not that these devices won&#8217;t still have nonlinguistic interfaces too (e.g. there will still be buttons, most likely). And by &#8220;every category&#8221;, I mean you will not find a category of consumer electronics that does not have some product in that category with that feature.</p>
<p>For example, users will expect to be able to talk to cameras, tvs, stereos, ipods, phones, watches, microwave ovens, refrigerators, cars, etc. There will still be some cameras that aren&#8217;t language-enabled, but every category will have some products that are.</p>
<p>As my friends Cliff Nass and Scott Brave write in their book, <a href="http://www.amazon.co.uk/Voice-Activated-Psychology-Interfaces-Wirelesses/dp/1575863324">Voice Activated</a>, when people interact with devices using voice, it also invokes the rest of their social apparatus. You can&#8217;t hear a voice without ascribing some kind of personality, gender, race, social status, etc to the source of the voice. So in addition to expecting linguistic capability, we&#8217;re also going to start expecting personality within the next decade.</p>
<p>I&#8217;ll stop here before I get carried away to the singularity&#8230;</p>
]]></content:encoded>
			<wfw:commentRss>http://www.barneypell.com/2008/02/in-5-years-we-will-search-more-with-voice-than-typing/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Powerset in Forbes article on the Language of Search</title>
		<link>http://www.barneypell.com/2008/02/powerset-in-forbes-article-on-the-language-of-search/</link>
		<comments>http://www.barneypell.com/2008/02/powerset-in-forbes-article-on-the-language-of-search/#comments</comments>
		<pubDate>Mon, 25 Feb 2008 00:16:54 +0000</pubDate>
		<dc:creator>Barney</dc:creator>
				<category><![CDATA[Human Language Technology]]></category>
		<category><![CDATA[Information retrieval]]></category>
		<category><![CDATA[Powerset]]></category>
		<category><![CDATA[Search]]></category>

		<guid isPermaLink="false">http://174.120.172.92/~barneype/?p=109</guid>
		<description><![CDATA[Forbes.com has a special issue on language, including interesting articles and interviews by some of my favorite writers on Language. I&#8217;m happy that natural language and semantic search was included in the special issue. Andy Greenberg from Forbes.com published his piece on language and search engines devoting a good portion of the article to Powerset [...]]]></description>
			<content:encoded><![CDATA[<p>Forbes.com has a special issue on language, including interesting articles and interviews by some of my favorite writers on Language.</p>
<p>I&#8217;m happy that natural language and semantic search was included in the special issue. Andy Greenberg from Forbes.com published his piece on language and search engines devoting a good portion of the article to <a href="http://www.powerset.com/">Powerset</a> and <a href="http://www.hakia.com/">Hakia</a>, featuring interviews with me and with Hakia&#8217;s founder Riza Berkan. The article, entitled <a href="http://www.forbes.com/business/2008/02/21/search-engine-semantic-tech-cx_ag_language_sp08_0221hakia.html">&#8220;Language Web-lish&#8221;</a> starts off with Andy using Powerset&#8217;s metaphor comparing people&#8217;s current use of search engines to communicating like cavemen:</p>
<blockquote><p>A question in English, like &#8220;What year was Hillary Clinton born?&#8221; becomes what he calls a primitive &#8220;keywordese&#8221;: &#8220;Hillary Clinton born year.&#8221;"We have this great gift of human intelligence based around language,&#8221; says Pell, &#8220;and now we have to translate it into a grunting pidgin language to interact with machines.&#8221;</p></blockquote>
<p>Andy described an example I showed him from Powerset:</p>
<blockquote><p>When a user enters the question, &#8220;In what year was Hillary Clinton born?,&#8221; Powerset&#8217;s algorithm doesn&#8217;t simply scour the Web for this collection of words in close proximity. Instead, it looks at pages with an eye for their meaning. Reading the sentence &#8220;Born to Dorothy and Hugh Rodham in 1947, Hillary Clinton is a New York senator,&#8221; Powerset will disassemble the sentence&#8217;s grammar and extract the fact of Hillary Clinton&#8217;s birth date. That fact is then connected with the user&#8217;s question, even if the word order of the result and the query didn&#8217;t originally match.</p></blockquote>
<p>Andy also went through an example from Hakia:</p>
<blockquote><p>Taking the question &#8220;What drug is best for treating a urinary tract infection?&#8221; Riza Berkan points to the word &#8220;drug.&#8221; Hakia&#8217;s algorithm, he says, understands that the word contains a massive subset of concepts including synonyms and specific names of medicines. When it spots a term that falls into that subset, like &#8220;Amoxicillin,&#8221; Hakia can substitute the medicine&#8217;s name for the word &#8220;drug&#8221; in the result.&#8221;You don&#8217;t want the word &#8216;drug,&#8217; you want the name of the drug,&#8221; says Berkan. &#8220;That&#8217;s a hidden failure in search engines, and people don&#8217;t even know what they&#8217;re missing.&#8221;</p></blockquote>
<p>Other natural language and semantic search companies mentioned included <a href="http://www.cognitionsearch.com/">Cognition Search</a> and <a href="http://www.lexxe.com/">Lexxe</a>.</p>
<p>As is typical, my friend Peter Norvig at Google gets the last word in the article:</p>
<blockquote><p>Google&#8217;s Peter Norvig, the search giant&#8217;s director of research, knows just how complex semantic algorithms can be: His Berkeley Ph.D. thesis tried to develop one in 1978. Every sentence of text, he says, took weeks to analyze. &#8220;The result was kind of like a dancing bear,&#8221; he says. &#8220;It was amazing that it could dance at all, but we didn&#8217;t expect it to star in the Moscow Ballet.&#8221;But that doesn&#8217;t mean Google&#8217;s engineers are idly watching semantic search from a distance, says Norvig. The company&#8217;s thousands of engineers are looking at how to incorporate semantic analysis into a search algorithm. But semantic analysis is just one of many directions that Google&#8217;s teams are exploring&#8230; &#8220;Basically, we just do whatever works,&#8221; says Norvig. &#8220;Instead of trying to understand everything, we&#8217;re trying to understand something about billions of pages a week.&#8221;</p>
<p>But does that pragmatic approach leave Google vulnerable to an innovative start-up willing to risk its fate on building meaning-based search from scratch?</p>
<p>&#8220;It&#8217;s unlikely,&#8221; says Norvig. &#8220;But even car companies have to worry about anti-gravity machines.&#8221;</p></blockquote>
<p>I think that analogy is quite a stretch. It&#8217;s more like big car companies having to worry about smaller companies focused on electric cars. They don&#8217;t have to worry about this immediately but, at some point, this is going to be the future of their industry.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.barneypell.com/2008/02/powerset-in-forbes-article-on-the-language-of-search/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Natural Language and the Semantic Web: ISWC Keynote talk</title>
		<link>http://www.barneypell.com/2007/11/natural-language-and-the-semantic-web-iswc-keynote-talk/</link>
		<comments>http://www.barneypell.com/2007/11/natural-language-and-the-semantic-web-iswc-keynote-talk/#comments</comments>
		<pubDate>Mon, 19 Nov 2007 20:29:54 +0000</pubDate>
		<dc:creator>Barney</dc:creator>
				<category><![CDATA[Collective Intelligence]]></category>
		<category><![CDATA[Human Language Technology]]></category>
		<category><![CDATA[Powerset]]></category>
		<category><![CDATA[Search]]></category>
		<category><![CDATA[ISWC07]]></category>
		<category><![CDATA[Korea]]></category>
		<category><![CDATA[natural language]]></category>
		<category><![CDATA[semantic web]]></category>

		<guid isPermaLink="false">http://174.120.172.92/~barneype/?p=102</guid>
		<description><![CDATA[I gave an invited keynote talk last week at The 6th International Semantic Web Conference and the 2nd Asian Semantic Web Conference, 2007. The abstract for the talk is below. The image below links to the original video and presentation slides. The live presentation (and video) contains technical demos that aren&#8217;t in the slides. Some [...]]]></description>
			<content:encoded><![CDATA[<p>I gave an invited keynote talk last week at <a href='http://videolectures.net/iswc07_busan/'>The 6th International Semantic Web Conference and the 2nd Asian Semantic Web Conference, 2007</a>.  The abstract for the talk is below.  The image below links to the original video and presentation slides.</p>
<p>The live presentation (and video) contains technical demos that aren&#8217;t in the slides.  Some of the demos are already available inside <a href="http://labs.powerset.com">Powerlabs</a> (e.g. Powermouse, which lets you browse and query our semantic database of facts extracted from Wikipedia), while some of these are still internal (e.g. an open search box, and output of our natural language system on full sentences).  I also gave some detailed walk-through showing how Powerset takes advantage of external semantic resources like <a href="http://wordnet.princeton.edu/">Wordnet</a> and <a href="http://www.Freebase.com">Freebase</a>.</p>
<p>For me, the most fun part of the talk was toward the end, where I got to speculate on how ecosystem effects can make natural language search and the semantic web become deeper and more powerful more quickly than people might expect. For example, advertisers, publishers, and vertical search sites will be able to contribute ontologies that enable them to get more users, better internal search, and more revenue, while having as a side effect that the broad search engines get more knowledgeable about different domains.<br />
The questions afterward were also challenging and interesting.<br />
<a href='http://videolectures.net/iswc07_pell_nlpsw/'><br />
<img src='http://videolectures.net/iswc07_pell_nlpsw/thumb.jpg' border=0/><br />
<br/>POWERSET &#8211; Natural Language and the Semantic Web</a><br/></p>
<p><span id="more-102"></span><br />
The Semantic Web promises to revolutionize access to information by adding machine-readable semantic information to content which is normally interpretable only by people. In addition, it will also revolutionize access to services by adding semantic information to create machine-readable service descriptions. This ambitious vision has been slow to take off because of a chickenand egg problem. Markup is required before people will build applications, applications are required before it is worth the hard work of doing markup. Natural language processing (NLP) has advanced to the point where it can break the impasse and open up the possibilities of the Semantic Web. First, NLP systems can now automatically create annotations from unstructured text. This provides the data that semantic web applications require. Second, NLP systems are themselves consumers of semantic web information and thus provide economic motivation for people to create and maintain such information. For example, a new generation of natural language search systems, as illustrated by Powerset, can take advantage of semantic web markup and ontologies to augment their interpretation of underlying textual content. They can also expose semantic web services directly in response to natural language queries.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.barneypell.com/2007/11/natural-language-and-the-semantic-web-iswc-keynote-talk/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Tim Converse on Proximity is a Hack</title>
		<link>http://www.barneypell.com/2007/09/tim-converse-on-proximity-is-a-hack/</link>
		<comments>http://www.barneypell.com/2007/09/tim-converse-on-proximity-is-a-hack/#comments</comments>
		<pubDate>Wed, 12 Sep 2007 21:03:28 +0000</pubDate>
		<dc:creator>Barney</dc:creator>
				<category><![CDATA[Human Language Technology]]></category>
		<category><![CDATA[Powerset]]></category>
		<category><![CDATA[Search]]></category>
		<category><![CDATA[natural language search]]></category>
		<category><![CDATA[term proximity]]></category>

		<guid isPermaLink="false">http://174.120.172.92/~barneype/?p=100</guid>
		<description><![CDATA[Powerset&#8217;s Tim Converse wrote a great article entitled: Proximity is a Hack. In the article, Tim says that the two biggest improvements in web search were the use of links (including anchor text) and term proximity. The article explores the benefits of term proximity and argues that works to the extent that it approximates linguistic [...]]]></description>
			<content:encoded><![CDATA[<p>Powerset&#8217;s <a href="http://timconverse.wordpress.com">Tim Converse </a>wrote a great article entitled: <a href="http://timconverse.wordpress.com/2007/06/25/proximity-is-a-hack/">Proximity is a Hack</a>.<br />
In the article, Tim says that the two biggest improvements in web search were the use of links (including anchor text) and term proximity. The article explores the benefits of term proximity and argues that works to the extent that it approximates linguistic relationships in the text.<br />
He concludes that natural language processing of the documents should have the ability to more accurately capture linguistic relationships even if the query itself is in keywordese (as opposed to a natural language query with internal linguistic structure).</p>
<blockquote><p>
To recap: proximity is both a wonderfully powerful relevance feature, and a total hack. It helps enormously, but it’s not what you really want, it’s just sorta somewhat correlated with what you really want. What you need for what you really want is the underlying structure of all that web content: the real syntactic structure of the sentences, how the sentences connect to each other, how the facts relate, and (maybe) how the discourse flows and the topics connect. We’ve squeezed all the juice we can out of webpages considered as word-vectors; now it’s time to parse this stuff and get at the real structure.<br />
Can that be done? A couple of years ago I would have said no, but I hadn’t seen the PARC natural language technology then, and didn’t know that an effort this concerted and well-funded was on the way. Now, do I think that Powerset will do it? I still don’t know, frankly &#8211; there’s so much more to do to make it real and debugged and scaled the way it needs to be. But it’s clear to me that the next big thing in web search is either this or something a whole lot like this, and I think we have the best shot of anyone. And that’s why I’m at Powerset.   </p></blockquote>
<p>The article is definitely good reading for people interested in search and the potential benefits of NLP.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.barneypell.com/2007/09/tim-converse-on-proximity-is-a-hack/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Technology Review on Building a Better Search Engine</title>
		<link>http://www.barneypell.com/2007/08/technology-review-on-building-a-better-search-engine/</link>
		<comments>http://www.barneypell.com/2007/08/technology-review-on-building-a-better-search-engine/#comments</comments>
		<pubDate>Thu, 09 Aug 2007 20:21:43 +0000</pubDate>
		<dc:creator>Barney</dc:creator>
				<category><![CDATA[Human Language Technology]]></category>
		<category><![CDATA[Powerset]]></category>
		<category><![CDATA[Search]]></category>

		<guid isPermaLink="false">http://174.120.172.92/~barneype/?p=95</guid>
		<description><![CDATA[Technology Review recently had an article featuring Powerset: Building a Better Search Engine, by Michael Reisman. In addition to Powerset, the article also mentions Hakia and Cognition Search, and closes with a discussion of a semantic search project inside IBM. I am an avid reader of Technology Review and am really excited to have an [...]]]></description>
			<content:encoded><![CDATA[<p>Technology Review recently had an article featuring Powerset: <a href="http://www.technologyreview.com/read_article.aspx?id=19109&#038;a=f">Building a Better Search Engine</a>, by Michael Reisman.<br />
In addition to Powerset, the article also mentions Hakia and Cognition Search, and closes with a discussion of a semantic search project inside IBM.</p>
<p>I am an avid reader of Technology Review and am really excited to have an article about us in this great publication.</p>
<p>The full article is worth reading.</p>
<p>Here are just a few excerpts about natural language search and Powerset&#8217;s technology:</p>
<blockquote><p>The company claims that the engine finds the best answer by considering the meaning and context of the question and related Web pages.<br />
&#8220;Powerset extracts deep concepts and relationships from the texts, and the users query and match them efficiently to deliver a better search,&#8221; Powerset CEO Barney Pell says.</p>
<p>Powerset chief technology officer Ron Kaplan has led PARC&#8217;s XLE team since the 1970s and is the author of much of the technology behind XLE that has been licensed to the company. Kaplan says that he and Pell began to collaborate on the idea about two years ago.<br />
Current methods of searching used by more traditional engines focus on isolated keywords and broad but shallow content coverage. This leaves a lot of room for improvement, Kaplan says.</p>
<p>&#8220;They are really not getting at relationships,&#8221; he notes. &#8220;The best that they do to approximate relationships are words that are close to other words.&#8221; He adds that a much deeper level of analysis is required.</p></blockquote>
<p>The article came in time to announce the upcoming launch of <a href="http://labs.powerset.com">Powerlabs</a>, our early user community:</p>
<blockquote><p>
The company plans to release demo versions of the search engine on its Powerlabs website, where consumers can test-drive the product beginning in September. User feedback will be taken into consideration as Powerset makes the final product, which is slated for release next year.</p>
<p>&#8220;The key challenge is to get the system to the point where people can understand how to use it and get real value out of these systems even though they are not perfect,&#8221; Pell says. &#8220;We are finally at the point where we are going to cross that threshold.&#8221;
</p></blockquote>
]]></content:encoded>
			<wfw:commentRss>http://www.barneypell.com/2007/08/technology-review-on-building-a-better-search-engine/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
