<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Barney Pell&#039;s Weblog &#187; Collective Intelligence</title>
	<atom:link href="http://www.barneypell.com/archives/collective-intelligence/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.barneypell.com</link>
	<description></description>
	<lastBuildDate>Thu, 17 Dec 2009 09:20:05 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
		<item>
		<title>Wolfram Alpha: A New Kind of Question-Answering System</title>
		<link>http://www.barneypell.com/2009/03/wolfram-alpha-a-new-kind-of-question-answering-system/</link>
		<comments>http://www.barneypell.com/2009/03/wolfram-alpha-a-new-kind-of-question-answering-system/#comments</comments>
		<pubDate>Mon, 23 Mar 2009 22:03:15 +0000</pubDate>
		<dc:creator>Barney</dc:creator>
				<category><![CDATA[Collective Intelligence]]></category>
		<category><![CDATA[Human Language Technology]]></category>
		<category><![CDATA[Information retrieval]]></category>
		<category><![CDATA[Powerset]]></category>
		<category><![CDATA[Science]]></category>
		<category><![CDATA[Search]]></category>
		<category><![CDATA[Software]]></category>
		<category><![CDATA[Web/Tech]]></category>

		<guid isPermaLink="false">http://174.120.172.92/~barneype/?p=124</guid>
		<description><![CDATA[There has been much excitement recently over the upcoming launch of Wolfram Alpha. This is a new question-answering system developed by Stephen Wolfram, inventor of Mathematica, and it is scheduled for a beta launch in May. Wolfram has been providing demos to industry insiders. I haven’t had a demo yet, but I have learned what [...]]]></description>
			<content:encoded><![CDATA[<p>There has been much excitement recently over the upcoming launch of Wolfram Alpha. This is a new question-answering system developed by Stephen Wolfram, inventor of Mathematica, and it is scheduled for a beta launch in May. Wolfram has been providing demos to industry insiders. I haven’t had a demo yet, but I have learned what I could from reading articles by Nova Spivak (“<a href="http://www.techcrunch.com/2009/03/08/wolfram-alpha-computes-answers-to-factual-questions-this-is-going-to-be-big/">Wolfram Alpha computes answers to factual questions. This is going to be big”</a>) and Doug Lenat (<a href="http://www.semanticuniverse.com/blogs-i-was-positively-impressed-wolfram-alpha.html">“I was positively impressed with Wolfram Alpha”</a>). And this weekend I spoke with William Tunstall-Pedoe, CEO of <a href="http://www.trueknowledge.com/">True Knowledge</a>, who also got a demo.  Many of my examples and conclusions come from conversation with William (thanks!).  Since life is short and so is the attention of web readers, I&#8217;ll give the rest of my thoughts in bullet form.</p>
<p><strong>What it is: A new kind of question-answering system. </strong></p>
<p><strong>Examples</strong></p>
<ul>
<li> Math: &#8220;2+2&#8243; and then a few simple math questions: &#8220;integrate xsin^4xdx&#8221;, &#8220;what is the square root of 18&#8243; etc.</li>
<li> Business: “gdp france” showed amount and graph of how it changed over time. “gdp france/germany” showed graph with both amounts and the ratio</li>
<li> “internet users in Europe”: Showed total, and a chart of usage by country in Europe, at the current time, specifically highlighting the biggest and smallest</li>
<li> “ISS”: generates a graphic rendition of the international space station orbiting earth and updating in real-time</li>
<li> “tides in san Francisco”: showed a graph of tides over time, where the times were listed in the local time regime current in the late 19th century for those data points. “tide NYC 11/12/1922” gave a single answer.</li>
<li> “weather”: showed graph of average temperature in Cambridge, MA (where Stephen was when doing the demo). Based on reverse IP lookup.</li>
<li> Computational fluid dynamics: typing in the name of a specific aerofoil produced a picture of that aerofoil along with its differential equations.</li>
<li> stock prices:  “MSFT CSCO” showed comparison chart</li>
<li> chemicals: Substances at temperature or pressure, got physical properties calculated. “H2SO4” showed a diagram and chemical properties. &#8220;5 molar h2s04&#8243; did something cool, I don’t know what.</li>
<li> genome sequences: “AGTAG” shows sequences from the human genome that match that pattern</li>
<li> data about people: “How old is Barack Obama” gives his age now. “When was Alan Turing born” gives the answer. “How old is Alan Turing” (a trick question) gives an error message with no human-readable explanation (True Knowledge, by contrast, tells you exactly why this is a trick question).</li>
</ul>
<p><strong>Coverage of data: It answers questions over the following types of structured data:</strong></p>
<ul>
<li> static tables and databases (e.g. a database of internet usage by country by year)</li>
<li> dynamic data feeds (e.g. historical stock market data, position of space shuttle, weather)</li>
<li> numerical inference (e.g. math questions)</li>
<li> numerical computations and simulations (e.g. tides, astronomy, chemistry)</li>
</ul>
<p><span id="more-124"></span></p>
<div id="a000132more">
<div id="more">
<p><strong> Form of queries</strong></p>
<li> The queries are expressed in template-based natural language or corresponding abbreviated forms</li>
<li> NL syntax: “what is the gdp of france”</li>
<li> Template compressed: {attribute} of {object} {time}  (“gdp france 2008”)</li>
<li> Mathematical expressions, or NL versions of these (as one might do in an entry-level LISP class)</li>
<li> I can imagine the query language supports (or could support) restrictions on presentation (plot, chart) and other constraints one might express in SQL (order by, etc), though I haven’t seen any examples showing this exists at present.<strong> Presentation and Answers</strong>
<ul>
<li> Answers can be a single fact, a table, or a graphical display of a live simulation.  Usually it’s a combination of these.</li>
<li> For ambiguous queries, it always picks one interpretation. And you can switch to something else if that’s wrong. (A drop-down menu of other alternatives).</li>
</ul>
<p><strong> Domains and Generality</strong></li>
<li> Wolfram Alpha is described as an open domain question answering system on structured data. But how exactly is this open domain? I distinguish three levels of domain generality:
<ul>
<li> Closed domain: A specified domain</li>
<li> Multi domain: Multiple domains are covered, we try to add more domains, but still treats each one a closed. Note: this can be accomplished through a unified or disjoint treatment.</li>
<li> Open domain: Any domain is within scope</li>
</ul>
</li>
<li>For Wolfram Alpha they have taken a domain-by-domain approach. For each domain, they determined what type of questions to support, and which data, feeds, or simulations to incorporate, and did hand curation to enable these.</li>
<li> The domains are typically fact and data oriented, especially where simulations are available<strong> Architecture</strong></li>
<li> The system is coded in Mathematica, about 4.5M lines of code, developed by a large team (100 people at present).</li>
<li> From this <a href="http://www.wolfram.com/products/mathematica/quickoverview/">presentation on Mathematica </a>it is quite easy to extrapolate what Wolfram Alpha is like &#8211; essentially Mathematica + a vast library of mathematical models and data attached + some error-tolerant processing of the user&#8217;s input (thanks Peter Clark for pointing this out).</li>
<li> Piecing together the Mathematica approach and generalizing from the examples and my own knowledge, I believe they have a basic level of representational tools that gets shared for multiple domains. Here&#8217;s how I would think about this:
<ul>
<li> Define the objects in the domain</li>
<li> Make a table of function names and attributes in the domain, and for each function or attribute list the restrictions on the type of objects that this can apply to.</li>
<li> Standardize representations of time and place and charting elements associated with these.</li>
<li> Import and normalize data</li>
<li> Associate data fields to objects and attributes in the domain</li>
</ul>
<p><strong> Infrastructure</strong></li>
<li> The system runs on thousands of expensive servers (running mathematica in real-time).</li>
<li> Apparently 10 machines per query give 1 queries per second (qps), so they can do 100 qps on 1,000 machines.<strong> What is innovative about this</strong></li>
<li> Rich mathematical computational infrastructure (Mathematica) to support mathematical aspects of natural language queries</li>
<li> Integration of mathematical inference and simulations along with structured data in a single question-answering system</li>
<li> Unprecedented level of structured data aggregation and curation</li>
<li> Rich presentation including static and dynamic elements and multiple modalities</li>
<li> (Potentially) Deployment of NL-to-SQL query translation in a multi-domain system. The technology has existed to do this for several years But I don’t know if anyone has deployed it yet. I’m not sure if Wolfram has deployed this and haven’t seen enough examples to indicate if they have.<strong> What it doesn’t do</strong></li>
<li> Queries or presentation against unstructured data (neither keyword nor NL queries against unstructured data, which is a strength of <a href="http://www.powerset.com/">Powerset</a>)</li>
<li> Queries requiring ontological or commonsense inference (whether structured or unstructured, which is a strength of True Knowledge and <a href="http://www.cyc.com/">Cyc</a>)</li>
<li> Answers in support of transactions (e.g. price feeds from many merchants or airlines), which is shown in various stages in many major search engines</li>
<li> Cross-domain multiple domains (e.g. “what was the weather in San Francisco when Yahoo was founded”, which is a strength of True Knowledge)<strong> Implications for the field</strong>
<ul>
<li> Question answering has been an important part of search results the whole time, but it has often been a second class citizen and hardly promoted</li>
<li> By increasing the level of comprehensiveness of structured questions (in terms of data and domains), this can increase awareness and usage of question answering systems</li>
<li> This should move question answering to be more of a competitive feature across search engines</li>
<li> Users will want to ask questions for structured and unstructured queries, not just structured queries, which will increase perceived differentiation for technology like Powerset</li>
<li> If the use of structured data and simulations prove valuable to large number of users and search engines, then this will increase the need to transform and route queries to vertical experts, potentially developed by ecosystem partners</li>
<li> This will increase the need and value for ecosystem players to add semantic markup to their structured data and simulations, hence making it easier to offer more semantic question answering and integration with other services, and expanding the value of the services by search engines in a virtuous cycle</li>
</ul>
<p><strong>Conclusion</strong></p>
<p>In conclusion, Wolfram Alpha is not going to be a new search engine or a universal answer engine. It is not going to put the existing major players or semantic search startups out of business. But there appears to be real innovation here, leading to at least a <span style="text-decoration: underline;">new kind of system</span> that we have not seen before.  I am eagerly looking forward to my turn to try it out.</li>
</div>
</div>
]]></content:encoded>
			<wfw:commentRss>http://www.barneypell.com/2009/03/wolfram-alpha-a-new-kind-of-question-answering-system/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Natural Language and the Semantic Web: ISWC Keynote talk</title>
		<link>http://www.barneypell.com/2007/11/natural-language-and-the-semantic-web-iswc-keynote-talk/</link>
		<comments>http://www.barneypell.com/2007/11/natural-language-and-the-semantic-web-iswc-keynote-talk/#comments</comments>
		<pubDate>Mon, 19 Nov 2007 20:29:54 +0000</pubDate>
		<dc:creator>Barney</dc:creator>
				<category><![CDATA[Collective Intelligence]]></category>
		<category><![CDATA[Human Language Technology]]></category>
		<category><![CDATA[Powerset]]></category>
		<category><![CDATA[Search]]></category>
		<category><![CDATA[ISWC07]]></category>
		<category><![CDATA[Korea]]></category>
		<category><![CDATA[natural language]]></category>
		<category><![CDATA[semantic web]]></category>

		<guid isPermaLink="false">http://174.120.172.92/~barneype/?p=102</guid>
		<description><![CDATA[I gave an invited keynote talk last week at The 6th International Semantic Web Conference and the 2nd Asian Semantic Web Conference, 2007. The abstract for the talk is below. The image below links to the original video and presentation slides. The live presentation (and video) contains technical demos that aren&#8217;t in the slides. Some [...]]]></description>
			<content:encoded><![CDATA[<p>I gave an invited keynote talk last week at <a href='http://videolectures.net/iswc07_busan/'>The 6th International Semantic Web Conference and the 2nd Asian Semantic Web Conference, 2007</a>.  The abstract for the talk is below.  The image below links to the original video and presentation slides.</p>
<p>The live presentation (and video) contains technical demos that aren&#8217;t in the slides.  Some of the demos are already available inside <a href="http://labs.powerset.com">Powerlabs</a> (e.g. Powermouse, which lets you browse and query our semantic database of facts extracted from Wikipedia), while some of these are still internal (e.g. an open search box, and output of our natural language system on full sentences).  I also gave some detailed walk-through showing how Powerset takes advantage of external semantic resources like <a href="http://wordnet.princeton.edu/">Wordnet</a> and <a href="http://www.Freebase.com">Freebase</a>.</p>
<p>For me, the most fun part of the talk was toward the end, where I got to speculate on how ecosystem effects can make natural language search and the semantic web become deeper and more powerful more quickly than people might expect. For example, advertisers, publishers, and vertical search sites will be able to contribute ontologies that enable them to get more users, better internal search, and more revenue, while having as a side effect that the broad search engines get more knowledgeable about different domains.<br />
The questions afterward were also challenging and interesting.<br />
<a href='http://videolectures.net/iswc07_pell_nlpsw/'><br />
<img src='http://videolectures.net/iswc07_pell_nlpsw/thumb.jpg' border=0/><br />
<br/>POWERSET &#8211; Natural Language and the Semantic Web</a><br/></p>
<p><span id="more-102"></span><br />
The Semantic Web promises to revolutionize access to information by adding machine-readable semantic information to content which is normally interpretable only by people. In addition, it will also revolutionize access to services by adding semantic information to create machine-readable service descriptions. This ambitious vision has been slow to take off because of a chickenand egg problem. Markup is required before people will build applications, applications are required before it is worth the hard work of doing markup. Natural language processing (NLP) has advanced to the point where it can break the impasse and open up the possibilities of the Semantic Web. First, NLP systems can now automatically create annotations from unstructured text. This provides the data that semantic web applications require. Second, NLP systems are themselves consumers of semantic web information and thus provide economic motivation for people to create and maintain such information. For example, a new generation of natural language search systems, as illustrated by Powerset, can take advantage of semantic web markup and ontologies to augment their interpretation of underlying textual content. They can also expose semantic web services directly in response to natural language queries.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.barneypell.com/2007/11/natural-language-and-the-semantic-web-iswc-keynote-talk/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Powerlabs internal launch</title>
		<link>http://www.barneypell.com/2007/08/powerlabs-internal-launch/</link>
		<comments>http://www.barneypell.com/2007/08/powerlabs-internal-launch/#comments</comments>
		<pubDate>Thu, 09 Aug 2007 20:49:52 +0000</pubDate>
		<dc:creator>Barney</dc:creator>
				<category><![CDATA[Collective Intelligence]]></category>
		<category><![CDATA[Powerset]]></category>
		<category><![CDATA[Social Networking]]></category>

		<guid isPermaLink="false">http://174.120.172.92/~barneype/?p=96</guid>
		<description><![CDATA[Today was an exciting milestone for Powerset. We released the first version of the Powerlabs platform for our employees to try out. The Powerlabs platform is a framework for innovation in which a community of users can generate and refine ideas as they interact with products and concepts. It combines elements of social networking, crowd-sourcing, [...]]]></description>
			<content:encoded><![CDATA[<p>Today was an exciting milestone for Powerset.  We released the first version of the Powerlabs platform for our employees to try out.  The Powerlabs platform is a framework for innovation in which a community of users can generate and refine ideas as they interact with products and concepts. It combines elements of social networking, crowd-sourcing, and social search (among other buzzwords that, in our case, really make a difference).<br />
It turns out that the product team have been using Powerlabs to improve Powerlabs itself, so there were already a large set of ideas and evaluations by the time the rest of the employees got to try the system out.  And we are already finding the system to be addictive: within a couple hours of internal release (the time it took Product Manager Mark Johnson and me to play a few matches of Dance Dance Revolution), already over 50 ideas had been generated and evaluated!<br />
With this much interest from our own small number of employees, it is amazing to think about the kind of ideas, creativity, and feedback we are going to get from the 16,000 people already signed up for Powerlabs launch in September! (You can sign up at the <a href="http://labs.powerset.com">Powerlabs Website</a>).<br />
Powerlabs is so cool, in fact, that we have already started talking about potentially offering this as a service to other companies who want community innovation around their products (both internal employees and outside users).  So the race is now on to see which takes off faster: a radical new way to search using natural language, or a radical new way to create products!</p>
]]></content:encoded>
			<wfw:commentRss>http://www.barneypell.com/2007/08/powerlabs-internal-launch/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Google Prediction Markets</title>
		<link>http://www.barneypell.com/2005/09/google-prediction-markets/</link>
		<comments>http://www.barneypell.com/2005/09/google-prediction-markets/#comments</comments>
		<pubDate>Fri, 30 Sep 2005 23:26:18 +0000</pubDate>
		<dc:creator>Barney</dc:creator>
				<category><![CDATA[Collective Intelligence]]></category>
		<category><![CDATA[Ecommerce]]></category>

		<guid isPermaLink="false">http://174.120.172.92/~barneype/?p=56</guid>
		<description><![CDATA[Patri Friedman, a google engineer who works on evaluating search quality, posted about the surprising accuracy of Google&#8217;s Internal Prediction Market. I&#8217;ve written a post about the previous prediction markets workshop at SuperNova2005, which gave some background on the topic from pioneers and leaders in the field. I am excited to see Google developing and [...]]]></description>
			<content:encoded><![CDATA[<p>Patri Friedman, a google engineer who works on evaluating search quality, posted about the surprising <a href="http://catallarchy.net/blog/archives/2005/09/22/more-on-google-prediction-markets/">accuracy of Google&#8217;s Internal Prediction Market</a>.<br />
I&#8217;ve written a post about the <a href="http://www.barneypell.com/archives/2005/06/prediction_mark.html">previous prediction markets workshop at SuperNova2005</a>, which gave some background on the topic from pioneers and leaders in the field.<br />
I am excited to see Google developing and using prediction markets internally. It just points to what, in my mind, is one of the best things about Google: they really think about <u>collective intelligence</u> (CI), in the sense envisioned by Doug Englebart &#8212; how to ensure that the many within an organization or community can process information and make decisions that benefit from scale, rather than get hurt by it.</p>
<p><span id="more-56"></span><br />
From Bill Gates&#8217; book &#8220;Business at the Speed of Thought&#8221;, it seeemed that Microsoft truly embraced the concept, but now Google is likely to have taken it to a new level.<br />
I wish that Google would take its internal collective intelligence tool suite (wikis, bug tracking, resistance to powerpoint, and now prediction markets) and make them open source or otherwise available so that other organizations could adopt them.  Of course that requires cultural inclinations that are not common outside Google.<br />
In any case, when people talk about Google&#8217;s greatest competitive assets, I think their &#8220;CI culture&#8221; should rank high up there, although I don&#8217;t recall seeing it discussed by any of the analysts.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.barneypell.com/2005/09/google-prediction-markets/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
	</channel>
</rss>

<!-- Performance optimized by W3 Total Cache. Learn more: http://www.w3-edge.com/wordpress-plugins/

Served from: www.barneypell.com @ 2012-02-07 05:59:44 -->
