« First Round Capital Holiday video | Main |
March 23, 2009
Wolfram Alpha: A New Kind of Question-Answering System
There has been much excitement recently over the upcoming launch of Wolfram Alpha. This is a new question-answering system developed by Stephen Wolfram, inventor of Mathematica, and it is scheduled for a beta launch in May. Wolfram has been providing demos to industry insiders. I haven’t had a demo yet, but I have learned what I could from reading articles by Nova Spivak (“Wolfram Alpha computes answers to factual questions. This is going to be big”) and Doug Lenat (“I was positively impressed with Wolfram Alpha”). And this weekend I spoke with William Tunstall-Pedoe, CEO of True Knowledge, who also got a demo. Many of my examples and conclusions come from conversation with William (thanks!). Since life is short and so is the attention of web readers, I’ll give the rest of my thoughts in bullet form.
What it is: A new kind of question-answering system.
Examples
- Math: “2+2″ and then a few simple math questions: “integrate xsin^4xdx”, “what is the square root of 18″ etc.
- Business: “gdp france” showed amount and graph of how it changed over time. “gdp france/germany” showed graph with both amounts and the ratio
- “internet users in Europe”: Showed total, and a chart of usage by country in Europe, at the current time, specifically highlighting the biggest and smallest
- “ISS”: generates a graphic rendition of the international space station orbiting earth and updating in real-time
- “tides in san Francisco”: showed a graph of tides over time, where the times were listed in the local time regime current in the late 19th century for those data points. “tide NYC 11/12/1922” gave a single answer.
- “weather”: showed graph of average temperature in Cambridge, MA (where Stephen was when doing the demo). Based on reverse IP lookup.
- Computational fluid dynamics: typing in the name of a specific aerofoil produced a picture of that aerofoil along with its differential equations.
- stock prices: “MSFT CSCO” showed comparison chart
- chemicals: Substances at temperature or pressure, got physical properties calculated. “H2SO4” showed a diagram and chemical properties. “5 molar h2s04″ did something cool, I don’t know what.
- genome sequences: “AGTAG” shows sequences from the human genome that match that pattern
- data about people: “How old is Barack Obama” gives his age now. “When was Alan Turing born” gives the answer. “How old is Alan Turing” (a trick question) gives an error message with no human-readable explanation (True Knowledge, by contrast, tells you exactly why this is a trick question).
Coverage of data: It answers questions over the following types of structured data:
- static tables and databases (e.g. a database of internet usage by country by year)
- dynamic data feeds (e.g. historical stock market data, position of space shuttle, weather)
- numerical inference (e.g. math questions)
- numerical computations and simulations (e.g. tides, astronomy, chemistry)
Form of queries
- Answers can be a single fact, a table, or a graphical display of a live simulation. Usually it’s a combination of these.
- For ambiguous queries, it always picks one interpretation. And you can switch to something else if that’s wrong. (A drop-down menu of other alternatives).
Domains and Generality
- Closed domain: A specified domain
- Multi domain: Multiple domains are covered, we try to add more domains, but still treats each one a closed. Note: this can be accomplished through a unified or disjoint treatment.
- Open domain: Any domain is within scope
- Define the objects in the domain
- Make a table of function names and attributes in the domain, and for each function or attribute list the restrictions on the type of objects that this can apply to.
- Standardize representations of time and place and charting elements associated with these.
- Import and normalize data
- Associate data fields to objects and attributes in the domain
Infrastructure
- Question answering has been an important part of search results the whole time, but it has often been a second class citizen and hardly promoted
- By increasing the level of comprehensiveness of structured questions (in terms of data and domains), this can increase awareness and usage of question answering systems
- This should move question answering to be more of a competitive feature across search engines
- Users will want to ask questions for structured and unstructured queries, not just structured queries, which will increase perceived differentiation for technology like Powerset
- If the use of structured data and simulations prove valuable to large number of users and search engines, then this will increase the need to transform and route queries to vertical experts, potentially developed by ecosystem partners
- This will increase the need and value for ecosystem players to add semantic markup to their structured data and simulations, hence making it easier to offer more semantic question answering and integration with other services, and expanding the value of the services by search engines in a virtuous cycle
Conclusion
In conclusion, Wolfram Alpha is not going to be a new search engine or a universal answer engine. It is not going to put the existing major players or semantic search startups out of business. But there appears to be real innovation here, leading to at least a new kind of system that we have not seen before. I am eagerly looking forward to my turn to try it out.
Posted by barney at March 23, 2009 10:03 pm
This entry was posted in Collective Intelligence, Human Language Technology, Information retrieval, Powerset, Science, Search, Software, Web/Tech
Trackbacks & Pingbacks
Trackback URL for this entry:
http://www.barneypell.com/xmlrpc.php