March 12, 2006
In-car conversational voice interfaces: Speak With Me and VoiceBox
I’ve been meaning to write about this for a while. Here are two comments about innovations in speech processing for embedded (eg. in-car) applications.
- My friend Ajay Juneja is founder of Speak With Me, a startup providing voice-based interaction with in-car information services (stereo, ipod, GPS). I got a demo of his system last year, when we drove me around in his car and we navigated his Ipod with voice. There were a few bugs back then (which Ajay tells me are now fixed) but in general the system worked pretty well. Speak With Me got a good writeup on TechCrunch a couple months ago, and Ajay says the good news keeps on coming. I’m trying to convince Ajay to give me a beta version of his Ipod voice navigator for my birthday.
- See the article in PC-Magazine entitled: IBM Strives For Superhuman Speech Tech
. To quote the article:
IBM unveiled new speech recognition technology on Tuesday that can comprehend the nuances of spoken English, translate it on the fly, and even create on-the-fly subtitles for foreign-language television programs.
VoiceBox Technologies have already implemented the new ViaVoice for in-car navigation, to control XM Satellite Radio via conversational speech.
March 10, 2006
Visual Image Search
I just came across Tiltomo, a new visual search engine (thanks to a posting on Silicon Beat). Most visual search methods are based on color histograms, so they are good for finding similar color schemes to a reference image (eg roses, sunsets). This one also lets you constrain the search by the “theme” of a reference image (without having to specify the theme yourself). The homepage has a nice demo using images from Flickr. I did a search for the tag “spider” and got lots of great spider images. Then I selected one of them, with a close-up of a spider in a web, and got many more similar ones to that. For other searches, I found it easy to get collections of photos of people’s eyes, stuffed animals wearing stripey costumes, and of course, swimming suits.
I did find myself wondering about how much human effort was involved in preparing the data for use with the service, and how that would scale up. The system seemed almost too good at knowing which photos were of people vs. animals, men vs. women, and using this information to find related photos.
While I’m writing about visual search, check out AHOP2, which lets you do find websites based on aesthetic style. It’s a good way to look for furniture that might go well in a modern house, for example.
Down the line, I know of several companies that will be bringing visual object recognition into search. That’s a topic for a later post.
March 10, 2006
Calendars and Natural Language
- Visiting Granny at 12 on March 1st: no problem, off to a good start.
- Flying to Cambridge on the 2nd then back on the 4th: sweet – it got the two dates. They were placed in February, so no context (my last entry was March), but that is probably the right behaviour.
- Flying to Boston tomorrow: this got entered in today’s field, could be a time zone problem. It is 2am on the 31st where I am, so the entry should have been on the 1st of February.
- Flying to Boston in a week: nope, turned up in yesterday’s list. Could be related to the issue above in that it really meant to put it in today’s list – either way it’s wrong.
- Flying to Boston on Thursday: no problem.
- Flying to Boston a week on Thursday: nope – just Thursday.
Flying to Boston on the 30th of Feb: oops – turns up on the 2nd of March. An understandable error, but certainly a corner case that needs to be addressed.
I tried out the site. My initial impression of the look and feel was good, including the appearance of the calendar itself and the bubble tips for new users. I found about 75% of my NL inputs were handled correctly on the system, but as Matt says it is likely that users will learn which cases work well and then get high performance using those patterns.
It’s perhaps the best kept secret at Microsoft, but did you know that Microsoft Outlook already supports some natural language entry of calendar events as well? Open up an appointment (new or existing) and in the day slot for the time, enter “fourth monday of April”. Outlook converts that into the correct date. I use this feature all the time.
TechCrunch has a writeup about SpongeCell and many other players in the Calendar2.0 space. That page features 73 comments, many by other calendar companies, so to some extent this captures the current state of play on this topic. One missing related company from the list is TimeBridge (product coming soon), a Mayfield portfolio company for which I’m advisor, along with my friend Mark Drummond (founder of one of the Calendar1.0 companies, called TimeDance, for those who remember).
After writing the first draft of this article, I saw another post on TechCrunch about Google’s upcoming calendar. The information was leaked by a beta tester, and includes detailed screenshots. Key elements from that posting:
- Natural-language event input: Just type in your event details and the service will parse them to fill out the form automatically: “dinner with Michael 7pm tomorrow” (just like SpongeCell).
- Event pages: Create an event and automatically have a web page you can share with friends or the world at large.
There is also discussion about Google Calendar scraping events from around the net, just like Zvents does.
All of this was to be expected. It’s not necessarily bad news for all the Calendar2.0 startups. I think they’re not likely to succeed standalone and Google is unlikely to acquire them, but the enhanced feature set will likely become important for all the major portals, leading the top 3-5 of the new entrants to acquisitions in the near future.
Update: I got pointed to another cool calendar2.0: 30 Boxes. It also supports natural language entry, and it looks like it makes it really easy to share events with friends and family.