« Merging of ecommerce and social networking | Main | America For Sale »
April 12, 2005
Recent Innovations in Search: Notes from BayCHI panel
The San Francisco Bay Area ACM Special Interest Group on Computer-Human Interaction (SIGCHI) hosted a panel discussion on “Recent Innovations in Search, and Other Ways of Finding Information”.
Abstract:
Search has been an exciting area, with a spate of recent innovations and acquisitions. From Visual Yellow Pages to sophisticated client side interactivity, from RSS feeds to folksnomies, developments have come at a fast and furious pace.
– Where is all this leading? – Is search evolving beyond the simple search box? – What else can we look forward to in the near future? – What type of user experience challenges do these innovations bring?
Panelists:
– PETER NORVIG, Google – MARK FLETCHER, Ask Jeeves, Inc. – UDI MANBER, A9.com – KEN NORTON, Yahoo! – JAKOB NIELSEN, Nielsen Norman Group
BayCHI Program Co-Chair RASHMI SINHA will moderate.
——– Complete abstract and bios: http://www.baychi.org/program/
There was an overflow crowd, with people sitting in the aisles.
The rest of this blog entry contains my notes.
While I’m pretty good at capturing sessions in real-time, so this almost looks like a transcript, I don’t claim that I was accurate in the notes — I did not capture everything, and abbreviated or interpreted as I absorbed the content. However, we were told the full audio will be archived so I encourage people to listen to the whole talk if you want to quote the speakers.
The format of the session began with 5-10 minute talks by each panelist. Then a set of questions were asked by the moderator. Then questions were asked by audience members.
Peter Norvig, Google 1. just launched question and answer. “what is population of japan” “when did elvis presley die” “how many moons does jupiter have” 2. search by SMS, cellphone, now get maps on your cellphones 3. maps, now integrated satellite photos 4. google suggest now called “AJAX” asynch java and xhttp request 5. desktop search your search results first, and/or results from the web
Ken Norton, Sr Director, Product Management, Yahoo! Search It’s hard to believe, but Yahoo! Search was launched only a year ago 1. yahoo search vision: enable people to find, use, share and expand all human knowledge. 2. much disc in the past is only the find part. how to expand to the other elements is now topic. 3. My Yahoo search (beta): save and share content from search. like bookmarking, but with more control and portability 4. video search more demand, content driving big interface support for media rss to submit content into the index 5. desktop search: various file formats, mail, address book 6. y!q: related search results on-the-fly bring search to point of inspiration learn more while you’re surfing and without leaving the page eg. related news and web search results shown within the page 7. mobile search 8. creative commons search search for content based on creative contents licenses. find content I can use for commercial purposes find content I can modify, adapt or build upon 9. flickr acquisition photo sharing service + community bringing that dna into yahoo thinking how users share content and tag it 10. y! developer network
next: personal search contextual search multimedia search: done video, great image search. using media to enhance general web search experience desktop local travel social media
showcase site: http://next.yahoo.com
Mark Fletcher, Ask Jeeves. Founder and GM of Bloglines. founded june 03, acq feb 05, building and expanding service search, subscribe, share, publish news, rss, blog feeds blog publishing capability as well started company as had long bookmarks site visited every day blog content is fastest growing segment of content on internet pull in 1.6M articles every day “universal inbox” metaphor “search into the future” save the search creates subscription in your account within an hour, searches will pull in new stuff eg. standing search for articles that mention bloglines searching like this will be a key factor in dealing with info overload intro’d package tracking service: status of package becomes a subscription future: search over friends, filter articles
Udi Manber, CEO, A9 1. yellow pages took >28M images country-wide walk up and down the street allows search not by keywords eg. coffee shop next to pier 1 imports on castro st covered most of NYC invented a “neutered mouse” that doesn’t click while you drive video about security problems imaging outside the white house (why so much emphasis on the yellow page images? to get acquired by Google?)
Jacob Nielsson, NN Group length of query strings in only 10 years have had 100% change in user behavior, going from typing 1 word to 2 words. mean length: 1.3 in 94, 1.9 in 97, 2.2 in 2004 we might get up to 3 words avg in a few more years don’t think we’ll get to 10 words any time soon. or boolean queries ever.
search success: – found what user was looking for all searches: 42% users who scrolled first page: 48% – by uers level of internet experience low: 32% high: 50% big issue: we’re leaving behind the people who aren’t highly literate)
- where was search done external search engines: 56% internal search on website: 33% big issue: search on websites is a miserable failure. disgrace for the field that we’ve allowed search interfaces to be so horrible on websites. intranets are even worse. search quality is the biggest issue with intranets.
eg. inexperienced user doing a search for sinus headache typing query in the login field, but clicking “search here” button click search a few more times with no results clicks on various tabs, gets same empty box goes to help but finds no help for when you don’t type anything as a no literacy user, the help is unusable goes to url bar, types: www.head ache.com gets a MSFT error, which tells her to type the word in the search box. however, it requries her to click ok before doing that then types “head ache” into the search box get a spelling correction suggestion, but wants to accept her own (mis)spelling tries clicking on things that aren’t clickable randomly mouses over a link for headaches, gets to webmd selects drugs and herbs from the left nav panel, but that’s generic, not in the context of headache
- across 25% websites, 11% of the usability problem severity was due to search (the biggest category)
Q&A 1. comments on claim that search in websites is miserable failure
udi: search is hard, even when you have large teams. for a small site, it’s really hard. you can’t just take it off the shelf and have it work.
mark: issue wasn’t just for search. how browser was laid out, text fields, etc. computers in general are way to hard to use, have a lot of work to do.
peter: personal example, was looking for thinkpad on ibm website in 97. failed. then went to google, typed it in and got the first click and found it. that was a powerful moment, where accessing a general collection of info was easier than ibm, even with their formidable resources
ken: glad didn’t show the yahoo intranet.. seeing more sites using major search engines to search their own sites. site: restriction before you undertake this problem, decide whether it’s the right approach
2. how have searchers evolved in terms of search queries?
ken: demand for better searching tools in particular vertical areas a few years ago people would just type a shopping or local query into engine. they now want the context appropriate to the vertical. “pizza palo alto” had better be relevant to the local query looking at categories of searches and how to improve within general web search. as users uncover those features, they get more adept at using them, continue to use them. challenge is how to let the user know these things are there. it’s hard to message from the search results page. what are the categories what are users expecting
peter: agrees, vertical and local search are important international is a big growth for google. deal with the world’s languages.
mark: jeeves team had 3 things. a. query length on jeeves is getting shorter. people realizing it’s not just NL queries that Jeeves does anymore b. content: tail has grown substantially. most queries are unique over the day. c. expectations have grown. searching more as searches getting better. leads to more searches.
udi: size of search is proportional to size of the searchbox. UI matters. agree with vertical search. took to a new level with opensearch, letting people syndicate their search. do for search what RSS does to content. already have 100 sources. eg creative commons was there on the first day…
jacob: avg search box is 18 chars wide. 30 characters are needed to accomodate queries people enter on sites. people don’t want to suffer with the interfaces that websites give them if websites would use more conventional, standardized navigation, eg. pull out site map and use on you cpu standardized navigation, not individual tricks site maps: the hyperbolic ones fail as they are one-off. need to be standardized.
3. how to give access to all these vertical search engines without becoming a directory?
ken: recognize that web search box is the interface the user wants and will use. rather than sending them somewhere else when they do a local search, give results in the context of that interface. directories tended to force the user to some other place. rather than ask users to find the right vertical, ask search engine to be smarted at figuring out the query intent. eg. is pizza palo alto a local query? then direct them through the web results don’t send them somewhere else but give results in a familiar ui.
also, users have gotten competent in web surfing experience. 10 years ago they didn’t know what to type. now they expect good results. up to us to deliver.
udi: people put in “population of india” and got good results. then say “what’s best tv to buy”, but best is not a query word and doesn’t work well. goal is to make it better and easier. issue to discuss: how many powerful tools to give users? how to move from a simple: query, results, repeat, to more tools that make it more difficult. fine balance betwen ease of use and powerful results
peter: we’ve failed in not allowing user to have a dialog. search box is a command line vs a menu interface, but don’t tell you what the command language is. people don’t use the advanced syntax, just type in words when it doesn’t work, we should have a dialog, give advice and tell users where they should go have had experiments, but haven’t yet had the kind of dialog you get when you ask a librarian for help (> 2.2 words, doesn’t end with just a query returning a bunch of books)
survey: how many of us have read the advanced search help on any engine? (everyone in audience)
4. tags and folksonomies: will these lead to new paadigms?
udi: anything you can do I can do meta metadata is another layer on data. we need to analyze all kinds of data and do the best, so it’s not any different the hope is people will provide more data, that remains to be seen.
ken: web is almost one big tagging system. publish and link pages, put text in the links. the link text is key for relevance. but so far users haven’t been entering that kind of metadata. tagging lets you collect the metadata from more users the more metadata the better. if tagging is a mechanism to get metadata from the actual content consumers, that could be interesting for web relevance.
jacob: it’s still work to write a tag. to get ltos of usre data, need passive data (eyeball time, etc). ken: people will do it if it’s of value to them. if there’s a task the user is undertaking, like a bookmarking service, that’s where it gets interesting.
jacob: collabortive spam (eg brightmail) is another examlpe of that.
udi: mapr search takes flickr, finds addresses in metadata and puts it on a map. but people tag most images with “me”, which is abbrev for Maine.
5. (someone from howard dean platform building commonities) at CHI last week, theme of social context of content:
Outlook giving info about size of email, should have been giving “this email is relevant to an upcoming meeting”. msft said power suerfs were most interested in who the authors were. photos: categorization of photos was most important with the social context. (eg. a photo on a road trip, where the child died next week, has more powerful context) how will search engines address the social context?
ken: users make relevance decisions based on many factors. human element is very powerful. yahoo 360 is a framework for consumers to interact. help you maintain the relationships you already have. share photos, blog postings, with those people. leverage social interactions that are already taking place. provide more ease of use to add up social context.
mark: lots of info already in blogposts, blogrolls, that isn’t being mined well yet. look at more of that stuff. it’s implicit vs explicit data.
peter: people looking for info they trust. we trust our friends. also want to discover new things beyond our circle of friends.
jacob: the most social part is email. but also workgroup support and intranet apps could make this more important too. that will hlp for finding a restaurant, but help even more for work-related problems. groupware has been underemphasized, need a shift back to collaborative computing
udi: almost all searches are stateless, start from the beginning. add more tools to allow you to get more context. we keep history of searches, so we can tell you on a repeat search which results are new, which ones you clicked on
6. paul sass: philip greenspun had email from peter norvig asking specifics of a camera. but peter works for google, why is he emailing me. so: what don’t you use search for?
peter: found a post by a friend of his, so just asked him about it. you don’t use search for info that’s not available. or things that are too complicated. search is good for bringing back a webpage. but not for synthesizing results, aggregating results from multiple pages and summarizing.
ken: I don’t use search to ask my friend a cool band to see in austin next week, or what did you learn at grad school about presenting papers on conferences.
7. mike galpern, UI design media companies became successful by leveraging the info on the web today. for groupware: we have tools for creating email. is there significant investment in allowing publishing community to expand?
mark: that’s the whole blogging phenomenon… of course it could be made easier. how many essays have you written in a browser?
udi: the web is successful as it allows people to author. 10 years ago anybody could author on the web with html. 10s of millions of people can author. if you expand that to connect them better, it’s great.
peter: lots of tools will make things easier. but I only want to read stuff from people that had to go to some effort to say something. not every random thought.
udi: I’d like to build tools to make it easier for you to do that filtering.
jacob: search engines can suck the ROI out of a website. if you can double your conversion rate, you’ll have to big twice as much on a search engine.
people don’t search when you have a favorite destination, eg. amazon. tension for how to let the people who do the work to benefit. how to help site owners build an audience directly.
8. mike higging: search engine abuse?
peter: google started as observers. now we’re co-evolving in environment with other players. we set the rules for what it means to get at top of a search result. some people try to create pages to put them up there. we have guidelines (can’t show different results to google bot and to users). have other rules that we’ll discount multiple sites linking to each other. it’s a game we can stay on top of: much rather be in our shoes than trying to combat email spam. for email spam, there’s every mail reader in the world, have to win each of these battles individually. also it’s almost free to send an email, whereas setting up a ring of websites it costs >100K. if we can reset them after all that work, they’ll get fed up with it.
ken: all great systems have parasites. your solution is never perfect, and problem is always evolving. users blame us if they get irrelevant results, so it’s essential that we evolve to keep good results. yahoo tries to be open: publish our content policies.
9. deanne harp, independent contractor with IBM
has there been progress in spoken language interface to search engines, and ui to create a dialog? good for low literacy users, accessibility…
peter: had experiment with spoken interface. saw some use. not a lot of demand for it. can you make an actual dialog out of it: that’s an independent question, can do with a spoken or typed interface.
ken: we’re only as accessible as what we link to. so what do we do when you get to the result site? as a search engine, your job is to take the user anywhere on the web, so you’re beholden to the web.
jacob: doesn’t have to be a dialgo. remember anti-mac UI? guiding principal was the power of language (vs mac is like cave man language, 1 club). also had visual richnness, ability to utilize powerful computer screens. that’s not happening much here, partly because web pages are primitive. eventually will need a combo of language and visual interface. spoken dialog is cumbersome alone. librarian example: needs natural intelligence, and patience and time.
udi: first will be triggered by cellphoens. when they have enough processing power to do voice recognition, that will make all the difference. will come pretty soon. won’t have a dialog, but pretty rich interaction.
10. is user success at searching increasing as rapidly as search providers think they are progressing? what do users perceive about their own search success?
jacob: people are getting more search reliant. feel like they can do it, more confident.
peter: cross industry consumer satisfaction index shows search engines did better than car industry etc. don’t know how far back they go.
ken: hard to take any one milestone with search. building a search engine is more like gardening than engineering. every day we make 1% of the queries 10% better. but can’t point at a single day anymore than a single plant that makes the garden glorious. we measure progress qual and quantitative over time. all this says that we’re heading in the right direction. a lot of room ahead of us.
11. bonnie brown, keynote found a branding halo effect. perhaps the battle of search engines is settled in the marketing world. how are these factored into plans?
udi: in a fast-moving area, technology and innovations matter a lot more.
ken: at various points people said tech is done now it’s packaging, but then another innovation blew the previous ones out of the water. there will continue to be breakthrough innovations in search. search is a mature industry with brand recognition.
jacob: that’s the same state from 5 years ago. users change behavior relatively fast on the web, but not instantaneously. a radically better site will grow a little each day. people send recommendations by email, shift gradually but do shift.
mark: switching cost for search is zero. easy for people to switch, market is immature still and wide open for innovation.
peter: we have 12 peopel in marketing and >1000 in search engineering, so you know where our bias is!
12. jim muller, stanford. collecting metadata from the masses: use the query streams (esp those that result in clickthrough) as tag data?
ken: less concerned with tagging results from a query as I am with results that didn’t dcome up as we didn’t know they were relevant. tagging is interesting to bring forward the results we left on the floor.
peter: that’s what jacob was getting at with implicit data. less organized than a tagging system. on flickr people add to other people’s tags. difference between individual user actions and community of interst.
udi: also runs search for amazon. when people search, we know the end result (buy or not), so feedback is better than most. we can tell how happy they were and of course we use it.
jacob: direct hit looked at what people were clicking, but that only rewarded people who were good at writing page titles that were interesting.
13. Matt J, developer on EBAY search team. computers don’t think like a human, humans are good at pattern recognition. how to get computers to think more like people? where is software going for search intelligence?
peter: computers can look across a larger data set to find the patterns. people are good at discovering semantic patterns, but not structural patterns, link-based patterns. there is plenty left to mine without getting to entire level of understanding.
ken: ther’s a human on both ends of the search. that creates advantages. eg. misspellings occur in queries and in content, so this can compensate. being in the middle of humans gives some economies of scale.
mark: skeptical of AI in general. but so much we can do in search with normal algorithsm. incorporating implicit metadata will help.
udi: hard to measure how successful we are in search. unlike a network or file system. that’s because results are whether people are happy. we don’t have a good model of that and it’s inherently hard. we have to simulate, approximate it.
jacob: it’s not going to happen, we worked on it for 50 years already without much progres so will be another 50 years before we get ai. it’s the wrong problem anyway, as we alraedy have humans. the goal is to make computer system work with the ways people want to work. more of a smooth tool. e.g. you don’t want a hammer that is like your finger.
14. fred jacobsen: saving queries so they can be recognized later. is adapting to individual user a way to improve search?
udi: have to be careful. leery of black boxes that understand you. give you the tools to interact. people aren’t one dimensional: just because someone is a doctor doesn’t mean all their searches are medical. personalizing is very tricky. start by giving you more ways to tell us what you want rather than inferring what you want.
ken: agrees 100%. My Yahoo is good example of personalization on the web. people asked for palo alto weather. give them explicit inputs and ability to change it whenever they want.
15. josh ullman, linkedin. entity search: people, jobs, products, separately from starting with command search interface. how to group and direct people? many vertical search folks say they are better than general search.
peter: several vertical areas we’re all looking at. local, yellow pages shopping search most of what we see is in the long tail. people have so varied interest that specialized searches are important, but most is not served by these vary specific vertical sites, even if they’re popular. we’ll do both, but put emphasis on the long tail.
ken: categorizing queries as shopping, local, but not saying what is the intent might be wrong. eg. sony laptop might be a query for the manual, not to purchase. so how to figure out the intent and give the user ability to refine it. that’s why skeptical about doing it standalone on a vertical. so many people start a web search and refine it.
mark: people expect a text box to read their mind. don’t want to have to go to 20 different engines to find what they’re looking for. (udi: that’s exactly what I was thinking …)
16. thinking outside the textbox: faceted navigation or parametric refinement. will this be standardized, how useful is it, etc? (most of that action is in corporate centered serach providers).
ken: yahoo found it really useful in local and product search. tested and learned a bunch of lessons around how you do that. at some point, want to apply it to the web search. but harder to know the categories about refinement options. eg. yahoo product search: have 50M products, 5M shoes sorted by what we think is most relevant. faceted nav allows you to have a dialog. check out local and product search to see how we apply.
jacob: would be most powerful if it were more standardized. winnowing tools definitely do help people in various environments. but each time you go to another place it works differently. controls, results, etc are diferfent. makes it harder for users, more complicated if even used it. for intranets: doesn’t have to be standardized, as company can decide how we’re doing it. but for web as a whole, it’s in everybody’s interest to do it but don’t have a real good web consortium for user experience.
17. adam piagenete, allegis corp (startup, govt search tools) what about helping users search when they have to search a restricted universe of websites (eg. government-related sites)?
ken: product search is an example of such a search. (bp: see also firstgov.com)
18. Luke R, webapp UI designer: what about visualizing search results?
peter: text result list gives high information density. 2d or 3d visualization can waste pixels. If we could cluster really well, the visualization might (perhaps) be a better option, but we can’t cluster well enough yet.
ken: lists are fastest. visualization always makes you wait.
mark: clustering blog search is a hard problem.
udi: visualization might be useful for users with specialized needs (eg by vertical, by user group) who are willing to invest to get results. More generally, he worries that we aren’t trying out more approaches and are converging too early on result interfaces that might not be the best. It’s too late for the metric system in the US. Will our grandchildren suffer because we didn’t experiment more with search interfaces today?
jacob: agrees that for specialized cases, we might have more visualization. another reason lists are good is that it is clear what is highest ranked. early on we numbered and gave % relevance figures to each result, but people just start from the top anyway so this data doesn’t matter anymore.
19. Why not provide back links in search results, as that would be useful?
peter: advanced search page has an operator to show back links (same for the other engines). Too much information on every search result overwhelms users. So we don’t show features that are only relevant to a small number of users. “Similar search results” feature doesn’t get much usage, and might move off the main results page too for that reason.
Ken: similar comment about users overwhelmed by complexity.
Posted by barney at April 12, 2005 11:46 pm
This entry was posted in Search, Web/Tech
Trackbacks & Pingbacks
Trackback URL for this entry:
http://www.barneypell.com/xmlrpc.php
Comments
posted by Kieran Lal at April 14th, 2005 11:35 am
Hi, I asked the question about the social context of content. The project is CivicSpace.
Cheers,
Kieran