04.25.08
Keyword Search Death Knell Sounds
Is Keyword Search About To Hit Its Breaking Point? Yes. I made a post a few days earlier about the problem with Keyword search is that people aren’t all of the same background.
The example I use is what happens if Oprah Winfrey gets pushed infront of the L in Chicago?
What would you search for? Oprah Pushed By Fan? Winfrey killed by Train? Oprah murdered on the L? They are all valid searches that should return the results, but they won’t all work if you type them in to Dmoz, Mahalo.com, Wikia, Wikipedia, or Ask.com, certainly not if you type them in, in the first hour of the event.
A Search engine needs to know "what" a thing is. And it needs to adapt to what you are trying to search or at least give you the tools to find what you are searching for.
I think the intelligent web is closer than the 2020 that is quoted in the TechCrunch Article. The processing power required isn’t as insane as many believe because 80% of searches are for 20% of topics. But it is not the searching that is hard… It is the finding. How will we know the "Right" answer to a question?
Is it true that Brandon Wirtz is the Greatest Living American ? Or is Stephen Colbert? Which truth has more truthiness?
When a story breaks is New York Times a better result or the Chicago Tribune?
Once you get over the technological hurdle of dissecting a query into what does the user want, you need to get over the hurdle of determining which result has the most authority on the subject.
Imagine the following queries happen on the same day and the reasons for them being asked.
In the morning: a pre-recorded showing of Oprah’s Favorite things airs, and she gives everyone in the Audience iPhone 2.0.
Mid-day: a sextape of Stedman and Lindsey Lohan is released with photos on TMZ
Afternoon: Oprah remarks to her new assistant that she wishes someone would just push her in front of the L, and the assistant does just that.
So the queries come in…
Oprah iPhone
Oprah boyfriend sex tape
Oprah train
The Oprah iPhone query should get a result from Wired about the iPhone 2.0, or the Apple iPhone Page, and the Oprah website about today’s show, but shouldn’t return the 3 month old result about the Oprah Website now being iPhone "Safari compatible" which was the top hit the day before.
While Wired is the place for all things "Tech" TMZ should have authority for Sex Tape… But an interview in the Chicago Tribune says that Oprah’s assistant helped with her suicide because of the Stedman Sex tape.
And the search for Oprah Train ? Well the tribune didn’t SEO their article because it was written for print, and so it refers to the "L" and Chicagoe Transit authority, but never uses the word train… So the right result would be the fore-mentioned Tribune Article, but because NYT headlined "Oprah killed in Train Accident" search results would normally go to this article, but an intelligent search would know that the "L" is a train and would respond accordingly.
Where do I come in? HUGE lists. If you can build a classification of everything in to lots of categories you can start to build the relationships that we humans take for granted. I don’t have to write complex rules, I can use lots of simple rules to determine that things are related. A Spike in Oprah traffic means that something news worthy happened there.
Some of the rules I’m building are obvious, some of the rules are only obvious after you point them out. Like knowing the TV schedule, so that you know that a search for Grey’s Anatomy Contest is not looking for the illustration in a book but, people looking for information about how Meredith won the Glitter Pager.
Edit:
Hacking Cough is right about you having to know what needle you are looking for.
Words are there for themselves
Also thanks Jordan for pointing out the misspelling of Knell… This is what happens when you blog while Coding.
Jordan said,
April 25, 2008 at 2:31 pm
A knoll is a hill. A knell is the sound a bell makes.