I like what I have seen of Powerset, but… and this is a big but, they have spent a lot of time learning how to search 2 sites. 2 sites that are well edited, are supposed to be reasonably encyclopedic.
That makes it Cool, but a long ways from done.
Consider the following Passages which mean exactly the same thing, but are written in the style of two very different bloggers.
Powerset leverages the power of Natural Language Search to discern what you are searching for. This allows it to determine the difference between positive and negative positions, so if you are search for "Plants that are not Vegetables" Powerset will return articles about Fruits, and Legumes.
The above is "Encyclopedic" but just as useful, and more typical of web pages is the following…
The dudes at Powerset have a Search Engine that pulls results using normal English. So rather than answers that are bogus because they have all of the words "Plants that are not Vegetables" it knows you are looking for fruits, or legumes.
The first is a LOT easier for a machine to parse. Any Yoda style post that would be fine for a human is going to wreak havoc with a "natural language" search.
Powerset searches with the Language of Nature they do. Results from the meaning of your words they find. Seeking "Plants that are not Vegetables" yields Fruits and Legumes, not mis-placed pages with those words upon them. MMMM
Teaching a computer to understand context in sentences with regular word order is not particularly difficult. Working with a sample size that is 1/1 millionth the size of the Internet is not easy, but is a small feat compared to what is needed to Index everything on the planet.
And to be entirely honest searching Wikipedia is easy because pretty much everything in it is a Noun. What do I mean by this?
Wikipedia is no help if you are looking for "Setting up Exchange Server" You don’t need natural language to parse this question. but finding the answer is hard. Because you will encounter all sorts of things that look like they are the answer in the real world. "I need help setting up an Exchange server" is going to appear, and there will be very technical looking things surrounding the statement, but it won’t be the answer to the question.
Conversely finding the answer to "who was the first president of the USA?" can be broken down.
Who = Person search
Was = In the past not the present
The = the question is singular
First = question implies there were more than one
President = a Noun likely what we are looking for.
Of = Modifier of President so subset of the answers
the USA = Specific Modifier
Run a search no results found, so you run parts through a Synonym engine.
the USA = United States
Poof you now have an easy task, search for "a person" with "first president" and "United States" hopefully in the same sentence.
I haven’t gotten to play with the tool, but would it get the equally "natural language" answer to "Who was Voted the First American President?" or "Who First Filled the Role of US President?"
Don’t get me wrong I think PowerSet has a future. But I think in the near term it is in Answering questions about "small" sets of data, not the web in general. eHow would be benefited, Microsoft Encarta, Project Gutenberg, and as more an more of these data sources were indexed Powerset can get better, and can be ready to deal with News Sites, and from there, it might be able to break in to Blogging, but it will take a very long time to make it work for the Internet in General.
Or Perhaps making people write in a style that is easy for computers to parse will be a good thing for SEO in the future.
This is a response to:
Michael Arrington: Powerset’s Dilemma: Go For It, Or Sell