04.29.08
Posted in Technology News at 12:19 pm by Brandon Wirtz
Several people canceled appointments with me today. While it is possible they all came down with rare 24 hour bugs, that struck at 10am… I think it more likely that these people scored a copy of GTA4 at the last minute and are racing home to earn some achievements.
A quick visit to the Xbox Dashboard confirms this for the ones on my friends list… and 360voice.com provides incriminating evidence for the others… If you are going to use the same Screen name on Yahoo, and Xbox you might think to go offline when you are "Sick".
If you can’t beat them, join them. I’m off to conquer Liberty City. Since it is hard to do real work with everyone "Out of Office". Grand Theft Auto 4, stealing productivity across the nation.
Permalink
04.25.08
Posted in Search Engines at 1:01 pm by Brandon Wirtz
Is Keyword Search About To Hit Its Breaking Point? Yes. I made a post a few days earlier about the problem with Keyword search is that people aren’t all of the same background.
The example I use is what happens if Oprah Winfrey gets pushed infront of the L in Chicago?
What would you search for? Oprah Pushed By Fan? Winfrey killed by Train? Oprah murdered on the L? They are all valid searches that should return the results, but they won’t all work if you type them in to Dmoz, Mahalo.com, Wikia, Wikipedia, or Ask.com, certainly not if you type them in, in the first hour of the event.
A Search engine needs to know "what" a thing is. And it needs to adapt to what you are trying to search or at least give you the tools to find what you are searching for.
I think the intelligent web is closer than the 2020 that is quoted in the TechCrunch Article. The processing power required isn’t as insane as many believe because 80% of searches are for 20% of topics. But it is not the searching that is hard… It is the finding. How will we know the "Right" answer to a question?
Is it true that Brandon Wirtz is the Greatest Living American ? Or is Stephen Colbert? Which truth has more truthiness?
When a story breaks is New York Times a better result or the Chicago Tribune?
Once you get over the technological hurdle of dissecting a query into what does the user want, you need to get over the hurdle of determining which result has the most authority on the subject.
Imagine the following queries happen on the same day and the reasons for them being asked.
In the morning: a pre-recorded showing of Oprah’s Favorite things airs, and she gives everyone in the Audience iPhone 2.0.
Mid-day: a sextape of Stedman and Lindsey Lohan is released with photos on TMZ
Afternoon: Oprah remarks to her new assistant that she wishes someone would just push her in front of the L, and the assistant does just that.
So the queries come in…
Oprah iPhone
Oprah boyfriend sex tape
Oprah train
The Oprah iPhone query should get a result from Wired about the iPhone 2.0, or the Apple iPhone Page, and the Oprah website about today’s show, but shouldn’t return the 3 month old result about the Oprah Website now being iPhone "Safari compatible" which was the top hit the day before.
While Wired is the place for all things "Tech" TMZ should have authority for Sex Tape… But an interview in the Chicago Tribune says that Oprah’s assistant helped with her suicide because of the Stedman Sex tape.
And the search for Oprah Train ? Well the tribune didn’t SEO their article because it was written for print, and so it refers to the "L" and Chicagoe Transit authority, but never uses the word train… So the right result would be the fore-mentioned Tribune Article, but because NYT headlined "Oprah killed in Train Accident" search results would normally go to this article, but an intelligent search would know that the "L" is a train and would respond accordingly.
Where do I come in? HUGE lists. If you can build a classification of everything in to lots of categories you can start to build the relationships that we humans take for granted. I don’t have to write complex rules, I can use lots of simple rules to determine that things are related. A Spike in Oprah traffic means that something news worthy happened there.
Some of the rules I’m building are obvious, some of the rules are only obvious after you point them out. Like knowing the TV schedule, so that you know that a search for Grey’s Anatomy Contest is not looking for the illustration in a book but, people looking for information about how Meredith won the Glitter Pager.
Edit:
Hacking Cough is right about you having to know what needle you are looking for.
Words are there for themselves
Also thanks Jordan for pointing out the misspelling of Knell… This is what happens when you blog while Coding.
Permalink
Posted in Site News at 11:53 am by Brandon Wirtz
I like Lists, Archives, Collections… It is great when things sort nicely in to piles. www.iSayHello.com just had data from the CIA Fact Book Added, If you search for an abbreviation which is used in the CIA fact book, we now return what that abbreviation stands for. Which can be interesting because I don’t think of AU as African Union, I think Astronomical Unit, or the Periodic Symbol for Gold (the periodic table will be added shortly as well).
I also added 12k prescription medicines and their generics, along with a link to more information on the drug. Here is one Example…
This adds nearly 30k entries to the search results.
Today or early tomorrow, 15k game cheats are being added. Mostly PC first, then Console games in the coming days, 35k in all across 16 platforms.
Recipes are in the works, and I’m working to find a provider of Videos to go with the recipes, but at least very nice pictures.
What is the point you ask? To create pages that present you with the information you are most likely looking for and more choices if it wasn’t.
I also updated the site’s logo some. Not great but I only spent 20 minutes minutes on it.
Lastly, I fixed the bug where if there was no Video or images for a result I don’t try and return blank results.
Permalink
Posted in Money, Responses at 11:13 am by Brandon Wirtz
37signals has an article Why I love working with family people, which talks about why you want more family people in your startup. I don’t entirely disagree, just mostly.
Family people have a family… The right team becomes family. I’m not saying you shouldn’t hire the married guy with 6 kids, who goes to church every Sunday, but you need balance in a group. You need people who can work on Thanksgiving, others that can lead and advise lend a paternal role to your group.
I am a 20 something and I can put 100 hours in during a week, hit a deadline and then take a day recover and come back and not miss a beat. You can’t do that to a "family person". I have no problem with working with family people and I appreciate having them on a team, but don’t compensate them the same way. I really like Hourly jobs, or performance pay, because when I work 100 hour week I get compensated. If I work 100 hours and someone else takes time to pick up their kids from soccer, go to Church and takes 3 days to go see Grandma at Thanksgiving, don’t pay them the same.
David says he can get things done when the objectives are clear and the work has meaning. Well David, I think it is more important to be able to define objectives and find meaning in the work. Often there are things that need to get done which don’t have clear objectives, and are menial. But the still need to get done.
David is writing a Bitch-meme likely because some whipper-snapper like me beat him out for something recently. David when company becomes your family the company succeeds. When your Team becomes your Family the Team Succeeds.
Managers that understand building a balanced team, build teams that not only succeed but grow, and stay together even as the members go to other jobs.
Having team members who work twice as hard, accomplish twice as much and only make the same amount or a little less, because when they signed on they had fewer years of experience only serves to create a chasm on your team.
Permalink
04.24.08
Posted in Yahoo at 2:36 pm by Brandon Wirtz
Yahoo Pipes announced today that they are offering PHP Serialized Output. This means rather than parsing an RSS feed to get results you can get the results back as an array. This makes programming with Pipes faster, and easier.
I’m still uncertain how Yahoo is going to monetize this, but it is great for anyone who wants to display results a very specific way.
What I think would be even cooler, is a Pipes to PHP solution that let you use Pipes as a GUI for creating PHP that has the same functionality… I know I build things in Pipe and then after I get them tweaked spend a good deal of time recreating the same solution on my own server.
Permalink
04.22.08
Posted in Site News, Technology News at 11:45 pm by Brandon Wirtz
I’m obviously biased, so when I win don’t be surprised…but I’m going to try and skew the results in favor of the other guys because well, if my product sucks I want to fix it.
The Test:
I’m going to search for Jason Calacanis on each of the services. I haven’t hand modified my results for Jason Calacanis on my engine so it should be a reasonable test.
Speed:
Wikia: And I thought the 800ms ISayHello.com’s search results took was too long. You can go get a Starbuck’s in the time it takes for Wikia to return results. I clocked 14 seconds, for most results a few came in at 8 some at 20, but 14 seemed to be both the Mean and the Mode.
Mahalo: Blazing fast. It aint Google fast, but it is fast.
ISayHello.com: It feels slow especially compared to Google, but on par with Live.com maybe a tiny bit slower than Yahoo. I think I take a hit from Youtube and Metacafe loading.
Stumpedia: Reasonably fast, doesn’t take long to load 3 results.
Quality of Results:
Wikia: Not too bad, but not great. Calacanis.com, Searchenginewatch,ReadWriteTalk, Mahalo, Twitter, Webpro news, TechCrunch, WebanalyticsBook, WebProNews, Beet.tv…. No Photo even though that is supported.
Mahalo.com: Calacanis.com, Twitter, Wikipedia,Weblogs Inc, Forbes, Jensense, TechCrunch. A picture of Gallery of Jason from Flickr, links to a Google Video and Jason’s Ustream
ISayHello.com: Calacanis.com, Wikipedia, TechCrunch, Twitter, Valleywag. Youtube of Jason Calacanis at his home, MetaCafe of someone talking about Jason’s Keynote at Affiliate Summit, 3 good Flickr Photos of Jason, and one odd one.
Stumpedia: Tinpig, Calcanis.com, Wikipedia. No Pics, no video, nothing of note.
Summing the Jason Search Up:
None of these results sucked except for Stumpedia. Mahalo Kicked ass with its hand edited results for its CEO. And it should. Wikia was slow but didn’t suck, Stumpedia just plain fails. 3 Results? I mean Jason is not Britney Spears but he isn’t a nobody, and you would expect that anyone in the search engine business should have something for him. (Matt Cutts only got 2 results)
But lets say you dislike Jason. I don’t dislike Jason, I think he is a smart guy, and he left a comment on my blog or someone using his name did, so hey, I’d buy him a drink, or a dinner, but lets say you didn’t like him. His start up crushed yours or something like that…
So you search for Jason Calacanis Sucks, because lets face it sometimes you aren’t looking for "Happy" search results.
Stumpedia: Zero Results
Mahalo: You get google results, and they all say Calacanis and Sucks but none of them are about how Jason Sucks.
iSayHello: Digg-Why it Sucks to be Friends with Jason Calacanis. That counts as a "hit"
Wikia: Jason Calacanis Sucks by MorningCoffeeNotes.com that Counts as a "Hit"
So what does that say….
Wikia doesn’t suck. It is slow but there is promise, I don’t think it offers any improvements over Google, but maybe someday.
Stumpedia, yeah it sucks. Too Few entries 4500 or so.
Mahalo, Good but only if you search the way it expects. You can’t get hits on a lot of phrases. I actually got better results doing a site:mahalo.com search in Google than I did using their search bar.
ISayHello: You knew I was going to declare victory, it’s my site. Mahalo is faster. And if you are the CEO you definitely get better results than my Generic Template can give. But I also had results for Jason and his level of Suck which I count as a victory because you aren’t always looking for "encyclopedic" results.
This is a Response to:
Venturebeat - Search Wikia takes a step closer to the promise of ’search meets Wikipedia’
Mathew Ingram - Wikia Search: Edit anything and everything
CenterNetworks - Wikia Search Launches Major Enhancements to Search Alpha
Permalink
Posted in Money, Responses, Technology News at 3:15 pm by Brandon Wirtz
The problem with entirely human powered results is the amount of time it takes to build a library of results. At www.ISayHello.com we are focusing on finding the best places for categories of results, and then working to categorize every search term so that you get good results. We are also creating content for top results. This allows us to be relevant for everything, and great on the most popular results.
Granted with only 48 hours of being live we don’t have a huge assortment of customized results, but we are able to move much faster than most, because we aren’t focusing on re-writing 300 words from Wikipedia for every result, we are instead focusing on finding the best results on the web for large categories of data, and so tomorrow when we add 24k results for prescription medications those 24k results will be much better than they were today. And unlike Mahalo or Stumpedia the improvements to those 24k results didn’t cost us even a dollar an entry.
The model is in a lot of ways like Google’s where you "tune" the algorithm, except that we will also be tuning the layout of results, what items are on the page, and how we present data.
It is our goal that you could use ISayHello.com as your primary search engine. You can’t do that with ChaCha or Stumpedia, or Wikipedia.
This is a response to:
TechCrunch: Miss Tormenting ChaCha Operators? Let Me Introduce You To Stumpedia
Edit: Stumpedia has 3,981 links - 803 members - over 4,500 search terms… We launched with 108k tuned results, 1million links, and I have no idea how many search terms… and I expect to add 10k a day.
Permalink
04.21.08
Posted in Site News, Technology News at 10:35 am by Brandon Wirtz
www.isayhello.com went live last night. The front page is ugly, I don’t have as many human edited pages as I would like, but it is day one, and this is the first step. In the coming days you will start to see better and better results pages, and more "Category" templates.
Not all results will be hand edited, but more and more topics will get classified into a category which will give results tuned to that category type.
What is ISayHello.com?
It is a hybrid of traditional computer generated results, human classifications of results, and human generated content. Keep in mind that ISayHello.com was built in about 150 hours by one guy. (Brandon Wirtz).
What isn’t ISayHello.com?
A replacement for Google, Yahoo, or Live.com Search results. Because we don’t index specific pages you will never be able to find a specific phrase on a site, but you will find specific phrases in titles, topics, and categories.
A Mahalo.com clone. The biggest difference between our results and Mahalo.com results is that we have results for pretty much everything on the planet. Mahalo.com only has results for topics that people have written results for. Much like Mahalo, iSayHello.com creates relevant links to sites that we think would best help you find answers to your searches, but our results change dynamically based on trends for the topic. For example if your favorite actor gets hit by a bus, we will switch from being 100% links to that actors works and biographies to a blend of links about that actors works, and news results about the recent incident.
What will ISayHello.com become?
ISayHello is working hard to add features almost hourly. Things like Game Walk Through / Cheat Data Bases, and Biographies of everyone on the planet. Some of these will just be willed in to existence and improve all of the results they are tied to, and others will be a process. (we can’t have a biography of everyone until you submit yours now can we?)
ISayHello is also working on relationships with other "engines" so that we can bring you results for the best price on products related to your searches when appropriate.
How does ISayHello.com work?
ISayHello.com works like many search engines, but is a "Just in Time" search engine. We Scour a huge number of places based on the type of search we think your topic is, and return results on that topic. We cache those results for a time, but if you come back in a few days you may get different, likely better results.
Our results take a little longer to "come back" than other search engines because of this, but we think we get better results because of this extra time. If you are doing a search that someone else has done recently you will be served a cached results and should get a fast response. As this is the beta very few speed optimizations have been made, so expect that results will get faster over time.
How can I help with ISayHello.com?
We will be posting various ways you can contribute. So check back every so often. We will also be running contests, and posting full time positions.
Permalink
04.18.08
Posted in Site News, Technology News at 2:57 pm by Brandon Wirtz
For a long time I was competing in my own way with Mahalo.com Jason Calacanis’s human powered search site. The formula was simple, take the Google.com/Trends data, then sort through the most "ownable" terms which was basically defined as anything with fewer than 10k competing pages and have a girl in India create a blog post about them. This is not quite Jason’s model, he was creating pages for every term, but as he has Sequoia’s VC money behind him, and I only have my own money, I had to generate revenue to pay for the content creation.
The model worked. I could recoup about 10% of the pages the first day, and 40% the first month, and the rest I’d lose money on, but taking a 6 month average all pages would break even and begin earning money. The problem is I don’t really have the bank roll to sponsor 6 months worth of posts to make $5 on each of them. The model doesn’t work that well.
If I write the posts most of the time I can make $5 the very first day, and for several days after that, but my time is worth more than $15 an hour.
So I needed a solution that would create pages at least equal to the quality of a Mahalo.com post, and created at zero cost.
Using nothing more than my server and content that is available through various web API’s. The results are a bit slow if you are the first person to search a term, but caching makes the results fast for the next person.
If you’d like to be in the beta, contact me (Brandon at XYHD.tv) I’ll point you to the site.
I’m still in the process of picking a domain, all the good ones are taken, but likely I will have several, each tuned for different types of searches.
I was blown away that there is not a good "classification" tool on the web, basically just to sort, this X term is of type Y. Like Britney Spears is a Person, Paris France is a Place.
My results could be a lot better if I had this because then I’d know where to look for types of queries. No worries I can build that in to the logic later, or make that part of the human part of the equation.
Permalink
04.13.08
Posted in Responses at 10:02 pm by Brandon Wirtz
Philip Parker is almost my idol. There are a lot of black hats that use computer generated content to make money. Most of them don’t use them to game Amazon.
The problem is as is often the case that they suck. You see most legit uses of computer generated content create tools which the user knows are generated. Google is computer generated content. Wikipedia is not. I think when you boil it down, this is the difference between computer and human generated content. Google points you in the right direction, and Wikipedia for all of its inaccuracies about hard topics is at least generally complete for the menial stuff.
Parker’s works which cover medical topics on the other hand strike me as border line dangerous. Some of my computer generated services return fun information, like singles who have expressed interest in the topics of a given webpage, which I bill as your readers might look like this…. Has little practical use, and no one is likely to be injured by their use. Parker’s books on the other hand are simply compilations of other reports and compendiums of non-copyrighted materials, but are also not fact checked.
Parker has 200k books, I know black hats with 200k websites each with 200k pages. Parker is just more willing to take individuals money, where as black hats prefer to take that of online advertisers. A quick Turing Engine, some starting keywords, and you have as many pages on what ever topic you want. Throw in a Thesaurus and you have the makings of a $100 a day website.
I find both practices distasteful, but atleast the blackhats are trying to fool the computer, not unwitting shoppers.
All of that said…. It does lend to the Public Data conversations of earlier this week.
Permalink
« Previous entries