Monday, December 20, 2004

Google Suggest Dissected Follow-up...

Thanks for all of the feedback, good and bad - if I couldn't take criticism, I wouldn't be posting things publicly, right? My New Years Resolution is to think about what I am writing, and improve my writing ability.

I just wanted to make sure that everyone understands that what I was writing about in my last post was an attempt to learn, and in turn teach people about the client-side technologies used within Google Suggest. The true technology, as many people pointed out, is the incredible back-end search / server technology, which isn't about to be duplicated by anyone that wasn't already reverse engineering everything that Google is doing. I'm hoping to that we can all start building better web interfaces and that they become common-place a year or two from now. I'm not trying to undermine / undo someone's hard work at Google. For those that weren't technical enough to understand what I had written, I haven't hacked or cracked anything :)

I have some good ideas for some other cool interfaces and client/server things I'd like to look at and investigate over the next couple of weeks - stay tuned.

Saturday, December 18, 2004

Sweet... slashdotted...

Sweet...

Tuesday, December 14, 2004

Google Suggest Dissected...

People have been contributing their two cents to how this works, but I have un-compressed (ie. re-written) Google's compressed javascript, so that the average web developer should be able to get a detailed understanding of how this works.... My final rewrite is available from my website here.

I saw the coolest thing I've seen since realizing that Mozilla was embedding a wsdl-enabled SOAP client into this browser... Google Suggest returns suggested results as you type... This is technically amazing on about at least two different levels:


  1. How fast this is... I type pretty fast, and it updates with every single keypress...
  2. The cool web interface... I used to be pro-server side web updates, and avoiding javascript, but I'm really turning around on this with the impressive interfaces I've seen with gmail, and now google suggest (among others...)


So everyone is impressed by this... My shock and awe goes further in terms of how nice this interface works:


  1. That the suggestion list lines up perfectly with the query input field...
  2. The high-lighting of the additionally suggested text (I type "fa", it suggests "fast bugtrack" and highlights the "st bugtrack" so that the next character I type wipes out it's suggestion... beautiful...)
  3. The great handling of keypresses (cursors up and down...)
    And After going through googles code:
  4. How the javascript cache's the dynamic results so that if you backspace, it doesn't have to go back to google...
  5. How the code dynamically adjusts it's main (time/alarm) driven loop based on how quickly you're getting results back from google...


So I wanted to understand the web interface and it's dynamic interface... Just a note that the good and brilliant folks at google wrote all of the code we'll be looking at here this evening... I didn't write any of it, but I will be stepping through it with you, and hopefully helping to improve everyone's understanding of this great dynamic web interface...

A couple of tips for how I went about reversing the logic here:

  1. I saved the html and javascript locally... I managed to get a local copy running, and placed some alerts into the code to observe behaviour as well as using the javascript console to catch places I made mistakes renaming variables and functions...
  2. The google code uses an XMLHttp object to make calls back to google, and executing the results... to fully understand the code, I need to see what google is sending back... BUT when I tried the url directly, I didn't get anything but a 404 back from google (it turns out I had mis-typed the generated url...)... I tried to have my browser go through a local proxy server, but it appears that the XMLHttp object doesn't use the browsers proxy when communicating (which means that this might not work if you're behind a proxy server... Can people confirm this??) ... I would have fallen back on a packet sniffer to capture the data, but caught my mistake in the URL before reaching this point...


Looking at the main page source, just go to google and view source... At the bottom of this file, we can see a reference to javascript which drives the dynamic interaface (available directly from google here...)

The good folks at google compress their code as they should, so in order to understand it, I first re-indented it as can be seen here... Then I began the fun process of figuring out what the global variables are for, and what the various functions do, and renaming them to meaningful names... I made it pretty far as can be seen in my final re-write of Google's suggest javascript codehere

Things I didn't know before this exercise that I learned going through this...
1) You can turn the browsers autocomplete off by adding autocomplete="off" attribute to an input field... How did I not know this before...
2) The XMLHTTP / XMLHttpRequest object to communicate back with a server and get new info / instructions without refreshing the page ... the new black of web development... go read everything you can about this...
3) How powerful the keypress handling can be with javascript... (capturing keyup/keydown and events and changing state for cursor key events, etc...)
4) You can highlight text in an input field using javascript...


Stepping through it:

The html page calls InstallAC()...
This set's up the system... An interesting line:
var Jb="zh-CN|zh-TW|ja|ko|vi|";
So while they say they support English only, there is definitely code that looks for locales in Japan, Korea, and China and handle requests appropriately...

The installAC function calls another function (I called installACPart2)... This function checks that our browser supports XMLHttp, creates what I call the "_completeDiv" ... the DIV inwhich google suggestions will be populated when we get data back from google... It uses absolute positioning to line it up with the input text field, and is intially hidden...
The installACPart2 function also sets up some keydown and resize event handlers...It also begins the creation of the url for which we will be making our dynamic requests to google...

The function I called mainLoop sets itself up to be called repeatedly using the javascript setTimeout function... It's interesting to note that the designers decided to use this timeout based mechanism rather than the keydown mechanism... This would handle fast typers on slow connections (so if I typed 3 characters between timeouts, a single request would go out to google...) The mainLoop checks if the state of the input field has changed and if so, takes action - looking first in the result cache, then making a call out to google... The google suggestion code also handles older browsers that don't have an XMLHttp object by using cookies and frame reloading (I haven't tried this yet...)

The callGoogle routine is fairly straight-forward... I makes calls of the format (if I am in an English locale, and have typed "fast bug"):
http://www.google.com/complete/search?hl=en&js=true&qu=fast%20bug
It sets up a callback _xmlHttp.onchange event function, that will simply evaluate the (what ends up being a javascript funciton) that gets returned from google...
What gets sent back looks like this:

sendRPCDone(frameElement, "fast bug", new Array("fast bug track", "fast bugs", "fast bug", "fast bugtrack"), new Array("793,000 results", "2,040,000 results", "6,000,000 results", "7,910 results"), new Array(""));

The sendRPCDone function is defined in the ac.js file... It adjusts timing in the mainloop, caches the results received, sets up the _completeDiv DIV with the result arrays, and ultimately ends up displaying this DIV....

The function displaySuggestedList takes the results and dynamically creates a series of DIV and SPAN data structures (using the DOM model) that ultimately form the suggestion list that gets displayed... For each element in our list, our data structure looks something like: (where (x) is the variable in the code)

<DIV (u) - mousedown/mouseover/mouseout class="aAutoComplete">
<SPAN (ka) class="lAutoComplete">
<SPAN (ua) class="cAutoComplete">
bug tracking
</SPAN (ua)>
<SPAN (ea) class="dAutoComplete">
500,000 results
</SPAN (ea)>
</SPAN>
</DIV (u)>


The Pa() function [I never came up with a satifactory name] gets called when results are received and whenever a key is pressed (and perhaps on some of the mouse events as well(?)... It does the high-lighting of text that we didn't type...

You'll want to look over and step through the code yourself to truly understand it... Let me know if you have any questions or comments... There is a good chance that I made a typo or two as I renamed things...