Keith Devens .com |
Saturday, March 20, 2010 | ![]() |
| "Fools! Don't they know that tears are a woman's most effective weapon?" – Catwoman (The Batman TV Series, episode 83) | ||
|
| ← Spam bounces | Why Wal-Mart Works & Why That Makes Some People Crazy → |

Lincoln wrote:
Keith (http://keithdevens.com/) wrote:
Google build the "image" on the fly?
Doesn't seem so. They seem to be direct scans. They have the original pagination and formatting and even any original art that's in the book. I'm really at a loss as to how they did it.
Elling wrote:
It's OCR, man....
They scan the pages, apply OCR, and remember the position of the different words.
Probably all there is to it, I think. 
Keith (http://keithdevens.com/) wrote:
You're probably right. It just seems so much like magic to me
I guess they could remember the coordinates of every single word. They even manage to highlight phrases across punctuation and such. One thing I did notice -- in that link, the word "analogical" is split across two lines as "ana-", "logical". The search picks it up as "analogical", but the highlighter catches "ana-" on one line, but not "logical" on the next.
DJ Hannibal wrote:
It just seems so much like magic to me
"Any sufficiently advanced technology is indistinguishable from magic."
---Arthur C. Clarke, "Profiles of The Future", 1961
Feel free to post a comment below. Please see my comment policy.
Formatting Rules (No HTML):
Generated in about 0.11s.
(Used 8 db queries)
word highlighting is easy. highlighting on a phrase is not so easy unless you have the exact phrase. Google build the "image" on the fly?