2007年12月8日 星期六

Google's handiwork

Book scans reveal Google's handiwork


An example of some of the hand scans caught on Google Books.An example of some of the hand scans caught on Google Books.

Asher Moses
December 6, 2007 - 4:06PM

In its rush to digitise the corpus of human knowledge, Google inadvertently created a new type of "digital" database.

The search giant's army of human book scanners have reportedly scanned over one million volumes at a rate of 3000 books a day, uploading the full text of each to http://www.books.google.com for anyone to search and, in the case of public domain books that are out of copyright, read.

And you can tell they're in a hurry, as some overzealous scanners left their fingerprints on the pages of history - literally.

Digital bookworms reading titles like the 1855 issue of The Gentleman's Magazine and Plato's The Trial and Death of Socrates have been surprised to find large chunks of some pages blocked by manicured paws clad in pink finger condoms. Others have had their reading interrupted by folded and poorly scanned pages.

Google Australia spokesman Rob Shilkin said Google was now using improved scanning technology to ensure wayward fingers no longer get in the way.

He said Google Book Search, launched as "Google Print" in October 2004, wasn't designed foremost as a place where people could come to read. Most books could not be shown in their entirety due to copyright issues, so the site was best used for searching. Unlike the computers in the local library, Book Search allows people to find books by searching their full text.

"In our lifetimes, it should be possible for a schoolkid in a remote part of Australia to get a list of every book ever written about Albert Einstein or Don Bradman, just by typing a few words into a computer," Shilkin said.

"In the time since we initially began our scanning, we've vastly improved our scanning technology so that a random finger is automatically brought to our attention long before we return the book back to the shelf."

He said Google would fix errors in pages scanned earlier in the project as they were brought to its attention.

Over 10,000 publishing partners are providing books to be scanned for the project and 27 universities are participating, including the Harvard, Stanford, Oxford and Princeton libraries.

However, while all of the partners allow Google to scan books in their entirety, many do not let users actually read all pages online. Some only show sample pages or snippets of basic information, alongside links of where the book can be bought or found in a library.

But out-of-copyright books like old editions of Macbeth can be downloaded as a complete PDF for printing or reading on the computer.

In September, Google added a feature called "My Library", allowing people to add books to their own personal online library and label, rate, review or search them.

沒有留言:

網誌存檔