SlideShare a Scribd company logo
1 of 12
Download to read offline
Maxakova 1


Vera Maxakova

ATEC 6V81

David Parry

December 13, 2008



                Google’s Print Digitization Efforts: Benefits and Obstacles

       In its constant quest to “organize the world's information and make it universally

accessible and useful” in indexing vast amounts of digital content, Google plunged into

a previously uncharted territory of attempting to digitize and then index books, whose

practice, unsurprisingly, raised the dreadful issue of copyright among some other,

somewhat surprising concerns. The Google Book Search project digitizes and indexes

books that were attained through their Library Project and the Partner Programme,

which allows users to see not only website results, but also snippets of text or full books

that match their query. This is something that has never been possible before the

internet, and with this project Google is enabling millions of people all over the world to

search and access books that they may never have been able to discover or much less

have access to otherwise. In this project, Google realizes the dream of every library

before it that would never have had the means to even come close to achieve the goal

of providing free and easy access to the public. But not everyone seems to appreciate

the potential or the benefit that this project may bring to the public in relation to current

copyright laws, and the lawsuits were quick to follow as Google started adding non-

public domain books to its new “library.”
Maxakova 2


      One of the concerns was brought up by the president of the Bibliothèque

nationale de France, Jean-Noël Jeanneney. In Google and the Myth of Universal

Knowledge Jeanneney’s main argument is that Google Books, being an American

company, will tend to give preference to English-language books and “the dominance of

work from the United States may become even greater than it is today” (Jeanneney 6).

In May of 2005 his fears were confirmed as Google released the first version of what

was then known as Google Print, in which “the inevitable self-centering of the selections

was immediately apparent” (Jeanneney 11).

      Jeanneney’s concern can be justified. Since Google strives to be the archive of

all knowledge, according to the old model of the archive, if a title made it into the

archive, it meant that it was an important piece of work that needed to be preserved.

Due to the constraints of physical space the archives and libraries had to leave out

some works in favor of other, more significant works, and other works were therefore

deemed less important and were left out. The possibility that Google would give

preference to U.S. works over European, non-English ones could have been interpreted

by Jeanneney as Google considering the works somehow unworthy of inclusion in their

project, but that would be a gross misinterpretation, considering that the Google Books

project is still in very early stages of development. Google only recently reached a

settlement with the Association of American Publishers who filed a copyright

infringement lawsuit against them back in 2005, and it would be nothing short of suicidal

for Google to try to digitize books in foreign countries when they are having so much

trouble with just the local copyright laws. In 2006, in June alone there were two foreign

lawsuits filed against Google’s new project. La Martiniere, a French publisher, accused
Maxakova 3


Google of “counterfeiting and breach of intellectual property rights” when Google

indexed and published excerpts of about 100 of the publisher’s titles (French book

publisher sues Google). The second lawsuit was filed by a German publisher WBG,

backed by the German Publishers Association, which was dropped at the end of the

month as “the [German] court ruled that there was no copyright violation resulting from

the development of Google’s project” (Google's victory in court against German

publisher). These cases are a clear indication of what is to come if Google attempts to

expand into other countries, especially in this early stage of the project’s development

and while the copyright situation in the U.S. is so unfavorable to the project.

       On the contrary, according to Google’s Chairman and CEO Eric Schmidt,

Google’s practices are within the confines of the copyright law’s “fair use” doctrine,

“balancing the rights of copyright-holders with the public benefits of free expression and

innovation [that] allows a wide range of activity … without copyright-holder permission”

(Mathes). On these grounds, the University of Michigan permitted Google access to

digitize the university library, whose head librarian, Paul Courant, also agrees that

Google is not breaking any copyright laws in scanning books and providing free access

to them and states that “the University of Michigan (and the other partner libraries) and

Google are changing the world for the better” (Anderson) in allowing Google to do so.

       Unlike the old notion of the archive, books that are digitized by Google are

preserved, not only on their servers, but also in many cases the servers of providers of

those works. The aforementioned University of Michigan, for example, not only keeps

the books scanned by Google, but also gets the digital copies of the scanned works to

use for their own purposes. Besides the obvious intention of allowing users to see book
Maxakova 4


texts in their search, another great benefit to this mass digitizing is preservation of these

books from damage and loss, which often happens in today’s libraries. This is especially

important for rare and out-of-print works. “Checking out” books through Google’s new

system would accomplish the same principle of dispersal with no damage to the books

or the risk of losing them, as well as having multiple backups of each book. The notion

of the book is fully realized on the internet through projects like Wikipedia and now,

Google Books, where knowledge is collected in one place can be easily accessed

(dispersed) by anyone with an internet connection, and is not threatened by the effects

of said dispersal, which is the main idea of the book and the archive (Paper Machine

15).

       Transitioning into a new, virtual space and out of the constraints of the physical

space, allows not only for a vastly larger collection of books, but also for a new and

more efficient way of searching and sorting them. Google takes the age-old concept of

the card catalog which is no longer limited to the space on a note card, and includes the

whole body of the publication into its index. However, some people are still struggling to

make the mental switch from the old, physically limited model, among these Anne

Bergman-Tahon, the head of the Federation of European Publishers (FEE).

       Bergman-Tahon believes that “virtual borrowing” will threaten the book and the

libraries and bookstores that do not have the physical space to store the volume of

books that can be stored on the internet, to remediate which, she plans to “limit the

number of copies available to web users. When there are no copies left on the virtual

bookshelves, they will have to either reserve a copy and wait, or go to the bookshop

and buy an e-book” (Mompel). Ironically, this practice defeats the whole purpose of
Maxakova 5


having a “virtual bookshelf” with digital copies that are not restricted by the confines of

the physical world and can be distributed to an unlimited number of potential readers in

any part of the world, at any time. The idea of digitizing books is to distribute knowledge

to as many people as possible with very little or no barriers to entry, which could not be

accomplished previously with the brick-and-mortar bookstores and libraries.

       It is important to note that the library as we know it today did not always operate

this way. In the seventeenth-century Oxford’s Bodleian Library, in their attempt to

safekeep the books, refused all requests to check them out and take them home. The

policy was so strict that even King Charles I himself was declined this luxury that we

now take for granted. “The library was a temple of learning, where scholars might come

to read and learn. The books stayed put” (Macintyre). But this is not the case today.

Today, anyone can come to the library, take the book home and study it at their leisure.

Google Books and similar book digitizing projects are simply taking this concept a step

further by bringing the library online where the readers are not constrained by the

library’s operation hours or physical location. “Technology has made achievable what

the librarians of Alexandria could only dream of: one vast, searchable, all-encompassing

book, the complete history of the race” (Macintyre). The seventeenth-century Bodleian

Library model evolved, and it may be time for the 20th century library to follow its

example as the technology changed and the library can become what it was always

meant to be – a repository of all the world’s knowledge at the readers’ fingertips.

       Furthermore, Bergman-Tahon also fears that paperbacks will disappear and

libraries and bookstores will be forced out of business. The argument is as old as the

printed word itself. When the printing press gained more popularity, there was a similar
Maxakova 6


concern for the scribes being out of work, and as history showed, they adapted to the

new technology. One such case was documented by Elizabeth Eisenstein in The

Printing Revolution in Early Modern Europe, in which “the most celebrated Florentine

book merchant” in the late fifteenth century, Vespasiano da Bisticci, was forced out of

business due to “dealing exclusively in manuscripts,” while his rival Zanobi di Mariano’s

business flourished since, unlike Vespasiano, he began selling printed books

(Eisenstein 18). The bookstores as we know them today may in fact be forced out of

business or face significant difficulty in trying to stay in business using the old model,

but inevitably, new bookstores will emerge and will thrive as they embrace new

technology.

      Google’s idea is not only to store the knowledge, but to make it easily accessible

and usable as resources that cannot be found by the user may as well not exist.

Indexing the whole text of a publication increases its chances of being found when an

appropriate search query is entered. Jeanneney argues that a project of this magnitude

and significance should not be left up to a private company but needs to be managed by

a more stable agency, such as the government, contradicting an earlier statement that

government-run libraries and archives are “chronically underfunded” (viii). More financial

support from the government would definitely help such projects, but as history had

shown, the government fails at this miserably, so why would this change now? Leaving

this job up to the government with their poor history of funding such projects would

mean that the digital library project would either never have been started or would not

be as rich and successful as it will be in Google’s hands. And when a private company
Maxakova 7


with enough means and ambition wants to pursue this endeavor it should only be

encouraged onward.

       As stated earlier, it is important that resources are findable, and who better to

provide that “findability” than the search engine with the best search algorithms?

“Libraries die when people forget what is in them: they thrive when we are reminded of

their riches” (Macintyre). Inability to find a publication threatens the dispersal of

knowledge, which renders the resources useless if they cannot be found and thus

dispersed to the users. How are we to trust the government with this colossal task of

collecting, digitizing and making easily available more books than it was in charge of

managing in the old style libraries and archives at which it was obviously failing by

neglecting to provide financial support? Even if the government were to accomplish the

task of collecting and storing this vast body of work, how would it go about providing for

easy access and use by the people? Democratic institutions can be measured by how

much access its people have to the archive (Archive Fever 4), and considering the way

most government-run websites are built and inexplicable malfunctioning and

ineffectiveness of the search function, it is hard to imagine this project reaching its full

potential while being under the administration of the government (as it is today).

       One of Jeanneney’s biggest concerns seems to be based on what criteria would

Google choose what books should be included in the Google Books database and that

it is up to Google to decide on those criteria (5). He seems to be very uncomfortable

with the idea that a private company will have the power to make this important

decision, which was so recently left to government-operated and -subsidized libraries

and archives. His fears may be well justified in this case (although Google promises to
Maxakova 8


not be evil). As Jacques Derrida stresses in Archive Fever, “[t]here is no political power

without control of the archive,” which would mean that the entity that controls the

archive – in this case, the largest archive ever assembled – would hold unprecedented

power, a monopoly on knowledge (Archive Fever 4). Thus, it is understandable why

Jeanneney may be disturbed by the idea of one private company controlling the largest

knowledge bank in the world and why he suggests that for this reason a government

agency is a better fit for the job.

       So far the only obstacle preventing Google from indexing every book in the world

is the copyright law. Unlike other companies that attempted to digitize books, Google

first digitizes the books and then presents the publishers the opportunity to opt out of

being “published” in Google’s library. Other services, such MSN, Yahoo! and even

Amazon with their new Search Inside!™ feature, first obtain permission from the

publisher before posting the titles’ full text or even a limited preview online. According to

Google, this practice would slow down the digitization efforts (Eun) and most likely,

significantly increase the cost if Google were to contact every author and try to obtain

authorization for use of their content. This is what Google calls the Opt-Out Approach

and this is the reason Google has been more successful at digitizing a larger amount of

books than the competing services offered by MSN, Yahoo! and Amazon. This

approach is more efficient as some authors may not even be aware of Google’s efforts

to digitize books even if they are willing publish their works online through Google, and if

they have not been contacted by Google to obtain permission, their work would not be

published and users would not be able to find it.
Maxakova 9


       What most people don’t realize in the midst of these lawsuits and criticisms of the

project is that publishers will (and some already do) in fact benefit from this new

exposure of their works on the web. The Google Book Search information site’s

“Thought & Opinions” section provides some quotes from publishers and authors who

understand the marketing potential and the benefits of the project and the benefits they

receive from it and praise Google for undertaking such a substantial project. One such

documented case is C.S. Lewis’s Mere Christianity, which in 16 months had acquired

351 page views, and only 14 clicks on the publisher HarperCollins’ site, meanwhile the

same book on Google Book Search had over fifteen thousand views and almost three

hundred click-throughs (“LBF Daily”). Google’s project helped this publisher raise

awareness of their backlist books that may not have been discovered or bought

otherwise. In light of these facts, it is ironic that companies like the AAG (Association of

American Publishers) would seek reimbursement for damages for copyright

infringement from Google, when they could have been benefiting from their services all

this time.

       Another interesting law suit filed by a few large publishing companies which

included Simon & Schuster, the Penguin Group, and McGraw-Hill, attempted to require

Google to “destroy all unauthorized copies made by Google through the Google Library

Project – [now Google Books Search] – of any copyrighted works” (Toobin). Although,

as ridiculous as this request may be, it brings up a very interesting concept – how does

one destroy a literary work on the web? In the physical world this could be

accomplished with book burnings when books were still rare and there was a chance of

exterminating works out of existence. This concept today seems ludicrous, and even
Maxakova 10


more so in the near future when full works will be available on the web and possibly

even downloaded, where permitted by the publisher. But what is most interesting in this

case goes back to Jeanneney’s fear of Google’s monopolization of digitized books. If

Google Book Search ever becomes the main and sole source of digital works and is

somehow forced to destroy a book and complies with the request, would that be the

equivalent of a modern book burning?

       Copyright law was created to protect the creative work of authors and publishers,

but never was it meant to limit the public’s access to said work. Google is trying to

exploit the latter and provide virtually limitless access to works previously unsearchable

and (in some cases), thus, undiscoverable on the web. If a book cannot be found by a

potential reader, it undermines the whole idea and the reason for its existence. Google

is not only trying to create an archive of all published works, but, most importantly, they

are trying to make it easily accessible and searchable within context and relevant to a

particular search query entered by the user to enable him or her to discover new books

and articles which he or she may not have been able to access otherwise.
Maxakova 11


                                     Works Cited

Anderson, Nate. "University of Michigan librarian defends Google scanning deal". Arts

      Technica. 18 Nov. 2008. <http://arstechnica.com/news.ars/post/20071126-

      university-of-michigan-librarian-defends-google-scanning-deal.html>.

Derrida, Jacques. Archive Fever: A Freudian Impression. Trans. Eric Prenowitz.

      Chicago: University Of Chicago Press, 1998.

Derrida, Jacques. Paper Machine: Cultural Memory In The Present. Trans. Rachel

      Bowlby. Stanford: Stanford University Press, 2005.

Eisenstein, Elizabeth. The Printing Revolution In Early Modern Europe. New York:

      Cambridge University Press, 1984.

Eun, David. “Our approach to content.” The official Google Blog. 26 Sep. 2006.

      <http://googleblog.blogspot.com/2006/09/our-approach-to-content.html>.

"French book publisher sues Google," BBC News 7 June 2006. 9 Dec 2008.

      <http://news.bbc.co.uk/1/hi/entertainment/5052912.stm>.

“Google's victory in court against German publisher,” EDRI-gram 5 July 2006. 9 Dec

      2008. <http://www.edri.org/edrigram/number4.13/googlegermany>.

Jeanneney, Jean-Noël. Google and the Myth of Universal Knowledge: A View from

      Europe. Trans.Teresa Lavender Fagan. Chicago: University of Chicago Press,

      2007.

Kissell, Joe. "The Bodleian Library: Oxford's famous book sanctorium." Interesting Thing

      of the Day 18 Oct 2004 7 Dec 2008 <http://itotd.com/articles/341/the-bodleian-

      library/>.
Maxakova 12


“LBF Daily: Google boosts backlist sales, say publishers.” All Business: A D&B

      Company. 7 Mar 2006. < http://www.allbusiness.com/retail-trade/miscellaneous-

      retail-miscellaneous/4647513-1.html>.

Macintyre, Ben. "The biggest library ever built." Times Online 16 Nov 2007 7 Dec 2008

      <http://www.timesonline.co.uk/tol/comment/columnists/ben_macintyre/article2879

      538.ece>.

Mathes, Adam. “The point of Google Print.” The Official Google Blog. 19 Nov. 2008.

      <http://googleblog.blogspot.com/2005/10/point-of-google-print.html>.

Mompel, Mariona Vivar . "Google Print Outshines The European Digital Library ". Trans.

      Luke Croll. CafeBabel.com. 10 Apr. 2006. 18 Nov. 2008.

      <http://www.cafebabel.com/eng/article/18276/google-print-outshines-the-

      european-digital-librar.html>.

Toobin, Jeffrey. “Google’s Moon Shoot.” The New Yorker. 5 Feb. 2007.

      <http://www.newyorker.com/reporting/2007/02/05/070205fa_fact_toobin?current

      Page=all>.

More Related Content

What's hot

Theory of imitation
Theory of imitationTheory of imitation
Theory of imitationApoorv Joshi
 
The library in the life of the user
The library in the life of the userThe library in the life of the user
The library in the life of the userlisld
 
Stream of consciousness
Stream of consciousnessStream of consciousness
Stream of consciousnessDayamani Surya
 
Morphology-Syntax Interface
Morphology-Syntax InterfaceMorphology-Syntax Interface
Morphology-Syntax InterfaceDr. Mohsin Khan
 
An internship report on library operations and services of Dhaka University
An internship report on library operations and services of Dhaka UniversityAn internship report on library operations and services of Dhaka University
An internship report on library operations and services of Dhaka UniversityK M Mehedi Hasan
 
Expanding the Concept of Library
Expanding the Concept of LibraryExpanding the Concept of Library
Expanding the Concept of LibraryKathleen Johnson
 
Organisation of Libraries
Organisation of LibrariesOrganisation of Libraries
Organisation of Librariesdeewil
 
The Work of Art In The Age of ... (?)
The Work of Art In The Age of ... (?)The Work of Art In The Age of ... (?)
The Work of Art In The Age of ... (?)_
 
Commonwealth literature an outline
Commonwealth literature an outlineCommonwealth literature an outline
Commonwealth literature an outlineMohan Raj Raj
 
Role of Government in Tourism
Role of Government in TourismRole of Government in Tourism
Role of Government in Tourismnareshtanwar5
 
Post colonial 1
Post colonial 1Post colonial 1
Post colonial 1jakajmmk
 
difference between tragedy and epic.
difference between tragedy and epic.difference between tragedy and epic.
difference between tragedy and epic.NikunjBhatti
 
Markedness: Marked and Unmarked forms
Markedness: Marked and Unmarked formsMarkedness: Marked and Unmarked forms
Markedness: Marked and Unmarked formsIbrahim Muneer
 
RDA Presentation
RDA PresentationRDA Presentation
RDA Presentationjendibbern
 

What's hot (20)

Theory of imitation
Theory of imitationTheory of imitation
Theory of imitation
 
The library in the life of the user
The library in the life of the userThe library in the life of the user
The library in the life of the user
 
Dublin Core Intro
Dublin Core IntroDublin Core Intro
Dublin Core Intro
 
Traditional grammar
Traditional grammarTraditional grammar
Traditional grammar
 
Stream of consciousness
Stream of consciousnessStream of consciousness
Stream of consciousness
 
Morphology-Syntax Interface
Morphology-Syntax InterfaceMorphology-Syntax Interface
Morphology-Syntax Interface
 
An internship report on library operations and services of Dhaka University
An internship report on library operations and services of Dhaka UniversityAn internship report on library operations and services of Dhaka University
An internship report on library operations and services of Dhaka University
 
Expanding the Concept of Library
Expanding the Concept of LibraryExpanding the Concept of Library
Expanding the Concept of Library
 
Organisation of Libraries
Organisation of LibrariesOrganisation of Libraries
Organisation of Libraries
 
Roland Barthes
Roland BarthesRoland Barthes
Roland Barthes
 
Preface to Lyrical Ballads
Preface to Lyrical BalladsPreface to Lyrical Ballads
Preface to Lyrical Ballads
 
The Work of Art In The Age of ... (?)
The Work of Art In The Age of ... (?)The Work of Art In The Age of ... (?)
The Work of Art In The Age of ... (?)
 
Slideshare
SlideshareSlideshare
Slideshare
 
Commonwealth literature an outline
Commonwealth literature an outlineCommonwealth literature an outline
Commonwealth literature an outline
 
Role of Government in Tourism
Role of Government in TourismRole of Government in Tourism
Role of Government in Tourism
 
Post colonial 1
Post colonial 1Post colonial 1
Post colonial 1
 
difference between tragedy and epic.
difference between tragedy and epic.difference between tragedy and epic.
difference between tragedy and epic.
 
Review of The Mirror and the Lamp
Review of The Mirror and the LampReview of The Mirror and the Lamp
Review of The Mirror and the Lamp
 
Markedness: Marked and Unmarked forms
Markedness: Marked and Unmarked formsMarkedness: Marked and Unmarked forms
Markedness: Marked and Unmarked forms
 
RDA Presentation
RDA PresentationRDA Presentation
RDA Presentation
 

Viewers also liked

Viewers also liked (20)

Pop indie
Pop indiePop indie
Pop indie
 
Presentation1
Presentation1Presentation1
Presentation1
 
Presentation1
Presentation1Presentation1
Presentation1
 
Presentation1
Presentation1Presentation1
Presentation1
 
Presentation1
Presentation1Presentation1
Presentation1
 
Changed
ChangedChanged
Changed
 
Kate Nash
Kate NashKate Nash
Kate Nash
 
Links
LinksLinks
Links
 
Presentation3
Presentation3Presentation3
Presentation3
 
Presentation1
Presentation1Presentation1
Presentation1
 
Presentation3
Presentation3Presentation3
Presentation3
 
Lucy !
Lucy !Lucy !
Lucy !
 
Presentation8
Presentation8Presentation8
Presentation8
 
Presentation7
Presentation7Presentation7
Presentation7
 
Observation
Observation Observation
Observation
 
Observation
Observation Observation
Observation
 
Diane arbus presentation (2)
Diane arbus presentation (2)Diane arbus presentation (2)
Diane arbus presentation (2)
 
Baz luhrmann presentation
Baz luhrmann presentationBaz luhrmann presentation
Baz luhrmann presentation
 
Diane Arbus Presentation
Diane Arbus PresentationDiane Arbus Presentation
Diane Arbus Presentation
 
Diane Arbus
Diane  ArbusDiane  Arbus
Diane Arbus
 

Similar to Google Books: Benefits And Obstacles

Cummings LIBR 202 Term Paper
Cummings LIBR 202 Term PaperCummings LIBR 202 Term Paper
Cummings LIBR 202 Term PaperDarcy Cummings
 
Teaching with Google Books: research, copyright, and data mining
Teaching with Google Books: research, copyright, and data miningTeaching with Google Books: research, copyright, and data mining
Teaching with Google Books: research, copyright, and data miningNathan Rinne
 
Google Book Search Presentation
Google Book Search PresentationGoogle Book Search Presentation
Google Book Search Presentationbryboyd
 
Kindle garten, jeff bezos
Kindle garten, jeff bezosKindle garten, jeff bezos
Kindle garten, jeff bezosmpt001
 
Bookless Libraries
Bookless LibrariesBookless Libraries
Bookless LibrariesDheeraj Negi
 
Tonta World Is Flat Yet Not Open Oslo Workshop 10 May 2006 Final Revised
Tonta World Is Flat Yet Not Open Oslo Workshop 10 May 2006 Final RevisedTonta World Is Flat Yet Not Open Oslo Workshop 10 May 2006 Final Revised
Tonta World Is Flat Yet Not Open Oslo Workshop 10 May 2006 Final RevisedYasar Tonta
 
HMID6303 Assignment 1 - Yeap
HMID6303 Assignment 1 - YeapHMID6303 Assignment 1 - Yeap
HMID6303 Assignment 1 - YeapYeap Aun
 
A database of riches michael cairns
A database of riches michael cairnsA database of riches michael cairns
A database of riches michael cairnsMichael Cairns
 
Discovering Library2.0 Libraryservices For The Google Generation Sconul June ...
Discovering Library2.0 Libraryservices For The Google Generation Sconul June ...Discovering Library2.0 Libraryservices For The Google Generation Sconul June ...
Discovering Library2.0 Libraryservices For The Google Generation Sconul June ...Ken Chad Consulting Ltd
 
"eBooks and eReaders - tipping points, is 26 the magic number and predicting ...
"eBooks and eReaders - tipping points, is 26 the magic number and predicting ..."eBooks and eReaders - tipping points, is 26 the magic number and predicting ...
"eBooks and eReaders - tipping points, is 26 the magic number and predicting ...Terry O'Brien
 
Library labs as experimental incubators for digital humanities research
Library labs as experimental incubators for digital humanities researchLibrary labs as experimental incubators for digital humanities research
Library labs as experimental incubators for digital humanities researchSally Chambers
 
4 technology trends every librarian needs to know
4 technology trends every librarian needs to know4 technology trends every librarian needs to know
4 technology trends every librarian needs to knowFacet Publishing
 
LIBER position statement on the Google Book Settlement
LIBER position statement on the Google Book SettlementLIBER position statement on the Google Book Settlement
LIBER position statement on the Google Book SettlementWouter Schallier
 
Sr briefing paper_anderson
Sr briefing paper_andersonSr briefing paper_anderson
Sr briefing paper_andersonbriquetdelemos
 

Similar to Google Books: Benefits And Obstacles (20)

Cummings LIBR 202 Term Paper
Cummings LIBR 202 Term PaperCummings LIBR 202 Term Paper
Cummings LIBR 202 Term Paper
 
Google Books Lecture
Google Books LectureGoogle Books Lecture
Google Books Lecture
 
Teaching with Google Books: research, copyright, and data mining
Teaching with Google Books: research, copyright, and data miningTeaching with Google Books: research, copyright, and data mining
Teaching with Google Books: research, copyright, and data mining
 
Google Book Search Presentation
Google Book Search PresentationGoogle Book Search Presentation
Google Book Search Presentation
 
Kindle garten, jeff bezos
Kindle garten, jeff bezosKindle garten, jeff bezos
Kindle garten, jeff bezos
 
Ronald Milne - Fair Use
Ronald Milne - Fair UseRonald Milne - Fair Use
Ronald Milne - Fair Use
 
Bookless Libraries
Bookless LibrariesBookless Libraries
Bookless Libraries
 
Tonta World Is Flat Yet Not Open Oslo Workshop 10 May 2006 Final Revised
Tonta World Is Flat Yet Not Open Oslo Workshop 10 May 2006 Final RevisedTonta World Is Flat Yet Not Open Oslo Workshop 10 May 2006 Final Revised
Tonta World Is Flat Yet Not Open Oslo Workshop 10 May 2006 Final Revised
 
HMID6303 Assignment 1 - Yeap
HMID6303 Assignment 1 - YeapHMID6303 Assignment 1 - Yeap
HMID6303 Assignment 1 - Yeap
 
A database of riches michael cairns
A database of riches michael cairnsA database of riches michael cairns
A database of riches michael cairns
 
Discovering Library2.0 Libraryservices For The Google Generation Sconul June ...
Discovering Library2.0 Libraryservices For The Google Generation Sconul June ...Discovering Library2.0 Libraryservices For The Google Generation Sconul June ...
Discovering Library2.0 Libraryservices For The Google Generation Sconul June ...
 
"eBooks and eReaders - tipping points, is 26 the magic number and predicting ...
"eBooks and eReaders - tipping points, is 26 the magic number and predicting ..."eBooks and eReaders - tipping points, is 26 the magic number and predicting ...
"eBooks and eReaders - tipping points, is 26 the magic number and predicting ...
 
Library labs as experimental incubators for digital humanities research
Library labs as experimental incubators for digital humanities researchLibrary labs as experimental incubators for digital humanities research
Library labs as experimental incubators for digital humanities research
 
31 36
31 3631 36
31 36
 
4 technology trends every librarian needs to know
4 technology trends every librarian needs to know4 technology trends every librarian needs to know
4 technology trends every librarian needs to know
 
224 April 11
224 April 11224 April 11
224 April 11
 
LIBER position statement on the Google Book Settlement
LIBER position statement on the Google Book SettlementLIBER position statement on the Google Book Settlement
LIBER position statement on the Google Book Settlement
 
Xcongressonbookbarcelona
XcongressonbookbarcelonaXcongressonbookbarcelona
Xcongressonbookbarcelona
 
A National UK Public Library Catalogue
A National  UK Public Library CatalogueA National  UK Public Library Catalogue
A National UK Public Library Catalogue
 
Sr briefing paper_anderson
Sr briefing paper_andersonSr briefing paper_anderson
Sr briefing paper_anderson
 

Google Books: Benefits And Obstacles

  • 1. Maxakova 1 Vera Maxakova ATEC 6V81 David Parry December 13, 2008 Google’s Print Digitization Efforts: Benefits and Obstacles In its constant quest to “organize the world's information and make it universally accessible and useful” in indexing vast amounts of digital content, Google plunged into a previously uncharted territory of attempting to digitize and then index books, whose practice, unsurprisingly, raised the dreadful issue of copyright among some other, somewhat surprising concerns. The Google Book Search project digitizes and indexes books that were attained through their Library Project and the Partner Programme, which allows users to see not only website results, but also snippets of text or full books that match their query. This is something that has never been possible before the internet, and with this project Google is enabling millions of people all over the world to search and access books that they may never have been able to discover or much less have access to otherwise. In this project, Google realizes the dream of every library before it that would never have had the means to even come close to achieve the goal of providing free and easy access to the public. But not everyone seems to appreciate the potential or the benefit that this project may bring to the public in relation to current copyright laws, and the lawsuits were quick to follow as Google started adding non- public domain books to its new “library.”
  • 2. Maxakova 2 One of the concerns was brought up by the president of the Bibliothèque nationale de France, Jean-Noël Jeanneney. In Google and the Myth of Universal Knowledge Jeanneney’s main argument is that Google Books, being an American company, will tend to give preference to English-language books and “the dominance of work from the United States may become even greater than it is today” (Jeanneney 6). In May of 2005 his fears were confirmed as Google released the first version of what was then known as Google Print, in which “the inevitable self-centering of the selections was immediately apparent” (Jeanneney 11). Jeanneney’s concern can be justified. Since Google strives to be the archive of all knowledge, according to the old model of the archive, if a title made it into the archive, it meant that it was an important piece of work that needed to be preserved. Due to the constraints of physical space the archives and libraries had to leave out some works in favor of other, more significant works, and other works were therefore deemed less important and were left out. The possibility that Google would give preference to U.S. works over European, non-English ones could have been interpreted by Jeanneney as Google considering the works somehow unworthy of inclusion in their project, but that would be a gross misinterpretation, considering that the Google Books project is still in very early stages of development. Google only recently reached a settlement with the Association of American Publishers who filed a copyright infringement lawsuit against them back in 2005, and it would be nothing short of suicidal for Google to try to digitize books in foreign countries when they are having so much trouble with just the local copyright laws. In 2006, in June alone there were two foreign lawsuits filed against Google’s new project. La Martiniere, a French publisher, accused
  • 3. Maxakova 3 Google of “counterfeiting and breach of intellectual property rights” when Google indexed and published excerpts of about 100 of the publisher’s titles (French book publisher sues Google). The second lawsuit was filed by a German publisher WBG, backed by the German Publishers Association, which was dropped at the end of the month as “the [German] court ruled that there was no copyright violation resulting from the development of Google’s project” (Google's victory in court against German publisher). These cases are a clear indication of what is to come if Google attempts to expand into other countries, especially in this early stage of the project’s development and while the copyright situation in the U.S. is so unfavorable to the project. On the contrary, according to Google’s Chairman and CEO Eric Schmidt, Google’s practices are within the confines of the copyright law’s “fair use” doctrine, “balancing the rights of copyright-holders with the public benefits of free expression and innovation [that] allows a wide range of activity … without copyright-holder permission” (Mathes). On these grounds, the University of Michigan permitted Google access to digitize the university library, whose head librarian, Paul Courant, also agrees that Google is not breaking any copyright laws in scanning books and providing free access to them and states that “the University of Michigan (and the other partner libraries) and Google are changing the world for the better” (Anderson) in allowing Google to do so. Unlike the old notion of the archive, books that are digitized by Google are preserved, not only on their servers, but also in many cases the servers of providers of those works. The aforementioned University of Michigan, for example, not only keeps the books scanned by Google, but also gets the digital copies of the scanned works to use for their own purposes. Besides the obvious intention of allowing users to see book
  • 4. Maxakova 4 texts in their search, another great benefit to this mass digitizing is preservation of these books from damage and loss, which often happens in today’s libraries. This is especially important for rare and out-of-print works. “Checking out” books through Google’s new system would accomplish the same principle of dispersal with no damage to the books or the risk of losing them, as well as having multiple backups of each book. The notion of the book is fully realized on the internet through projects like Wikipedia and now, Google Books, where knowledge is collected in one place can be easily accessed (dispersed) by anyone with an internet connection, and is not threatened by the effects of said dispersal, which is the main idea of the book and the archive (Paper Machine 15). Transitioning into a new, virtual space and out of the constraints of the physical space, allows not only for a vastly larger collection of books, but also for a new and more efficient way of searching and sorting them. Google takes the age-old concept of the card catalog which is no longer limited to the space on a note card, and includes the whole body of the publication into its index. However, some people are still struggling to make the mental switch from the old, physically limited model, among these Anne Bergman-Tahon, the head of the Federation of European Publishers (FEE). Bergman-Tahon believes that “virtual borrowing” will threaten the book and the libraries and bookstores that do not have the physical space to store the volume of books that can be stored on the internet, to remediate which, she plans to “limit the number of copies available to web users. When there are no copies left on the virtual bookshelves, they will have to either reserve a copy and wait, or go to the bookshop and buy an e-book” (Mompel). Ironically, this practice defeats the whole purpose of
  • 5. Maxakova 5 having a “virtual bookshelf” with digital copies that are not restricted by the confines of the physical world and can be distributed to an unlimited number of potential readers in any part of the world, at any time. The idea of digitizing books is to distribute knowledge to as many people as possible with very little or no barriers to entry, which could not be accomplished previously with the brick-and-mortar bookstores and libraries. It is important to note that the library as we know it today did not always operate this way. In the seventeenth-century Oxford’s Bodleian Library, in their attempt to safekeep the books, refused all requests to check them out and take them home. The policy was so strict that even King Charles I himself was declined this luxury that we now take for granted. “The library was a temple of learning, where scholars might come to read and learn. The books stayed put” (Macintyre). But this is not the case today. Today, anyone can come to the library, take the book home and study it at their leisure. Google Books and similar book digitizing projects are simply taking this concept a step further by bringing the library online where the readers are not constrained by the library’s operation hours or physical location. “Technology has made achievable what the librarians of Alexandria could only dream of: one vast, searchable, all-encompassing book, the complete history of the race” (Macintyre). The seventeenth-century Bodleian Library model evolved, and it may be time for the 20th century library to follow its example as the technology changed and the library can become what it was always meant to be – a repository of all the world’s knowledge at the readers’ fingertips. Furthermore, Bergman-Tahon also fears that paperbacks will disappear and libraries and bookstores will be forced out of business. The argument is as old as the printed word itself. When the printing press gained more popularity, there was a similar
  • 6. Maxakova 6 concern for the scribes being out of work, and as history showed, they adapted to the new technology. One such case was documented by Elizabeth Eisenstein in The Printing Revolution in Early Modern Europe, in which “the most celebrated Florentine book merchant” in the late fifteenth century, Vespasiano da Bisticci, was forced out of business due to “dealing exclusively in manuscripts,” while his rival Zanobi di Mariano’s business flourished since, unlike Vespasiano, he began selling printed books (Eisenstein 18). The bookstores as we know them today may in fact be forced out of business or face significant difficulty in trying to stay in business using the old model, but inevitably, new bookstores will emerge and will thrive as they embrace new technology. Google’s idea is not only to store the knowledge, but to make it easily accessible and usable as resources that cannot be found by the user may as well not exist. Indexing the whole text of a publication increases its chances of being found when an appropriate search query is entered. Jeanneney argues that a project of this magnitude and significance should not be left up to a private company but needs to be managed by a more stable agency, such as the government, contradicting an earlier statement that government-run libraries and archives are “chronically underfunded” (viii). More financial support from the government would definitely help such projects, but as history had shown, the government fails at this miserably, so why would this change now? Leaving this job up to the government with their poor history of funding such projects would mean that the digital library project would either never have been started or would not be as rich and successful as it will be in Google’s hands. And when a private company
  • 7. Maxakova 7 with enough means and ambition wants to pursue this endeavor it should only be encouraged onward. As stated earlier, it is important that resources are findable, and who better to provide that “findability” than the search engine with the best search algorithms? “Libraries die when people forget what is in them: they thrive when we are reminded of their riches” (Macintyre). Inability to find a publication threatens the dispersal of knowledge, which renders the resources useless if they cannot be found and thus dispersed to the users. How are we to trust the government with this colossal task of collecting, digitizing and making easily available more books than it was in charge of managing in the old style libraries and archives at which it was obviously failing by neglecting to provide financial support? Even if the government were to accomplish the task of collecting and storing this vast body of work, how would it go about providing for easy access and use by the people? Democratic institutions can be measured by how much access its people have to the archive (Archive Fever 4), and considering the way most government-run websites are built and inexplicable malfunctioning and ineffectiveness of the search function, it is hard to imagine this project reaching its full potential while being under the administration of the government (as it is today). One of Jeanneney’s biggest concerns seems to be based on what criteria would Google choose what books should be included in the Google Books database and that it is up to Google to decide on those criteria (5). He seems to be very uncomfortable with the idea that a private company will have the power to make this important decision, which was so recently left to government-operated and -subsidized libraries and archives. His fears may be well justified in this case (although Google promises to
  • 8. Maxakova 8 not be evil). As Jacques Derrida stresses in Archive Fever, “[t]here is no political power without control of the archive,” which would mean that the entity that controls the archive – in this case, the largest archive ever assembled – would hold unprecedented power, a monopoly on knowledge (Archive Fever 4). Thus, it is understandable why Jeanneney may be disturbed by the idea of one private company controlling the largest knowledge bank in the world and why he suggests that for this reason a government agency is a better fit for the job. So far the only obstacle preventing Google from indexing every book in the world is the copyright law. Unlike other companies that attempted to digitize books, Google first digitizes the books and then presents the publishers the opportunity to opt out of being “published” in Google’s library. Other services, such MSN, Yahoo! and even Amazon with their new Search Inside!™ feature, first obtain permission from the publisher before posting the titles’ full text or even a limited preview online. According to Google, this practice would slow down the digitization efforts (Eun) and most likely, significantly increase the cost if Google were to contact every author and try to obtain authorization for use of their content. This is what Google calls the Opt-Out Approach and this is the reason Google has been more successful at digitizing a larger amount of books than the competing services offered by MSN, Yahoo! and Amazon. This approach is more efficient as some authors may not even be aware of Google’s efforts to digitize books even if they are willing publish their works online through Google, and if they have not been contacted by Google to obtain permission, their work would not be published and users would not be able to find it.
  • 9. Maxakova 9 What most people don’t realize in the midst of these lawsuits and criticisms of the project is that publishers will (and some already do) in fact benefit from this new exposure of their works on the web. The Google Book Search information site’s “Thought & Opinions” section provides some quotes from publishers and authors who understand the marketing potential and the benefits of the project and the benefits they receive from it and praise Google for undertaking such a substantial project. One such documented case is C.S. Lewis’s Mere Christianity, which in 16 months had acquired 351 page views, and only 14 clicks on the publisher HarperCollins’ site, meanwhile the same book on Google Book Search had over fifteen thousand views and almost three hundred click-throughs (“LBF Daily”). Google’s project helped this publisher raise awareness of their backlist books that may not have been discovered or bought otherwise. In light of these facts, it is ironic that companies like the AAG (Association of American Publishers) would seek reimbursement for damages for copyright infringement from Google, when they could have been benefiting from their services all this time. Another interesting law suit filed by a few large publishing companies which included Simon & Schuster, the Penguin Group, and McGraw-Hill, attempted to require Google to “destroy all unauthorized copies made by Google through the Google Library Project – [now Google Books Search] – of any copyrighted works” (Toobin). Although, as ridiculous as this request may be, it brings up a very interesting concept – how does one destroy a literary work on the web? In the physical world this could be accomplished with book burnings when books were still rare and there was a chance of exterminating works out of existence. This concept today seems ludicrous, and even
  • 10. Maxakova 10 more so in the near future when full works will be available on the web and possibly even downloaded, where permitted by the publisher. But what is most interesting in this case goes back to Jeanneney’s fear of Google’s monopolization of digitized books. If Google Book Search ever becomes the main and sole source of digital works and is somehow forced to destroy a book and complies with the request, would that be the equivalent of a modern book burning? Copyright law was created to protect the creative work of authors and publishers, but never was it meant to limit the public’s access to said work. Google is trying to exploit the latter and provide virtually limitless access to works previously unsearchable and (in some cases), thus, undiscoverable on the web. If a book cannot be found by a potential reader, it undermines the whole idea and the reason for its existence. Google is not only trying to create an archive of all published works, but, most importantly, they are trying to make it easily accessible and searchable within context and relevant to a particular search query entered by the user to enable him or her to discover new books and articles which he or she may not have been able to access otherwise.
  • 11. Maxakova 11 Works Cited Anderson, Nate. "University of Michigan librarian defends Google scanning deal". Arts Technica. 18 Nov. 2008. <http://arstechnica.com/news.ars/post/20071126- university-of-michigan-librarian-defends-google-scanning-deal.html>. Derrida, Jacques. Archive Fever: A Freudian Impression. Trans. Eric Prenowitz. Chicago: University Of Chicago Press, 1998. Derrida, Jacques. Paper Machine: Cultural Memory In The Present. Trans. Rachel Bowlby. Stanford: Stanford University Press, 2005. Eisenstein, Elizabeth. The Printing Revolution In Early Modern Europe. New York: Cambridge University Press, 1984. Eun, David. “Our approach to content.” The official Google Blog. 26 Sep. 2006. <http://googleblog.blogspot.com/2006/09/our-approach-to-content.html>. "French book publisher sues Google," BBC News 7 June 2006. 9 Dec 2008. <http://news.bbc.co.uk/1/hi/entertainment/5052912.stm>. “Google's victory in court against German publisher,” EDRI-gram 5 July 2006. 9 Dec 2008. <http://www.edri.org/edrigram/number4.13/googlegermany>. Jeanneney, Jean-Noël. Google and the Myth of Universal Knowledge: A View from Europe. Trans.Teresa Lavender Fagan. Chicago: University of Chicago Press, 2007. Kissell, Joe. "The Bodleian Library: Oxford's famous book sanctorium." Interesting Thing of the Day 18 Oct 2004 7 Dec 2008 <http://itotd.com/articles/341/the-bodleian- library/>.
  • 12. Maxakova 12 “LBF Daily: Google boosts backlist sales, say publishers.” All Business: A D&B Company. 7 Mar 2006. < http://www.allbusiness.com/retail-trade/miscellaneous- retail-miscellaneous/4647513-1.html>. Macintyre, Ben. "The biggest library ever built." Times Online 16 Nov 2007 7 Dec 2008 <http://www.timesonline.co.uk/tol/comment/columnists/ben_macintyre/article2879 538.ece>. Mathes, Adam. “The point of Google Print.” The Official Google Blog. 19 Nov. 2008. <http://googleblog.blogspot.com/2005/10/point-of-google-print.html>. Mompel, Mariona Vivar . "Google Print Outshines The European Digital Library ". Trans. Luke Croll. CafeBabel.com. 10 Apr. 2006. 18 Nov. 2008. <http://www.cafebabel.com/eng/article/18276/google-print-outshines-the- european-digital-librar.html>. Toobin, Jeffrey. “Google’s Moon Shoot.” The New Yorker. 5 Feb. 2007. <http://www.newyorker.com/reporting/2007/02/05/070205fa_fact_toobin?current Page=all>.