1. Maxakova 1
Vera Maxakova
ATEC 6V81
David Parry
December 13, 2008
Google’s Print Digitization Efforts: Benefits and Obstacles
In its constant quest to “organize the world's information and make it universally
accessible and useful” in indexing vast amounts of digital content, Google plunged into
a previously uncharted territory of attempting to digitize and then index books, whose
practice, unsurprisingly, raised the dreadful issue of copyright among some other,
somewhat surprising concerns. The Google Book Search project digitizes and indexes
books that were attained through their Library Project and the Partner Programme,
which allows users to see not only website results, but also snippets of text or full books
that match their query. This is something that has never been possible before the
internet, and with this project Google is enabling millions of people all over the world to
search and access books that they may never have been able to discover or much less
have access to otherwise. In this project, Google realizes the dream of every library
before it that would never have had the means to even come close to achieve the goal
of providing free and easy access to the public. But not everyone seems to appreciate
the potential or the benefit that this project may bring to the public in relation to current
copyright laws, and the lawsuits were quick to follow as Google started adding non-
public domain books to its new “library.”
2. Maxakova 2
One of the concerns was brought up by the president of the Bibliothèque
nationale de France, Jean-Noël Jeanneney. In Google and the Myth of Universal
Knowledge Jeanneney’s main argument is that Google Books, being an American
company, will tend to give preference to English-language books and “the dominance of
work from the United States may become even greater than it is today” (Jeanneney 6).
In May of 2005 his fears were confirmed as Google released the first version of what
was then known as Google Print, in which “the inevitable self-centering of the selections
was immediately apparent” (Jeanneney 11).
Jeanneney’s concern can be justified. Since Google strives to be the archive of
all knowledge, according to the old model of the archive, if a title made it into the
archive, it meant that it was an important piece of work that needed to be preserved.
Due to the constraints of physical space the archives and libraries had to leave out
some works in favor of other, more significant works, and other works were therefore
deemed less important and were left out. The possibility that Google would give
preference to U.S. works over European, non-English ones could have been interpreted
by Jeanneney as Google considering the works somehow unworthy of inclusion in their
project, but that would be a gross misinterpretation, considering that the Google Books
project is still in very early stages of development. Google only recently reached a
settlement with the Association of American Publishers who filed a copyright
infringement lawsuit against them back in 2005, and it would be nothing short of suicidal
for Google to try to digitize books in foreign countries when they are having so much
trouble with just the local copyright laws. In 2006, in June alone there were two foreign
lawsuits filed against Google’s new project. La Martiniere, a French publisher, accused
3. Maxakova 3
Google of “counterfeiting and breach of intellectual property rights” when Google
indexed and published excerpts of about 100 of the publisher’s titles (French book
publisher sues Google). The second lawsuit was filed by a German publisher WBG,
backed by the German Publishers Association, which was dropped at the end of the
month as “the [German] court ruled that there was no copyright violation resulting from
the development of Google’s project” (Google's victory in court against German
publisher). These cases are a clear indication of what is to come if Google attempts to
expand into other countries, especially in this early stage of the project’s development
and while the copyright situation in the U.S. is so unfavorable to the project.
On the contrary, according to Google’s Chairman and CEO Eric Schmidt,
Google’s practices are within the confines of the copyright law’s “fair use” doctrine,
“balancing the rights of copyright-holders with the public benefits of free expression and
innovation [that] allows a wide range of activity … without copyright-holder permission”
(Mathes). On these grounds, the University of Michigan permitted Google access to
digitize the university library, whose head librarian, Paul Courant, also agrees that
Google is not breaking any copyright laws in scanning books and providing free access
to them and states that “the University of Michigan (and the other partner libraries) and
Google are changing the world for the better” (Anderson) in allowing Google to do so.
Unlike the old notion of the archive, books that are digitized by Google are
preserved, not only on their servers, but also in many cases the servers of providers of
those works. The aforementioned University of Michigan, for example, not only keeps
the books scanned by Google, but also gets the digital copies of the scanned works to
use for their own purposes. Besides the obvious intention of allowing users to see book
4. Maxakova 4
texts in their search, another great benefit to this mass digitizing is preservation of these
books from damage and loss, which often happens in today’s libraries. This is especially
important for rare and out-of-print works. “Checking out” books through Google’s new
system would accomplish the same principle of dispersal with no damage to the books
or the risk of losing them, as well as having multiple backups of each book. The notion
of the book is fully realized on the internet through projects like Wikipedia and now,
Google Books, where knowledge is collected in one place can be easily accessed
(dispersed) by anyone with an internet connection, and is not threatened by the effects
of said dispersal, which is the main idea of the book and the archive (Paper Machine
15).
Transitioning into a new, virtual space and out of the constraints of the physical
space, allows not only for a vastly larger collection of books, but also for a new and
more efficient way of searching and sorting them. Google takes the age-old concept of
the card catalog which is no longer limited to the space on a note card, and includes the
whole body of the publication into its index. However, some people are still struggling to
make the mental switch from the old, physically limited model, among these Anne
Bergman-Tahon, the head of the Federation of European Publishers (FEE).
Bergman-Tahon believes that “virtual borrowing” will threaten the book and the
libraries and bookstores that do not have the physical space to store the volume of
books that can be stored on the internet, to remediate which, she plans to “limit the
number of copies available to web users. When there are no copies left on the virtual
bookshelves, they will have to either reserve a copy and wait, or go to the bookshop
and buy an e-book” (Mompel). Ironically, this practice defeats the whole purpose of
5. Maxakova 5
having a “virtual bookshelf” with digital copies that are not restricted by the confines of
the physical world and can be distributed to an unlimited number of potential readers in
any part of the world, at any time. The idea of digitizing books is to distribute knowledge
to as many people as possible with very little or no barriers to entry, which could not be
accomplished previously with the brick-and-mortar bookstores and libraries.
It is important to note that the library as we know it today did not always operate
this way. In the seventeenth-century Oxford’s Bodleian Library, in their attempt to
safekeep the books, refused all requests to check them out and take them home. The
policy was so strict that even King Charles I himself was declined this luxury that we
now take for granted. “The library was a temple of learning, where scholars might come
to read and learn. The books stayed put” (Macintyre). But this is not the case today.
Today, anyone can come to the library, take the book home and study it at their leisure.
Google Books and similar book digitizing projects are simply taking this concept a step
further by bringing the library online where the readers are not constrained by the
library’s operation hours or physical location. “Technology has made achievable what
the librarians of Alexandria could only dream of: one vast, searchable, all-encompassing
book, the complete history of the race” (Macintyre). The seventeenth-century Bodleian
Library model evolved, and it may be time for the 20th century library to follow its
example as the technology changed and the library can become what it was always
meant to be – a repository of all the world’s knowledge at the readers’ fingertips.
Furthermore, Bergman-Tahon also fears that paperbacks will disappear and
libraries and bookstores will be forced out of business. The argument is as old as the
printed word itself. When the printing press gained more popularity, there was a similar
6. Maxakova 6
concern for the scribes being out of work, and as history showed, they adapted to the
new technology. One such case was documented by Elizabeth Eisenstein in The
Printing Revolution in Early Modern Europe, in which “the most celebrated Florentine
book merchant” in the late fifteenth century, Vespasiano da Bisticci, was forced out of
business due to “dealing exclusively in manuscripts,” while his rival Zanobi di Mariano’s
business flourished since, unlike Vespasiano, he began selling printed books
(Eisenstein 18). The bookstores as we know them today may in fact be forced out of
business or face significant difficulty in trying to stay in business using the old model,
but inevitably, new bookstores will emerge and will thrive as they embrace new
technology.
Google’s idea is not only to store the knowledge, but to make it easily accessible
and usable as resources that cannot be found by the user may as well not exist.
Indexing the whole text of a publication increases its chances of being found when an
appropriate search query is entered. Jeanneney argues that a project of this magnitude
and significance should not be left up to a private company but needs to be managed by
a more stable agency, such as the government, contradicting an earlier statement that
government-run libraries and archives are “chronically underfunded” (viii). More financial
support from the government would definitely help such projects, but as history had
shown, the government fails at this miserably, so why would this change now? Leaving
this job up to the government with their poor history of funding such projects would
mean that the digital library project would either never have been started or would not
be as rich and successful as it will be in Google’s hands. And when a private company
7. Maxakova 7
with enough means and ambition wants to pursue this endeavor it should only be
encouraged onward.
As stated earlier, it is important that resources are findable, and who better to
provide that “findability” than the search engine with the best search algorithms?
“Libraries die when people forget what is in them: they thrive when we are reminded of
their riches” (Macintyre). Inability to find a publication threatens the dispersal of
knowledge, which renders the resources useless if they cannot be found and thus
dispersed to the users. How are we to trust the government with this colossal task of
collecting, digitizing and making easily available more books than it was in charge of
managing in the old style libraries and archives at which it was obviously failing by
neglecting to provide financial support? Even if the government were to accomplish the
task of collecting and storing this vast body of work, how would it go about providing for
easy access and use by the people? Democratic institutions can be measured by how
much access its people have to the archive (Archive Fever 4), and considering the way
most government-run websites are built and inexplicable malfunctioning and
ineffectiveness of the search function, it is hard to imagine this project reaching its full
potential while being under the administration of the government (as it is today).
One of Jeanneney’s biggest concerns seems to be based on what criteria would
Google choose what books should be included in the Google Books database and that
it is up to Google to decide on those criteria (5). He seems to be very uncomfortable
with the idea that a private company will have the power to make this important
decision, which was so recently left to government-operated and -subsidized libraries
and archives. His fears may be well justified in this case (although Google promises to
8. Maxakova 8
not be evil). As Jacques Derrida stresses in Archive Fever, “[t]here is no political power
without control of the archive,” which would mean that the entity that controls the
archive – in this case, the largest archive ever assembled – would hold unprecedented
power, a monopoly on knowledge (Archive Fever 4). Thus, it is understandable why
Jeanneney may be disturbed by the idea of one private company controlling the largest
knowledge bank in the world and why he suggests that for this reason a government
agency is a better fit for the job.
So far the only obstacle preventing Google from indexing every book in the world
is the copyright law. Unlike other companies that attempted to digitize books, Google
first digitizes the books and then presents the publishers the opportunity to opt out of
being “published” in Google’s library. Other services, such MSN, Yahoo! and even
Amazon with their new Search Inside!™ feature, first obtain permission from the
publisher before posting the titles’ full text or even a limited preview online. According to
Google, this practice would slow down the digitization efforts (Eun) and most likely,
significantly increase the cost if Google were to contact every author and try to obtain
authorization for use of their content. This is what Google calls the Opt-Out Approach
and this is the reason Google has been more successful at digitizing a larger amount of
books than the competing services offered by MSN, Yahoo! and Amazon. This
approach is more efficient as some authors may not even be aware of Google’s efforts
to digitize books even if they are willing publish their works online through Google, and if
they have not been contacted by Google to obtain permission, their work would not be
published and users would not be able to find it.
9. Maxakova 9
What most people don’t realize in the midst of these lawsuits and criticisms of the
project is that publishers will (and some already do) in fact benefit from this new
exposure of their works on the web. The Google Book Search information site’s
“Thought & Opinions” section provides some quotes from publishers and authors who
understand the marketing potential and the benefits of the project and the benefits they
receive from it and praise Google for undertaking such a substantial project. One such
documented case is C.S. Lewis’s Mere Christianity, which in 16 months had acquired
351 page views, and only 14 clicks on the publisher HarperCollins’ site, meanwhile the
same book on Google Book Search had over fifteen thousand views and almost three
hundred click-throughs (“LBF Daily”). Google’s project helped this publisher raise
awareness of their backlist books that may not have been discovered or bought
otherwise. In light of these facts, it is ironic that companies like the AAG (Association of
American Publishers) would seek reimbursement for damages for copyright
infringement from Google, when they could have been benefiting from their services all
this time.
Another interesting law suit filed by a few large publishing companies which
included Simon & Schuster, the Penguin Group, and McGraw-Hill, attempted to require
Google to “destroy all unauthorized copies made by Google through the Google Library
Project – [now Google Books Search] – of any copyrighted works” (Toobin). Although,
as ridiculous as this request may be, it brings up a very interesting concept – how does
one destroy a literary work on the web? In the physical world this could be
accomplished with book burnings when books were still rare and there was a chance of
exterminating works out of existence. This concept today seems ludicrous, and even
10. Maxakova 10
more so in the near future when full works will be available on the web and possibly
even downloaded, where permitted by the publisher. But what is most interesting in this
case goes back to Jeanneney’s fear of Google’s monopolization of digitized books. If
Google Book Search ever becomes the main and sole source of digital works and is
somehow forced to destroy a book and complies with the request, would that be the
equivalent of a modern book burning?
Copyright law was created to protect the creative work of authors and publishers,
but never was it meant to limit the public’s access to said work. Google is trying to
exploit the latter and provide virtually limitless access to works previously unsearchable
and (in some cases), thus, undiscoverable on the web. If a book cannot be found by a
potential reader, it undermines the whole idea and the reason for its existence. Google
is not only trying to create an archive of all published works, but, most importantly, they
are trying to make it easily accessible and searchable within context and relevant to a
particular search query entered by the user to enable him or her to discover new books
and articles which he or she may not have been able to access otherwise.
11. Maxakova 11
Works Cited
Anderson, Nate. "University of Michigan librarian defends Google scanning deal". Arts
Technica. 18 Nov. 2008. <http://arstechnica.com/news.ars/post/20071126-
university-of-michigan-librarian-defends-google-scanning-deal.html>.
Derrida, Jacques. Archive Fever: A Freudian Impression. Trans. Eric Prenowitz.
Chicago: University Of Chicago Press, 1998.
Derrida, Jacques. Paper Machine: Cultural Memory In The Present. Trans. Rachel
Bowlby. Stanford: Stanford University Press, 2005.
Eisenstein, Elizabeth. The Printing Revolution In Early Modern Europe. New York:
Cambridge University Press, 1984.
Eun, David. “Our approach to content.” The official Google Blog. 26 Sep. 2006.
<http://googleblog.blogspot.com/2006/09/our-approach-to-content.html>.
"French book publisher sues Google," BBC News 7 June 2006. 9 Dec 2008.
<http://news.bbc.co.uk/1/hi/entertainment/5052912.stm>.
“Google's victory in court against German publisher,” EDRI-gram 5 July 2006. 9 Dec
2008. <http://www.edri.org/edrigram/number4.13/googlegermany>.
Jeanneney, Jean-Noël. Google and the Myth of Universal Knowledge: A View from
Europe. Trans.Teresa Lavender Fagan. Chicago: University of Chicago Press,
2007.
Kissell, Joe. "The Bodleian Library: Oxford's famous book sanctorium." Interesting Thing
of the Day 18 Oct 2004 7 Dec 2008 <http://itotd.com/articles/341/the-bodleian-
library/>.
12. Maxakova 12
“LBF Daily: Google boosts backlist sales, say publishers.” All Business: A D&B
Company. 7 Mar 2006. < http://www.allbusiness.com/retail-trade/miscellaneous-
retail-miscellaneous/4647513-1.html>.
Macintyre, Ben. "The biggest library ever built." Times Online 16 Nov 2007 7 Dec 2008
<http://www.timesonline.co.uk/tol/comment/columnists/ben_macintyre/article2879
538.ece>.
Mathes, Adam. “The point of Google Print.” The Official Google Blog. 19 Nov. 2008.
<http://googleblog.blogspot.com/2005/10/point-of-google-print.html>.
Mompel, Mariona Vivar . "Google Print Outshines The European Digital Library ". Trans.
Luke Croll. CafeBabel.com. 10 Apr. 2006. 18 Nov. 2008.
<http://www.cafebabel.com/eng/article/18276/google-print-outshines-the-
european-digital-librar.html>.
Toobin, Jeffrey. “Google’s Moon Shoot.” The New Yorker. 5 Feb. 2007.
<http://www.newyorker.com/reporting/2007/02/05/070205fa_fact_toobin?current
Page=all>.