Digitizing Initiatives



Methods and Costs

From 2005:

There were, as of 2005, three mass productions options for scanning books according to Dustin Goot in a Wired article titled 3 Ways to Scan a Library. The first method is to remove the spines and use "machines [that] cost $25,000 and churn through 90 black-and-white pages per minute, front and back." Second, libraries can have "workers in India, China, and the Philippines earn about 40 cents an hour to manually turn pages that are zapped by $15,000 overhead scanners... Carnegie Mellon's Million Book Project [see above] alone employs more than 100 Indians for this activity" Third, libraries or publishers can employ automated systems such as Kirtas Technologies' automated system that scans 1,200 pages per hour from bound books.
When the end purpose of digitization is the publishing of converted material onto the Internet, art books and journal articles present a special challenge for conversion of the analog material to digital files. Since images of art objects are frequently embedded in pages containing text, .pdf digital output for Internet publication is usually not feasible due to the time and expense needed for copyright clearance of the images of art objects.
In addition to the mass scanning methods, organizations can manually scan text in bound books page by page. Resource Library recently estimated that the time required to manually scan, delete artwork images, proofread and convert to HTML a 400 word page of text in a bound book averages 6 minutes per page, maintaining 99.995% accuracy. The combined direct labor cost is estimated at $2.50 per page, or $25 per hour for 10 pages. A 5,000 word essay would therefore cost $32 to process in direct labor cost. Capital equipment and overhead costs need to be added to direct labor costs to arrive at total cost. (see the Content presentation guidelines from Resource Library for further information on its text presentation conventions)
In 2006, TFAO conducted research to assess the feasibility of outsourcing its text conversion process to service bureaus. TFAO would provide final processing into .htm files for online publication. Assumptions for quotes:
For a discussion on the costs related to reading of "open access publishing" vs. subscription based articles see "The Cost per Article Reading of Open Access Articles" by Jonas Holmström, Research Assistant, Swedish School of Economics and Business Administration.
For a comparison of costs involved with operating a paper vs. virtual library see "Comparing Library Resource Allocations for the Paper and the Digital Library" by Lynn Silipigni Connaway, Research Scientist, Office of Research, OCLC Online Computer Library Center, Inc. and Stephen R. Lawrence, Associate Professor of Operations Management, Leeds School of Business, University of Colorado. Also see "The Return on Investment of Electronic Journals - It Is a Matter of Time" by Jonas Holmström, Swedish School of Economics and Business Administration, Helsinki, Finland
At an image resolution of 300 to 500 dpi. Kirtas estimated in 2005 that their automated method costs "as low as $.03" per page ($36 per hour), while manual scanning, at a rate of 100 to 150 pages per hour, costs "$.35 to $1.50" per page. (This cost quote is probably not applicable to Resource Library's text conversion and text presentation conventions requiring 99.995% accuracy.)
A November 9, 2005 Wall Street Journal article by David Kesmodel and Vauhini Vara discussed costs connected with the book digitizing program of Internet Archive, a San Francisco nonprofit group that is spearheading the Open Content Alliance, a consortium of business and educational groups. Employees manually scan out of copyright books in five-hour shifts, four times a week. Pay is just over $10 per hour. The article says that the Archive has digitized around 2,800 books, at a cost of about $108,000, which is $38.50 per book. It costs "about 10 cents a page to get a book online, taking into account equipment, labor and the cost of hosting the pages on the Internet Archive's Web servers." Each special scanning machine costs $20,000 to $40,000. It takes around one hour to scan 500 pages or about 8 1/3 pages per minute. (This cost quote is probably not applicable to Resource Library's text conversion and text presentation conventions requiring 99.995% accuracy.)
A December 12, 2005 article in the Wall Street Journal by Jeffrey A. Trachtenberg and Kevin J. Delaney said that a major publisher was recently told that "it costs as much as 10 cents per page to scan, digitize and tag a book, which means a 300-page novel would cost $30." (This cost quote is probably not applicable to Resource Library's text conversion and text presentation conventions requiring 99.995% accuracy.)
A December 14, 2004 announcement by Google that the firm will collaborate with institutional libraries to digitize large quantities of books spawned numerous articles in the media. Digitizing expenses were quoted from $10 to $20 per book. For instance, a December 14, 2004 Reuters article by Lisa Baertlein titled "Google Bets Big on Bringing Libraries to Web" said "Librarians and non profits already involved in scanning books for other projects say it costs around $20 to do a 300-page book, but that the cost should soon fall to around $10 per book." At $20 that is 7 cents per page and at $10 it's 3 cents. (This cost quote is probably not applicable to Resource Library's text conversion and text presentation conventions requiring 99.995% accuracy.)

From 2006:

During 2006 TFAO received quotes from firms to provide text conversion service.
One firm's subcontractor offered 99.995% accuracy with pricing for .doc output files to be: 
-- Bound bitone scanning up to 8.5" x 11" = $0.72/each
-- Bound bitone scanning up to 11" x 17" = $1.02/each
-- OCR bitone images = $0.18/each
-- Proofing and formatting = $1.17 per 1,000 characters (later reduced to 80 cents in a 2007 requote)
-- CD-R masters = $10.00/each (optional)
-- Shipping = at cost
Assuming a 10,000 word essay with 5.3 characters per word, there would be 53,000 characters in the document. The (2007) proofreading cost = $42.40. If there are 600 words per page the scanning = $12. Adding a CD-R master brings the total cost to $64.40
Firms quoted for proofreading and formatting service only for an equivalent document $42, $156 and $200.
For proofreading there are a number of specialty specialty service bureaus For example, Canyouproofthis.com charges a minimum of $50 as of November, 2006. They provide an online rate calculator. Wordsru.com provided an "instant estimate" of $78 for a 5,000 word document.


From 2007:

During 2007 TFAO received a quote for scanning, formatting, proofreading and emailing of a resultant .doc file at 80 cents per 1,000 characters. The source has a $100 minimum, so for maximum efficiency, TFAO would send to the contractor 125,000 characters, equivalent to 23,500 words of text.
A sample AAR article converted in 2007 has 1,840 words in five pages, or 368 words per page. At that rate for AAR articles to maximize use of $100 minimum, 23,500 words divided by 368 words per page = 64 pages needed.
These quotes are based on adherence to TFAO's text presentation conventions.
rev. 6/4/07

Go to:

Commercial Ventures
The eBook future
Related Non-Profit Organizations
Methods and Costs

back to start of Digitizing Initiatives


Individual pages in this study will be amended as TFAO adds content, corrects errors and reorganizes sections for improved readability. Refreshing or reloading pages enables readers to view the latest updates. Links to sources of information outside of our web site are provided only as referrals for your further consideration. Please use due diligence in judging the quality of information contained in these and all other Web sites and in employing referenced consultants or vendors. Information from linked sources may be inaccurate or out of date. Traditional Fine Arta Organization, Inc neither recommends or endorses these referenced organizations. Although Traditional Fine Art Organization, Inc. includes links to other web sites, it takes no responsibility for the content or information contained on those other sites, nor exerts any editorial or other control over those other sites. For more information on evaluating web pages see Traditional Fine Arts Organization, Inc.'s General Resources section in Online Resources for Collectors and Students of Art History.

Search Resource Library for thousands of articles and essays on American art.

Copyright 2012 Traditional Fine Arts Organization, Inc., an Arizona nonprofit corporation. All rights reserved.