By Adam King
Astronomical. That’s how much the cost would be if Ohio State’s University Libraries tried to scan its collection of out-of-copyright books on its own.
Nothing. That’s what Ohio State is paying to make that happen by partnering with the Google Books Library Project. OSU indicated its interest in joining the project in 2009 as part of the CIC Libraries’ agreement, but because other universities and colleges were in line for scanning ahead of it and Thompson Library was going through its renovation, Ohio State did not start sending books to Google until this month.
There is no timeline for the process to end. Ohio State is methodically going through the list Google put together of the works it wants to scan from the university’s collections. That list, of course, will be shorter than, say, the University of Michigan’s because of duplicate works in both libraries and Michigan’s status as an early partner.
The number of works already scanned and those yet to be is unknown publicly — Google asks its library partners not to share that information. But based on Google’s drive to create a virtual card catalog of the world’s books and the years put in, it is estimated millions of books have been digitized.
As a partner, Ohio State receives access to a digital copy of every book Google scans from OSU’s collections, and that alone is worth investing time in the project, said Carol Diedrichs, director of University Libraries.
“If you’re doing deep research, especially in the Humanities, that 1923 date (prior to which anything published is considered out of copyright) seems like a long time ago, but not so much,” Diedrichs said. “Our faculty use our collection heavily. By being able to view that content online in your office or at home, you might get everything you need or you might determine that you want to get the physical book; it saves time being able to see it beforehand.”
Another advantage is that the book’s text is searchable.
“Right now you go to look for a book in our physical collection and you might be able to get the table of contents and a bit of a description, but you can’t get at what’s inside. So faculty spend time taking things off the shelf and making sure it’s useful or not. When you digitally scan all the words in a book, you can begin to find things much more clearly and might say, ‘Oh I had no idea there was that particular chapter in that book because the title would have never led me to believe that.’”
All the digitized works are being kept within the HathiTrust, a collective of 60 international libraries that ensures such works will be preserved for all time. Ohio State is a founding member as part of the Committee on Institutional Cooperation (which includes all the Big Ten schools and the University of Chicago) as well as the University of California system and the University of Virginia. Diedrichs currently serves on the governing board. Anyone in the OSU community can access the works at hathitrust.org, which has its infrastructure at the universities of Michigan, Indiana and Illinois.
Diedrichs said when Google first began its library project, it scanned everything — even books that retained copyright protection. The publishers sued Google, and the settlement means Google has refocused on digitizing out-of-copyright material. But all of the books that Google scanned will remain in the HathiTrust as archived materials that can’t be accessed until their copyright expires, with one exception — the HathiTrust is developing software that will allow people with print disabilities access to digitized copyrighted materials without restraint. Converting books for that effort is a key component of the trust.
Sans that exception, of the 10.6 million digitized volumes currently housed at HathiTrust, 3.3 million are in the public domain.
“A book written in 2010 is absolutely in copyright, but there is a gap in the middle from 1923 to now where things could be out of copyright, and there is no easy way to determine that,” Diedrichs said. “The HathiTrust looks at each book systematically and it’s freeing material every day.”
A discussion occurring within the HathiTrust is how digitizing collections might help libraries pare down the original materials they need on hand. If it’s decided that a book need only be available in a certain number, then select members of the trust will agree to be the keepers of those copies while other institutions could remove theirs. Diedrichs, though, said Ohio State is likely to play the keeper role as one of the flagship university libraries in the world.
In a similar effort, OSU is a partner in a CIC initiative to retain a single copy of certain print journals which are accessible and preserved in Ohio in electronic form. That shared print copy will be housed at Indiana University’s new repository.
“If a faculty member did need the print volume, we can have it retrieved and brought over and then it would go back to IU,” Diedrichs said.
The Ohio State books going to Google now are the ones easy to transport since the scanning takes place outside the state. Those books that need more delicate handling or ones housed in the rare and special collections will be a case-by-case determination — after discussions with those collections’ curators — if they will be sent to Google. If not, Ohio State would scan them in-house.
None of this means print is going away anytime soon. University Libraries buys upward of 80,000 books a year, and very few libraries are thinking of going bookless.
“In today’s world, the digital isn’t necessarily cheaper than print,” Diedrichs said. “We’re buying lots of electronic books but they also require staff to process and make available. We have people who do nothing but troubleshoot why someone couldn’t get into a digital resource. But you can be anywhere in the world and access the digital library, and that makes it more efficient.”
What books are considered out of copyright?
Originally, the United States allowed a copyright to last 14 years and then it could be renewed for another 14 years. But legislation over the years has extended such protections even farther.
Generally, anything published prior to 1923 is considered in the public domain, but it’s more nuanced than that. Some works on either side of the date were put directly into the public domain without a copyright. Other works published before 1989 without proper copyright notice or before 1964 whose copyrights were not renewed might be in the public domain, although works published in other countries under those guidelines might retain their copyright.
In 1998, the US enacted a 20-year copyright extension, which means works published during 1923 or later with their copyrights still intact won’t be in the public domain until 2018 or later.
For works published in 1977 or later, the US upholds copyright protection for the life of the author plus 70 years.
For more detailed information about what constitutes public domain works, visit onlinebooks.library.upenn.edu/okbooks.html. The Copyright Resources Center at OSU Libraries and the Health Sciences Copyright Management Office support faculty, staff and students by providing education and guidance on the application of copyright law to facilitate education, research and patient care at library.osu.edu/projects-initiatives/copyright-resources-center.
