On the importance of open-source literature

[by: Will Cunningham]
Matthew Jockers, Matthew Sag, and Jason Schultz’s article
“Don’t let copyright block data mining,” which appeared in Nature magazine in 2012, brings to light perhaps one of the biggest
roadblocks in the field of Digital Humanities: copyright law. To put the issue
simply, unless a novel was published before 1923, the copyright has expired, or
the copyright owner agrees (on an individual basis) to make the content available,
then some of the most important tools that DH has to offer are off the table.
Jockers, Sag, and Schultz offer a brief summary of the class
action lawsuit levied by the Authors Guild against Google’s massive collection
of scanned novels as a case study in the complex issue surrounding copyright
and the DH. This lawsuit, while intended to protect both the integrity and
economic viability of literary production, has also closed the door on
“non-expressive” uses of literature in the field of DH: data and text mining,
geo-surveys, deep word counts, etc. As Jockers, et al point out, the goal of DH
is not about republishing work or even quoting from texts; rather, DHG scholars
“simply want to extract information from and about them to sift out trends and
This obstacle is especially prescient for African America
Literary studies (and of the African American novel in particular). While HBW
boasts a near exhaustive collection of every black novel published before the
1923 cut-off, an extensive DH project that omits the vast majority of novels
produced by black authors post-1923 is hardly extensive or even useful, one
might argue. In fact, this is a roadblock we are dealing with now: how do we
build a useable database for non-expressive purposes and exclude much of the
Harlem Renaissance? The Black Power Movement? The rise of Black Feminist
Literature? How does one build a database without
Toni Morrison?
These questions are primary to the development of both the
field of DH and African American Literature. And as Jockers, et al note, an Author’s rights do deserve protection…but digitizing books for non-expressive
uses is a separate issue. The slow, trudging, ponderous weight of copyright law
must at some point catch up to the fast accelerating pace of academic studies
in general, and the field of Digital Humanities in particular.