In a significant milestone for public access to the law, the Library Innovation Lab at Harvard Law School on Monday published more than 40 million pages of U.S. court decisions online.
The publication, which represents nearly 6.5 million state and federal court cases, is the culmination of a five-year project that saw Harvard slice the spines off a vast collection of legal reporter books in order to digitize them.
The final product is a service called the Caselaw Access Project, which lets anyone access general information about all the cases, or else download large batches of entire cases on a state-by-state basis. The service includes an API, which will let coders build search tools and other products on top of the vast corpus of law.
The Harvard initiative is significant because, although court decisions are in the public domain, many of them are only accessible by paying a commercial service like LexisNexis. These services charge steep fees that are typically beyond what small law offices or non-profits organizations are able to pay.
While countries like Canada and Australia have long offered free online access to the law, the United States has been a laggard, partly because of a haphazard digitization process by the federal and state governments.
The Caselaw Access Project, which digitized every reported case in the country, beginning in the 1600s and up until the summer of this year, thus represents a dramatic improvement in access to law.
According to Adam Ziegler, Director of the Library Innovation Lab, the library is also working with state governments to help them ensure all future decisions are published online in a machine-readable fashion with a neutral citation system.
Ziegler also noted the Caselaw Access Project will be a treasure trove for legal scholars, especially those who employ big data techniques to parse the corpus.
“It’s an opportunity to reconstruct the law as a data source, and write computer programs to peruse millions of cases,” he said.
While the legal corpus is available to anyone, it is presented in a format friendly to programmers rather than ordinary users. Ziegler says the library is leaving the work of developing search tools and consumer-friendly interfaces to others, including startups and non-profit groups.
The library carried out the project with the financial help of Ravel Law, a startup founded by former Stanford University students whose tagline is “Law is America’s Operating System. We’re Giving it an Update.”
Ravel was acquired by LexisNexis in 2017 but the legal giant agreed to uphold Ravel’s agreement with Harvard to make the caselaw and the metadata public. Under the terms of the agreement, the entire legal corpus will be available to scholars and other non-profit users for no charge and, in 2024, it will be available to other commercial services.
While Monday’s news represents a big leap forward for public access to law, other structural barriers remain. These include the federal court system’s antiquated service known as PACER, which charges users 10 cents a page to obtain PDFs of court documents. Despite bipartisan efforts to reform PACER, led by the likes of Rep. Darrell Issa (R-Calif.) and Sen. Elizabeth Warren (D-Mass.), Congress has yet to pass a law to replace the unpopular system.