Scholars Recruit Public for Project
Published: December 27, 2010
Since University College London began transcribing the papers of the Enlightenment philosopher Jeremy Bentham more than 50 years ago, it has published 27 volumes of his writings — less than half of the 70 or so ultimately expected.
Bentham Project
Tony Slade of the Bentham Project in London edits images of manuscripts to post online for volunteers to transcribe.
Articles in this series examine how digital tools are changing scholarship in history, literature and the arts.
Bentham Project
Jeremy Bentham’s preserved, clothed corpse, with a waxworks replica of his actual head at bottom, has greeted visitors to the college since 1850.
Bentham Project
A handwritten document by Jeremy Bentham in University College London’s collection.
The painstaking job of transcribing often hard-to-decipher handwritten documents from history’s lead players — not to mention a lack of money — has meant that most originals are seen by a just a handful of scholars and kept out of the public’s reach altogether. After more than five decades, only slightly more than half of James Madison’s papers have been transcribed and published, while work on Thomas Jefferson’s papers, begun in 1943, probably won’t be finished until around 2025.
Now the scholars behind the Bentham Project think they may have come up with a better way: crowd-sourcing.
Starting this fall, the editors have leveraged, if not the wisdom of the crowd, then at least its fingers, inviting anyone — yes, that means you — to help transcribe some of the 40,000 unpublished manuscripts from University College’s collection that have been scanned and put online. In the roughly four months since this Wikipedia-style experiment began, 350 registered users have produced 435 transcripts.
These transcripts, which are reviewed and corrected by editors, will eventually be used for printed editions of the collected works of Bentham, whose preserved corpse, clothed and seated, has greeted visitors to the college since 1850.
Other initiatives have recruited volunteers online, but the Bentham Project is one of the first to try crowd-sourced transcription and to open up a traditionally rarefied scholarly endeavor to the general public, generating both excitement and questions.
It is an undertaking that Bentham, who died in 1832 at 84 and is best known for his utilitarian philosophy that sought “the greatest happiness for the greatest number,” would no doubt appreciate.
This experiment, part of the way technology is revolutionizing the study of the humanities, has the potential to cut years, even decades, from the transcription process while making available to the public and the general pool of scholars miles of documents that are now off limits, difficult to read or unsearchable.
“It’s fairly astonishing,” Sharon Leon, a historian at George Mason University, said of crowd-sourcing’s potential. Ms. Leon recently received a grant from the National Endowment for the Humanities to design a free digital tool — a plug-in — that any archive or library could use to open transcription to the public.
Ms. Leon and her collaborators are working with 55,000 unpublished documents from the United States’ early War Department that have been collected, copied and reconstructed in the last dozen years to replace those largely destroyed in 1800 when a fire swept through the department’s offices.
These manuscripts, written from 1784 to 1800, are the equivalent of a national archive, since the early War Department handled not only military matters but also diplomacy, internal security, Indian affairs, the country’s only social welfare program — for veterans — and accounted for 7 of every 10 dollars spent by the federal government.
Ms. Leon said she assumed the project, scheduled to begin in January, would primarily attract scholars, students, history enthusiasts and genealogists, some of whom might have already transcribed portions of these documents in the course of their research.
Karen Mason, a librarian at Medgar Evers College in Brooklyn who has transcribed some of Bentham’s papers, described her participation in the project as “a service to the scholarly community.”
“I’m no Bentham scholar, but I am interested in history, so it’s interesting to look at the addenda and deletions in the manuscripts and generally follow the thought processes of a man living in 18th-century England,” she wrote in an e-mail. “I usually take about 15 to 20 minutes of my lunch hour if the weather is bad or if I don’t feel like going out to do it.”
To advocates of crowd-sourcing like Ms. Leon, making the country’s documentary heritage available to the public is one of the most valuable aspects of that approach. Yet it also underscores how the digital humanities have become a new source of tension between experts and amateurs.
Max J. Evans, the former executive director of the National Historical Publications and Records Commission, said he had long campaigned for using digital technology — like putting scanned originals online — as a way of widening access. This way, at least, the papers of the founding fathers and others, despite being tough to read and unsearchable, would not be “held up in these scholarly editing offices for years and years, and not only available to a select group of scholars,” he said. (The Library of Congress has done this with some of the founders’ papers.)
But generally, document editors tend to be very resistant, he added.
“ ‘It’ll be great when it’s done,’ they say, ‘it’s the Sistine Chapel, it’s a magnificent masterpiece,’ ” Mr. Evans said. “I get all that. But in the interim, I don’t know why we can’t use the images we have in an easy-to-use form without all the scholarly apparatus.”
Crowd-sourced transcription would naturally flow from such a set-up, he said.
Another obstacle is practical, said Daniel Stowell, the director and editor of the Papers of Abraham Lincoln in Springfield, Ill. His office experimented with the hiring of nonacademic transcribers, he said, but they produced so many errors and gaps in the papers that “we were spending more time and money correcting them as creating them from scratch.”
When tens of thousands of unpublished and rarely seen documents written by or to Lincoln were digitally scanned for the project for his bicentennial celebration in 2009, the National Center for Supercomputing Applications at the University of Illinois, Urbana-Champaign, created a prototype for crowd-sourced transcription. It was ultimately abandoned.
Mr. Stowell said he expected the rest of Lincoln’s papers to be transcribed and posted in 10 to 15 years. As to whether newer endeavors like Project Bentham or the War Department papers will end up saving time, he said, “I’m skeptical.”
Certainly, deciphering Bentham’s sloping antique script can be difficult for the pajama-wearing amateur. Bentham Project participants are given a long list of guidelines instructing them on how to enter codes for deletions, additions, marginal notes, headings and other textual quirks, which makes the work slow going in the beginning. Users can choose a manuscript to work on by subject (“Box 73 and Box 96 contain interesting manuscripts on drunkenness, swearing, adultery and much more ...”), or by how easy a document is to read. Bentham’s handwriting deteriorated in his later years, and he occasionally wrote in between the lines of a previous letter.
Will there be mistakes? Of course.
“We’re not looking for perfect,” Ms. Leon of George Mason said of crowd-sourced transcription. “We’re looking for progressive improvement, which is a completely different goal from someone who is creating a letter-press edition.”
Mr. Evans quoted from a 1791 letter by Jefferson. “Let us save what remains,” Jefferson wrote of national documents not destroyed during the war, “not by vaults and locks which fence them from the public eye and use in consigning them to the waste of time, but by such a multiplication of copies, as shall place them beyond the reach of accident.”
“I think there’s a good deal of ‘fencing’ now,” Mr. Evans said, “not by keeping them under lock but by keeping these documents in the hands of only small group of documentary editors, and those scholars who can make their way to the archives, until formal publication.”