Behind the Scenes: Digital Archiving
By Stan Prager
“My dear Mother:
Six months today my name was enrolled upon U.S. paper. Never shall I forget my feelings that night as I lay upon my comfortable bed thinking that tomorrow I should bid adieu to my dearest friends on earth to go forth to fight the battles of my country. This day has not passed unmindful of my birthday and I feel grateful to God for having spared my life thus far. It is a painful task for me to inform you of the death of young Ruggles at Ship Island. He died in the afternoon of April 16th. Yet the steamer was out of hailing distance which bore his comrades in Arms to this City, Alfred was much respected and universally beloved by every one of his company and his death is deeply felt by all of us.”
So wrote twenty-six year old Frank S. Knight of Hardwick, Massachusetts, in a poignant letter to his mother on May 9th, 1862 from New Orleans where he was stationed as a Private in Company D of the Massachusetts 31st Volunteer Infantry, as part of the Union Army occupation force led by General Benjamin Butler in the early years of the American Civil War.
I was honored to be one of the first to read this letter again more than a hundred fifty years after it was written — and more than a century after its typed version was sealed in a storage box that was later discovered in the archives at the Wood History Museum in Springfield, Massachusetts — as part of the team resurrecting the long-forgotten narrative of the Massachusetts 31st. It is difficult to properly articulate the emotional reaction I experienced reading this letter, as well as the dozens of other diaries, narratives, memoirs, correspondence and miscellany that sat essentially overlooked until this digital archiving project uncovered it, translated it into searchable text and uploaded it to the web for public access.
Young men who are often little more than boys go off to war, and their experiences are typically indelibly etched into their memories, for better and for worse; the adventure of a lifetime that can never be left behind, but ever after carried in little bits and pieces that are now inextricably bound up in the character and identity of the survivor. If it was a conflict of some real significance – the Civil War and World War II in American history, for instance – it is cut into even deeper grooves because it resonates as part of a shared national consciousness that by nature magnifies the role that one soldier may have played in that great drama. Civil War veterans were especially conscious of this, for even when it was part of the recent past it was evident that the result of its four terribly bloody years was to not only save the nation but to define it, to not only end slavery but to abolish the notion of it, to turn the plural united states into the singular United States, to vehemently underscore the meaning of what had then been our de facto national motto, E Pluribus Unum.
While the survivors in the southern states of the former Confederacy often wallowed in bitterness and self-pity and consoled themselves with an alternate fictional history in the “myth of the Lost Cause,” Union veterans ever celebrated what they had done to “save the union.” Veteran organizations sprang up all over the north, G.A.R. – Grand Army of the Republic – groups that reveled in the shared experiences of those who contributed to the victory, as well as strove to keep alive the spirit of the camaraderie that acted as a kind of glue when they were together in the field. Like the Athenians who turned back the Persians at Marathon and the G.I.’s who cut bloody footprints in the sands at Normandy and Iwo Jima, men who stood together in a noble calling at Shiloh, Antietam, Gettysburg – and places few have heard of like the “Battle of Sabine Crossroads” in Louisiana in 1864 – were determined that what they once did for their country would never be forgotten.
As these men grew older, and their ranks grew thinner, many thought it imperative that what they once did together should be preserved for their children and their children’s children. They wrote memoirs. They even wrote regimental histories that recounted the deeds that they did not want to be forgotten by the generations that followed. One man, Lewis Frederick Rice, historian of the 31st Massachusetts Volunteer Infantry, made it his mission to gather together correspondence, diaries and memoirs of his fellow veterans in the late nineteenth and early twentieth century, with the intention of publishing a regimental history. Some of these men did not survive the war: Frank Knight tragically died in New Orleans in 1863. Others, such as Adelbert Bailey of Conway, just nineteen when he mustered in to Company C, lived to a ripe old age, still active in regimental reunions as late as 1931. Rice reached out to survivors and their families and amassed a collection that ran to more than sixteen hundred pages; Rice himself contributed a vast amount of his own material. Better than ninety percent of what he sourced was then typed up on the old-style manual typewriters of the day. Sadly, in 1909 Rice passed away. No one picked up the mantle of his mission, and the manuscripts were buried in the archives, first at the Connecticut Valley Historical Society, later at the Wood History Museum, seldom seen and certainly never widely shared. Until now.
Through a grant obtained through the Massachusetts Sesquicentennial Commission, Archivist Cliff McCarthy spearheaded this effort to revivify Rice’s mission. Cliff not only headed up the project but designed and produced the website, wearing several hats simultaneously with great aplomb. Noted historian Larry Lowenthal was brought in to write the bigger picture historical analysis of the times and events of the 31st.
Seeking to fulfill my final program requirement for my Master’s Degree in Public History from American Public University — while running my own local computer repair business and darting around the country in my spare time visiting Civil War battlefields to celebrate the sesquicentennial — I serendipitously walked into the opportunity of a lifetime as this project was just poised to get off the ground. During my course of studies, digital archiving rapidly became my key area of interest. As a computer guy with a passion for history, digitization stood out as the perfect marriage of technology and history. But when I first sat down with Cliff McCarthy and Margaret Humberston, Head of Library and Archives, to inquire about a potential internship, I never could have predicted that something so perfectly anchored to my focus and interests could have presented itself, or that I might be selected to play a critical role!
My part of the project has been primarily centered on the technical application of digital archiving. As such, I have scanned the original materials in to create digital Adobe PDF files, then proceeded to use Optical Character Recognition (OCR) technology to “translate” these into text documents in Microsoft Word that Cliff can then process and upload to the website with an end result of searchable text. In the early days of digitization, countless documents and books were scanned and uploaded to the web. But these were image files – basically “pictures” of pages. They were readable, but they could not be indexed nor searched for data. In our project, all of the text on the webpages are searchable, so if you were – for example – looking for information relating to New Orleans, Benjamin Butler, Sabine Crossroads, Frank Knight, etc., you could search and locate your data in an instant. That is the beauty of the technology!
After some research, I selected an Epson GT-1500 document scanner to fulfill my hardware requirements. It came equipped with a sheet feeder for scanning multiple pages at once, but the age and fragility of the documents at hand precluded that, so the single page scanner bed method had to be employed, a more laborious but safer process for archival materials. For OCR purposes, I selected the PRO version of ABBYY Finereader software, which translates PDF’s and digitally scanned documents into text. On the one hand, the accuracy is quite impressive; on the other hand, it is far from perfect. Old typewriters with faint impressions and worn keys often produce images that require much work to finesse. The bottom line is that even with the finest equipment and the best software, OCR remains highly labor intensive, as the initial “translation” must be proofread line-by-line against the original before the final Word document can be generated, which Cliff then further polishes for upload to the web. It is a lot of work! But the flip side is that the OCR guy is compelled to carefully read every letter, diary, memoir, etc. to ensure accuracy. If you are passionately interested in the narratives, as I am, that is not a bad thing!
Utilizing the cloud, I began to work more and more remotely from home with scans created on site, so I could be – and often was! – doing detailed OCR work at 3 AM in my home office, long after the archive was closed. A few weeks into this, I was even more fortunate to become the rare intern to get his own intern, as Jose Hernandez – a very bright Springfield high school junior with a passion for history – joined the project. Going forward, Jose would volunteer his time scanning hundreds of pages and uploading these to a cloud-based application so I would always have ready material to OCR from home. Jose deserves much credit for the contribution he made to the final product!
Many of the soldiers’ letters and diaries are hardly enthralling. Much of the shared narratives center upon the dullness of drill, fantasies of edible food, and the dreams of going home in one piece. In the labor intensive world of OCR, this is often not heady stuff and can provoke a series of yawns that do not make for compelling “behind the scenes” tales. And yet … and yet even the most monotonous relating of camp life cannot help but inject a perspective for the reader. These guys are long dead but they once walked as we do; they once ate and slept and imagined what their futures might have held in store for them if they were fortunate enough to survive this dangerous phase of their lives. I may wince a little when I read Frank Knight’s even most mundane letters to his mother because I know from the benefit of hindsight that he would not live to see her again. Others I might silently chastise for their whining tones because I know they did go home and that in the greater scheme their complaints are insubstantial. But they could not have known that. Like millions of other twenty-something’s caught up in wars not-their-own-making on several continents over thousands of years, they were just trying to hold on and do their best so as not to let others down, and so as not to let their own expectations of themselves down.
All of them have been dead for many years. Reading it, OCR’ing it, all of them happen to be, even for brief moments, brought back to life for me, and I cannot help but vicariously feel what they felt, agonize with their agony, hope with their hopes. At the very least, I feel as if through this we collectively in this project have fulfilled a debt to them, and to Lewis Frederick Rice, who took on the mission and through no fault of his own could not complete it. We carry on Rice’s work, and as such we give voice to those men of the 31st who lost their own voices so very long ago.
East Longmeadow MA
July 11, 2014