Old Files, New Formats: The Challenges of Transferring Audio Data
We live in an age of digital cacophony. Some form of recorded audio—be it podcast, muzak, or car alarm—is constantly streaming from somewhere. Sometimes this immersion in sound is voluntary via expensive noise canceling headphones that allow us to experience public transportation as an out-of-body experience. Other times the the audio intrusion is unwelcome, like one of Spotify’s ubiquitous-and-unnerving subscription requests.
Like digital information in general, millions of petabytes of audio files exist in a vast digital ether preserved on huge servers and tiny thumb drives. Though the majority of these databases are unexplored and underutilized, the information that’s out there can potentially save your life—if you can render useful information from the the endless expanse of noise.
Case in point: primate researcher Klaus Zuberbühler has recorded hundreds of hours of jungle noise in Africa. After distilling the ambient noise, he was able to identify the distinct calls Diana monkeys emit when predators are nearby. On a later, solitary trip into the bush, Zuberbühler heard and recognized these warning calls and retreated to safety before a bloodthirsty leopard could permanently put an end to his research.
Although most organizations don’t have to contend with leopards, they do face the challenge of efficiently recording and distilling audio recordings for future use in a way that saves valuable time, energy and money. So, how does one transmute older formats into more modern and streamlined sources rich with useful metadata? Label and efficiently file new recordings? Store and maintain these files for future use?
Chris Lacinak, founder and president of archiving consulting firm AudioVisual Preservation Solutions, says any organization can effectively manage their audio resources if they integrate three key components: technology, people, and policies.
According to Lacinak, 99 percent of his clients are unaware of what’s contained within their audio files because of improper cataloguing and miscommunication between departments. “IT is not the solution to archiving,” he says. “The first step is to figure out what you have.”
This is where policy comes into play. It is imperative to create clear and precise policies that dictate the ways in which audio is recorded and stored, and to ingrain that policy as part of your organization’s institutional knowlege, Lacinak says.
“Actually do planning,” Lacinak says. “If you are doing audio, how many events are you doing a year? What type of recordings? [Will you be] editing the audio? That will all lead to what type of storage capacity you need.”
Regardless of the size and needs of an organization, audio should be stored in multiple locations that are separated geographically. For larger storage needs, outsourcing storage to massive databases with quality control measures has become more prevalent in the past year. Once an organization cements its strategy for effectively interweaving its technology, people and policies, the next hurdle is quick access to audio data.
When searching on a workstation for text files, your query scans both the title and content of those files. Not the case with audio files, which often include nothing more than author of the file and other basic information. Audio still largely lacks what Lacinak calls “time-based meta data,” essentially a searchable account of what is recorded on a given audio file. It is possible in controlled environments with high end audio equipment to convert crystal clear audio with speech-to-text software into readily accessible text.
But, according to Lacinak, doing so at conferences and meetings is hugely difficult with all the ambient noise present. Thankfully, enriching audio files with useful metadata is possible with BWF meta-edit, an open source tool developed by AudioVisual Preservation Solutions and the Federal Agencies Digitization Guidelines Initiative.
Identifying and sorting existing audio files in a disorganized database can be a frustrating process. But the findings of the Audio Visual Working Group can be a useful place to start. For general inspiration and useful soundbites, Archive.org and the Western Soundscape Archive are fantastic.
The best model for any organization is the world’s mecca of sound storage in Culpeper, Virginia.
Situated on a 45-acre former cold war-era bunker complex that formerly stored hard currency for use in a post-apocalyptic world, the National Audio-Visual Conservation Center (NAVCC) now serves a much more useful (and logical) purpose as the world’s largest and most state-of-the-art repository of audiovisual material. With over 90 miles of storage space, it holds more than three million audio recordings ranging from Edison wax cylinders to more modern magnetic tape spools, and processes them into digital formats. NAVCC is part of the Library of Congress and is endowed with a small army of archivists toiling away on specialized machines underwritten by a $150 million donation from David Packard.
Most organizations can’t afford that sort of logistical support. But adopting some of basic filing and sorting policies can make archived audio exponentially more accessible and useful.
“People think of the archive as where things go to die,” says Lacinak, “but that’s not the case at all. It’s required for a healthy organization to survive.”
Klaus Zuberbühler certainly provided a literal example of that observation in the jungles of Africa. Do you have any equally harrowing testimonies to the value of proper data storage? Any tips for identifying and organizing audio files in general? Let us know in the comments section.