Everything dies, including information | MIT Technology Review

A little, according to the experts. On the one hand, what we think is permanent is not. Digital storage systems can become unreadable in as little as three to five years. Librarians and archivists are rushing to copy things into newer formats. But entropy is still there, waiting in the wings. “Our professions and people often try to extend normal lifespans as much as possible through a variety of techniques, but that always bucks the tide,” says Joseph Janes, associate professor at the University of Washington Information School.

To complicate matters, archivists are now grappling with an unprecedented deluge of information. In the past, materials were scarce and storage space limited. “Now we have the reverse problem,” says Janes. “Everything is permanently recorded.”

In principle, this could right a historic wrong. For centuries, countless people did not have the right culture, gender, or socioeconomic class for their knowledge or work to be discovered, valued, or preserved. But the massive scale of the digital world now presents a unique challenge. According to an estimate from market research firm IDC last year, the amount of data that businesses, governments and individuals will create over the next few years will be double all the digital data previously generated since inception. of the computer age.

Entire schools within some universities are scrambling to find better approaches to safeguarding the data under their aegis. The Data and Service Center for the Humanities at the University of Basel, for example, has developed a software platform called Knora not only to archive the many types of data from humanities work, but to ensure that people in the future will be able to read and use them. And yet, the process is cumbersome.

“We can’t save everything…but that’s no reason not to do what we can.”

Andrea Ogier

“You make educated guesses and hope for the best, but there are datasets that get lost because no one knew they would be useful,” says Andrea Ogier, associate dean and director of data services at libraries. Virginia Tech scholars.

There are never enough people or money to do all the necessary work, and the formats change and multiply all the time. “How do we best allocate resources to preserve things? Because budgets are not very important,” says Janes. “In some cases, this means that items are saved or stored, but sit there, uncataloged and unprocessed, and therefore almost impossible to find or access.” In some cases, archivists end up refusing new collections.

The formats used to store data are themselves impermanent. NASA has stored about 170 bands of lunar dust data, collected during the Apollo era. When researchers first started using the tapes in the mid-2000s, they couldn’t find anyone with the 1960s IBM 729 Mark 5 machine needed to read them. With some help, the team eventually found one in poor condition in the Australian Computer Museum’s warehouse. Volunteers helped restore the machine.

Software also has a lifespan. Ogier remembers trying to examine an old Quattro Pro spreadsheet file only to find there was no readily available software capable of reading it.

Leave a Comment