You Can’t Trust the Internet to Continue Existing

August 24, 2016

9:45 am

Our Incredible Journey” puts the ack! in acquihire. So says the tagline to this underground tumblr blog devoted to documenting the various purchase announcements of startups in the tech industry. And the various statements about “sunsetting” a product, why this means “great things,” and decisions to end a company statement with the shout of “Onward!” are certainly entertaining. The blog will, no doubt, soon host Instapaper’s announcement that Pinterest bought it yesterday.

However, the tumblr’s about page reveals a deeper mission:

“In part this site is simply a snarky and angry response to companies and people who might profit from an acquisition while showing little regard for the efforts of the thousands of people who spent time on their service.

But this site also asks broader questions: Is this the best way to structure and grow businesses? Is this the best long-term model for keeping people interested in making and doing amazing things on the internet? Why does almost no website or online service (my own included) have a plan for what happens to their users’ content over the long term?”

User’s content and purchasing decisions are even more fragile in the internet age. Nest can discontinue its smart home hub device, rendering it useless (This is still a problem even if Nest later discontinued the CEO who made the decision). Websites can shut down with no warning, dumping any information that didn’t happen to be documented by the Internet Archive. But is the problem that bad?

The Problem Is Bad

To me, and many tech-savvy people, the internet can seem like a safe, reliable way to back up anything. Unlike the real world, files will never age on the internet. But that doesn’t mean they’ll be there forever, or even three years later. Researchers Hany M. SalahEldeen and Michael L. Nelson, of Old Dominion University, Virginia, conducted a 2012 study into the amount of internet resources lost to time, and how quickly they were lost.

They collected data on six events between June 2009 and March 2012: “the H1N1 virus outbreak, Michael Jackson’s death, the Iranian elections and protests, Barack Obama’s Nobel Peace Prize, the Egyptian revolution, and the Syrian uprising.” The results proved that a large chunk of the internet (containing facts about meaningful historical events) vanished soon after it was posted:

“[W]e found about 11% lost and 20% archived after just a year and an average of 27% lost and 41% archived after two and a half years. Furthermore, we found a nearly linear relationship between time of sharing of the resource and the percentage lost, with a slightly less linear relationship between time of sharing and archiving coverage of the resource. From this model we conclude that after the first year of publishing, nearly 11% of shared resources will be lost and after that we will continue to lose 0.02% per day.”

Imagine if your public library had an annual purge in which they took one tenth of all their books and threw them in a bonfire.

At Least We Have Some Sites to Deal With It

The ultimate problem is inescapable: The internet will never last as long as we want it to. One 2015 Atlantic article addressed this problem, centering its tale on the story of a Pulitzer-finalist 34-part series of investigative journalism that vanished from the internet after its paper folded, and found no solution either. It quoted Jason Scott, archivist and historian for the Internet Archive, who has had plenty of time to reflect on the situation:

“News organizations kill old articles, YouTube’s old videos go away. And while the Archive and other entities are saving—quote-unquote saving—these sites, even those will go to new URLs. They won’t be in the same place. You’ll have to search for them… There are success stories. But meanwhile, silently, thousands of useful things are disappearing. As time goes on, I have even less and less hope for how long it will last.”

The Internet Archive is fighting the good fight. So is the Lost Media Wiki and /r/ObscureMedia/, collections of long-lost TV shows and media, most of which is horrifyingly bad. But even these won’t last forever. And hard drives can be corrupted. The last time period to come up with a truly durable media format may have been the Stone Age. At least Stonehenge still exists.

H/T Y Combinator

Did you like this article?

Get more delivered to your inbox just like it!

Sorry about that. Try these articles instead!

Adam is a writer with an interest in a variety of mediums, from podcasts to comic books to video essays to novels to blogging — too many, basically. He's based out of Seattle, and remains a staunch defender of his state's slogan: "sayWA." In his spare time, he recommends articles about science fiction on Twitter, @AdamRRowe

  • Shares

Leave a Reply

  • (will not be published)
TechCo Spotlight 300×250
Startup_Mixology_300x250
.ME Tech.co search 300x600

Get personal with .ME

Make your domain name as unique and memorable as you are. Get creative, get .ME!