search.noResults

search.searching

saml.title
dataCollection.invalidEmail
note.createNoteMessage

search.noResults

search.searching

orderForm.title

orderForm.productCode
orderForm.description
orderForm.quantity
orderForm.itemPrice
orderForm.price
orderForm.totalPrice
orderForm.deliveryDetails.billingAddress
orderForm.deliveryDetails.deliveryAddress
orderForm.noItems
ILLUSTRATION: ANDY POTTS


digital


offer additional context. However, an investigation by Harvard’s Library Innovation Lab (LIL) into ‘link rot’ found that a quarter of external links in New York Times articles were no longer accessible, with that proportion rising with the age of the page. This also revealed the wider implications of web decay beyond journalism. Hyperlinks to government websites and legal judgements – often digital only and containing information of vital public interest – suffered a high rate of attrition. Researchers have also highlighted an issue termed ‘content


lost if the software that publishers use becomes outdated. Servers that host content can deteriorate with age. Rights for images and videos can expire, making it expedient to delete the articles they are attached to. Often the finances don’t stack up. “Businesses are unable to justify the costs of maintaining old content which gets a small amount of views,” says the manager. The best stories are often the most vulnerable, according to


Ben Welsh, data editor at the LA Times. “Newsrooms have content management systems (CMS) that operate like an assembly line and produce stories in the same template,” he says. “Those tend to have a better chance of surviving because they live inside a database that gets moved from place to place. “But special projects that are often custom-designed, coded and published outside of the CMS are vulnerable to just disappearing from the web… They are exceptionally fragile and they are often the journalism that we care most about and invest the most time in.” Much of the Guardian’s groundbreaking, crowdsourced


investigation into MPs’ expenses from 2009 can no longer be found on its website. A USA Today interactive piece on the border wall with Mexico now returns an error message. As new templates and graphic innovations are harnessed to convey the enormity of Covid-19 or the war in Ukraine, industry specialists fear that uniqueness may not be built to last. Web articles often suffer degradation as well as disappearance. Hyperlinks are one of the great advantages of online publishing as they allow journalists to cite sources and


drift’ where hyperlinks are redirected to new pages. The primary cause was identified by a 2019 Buzzfeed investigation that discovered a cottage industry of dubious companies offering hyperlink hijacking services on news websites, resulting in links to gambling sites and bankruptcy lawyers on BBC articles. News outlets often lack the means and motivation to preserve their digital output. A Columbia Journalism Review study found 19 of 21 companies were taking ‘no active steps’ to do so. The internet is ever evolving – unsteady terrain for outlets that are constantly adapting to its innovations and challenges. One of my former editors likened web publishing to fixing a car while you are driving it. There is also the imperative to look ahead and focus on what’s next rather than devoting resources to safeguarding the past. Independent archiving initiatives are stepping into the breach. Harvard’s LIL has launched Perma.CC, which fights link rot by providing secure permalinks to citations. Ben Welsh has launched several projects, including Save My News, a clipping service that saves articles to multiple archives, and The News Homepage Archive, which safeguards digital front pages. The British Library has expanded an ad-hoc archiving service for UK news websites into a more comprehensive effort since 2013. The largest initiative is the California-based Internet


Archive, which began in 1996 and now hosts more than 600 billion pages. The group’s software spiders crawl the web and capture content in a similar fashion to Google’s indexing technology. The company has expanded its catalogue to include books, TV, radio and music. The Archive has demonstrated value to journalists through





News outlets often lack the means and motivation to preserve their digital output


initiatives such as its Threatened Outlets page, established after billionaire Peter Thiel sued US site Gawker out of existence. The company also offers new investigative tools, as shown when it exposed Dominic Cummings’ lie that he predicted the pandemic by revealing edits on his blog through versions with different time stamps. But even a vast operation such as the Archive is dwarfed by the task of preserving the internet. “We get better every day but the need for our work outstrips our efforts,” says director Mark Graham. The spiders run into paywalls and sites that are incompatible with their software. How far archivists should aim to preserve a page with all its layers of multimedia, links and ads involves curatorial decisions. Apps present a whole new category of challenges. This is all vastly removed from traditional newspaper archiving. Harvard’s LIL researchers believe the solution lies in the


creation of a vast library by archivists, journalists, and technologists that would allow newsrooms to deposit pages through a common system without being slowed down. “We shouldn’t expect journalists to be librarians, but we could build tools that make it easy for them to hand things off to librarians,” says the LIL’s Clare Stanton. If we want our


theJournalist | 23


Page 1  |  Page 2  |  Page 3  |  Page 4  |  Page 5  |  Page 6  |  Page 7  |  Page 8  |  Page 9  |  Page 10  |  Page 11  |  Page 12  |  Page 13  |  Page 14  |  Page 15  |  Page 16  |  Page 17  |  Page 18  |  Page 19  |  Page 20  |  Page 21  |  Page 22  |  Page 23  |  Page 24  |  Page 25  |  Page 26  |  Page 27  |  Page 28  |  Page 29  |  Page 30  |  Page 31  |  Page 32