search.noResults

search.searching

note.createNoteMessage

search.noResults

search.searching

orderForm.title

orderForm.productCode
orderForm.description
orderForm.quantity
orderForm.itemPrice
orderForm.price
orderForm.totalPrice
orderForm.deliveryDetails.billingAddress
orderForm.deliveryDetails.deliveryAddress
orderForm.noItems
FEATURE Data management


The graph represents a network of 947 Twitter users whose recent tweets contained “#dataviz”, taken from a dataset limited to a maximum of 1,500 users


time, Twitter was processing more than 50 million tweets a day and the historical archive consisted of some 170 billion tweets. Given the historical and cultural significance of this data, the Library of Congress’s archive held enormous promise for researchers worldwide. But following practical challenges, the project stalled.


Key problems around how to ingest, organise and store the vast quantities of data stymied progress. Meanwhile issues over finding useful retrieval methods, creating the appropriate access controls to the archive and, as always, privacy policies, are still being addressed (see ‘Where next for the Library of Congress?’). But failings aside, Twitter has been the only social media platform that has publicly acknowledged the value of its data to researchers. And while the agreement between Twitter and the Library of Congress has not yet fulfilled expectations, it is a key a milestone for social media research, and other projects are now following.


8 Research Information APRIL/MAY 2016


‘Researchers can share tweet identification numbers but sharing larger datasets is prohibited’


In October 2015, MIT Media Lab launched the ‘Laboratory for Social Machines’ funded by a five-year, $10 million commitment from Twitter. The social platform is providing access to all public streaming tweets as well as historical tweets, so researchers can analyse how information spreads on Twitter and other social media platforms.


Meanwhile, outside of the US other key projects include the Social Data Science Lab and COSMOS platform at Wales-based Cardiff University, the Social Repository of Ireland and the National Archives’ UK Government Social Media Archive. Sara Day-Thomson believes these projects are very useful. However, she is adamant greater collaboration among


institutions, such as universities and national heritage libraries, will be crucial to solving current social media research issues, as well as ensuring access into the future. To this end, she believes the coming perabytes of social media should be managed and preserved by several large centralised providers to ease data analytics and also reduce data costs and create benchmarks for data quality. ‘We know that many institutions in the UK have relationships with Twitter, but don’t discuss this openly,’ she says. ‘Yet taking a lesson from the Library of Congress, it doesn’t necessarily work to have a large platform in which to deposit all the data.’ ‘A more successful model would be to develop a larger, and potentially national, interface that could liaise with different social media platforms to smooth over the rights issues, and terms and condition,’ she adds. ‘Things like this don’t come cheap, but the benefits would far outweigh the cost.’


@researchinfo www.researchinformation.info


Marc Smith


Page 1  |  Page 2  |  Page 3  |  Page 4  |  Page 5  |  Page 6  |  Page 7  |  Page 8  |  Page 9  |  Page 10  |  Page 11  |  Page 12  |  Page 13  |  Page 14  |  Page 15  |  Page 16  |  Page 17  |  Page 18  |  Page 19  |  Page 20  |  Page 21  |  Page 22  |  Page 23  |  Page 24  |  Page 25  |  Page 26  |  Page 27  |  Page 28  |  Page 29  |  Page 30  |  Page 31  |  Page 32  |  Page 33  |  Page 34  |  Page 35  |  Page 36  |  Page 37  |  Page 38  |  Page 39  |  Page 40