RI_APRMAY16

Data management FEATURE

of data could provide rich insight into politics, society and more. According to Weller, Twitter provides an API – application programming interface – from which academics can collect data for free. ‘You can collect certain parts of data from this API and then buy the remaining data if you need to access the entire dataset,’ she says. ‘But what many people have been doing is collecting data from the free public API, which offers a sample of all tweets currently being published or tweets from specific topics.’

Indeed, as well as the German Bundestag elections of 2013, Weller and colleagues have been collecting APIs to explore how soccer fans communicate with their favourite clubs via Twitter. Meanwhile, researchers elsewhere have been interrogating these APIs to better understand reactions to landmark events such as the Arab Spring, Hurricane Sandy and more. Alongside research, and driven by Twitter’s more open and accessible API, software developers have devised a host of tools and methods to capture and analyse data from the social media platform. Crucially, many have been developed for users that do not have any programming knowledge. Indeed, in a recent blog, Wasim Ahmed, PhD student from the University of Sheffield and social sciences blogger for the London School of Economics listed, ‘Webometric Analyst’, ‘NodeXL’, ‘Visibrain Focus’ and more, as key tools for social scientists. Currently looking at the use of Twitter data to give insights into health conditions and health- related events, Ahmed has also used sentiment, time series and network analysis, as well as machine learning methods, to analyse Twitter data.

But researchers’ love of the Twitter platform is not all about easy access to APIs and readily available analysis tools. For example, the 140 character tweets have lent themselves to relatively simple data searches and retrieval. Likewise, the hashtag norms ease data gathering and sorting. For Weller, the academic world’s predisposition towards Twitter is clear, and she has spent a lot of time exploring how researchers from different disciplines use this and other social media platforms. According to the researcher, spontaneous, ad hoc communication is rife around conferences: ‘Researchers [from all disciplines] at a conference will use Twitter to connect. Some use it as an alternative to business cards, adding colleagues to their Twitter list while others use it as a news source or to follow

www.researchinformation.info @researchinfo

Facebook was dubbed ‘The Egyptian Social Network” during the 2011 Arab Spring protests

recommendations on what to read from trusted fellow scholars.’ However, more detailed research has uncovered a clear discipline bias. ‘Bibliometric analysis has shown me that researchers from many different disciplines use social media data, but computer scientists take by far the biggest share of this,’ she says.

‘Social scientists come next and then you see a long tail of linguists, studying, for example, how language changes in social media,’ she adds. ‘Then we see doctors and

‘This goldmine of data could provide rich insight into politics, society and more’

health specialists looking into topics such as well-being, and also economists, looking into, for example, the use of social media to predict stock market exchange rates.’

Despite the proliferation of online methods and tools to help the less computer-savvy researcher analyse data, computer scientists are king in this academic sphere, and researchers from other disciplines need their expertise. ‘Myself and colleagues have observed that researchers from non-technical backgrounds still struggle to gather specific types of data and rely on collaborations with computer scientists and even physicists, at least for big data research topics,’ Weller says. ‘Indeed, in these social media research environments we see that researchers depend heavily on collaborative

efforts to study the data. ‘So we see a trend towards interdisciplinary research teams, in particular in the field of social media data.’

Research challenges

But in the race to use social media platforms as a data source, academics are hitting hurdle after hurdle. For Twitter in particular, ethical and legal issues need to be addressed. When retrieving large swathes of Twitter APIs for analysis, it is not always possible to gain direct consent from participants, a point that Sara Day-Thomson, project officer from the Scotland-based Digital Preservation Coalition is very concerned about. ‘A handful of highly competent social science researchers are trained to handle big research data, having worked with, for example, census data, and use strict and mature ethics processes to account for bias.’ she says. ‘But some computer scientists and physicists run analyses and produce results without relating these to a context.’

‘So what concerns me is the vast majority of social media users are private citizens that don’t necessarily understand the implications of their data being made accessible for either researcher or consumer analysis research,’ she adds. Ethics was one of several issues considered in Day-Thomson’s recent DPC report called, ‘Preserving Social Media’, written to provide guidance to academics accessing social media for research purposes, and to organisations looking to preserve social media data. As part of her research, Day-Thomson also considered how social media data, when combined with administrative data, could unintentionally

APRIL/MAY 2016 Research Information 5

Page 1 | Page 2 | Page 3 | Page 4 | Page 5 | Page 6 | Page 7 | Page 8 | Page 9 | Page 10 | Page 11 | Page 12 | Page 13 | Page 14 | Page 15 | Page 16 | Page 17 | Page 18 | Page 19 | Page 20 | Page 21 | Page 22 | Page 23 | Page 24 | Page 25 | Page 26 | Page 27 | Page 28 | Page 29 | Page 30 | Page 31 | Page 32 | Page 33 | Page 34 | Page 35 | Page 36 | Page 37 | Page 38 | Page 39 | Page 40

orderForm.title