Ethnography & Big Data: Future Roles?
Last month, Tricia Wang (giving a keynote at EPIC this year) wrote an ethnographer’s elevator pitch for data scientists. I’m sold, and as the co-founder of the first UK think-tank Centre dedicated to researching social media to inform public policy (and someone who works with those algorithms and ‘big N’ datasets every day) I want to continue the same theme: the enduring and vital relevance of ethnography to a particularly important form of big data – social media analysis.
I don’t doubt the possibly transformative power of social media research. We are living through the ‘datafication’ of the world – an increasing human ability and propensity to make more and more measurements of more and more things, from man-hole covers to airplane ticket prices. I view the explosive transfer of our social lives onto social-digital spaces – Facebook, Twitter and hundreds more – as the datafication of social life. For the first time, we have a measurable, analyzable visage of a society-in-motion: arguing, condemning, joking and applauding, being influenced and influencing in turn. Harnessing these vast digital repositories as behavioral evidence is incredibly exciting. Never before have those with a stake in understanding people and society been able to get their hands on data so copious, constantly refreshing, and unmediated. Big data social media research promises to teach us a lot, from how we are influenced by networks, to what finally tips us over the edge into a particular action.
However, my point is simple. Fundamentally and endemically, social media research is not yet generating the kinds of insights that it could.
In order to be truly useful, insightful and confer what is often called in Government ‘decision advantage’, it needs to undergo a step-change evolution. It needs to move from Generation I: analytics, to Generation 2: academic discipline. Let me explain why ethnography is vital to make this leap.
First, method. The story of big data analytics so far has been the growth of very powerful quantitative methods driven by the metrics traditions of the marketing and advertising industries, and the computer science departments of universities that feed them with techniques and staff. This has led to a state of the art that is on the whole good at counting things on social media and can indeed do so at truly incredible, unthinkably large scales. In general, however, this kind of raw enumerative capability leaves a large number of challenges to solid, valid insight currently unanswered. Across the research cycle, methodological frailties in how social media research is conducted means that what it produces cannot conform to the standards of evidence that are required to influence important decisions. The way that data is collected on social media, however large, is often arbitrary or incidental – even best practice work – and cannot meet the standard sociological requirements needed to construct representative datasets that allow findings to be generalized onto wider populations. Analytical techniques for big data – often necessarily computationally intensive – often ignore context, culture and nuance, and present raw results as if these present some kind of ‘obvious’ message for the reader. In general, the semi-straw man I’ve constructed is one that is unable to critically and reflexively understand and interpret human behavior in all of its textured complexity and challenge. This is something myself and colleagues have written about at length elsewhere – especially a paper we wrote last year with the former Director of GCHQ Sir David Omand - #intelligence.
Specifically where I view ethnography to be vital is in understanding the meaning and significance of these new forms of digital-social interaction in the eyes and words, of those that conduct them. The explosion of social media has been so rapid that our ability to count things on social media has outpaced our understanding of what these things mean as social and cultural practices – as symbols, as language-games, as rituals, as products of digital worlds ruled by new norms and subjective truths.
Without knowing these things, we don’t know what we’re counting. So, if applying network analytics to social media, we’re measuring 'edges' of networks on social media at great scale, but we don't know what the social significance of these edges is. We don’t know what it is to be a Facebook friend, a Twitter follower, to retweet or to G+1, or how this varies from person to person and community to community. If we’re crafting algorithms to understand broad sentiments expressed across platforms like Twitter, we need to understand how language use is changing – how new vernacular are springing up in 140 character formats, and how old words take on new meanings. If we’re trying to understand how a social phenomenon – like hate speech – maps onto social media, we also need to understand sub cultures that’ll troll for the ‘lulz’. Overall, to be useful, so many ‘big data’ findings on social media need to be wrapped around a digital ethnography and digital sociology that give their numbers meaning and at the moment they often are not. This is an enduring failing with Generation 1 analytics.
The second major contribution of ethnography is ethics. Generation II social media research will, like other mature academic disciplines that handle personal information, take responsibility as a discipline to create frameworks to ensure research practice manages and mitigates harms. Yet currently we have, neither in the private, public or academic sectors clear, explicit and consensual guidance on how to do this. Academics are beginning to work on it, and the public sector begins to issue clarification, but the private sector is hovering up data at a huge rate and scale. This, as I’ve said before, is a new key battleground for consumer rights and now, with the recent PRISM revelations, a scandal unfolding before our eyes. It could lead to a significant withdrawal of support for the use and analysis of people’s data, and to people withdrawing from social media platforms.
We cannot ever construct ethical frameworks for social media research without the vital input of ethnography. The key social good that is under threat from social media research is privacy. There is no specific definition of privacy enshrined within UK statutory law; it is a complex and multifaceted attitudinal construct, and is changing, partly as a result of Internet and social media use.
Ethnography must help us keep us keep a handle on this important must elusive social value. We need to understand what information is now considered private and what is not, and how these expectations differ by platform and context. Deep, textured understandings must form part of the way that we ensure that social media research conforms, and continues to conform, to these expectations.
‘Inter-disciplinary’ is currently a modish watchword – especially for funding bodies and university assessment forms. But to make social media research useful and ethical, an inter-disciplinary approach is vital (and at my Centre, it is assumed). At the moment disciplinary boundaries are being challenged across academia, and I predict we are increasingly going to see more and more trans-disciplinary hybrids that mix qualitative and quantitative methods together. In the case of social media, it is now clear that qualitative methods are now needed to make social media research a true (Generation II) social science, rather than just analytics, and in doing so to make to make humans increasingly understandable by big data through making big data more ethnographic, indeed human-sized.
Carl Miller is co-founder and Research Director at the Centre for Analysis of Social Media at Demos.