it was used in diary data collection. Finally, Murphy discusses the potential of mobile devices for SMS-based survey delivery, noting its efficacy for administering them at predetermined times or in the context of specific events or – when used in conjunction with GPS – specific places.
Murphy’s conclusion is that survey methods are continuing to play a major role in social research, and pessimism about their survival is misplaced. This role, however, is increasingly being shaped by people’s use of communication technologies. Given the rapid pace of innovation of these technologies, the future for survey methods remains hard to predict.
Chapter 5: Advances in Data Management for Social Survey Research
As argued in the previous chapter, despite the availability of new sources of social data, making optimal use of more conventional data sources such as surveys remains of critical importance to social research. However, using survey research data can present major challenges for data management. For example, pursuing a particular research question may require linking different datasets, extracting variables, combining them and recoding their values before statistical analysis can start. In this chapter, Lambert argues that data management practices have failed to keep pace with these challenges and explains how e-Research can advance the state of the art, drawing on examples of working with quantitative datasets generated through social surveys taken from the DAMES (Data Management through e-Social Science) project.2 He argues that enhanced facilities for file storage and linkage, for using metadata to describe data, and for the capture of data preparation routines (‘workflows’) can raise standards in data management and help researchers share their experience and expertise with one another. (Exercises illustrating each of these facilities can be found at the book’s website.)
Lambert concludes by examining the prospects for the adoption of more advanced data management tools and practices. Using an example where ‘bottom-up’ and ‘top down’ innovation processes might successfully complement one another, he notes how the push from journals and funding agencies for researchers to publish metadata about their data management is likely to have a decisive influence.
Chapter 6: Modelling and Simulation
Quantitative simulation and modelling are perhaps the most obvious examples of the potential for e-Research methods and tools to revolutionize the study of complex socio-economic problems, and their applications are becoming increasingly widespread. New sources of data and more powerful computational resources have made possible the development of more complex and sophisticated techniques and, of course, larger-scale models. As Birkin and Malleson point out in this chapter, while modelling and simulation in the social sciences have been around for fifty years, prompted by an earlier wave of innovations in computation, recent advances in both data and computation are now having a profound effect.
This chapter provides an introduction to the state of the art in four model classes that are of particular interest to social scientists – systems dynamic models, statistical and behavioural models, microsimulation models and agent-based models. Examples are presented of each of these classes – a retail or residential location model (spatial interaction model or mathematical/systems dynamic model); a traffic behaviour model (discrete choice or statistical model); a demographic model (microsimulation model); and a crime model (agent-based model). Birkin and Malleson observe that while building ever more sophisticated models of social systems has never been easier, the task of demonstrating that such models faithfully represent an underlying social reality remains the key challenge. They then relate some experiences and lessons from building a prototype social simulation infrastructure capable of providing support for the whole research lifecycle, and they stress, in particular, the importance of model reproducibility, reusability and generalizability. They conclude with a summary of some of the – as yet – unexploited opportunities for social simulation presented by new sources of data (e.g., using mobile phone data to update in real time models of population movements) and the challenges (e.g., data ownership and ethics) that will have to be met if these are to be realized.
Chapter 7: Contemporary Developments in Statistical Software for Social Scientists
In this chapter, Lambert, Browne and Michaelides examine the prospects of the quantitative social sciences being in a position to exploit the power of new social data, computational resources and tools to achieve advances in statistical analysis. They review the range of statistical software packages currently available to social researchers and the factors influencing their patterns of adoption. They illustrate their review with examples of the application of statistical methods in domains such as education, health inequalities and epidemiology. They argue that the profusion of statistical tools, while having the benefit of offering choice to researchers, nevertheless raises significant barriers, both social and technical (and, indeed, socio-technical), that need to be addressed if the power of the tools is to be fully exploited by the social science research community.
Regarding social barriers, the authors note that in the UK there is a lack of capacity in statistical skills within the social research community. Regarding technical barriers, they observe that the proliferation of statistical tools has been at the cost of inter-operability and has created a situation that they describe as ‘balkanization’. This can deter researchers from using the tool most appropriate for a particular analysis – rather than the one they are most familiar with – and may also inhibit experimenting with new tools. Echoing the concerns raised by Purdam and Elliot, they also point to problems with transparency, replicability and robustness of statistical analyses using computer packages whose algorithms are not accessible to the user. Drawing on the principles of e-Research for their inspiration, Lambert et al. conclude by presenting some ways of overcoming the social and technical barriers, which they exemplify through their efforts to develop Stat-JR and eBooks, new tools for statistical analysis that promote inter-operability between analysis packages and sharing through better documentation of analysis routines.
Chapter 8: Text Mining and Social Media: When Quantitative Meets Qualitative and Software Meets People
Text mining has developed dramatically in recent years in its power to analyse and extract information from very large bodies of unstructured text. Its applications are motivated by a growing awareness that researchers need more powerful tools in order to benefit from rapidly increasing amounts of textual data being generated through the proliferation and unprecedented levels of take up of Web 2.0 technologies. Chief among these are blogs and social media (‘micro-blogs’), the latter exemplified by the rise of platforms such as Facebook and Twitter.
In this chapter, Ampofo, Collister, O’Loughlin and Chadwick explore how text mining using natural language processing (NLP) techniques can provide qualitative social researchers with powerful analytical tools for extracting information from this unstructured data, including harvesting data and analysing it in real time. They survey the range of research tools for text mining, broadly defined, available both in the academic and commercial spheres. People’s use of social media is seen by many researchers as providing an ideal source of data through which to monitor rapidly changing situations, hence, it has come to particular prominence during civil unrest (e.g., the so-called ‘Arab Spring’) and natural disasters (e.g., Hurricane Sandy). Beyond these inherently unpredictable phenomena, one of the most popular emerging applications of social media analysis lies in the tracking of public opinion through the application of NLP-based techniques such as sentiment analysis. These techniques have the capacity to generate results in real time, which offers intriguing possibilities for both commercial and academic research.
To illustrate the potential and challenges of using text mining techniques in social research, Ampofo, Collister, O’Loughlin and Chadwick present overviews of two projects. The first is a study of social media during the televised debates between political party leaders in the 2010 UK general election campaign. The second is also drawn from this election campaign and focuses on the reporting of accusations of bullying against then-Prime Minister Gordon Brown in the British media. The application of NLP-based text analysis tools to social data is still, in many respects,