toolkit of ethology mobilized by Rahwan and colleagues (2019) in a recent Nature article is probably not up to this aim, for a quite simple reason: machine learning tools are eminently social animals. They learn from the social – datafied, quantified and transformed into computationally processable information – and then they manipulate it, by drawing probabilistic relations among people, objects and information. While Rahwan et al. are right in putting forward the ‘scientific study of intelligent machines, not as engineering artefacts, but as a class of actors with particular behavioural patterns and ecology’ (2019: 477), their analytical framework focuses on ‘evolutionary’ and ‘environmental’ dimensions only, downplaying the cornerstone of anthropological and sociological explanations, that is, culture. Here I argue that, in order to understand the causes and implications of algorithmic behaviour, it is necessary to first comprehend how culture enters the code of algorithmic systems, and how it is shaped by algorithms in turn.
Two major technological and social transformations that have taken place over the past decade make the need for a sociology of algorithms particularly pressing. A first, quantitative shift has resulted from the unprecedented penetration of digital technologies into the lives and routines of people and organizations. The rapid diffusion of smartphones since the beginning of the 2010s has literally put powerful computers in the hands of billions of individuals throughout the world, including in its poorest and most isolated regions (IWS 2020). Today’s global economic system relies on algorithms, data and networked infrastructures to the point that fibre Internet connections are no longer fast enough for automated financial transactions, leading to faster microwave or laser-based communication systems being installed on rooftops near New York’s trading centres in order to speed up algorithmic exchanges (D. MacKenzie 2018). Following the physical distancing norms imposed worldwide during the Covid-19 pandemic, the human reliance on digital technologies for work, leisure and interpersonal communication appears to have increased even further. Most of the world’s population now participates in what can be alternatively labelled ‘platform society’ (van Dijck, Poell and de Waal 2018), ‘metadata society’ (Pasquinelli 2018) or ‘surveillance capitalism’ (Zuboff 2019), that is, socio-economic systems heavily dependent on the massive extraction and predictive analysis of data. There have never been so many machines so deeply embedded in the heterogeneous bundle of culture, relations, institutions and practices that sociologists call ‘society’.
A second, qualitative shift concerns the types of machines and AI technologies embedded in our digital society. The development and industrial implementation of machine learning algorithms that ‘enable computers to learn from experience’ have marked an important turning point. ‘Experience’, in this context, is essentially ‘a dataset of historic events’, and ‘learning’ means ‘identifying and extracting useful patterns from a dataset’ (Kelleher 2019: 253).
In 1989, Lenat noted in the pages of the journal Machine Learning that ‘human-scale learning demands a human-scale amount of knowledge’ (1989: 255), which was not yet available to AI researchers at the time. An impressive advancement of machine learning methods occurred two decades later, thanks to a ‘fundamental socio-technological transformation of the relationship between humans and machines’, consisting in the capturing of human cognitive abilities through the digital accumulation of data (Mühlhoff 2020: 1868). This paradigmatic change has made the ubiquitous automation of social and cultural tasks suddenly possible on an unprecedented scale. What matters here sociologically is ‘not what happens in the machine’s artificial brain, but what the machine tells its users and the consequences of this’ (Esposito 2017: 250). According to Esposito, thanks to the novel cultural and communicative capabilities developed by ‘parasitically’ taking advantage of human-generated online data, algorithms have substantially turned into ‘social agents’.
Recent accomplishments in AI research – such as AlphaGo, the deep learning system that achieved a historic win against the world champion of the board game Go in 2016 (Chen 2016; Broussard 2018), or GPT-3, a powerful algorithmic model released in 2020, capable of autonomously writing poems, computer code and even philosophical texts (Weinberg 2020; Askell 2020) – indicate that the ongoing shift toward the increasingly active and autonomous participation of algorithmic systems in the social world is likely to continue into the near future. But let’s have a look at the past first.
Algorithms and their applications, from Euclid to AlphaGo
The term ‘algorithm’ is believed to derive from the French bastardization of the name of the ninth-century Persian mathematician al-Khwārizmī, the author of the oldest known work of algebra. Being originally employed in medieval Western Europe to indicate the novel calculation methods alternative to those based on Roman numerals, in more recent times the term has come to mean ‘any process of systematic calculation […] that could be carried out automatically’ (Chabert 1999: 2). As Chabert remarks in his book A History of the Algorithm: ‘algorithms have been around since the beginning of time and existed well before a special word had been coined to describe them’ (1999: 1). Euclid’s algorithm for determining the greatest common divisor of two integers, known since the fourth century BCE, is one of the earliest examples.
More generally, algorithms can be intended as computational recipes, that is, step-by-step instructions for transforming input data into a desired output (Gillespie 2014). According to Gillespie (2016: 19), algorithms are essentially operationalized procedures that must be distinguished from both their underlying ‘model’ – the ‘formalization of a problem and its goal, articulated in computational terms’ – and their final context of application, such as the technical infrastructure of a social media platform like Facebook, where sets of algorithms are used to allocate personalized content and ads in users’ feeds. Using a gastronomic metaphor, the step-by-step procedure for cooking an apple pie is the algorithm, the cookbook recipe works as the model, and the kitchen represents the application context. However, in current public and academic discourse, these different components and meanings tend to be conflated, and the term algorithm is broadly employed as a synecdoche for a ‘complex socio-technical assemblage’ (Gillespie 2016: 22).
‘Algorithm’ is thus a slippery umbrella term, which may refer to different things (Seaver 2017). There are many kinds of computational recipes, which vary based on their realms of application as well as on the specific ‘algorithmic techniques’ employed to order information and process data (Rieder 2020). A single task, such as classifying texts by topic, may concern domains as diverse as email ‘spam’ filtering, online content moderation, product recommendation, behavioural targeting, credit scoring, financial trading and more – all of which involve a plethora of possible input data and outputs. Furthermore, text classification tasks can be executed in several – yet all ‘algorithmic’ – ways: by hand, with pen and paper only; through rule-following software applying models predefined by human programmers (e.g. counting topic-related word occurrences within texts); or via ‘intelligent’ machine learning systems that are not explicitly programmed a priori. These latter can be either supervised – i.e. requiring a preliminary training process based on data examples, as in the case of naive Bayes text classifiers (Rieder 2017) – or unsupervised, that is, machine learning techniques working without pre-assigned outputs, like Latent Dirichlet Allocation in the field of topic modeling (Bechmann and Bowker 2019).
This book does not aim to offer heavily technical definitions, nor an introduction to algorithm design and AI technologies; the reader can easily find such notions elsewhere.2 Throughout the text, I will frequently make use of the generic terms ‘algorithm’ and ‘machine’ to broadly indicate automated systems producing outputs based on the computational elaboration of input data. However, in order to highlight the sociological relevance of the quali-quantitative transition from Euclid’s calculations to today’s seemingly ‘intelligent’ artificial agents like GPT-3 and AlphaGo, some preliminary conceptual distinctions are needed. It is apparent, in fact, that the everyday socio-cultural implications of algebraic formulas solved for centuries by hand or via mechanical calculators are not even close in magnitude to those of the algorithms currently governing information networks.
Below I briefly outline