UTOPIAN AND DYSTOPIAN SOCIOTECHNICAL IMAGINARIES OF BIG DATA

Data feminism, a way of thinking about and “doing” data utilizing feminist tools and perspectives, has emerged in recent years as a part of a critical discourse surrounding datafication. The aim of this study is to analyze and identify shared perceptions of big data as expressed in a corpus of scholarly writings published in the domain of data studies and data feminism. We analyzed a set of 44 scholarly texts engaging in feminism concerned with the concept of big data. For the purpose of this article, we refer to this set of texts as data feminism and examine how authors frame and describe big data. We compare future visions in data feminist material with policies by the European Commission and explore what tensions arise among them. Furthermore, we explore and delineate social and political alternatives that emerge from data feminist texts. Both corpora describe futures inclusive of big data and imagine possible positive outcomes from different perspectives and with different ideas of the current role of big data. We found that sociotechnical imaginaries of big data within the data feminist corpus are considerably richer and more nuanced than those of the European Commission. In the data feminist corpus, big data is described as a multiplicity of things and often implicated in perpetuating power imbalances and large societal issues. The European Commission corpus employs the perspective of “data as a resource” to be exploited.


INTRODUCTION
The rise of the Internet and social media, as well as the accumulation of vast amounts of data have created a setting for practices and structures associated with what has come to be called big data.It is a curious concept that has served as the foundation for both optimistic and pessimistic visions of the future.Big data is a contested concept and there is no consensus on its definition.In lieu of the lack of an agreed-upon definition, the "three V's" attributed to Gartner (Ward & Barker, 2013, p. 1) are often used to explain big data.The "three V's" are Volume, Velocity, and Variety, and new words have been introduced since (such as Value and Veracity).Simultaneously, critical discussions are continuing to shed light on the array of systems, decisions, and processes that influence what we perceive as big data.Together with predominant narratives about big data, they help shape how we envision and prepare for the future with big data.
According to the theoretical approach of sociotechnical imaginaries (Jasanoff, 2015;Jasanoff et al., 2007), visions of the future are formed among groups in society and frequent conflicts can occur between the imaginaries of different groups.Sociotechnical imaginaries are self-fulfilling and agenda-setting: today's dominating imaginaries set the boundaries for how the future will unfold.This has proven particularly relevant in the current decade, as long-term planning and investments by governments and corporations in the West are focused on images of smart cities, the Internet of Things, 5G telecommunications, augmented reality, virtual reality, and much more.The creation and application of big data involve practices that bind together many of these imaginaries.
The notion of big data as fostering both utopian and dystopian discussions goes back at least a decade, when boyd and Crawford (2012) stated that the application of big data at a scale triggers both utopian and dystopian rhetoric.According to them, utopian rhetoric describes big data as a helpful tool for simplifying and streamlining complex systems.In contrast, the dystopian rhetoric predominantly regards big data as capable of enabling privacy invasions and decreased civil freedoms.In this text, we use the notions of dystopian and utopian sociotechnical imaginaries to signal the utilization of the dystopian and utopian rhetoric described by boyd and Crawford.This is not to imply or feed into narratives describing feminists as being against technology or progress.In fact, women have long been at the forefront of technological development, even though their contributions were often unrecognized and erased.
With this in mind, utopian sociotechnical imaginaries encompass those imaginaries that depict big data as capable of improving social systems and economies.These sociotechnical imaginaries are promoted by two very different but highly influential groups.The first comprises IT companies, particularly those that have their headquarters in Silicon Valley.These companies often appear unified in marketing positive perspectives about technologies in the making (Lindh & Nolin, 2017).The second group entails policymakers who aspire to boost their economies by implementing data-driven innovation.These two groups frame big data as a resource and a tool for empowerment and social change (Levina and Hasinoff, 2016).Big data is described as beneficial across domains, from commerce to health and government (Intel, 2021;Chen et al., 2012) and helpful for solving local and social problems (e.g., Guha, 2021).Zuboff (2019) argues that positive discourses are dominated by Big Tech through the power of declarations.They function by "impos [ing] new facts on the social world while their declarers devise ways to get others to agree to those facts" (p.177).According to Zuboff, Big Tech companies move into uncharted territories to claim them, subsequently tailoring the direction of their development.
In dystopian rhetoric, big data is seen as problematic and capable of inflicting damage (Gregory & Halff, 2020;O'Neil, 2017).Considering this, discourses developed by scholars from various fields who choose a critical perspective toward big data fall under the scope of dystopian sociotechnical imaginaries.Critical data studies, surveillance studies, and feminist studies are examples of the critical lenses used to problematize and scrutinize various aspects of big data.Another way of describing the dystopian rhetoric would be to call it anti-utopian, as such accounts can provide constructive and positive ideas.In this article, we are particularly concerned with understanding the dystopian in the sociotechnical imaginaries within data feminism.
The term data feminism was popularized in recent years with the publication of the book "Data Feminism" in which it was defined as "a way of thinking about data, both their uses and their limits, that is informed by direct experience, by commitment to action, and by intersectional feminist thought" (D'Ignazio & Klein, 2020, p. 3).Feminist scholars have a tradition of developing tools and theories for studying structures of power and how they are subverted by and intertwined with different phenomena in society.Although data feminism is a relatively new concept associated with a specific program, for simplicity, we will use it broadly to cover an array of feminist critical approaches.We will contrast this by comparing data feminist sociotechnical imaginaries with those of the European Commission, as identified by Rieder (2018).We also want to better understand the relationship between the utopian and the dystopian within the sociotechnical imaginaries of big data.The questions that we explore in this article are therefore: • What characterizes the dystopian sociotechnical imaginaries of big data within data feminism?
• In which ways do these dystopian sociotechnical imaginaries differ from the utopian sociotechnical imaginaries within European Commission policies?• What are the particular contributions and added value to critical studies that data feminism provides in this area?
In order to perform a meta-analysis of the contributions of data feminism to the topics of this study, we aim to analyze the contributions of data feminism as researchers who are not active within it.We do this by conducting a text analysis of the data feminist corpus in addition to a comparison of official European Commission documents as analyzed by Rieder (2018).

SOCIOTECHNICAL IMAGINARIES
One of the most prolifically employed frameworks for studying the confluence of society and technology is the Actor Network Theory (ANT).However, the ANT has been criticized for flattening the thickness of social relationships, hierarchies, and power distributions (Jasanoff, 2015).The concept of sociotechnical imaginaries was developed as a response to this flattening and is used to describe the role of collective imagination in society.Social imaginaries upon which the concept is built have been predominantly defined as: [T]he ways in which people imagine their social existence, how they fit together with others, how things go on between them and their fellows, the expectations that are normally met, and the deeper normative notions and images that underlie these expectations (Taylor, 2003, p. 106).
According to this theory, the role of collective imagination is to enable the development and legitimization of practices and routines through the creation of common understandings of what is or is not acceptable, desirable, or even thinkable.However, Taylor's theory does not account for the increasingly important role of technology in the collective imagination.Since technology mediates so many aspects and routines of the everyday and mundane, accounting for it in the collective imagination has been grounds for developing the concept of sociotechnical imaginaries.
In this paper, we employ the theoretical perspective of sociotechnical imaginaries as developed by Sheila Jasanoff.According to Jasanoff, sociotechnical imaginaries are "collectively held and performed visions of desirable futures" (2015, p. 28).In other words, the way we imagine the future impacts our practices through the ways we prepare for it.Moreover, visions of undesirable futures can signify the emergence of competing sociotechnical imaginaries between different groups.Jasanoff explains that sociotechnical imaginaries are "animated by forms of social life and social order attainable through, and supportive of, advances in science and technology " (2015, p. 28).According to this view, dramatic technological changes, such as those in the 20th and 21st centuries, have an impact on social life and social order.This is not a technological determinist perspective; rather, it is situated in the tradition of Marshall McLuhan as articulated by Culkin (1967, p. 70): "we shape our tools, and thereafter our tools shape us".In addition to being mutually influenced and mediated by technology, sociotechnical imaginaries are important for how practices and technologies take shape.

IMAGINING THE FUTURE OF BIG DATA
When attempting to define big data beyond the "three V's", different authors mention scalability, frequent or continuous generation of data as well as methods used to process and analyze it (boyd & Crawford, 2012;Kitchin, 2014;Secundo et al., 2017).One way to think about the phenomenon is suggested by Mayer-Schönberger and Cukier ( 2013) who refer to big data as: [T]hings one can do at a large scale that cannot be done at a smaller one, to extract new insights or create new forms of value, in ways that change markets, organizations, the relationship between citizens and governments, and more (p.6).
The changes in relationships they mention indicate how deeply big data is embedded in societal and cultural contexts.Women have played a central role in developing a critical understanding of big data, Women of Color and nonbinary/trans women in particular.Women have described the social consequences of big data their racial implications and how they reinforce existing inequalities.They pioneered research on the use of big data for algorithmic decision making that is often impossible to audit and correct when blunt mistakes are made, and inequalities perpetuated.Some examples of their contributions concern big data reinforcing social inequalities in education and work force (O'Neil, 2017) and over policing (Benjamin, 2020).Furthermore, women of color have described how training datasets tend to overwhelmingly represent lighter-skinned subjects (Boulamwini and Gebru, 2018) and how digital spaces can be sites of racialization (Nakamura, 2013).These and other contributions enriched the understanding of big data as a technical as well as a social phenomenon.
Since the start of the 2010s, big data as an imaginary has been at the center of numerous discussions of the future.Considering the lack of consensus regarding how the concept should be defined, the way big data is imagined plays a significant role in how it figures in society and what kinds of futures are implied.Utopian visions of big data emerged quickly in the early 2010s.Big data seemed to offer a promise of new forms of rational planning that could also provide solutions to societal challenges such as climate change.Resources could be used with much more efficiency with the continued development of social media as well as smart homes and smart cities.All such developments depended on a combination of big data and artificial intelligence.
Within critical scholarship, discussions aligning with the dystopian rhetoric emerged within numerous disciplines and interdisciplinary fields.The commercialization of online space during the early 2000s resulted in a handful of companies seizing control of massive amounts of data that could be relatively freely combined into valuable prediction products (see for example McNamee, 2019;Zuboff, 2019;Cheney-Lippold, 2019).The sheer volume of available data signals the shaky promise of objectivity, truth and accuracy (boyd & Crawford, 2012).Nonetheless, authors such as McNamee (2019) and Zuboff (2019) explore how Big Tech and Silicon Valley shape narratives and thus the impact that they can have on how futures are shaped.
A key concept used within the discussions of big data is datafication.It is a concept that explains how social action is translated into data that, in turn, makes accessing, understanding, and monitoring human behavior possible ( van Dijck, 2014;Mayer Schoenberger & Cukier, 2013).The concept of datafication became another buzzword within the IT industry, carrying positive connotations.The mathematical understanding of the concept was launched by Mayer Schoenberg and Cukier (2013), who emphasized the aspect of quantification of human behavior and contextual information.They were among the first within the big data context to suggest that human behavior can be treated mathematically and thus open the possibility of collection and analyses of various forms of data.Scholars have also critically discussed the potential of datafied mechanisms and automation to perpetuate inequality and discrimination (e.g., Benjamin, 2020;Eubanks, 2019: Noble, 2018;O'Neil, 2017).In this article, we position discussions about datafication as part of big data imaginaries.
In a sense, big data appears as the ultimate form of the sociotechnical imaginary.It conjures up a future in which planning of automated and human practices can be organized for optimal efficiency in basically any realm of society.Indeed, the sociotechnical imaginaries approach has previously been used in critical studies of unfolding technologies related to big data.Reviewing such studies, a remarkable pattern can be discerned in which the imaginaries of Big Tech have become so powerful and uncontested that they are appropriated by the very people that are, seemingly, exploited.
Such research shows how users have come to accept profiling as a natural part of the online experience.Lupton (2020) explores how people imagine and talk about personal data profiling on social media.She defines the profiles created by collecting data generated through user participation in online life as "data personas".
Lupton uses this term to provoke a conversation about which data are collected, to whose benefit, as well as the perceived effect they have on her participants.
Corporations such as Google and Meta (formerly Facebook) have an advertisement-driven business model based on big data.The viability of this business model is dependent on minimizing the risks associated with public backlash and government regulation.To that end, it becomes necessary for these companies to position data profiling as a public good, perhaps even as a public service.One study found that even individuals with a sophisticated understanding of datafication and surveillance described the trade-off of control over personal data in positive terms (Sörum & Fuentes, 2022).The prioritization of receiving information in a seamless manner is closely connected with accepting datafication and framing it as positive.Furthermore, in a study where participants received guidance about developing critical consciousness toward digital futures, Markham (2020) found that it was difficult for the participants to envision alternatives.These results demonstrate how deeply ingrained such imaginaries can be, especially when established by big corporations that invest heavily in creating and establishing standards for the online practices of the future.These imaginaries can be called utopian as they identify diverse problems of contemporary life and present technological workarounds.In this way, Big Tech companies take on roles traditionally reserved for public institutions and governing bodies, increasingly dominating the imaginative power for various groups in society (Mager & Katzenbach, 2020).
In many other ways, the role of the policies that regulate Big Tech is heavily influenced by the utopian visions produced within Big Tech.Trying to understand how technology can be reimagined has prompted studies of smart cities (Deitz et al., 2021;Sadowski & Bendor, 2018;Sandeep, 2017); studies of agency in social media (Saariketo, 2020;2014); search engines (Mager, 2016); and algorithms (Kazansky & Milan, 2021;Bucher, 2017).In the following section, we look at how big data can and has been reimagined within critical feminist perspectives.

Feminist critical discussions
Feminist thinkers and scholars have developed perspectives that place structures and dynamics of power at the center of their attention.A tradition of studying intricate and ubiquitous oppression equipped feminist scholars with tools and theories that inspect and illuminate power dynamics, thus making it possible to criticize systems that support discriminatory practices and inequality in power distribution.
The introduction of intersectionality to mainstream feminisms reflects aspirations to advocate for the importance of localities and minorities (Daniels, 2015;Carbin & Edenheim, 2013;Davis, 2008).Intersectionality was introduced by legal scholar and civil rights advocate Kimberlé Crenshaw.She described the specific situation of Black women facing discrimination both by race and gender (Crenshaw, 1991).These compounded experiences of discrimination were situated at the intersection of racial and gendered identities, and as such are particularly difficult to advocate for within a system that favors whiteness and maleness.Appropriation of the term intersectionality has been controversial, the main criticism being that the use of intersectionality among feminist scholars has become increasingly vague and wide in scope, reducing its potential as a critical tool (Carbin & Edenheim, 2013).Even so, intersectionality remains a significant focus among feminist scholars, as many established norms tie into perspectives that were previously overlooked in (white) feminisms (Lykke, 2020;hooks, 2000).In digital landscapes, intersectionality implies opening to, and actively drawing from, indigenous and postcolonial research (Odumosu, 2020), LGBTQ+ research (Baucom, 2018;Drabinski, 2013, Huffer, 2013), and critical race studies (Knight Steele, 2021;Love, 2018), among others.
Feminisms have long operated between utopian and dystopian visions of the meaning that emerging technologies can have for women (Wajcman, 2004).On one hand, digital technologies are seen to enable spaces of freedom from gender norms and old social relations.On the other, the dramatic gender differences in access to and control over new technologies reiterates old power disbalances at scale.Women as innovators of technology have historically been overlooked and had their contributions erased (D'Ignazio & Klein, 2020a;Perez, 2019;Onouha, 2016).Feminist discourses on these topics address the issues of seeking equal representation in data as well as concerns about excessive surveillance and threats against privacy caused by such representation (Dubrofsky & Magnet, 2015).Simultaneously, feminist interest in topics of visibility and representation within datafied systems -in a world where maleness is the default -predates the current upscaling of datafication.
In the context of contributing to critical perspectives on data, feminist approaches such as standpoint epistemology (Harding, 1986;1991) and situated knowledge (Haraway, 1998) emerge as useful tools for scrutinizing the notions of data as inherently objective, reliable, and trustworthy.Haraway and Harding are influential in the feminist refutation of absolute objectivity, i.e., "a view from nowhere" that is often perpetuated by removing the contexts in which data are created and exist.This critical tradition frames data as situated and embodied, not isolated from the circumstances in which they were produced.Feminist critical perspectives also highlight the processes that data underwent since their creation and the circumstances under which they are employed.
The application of feminist discourse to a data-driven society can help interrogate and challenge power relations (D'Ignazio & Klein, 2020a) and broaden the understanding of the challenges they engender.According to boyd and Crawford (2012), the mythology of big data is a significant part of this sociotechnical phenomenon, resulting in "the widespread belief that large data sets offer a higher form of intelligence and knowledge that can generate insights that were previously impossible, with the aura of truth, objectivity, and accuracy" (p.663).Indeed, the intentional mystification of big data can hinder critical inquiry by obfuscating the understanding that data are "not an autonomous force or a unidimensional technical fix"; rather that they are shaped by "social and material factors, including social institutions and technologies" (Beaulieu & Leonelli, 2022, p. XV).

METHOD
In this article, we explore a collective set of dystopian imaginaries by comparing a corpus of articles that are here labeled data feminist with discourses developed in the European Union.In order to identify and highlight conceptions of big data and visions of the future in our corpora, we utilize the concept of sociotechnical imaginaries, or collectively held ideas about how things work, what is conceptualized as good or bad, and what constitutes desirable or undesirable futures.
We identified scholarly texts concerned with datafication that employ feminist tools and perspectives through searches in academic databases, discovery engines, and citation chaining.The initial searches were conducted in the Winter of 2020 and Spring 2021.At this stage, a set of 103 texts was collected.We utilized VOSViewer1 , a software for constructing and visualizing bibliometric networks, to identify clusters of related terms.The largest cluster formed around the term big data, drawing attention to the term as a central concept warranting further exploration.We then reviewed the 103 identified texts to assess their engagement with the concept of big data.By early 2022, when final texts were added, we narrowed down the corpus to 44 articles, commentaries and essays.A description of our search strategy is provided in Table 1.
We conducted a close reading of the texts in the resulting corpus to determine how big data figures in these documents, identify the concerns raised about it, and analyze the implications of these concerns for understanding and integration of big data into future visions.Our analysis encompassed searching for contexts in which the term appeared in the corpus and examining the words and metaphors used to discuss and describe it.For organizing our notes and creating mind maps of instances of the term "big data" in the texts, we utilized Miro software2 .Much like the widespread adoption of the concept of datafication has been widely adopted for critical discussions (see above), data feminist authors have also adopted big data.We refer to this corpus as data feminism, with the understanding that data feminism is a larger discourse and an umbrella term that surpasses the materials identified in this study.Following the construction of our data feminist corpus, we compared the results of our analysis of the corpus to the results of a 2018 study of sociotechnical imaginaries of big data in European Commission policies (Rieder, 2018).We created a table of Rieder's findings and complemented those with close readings of three additional texts published since the 2018 study.In the final stage, we surveyed and analyzed policy documents of the European Commission for mentions and considerations of minorities, gender, and other traces of intersectional and (data) feminist issues.Jasanoff and Kim (2009) have positioned textual criticism as a crucial method for the construction of imaginaries, and comparing these corpora aided us in teasing out sociotechnical imaginaries of big data.Overall, the study used a combination of bibliometrics, data visualization, and close reading techniques to analyze and understand the literature related to feminism and data studies.

Comparison as a method
Comparison is a common research method employed across disciplines, such as history, anthropology, and literary studies.In its most elementary form, comparison refers to a systematic process of contrasting two or more cases to each other, enabling the exploration of parallels and differences among them (Azarian, 2011).
Comparison has the potential to elucidate nuances that are difficult to identify in conventional single case studies and to explore assumed knowledge and perspectives within the different cases.As a prevalent method in studies investigating sociotechnical imaginaries, comparison helps to identify the content and contours of those imaginaries by highlighting and contrasting the epistemic and ethical assumptions that are situated and particular (Jasanoff, 2009).Studies that implement comparison as a method to investigate sociotechnical imaginaries are numerous.One study conducted a comparison of cross-national science and technology policies (Jasanoff et al., 2007).Another (Levenda et al., 2019) focused on sociotechnical imaginaries of governance in energy innovations, comparing the cases of Portland, OR and Phoenix, AZ in the United States.Most recently, a comparative analysis of US and EU data governance and their respective sociotechnical imaginaries of personal digital data was published (Guay and Birch, 2022).Sociotechnical imaginaries are embedded with normative implications.Jasanoff and Kim (2009) state that descriptions of potentially attainable futures tend to become prescriptive and positioned as futures that ought to be attained.
According to Jasanoff (2019), comparison helps tease out sociotechnical imaginaries and can reveal underlying assumptions and normative commitments by identifying commonalities and differences between them.
Comparison helps sketch out different sociotechnical imaginaries or visions of the future that depend on understanding existing technologies as well as future technological innovation.Comparing two corpora of texts can imply variations in terms of affiliations, types of authorship, and ways of achieving legitimacy.In our case, data feminist texts are usually affiliated with one or more universities and attributed directly to the individuals who wrote them, often by a single author.By contrast, the Commission's policies represent the work and reflect the values of the European Commission.Keeping in mind instances of authorship transparency issues (Nelhans & Nolin, 2022), these policies are published and legitimized under the name of the European Commission.We chose these two corpora to compare in order to contrast assumptions in data feminist and European Commission corpora.This juxtaposition highlights the sociotechnical imaginaries within data feminism, and those that arguably reflect a significant influence on the formation of big data imaginaries in Europe, particularly stemming from the European Commission.
Comparing one corpus of texts to another, using analysis performed by another author, means that this article relies heavily on the analysis conducted by Rieder in 2018.Perhaps the most consequential difference between the two corpora is that of genre.Moreover, when comparing corpora, disparities in underlying assumptions and taken-for-granted aspects become apparent.For example, having been written for different audiences with different perspectives, the studied texts assume different levels of a priori knowledge of certain concepts.Furthermore, their perspectives and intended audience differ.The intended audiences further influence the actors that the texts address and that are deemed central.

RESULTS
We found that data feminist texts and European Commission policies diverge in the ways in which they frame, understand, and discuss big data.European Commission documents identify and explain interpretations of socioeconomic and political realities to provide guidelines for policy.Such documents are designed to create broad-scale changes across the European Union.These are, therefore, powerful utopian imaginaries, situated at the core of a centralized decision-making system.Being a policy-making body and attempting to work in the interest of citizens of the European Union, the perspective of the European Commission is decidedly top-down.
Contrarily, authors who write with a data feminist lens take on bottom-up perspectives, with a focus on local communities and knowledge, as well as the protection of vulnerable members.The academic genre of data feminism presupposes a critical and investigative approach.Differences in these two contexts have implications for their perspectives and the way language is employed.Feminist writing is not intended to be prescriptive, unlike the policy documents of the European Commission.Furthermore, the utilization of feminist tools contributes to the critique and destabilization of dominant narratives, particularly where they reproduce problematic hierarchies and enable or recreate discriminatory practices.
Table 2 describes the result of the encoding of the differences between points of the dystopian and utopian sociotechnical imaginaries.It contains the results of close reading data feminist and European Commission texts, as well as our analysis of Rieder's (2018) findings.The results of the encoding will be discussed further below.We first present the motifs and phrases that occur within each of the corpora followed by each of the three prominent themes that occur across the data feminist texts.The first, titled Visibility and representation, focuses on discussions about big data and minorities and the relationship between the private and the public.The second is Power and domination, discussing the two in relation to datafication.The final focuses on the differences between those corpora employing the rhetoric of big data as a resource and those that challenge it.The final theme is Exploitation and vulnerability.

Motifs and phrases
In data feminist texts, big data is described in various ways, with the consensus being that it is an inevitable part of the future.As long as big data is tightly wrapped into dystopian imaginaries, a problematic development seems to be inevitable.That said, there are also data feminists who are attached to utopian imaginaries.Proximity to natural sciences and technology is correlated to positive visions of big data in some data feminist materials (e.g., Diaz Martinez et al., 2020;Vaitla et al., 2020;Larrondo et al., 2019).Some authors in our data feminist corpus assert that big data projects can be transformed and therefore serve to create attractive future outcomes.Vaitla et al. (2020, p. 18) argue that big data "can have a profound influence on improving the lives of all", adding that it can only happen if "it is intentionally managed as a vehicle for equity and empowerment".Hong (2016) claims that big data "offer a chance to open up new opportunities to produce a more egalitarian society".Other authors refrain from making such claims and contributing to similar discourses.They do not describe a future rid of, or with less, data extraction either.Instead, some authors explore the representational limits of data in the context of a societal system in which dominating societal norms already exist.They suggest Queering data (Gieseking, 2018) and Black data (Rueberg & Ruelos, 2020) as two respective methods of centering perspectives of minorities in datafied systems and data narratives.Foucault Welles (2014, p.1) suggests searching for outliers and minorities within big datasets and putting them in focus but points out that "a large dataset quickly becomes small when you focus on a minority population".
Themes of marginalization and bias occur throughout the data feminist corpus that we surveyed.This is reflected in the phrases used in relation to big data, including bias (Diaz Martinez et al., 2020), violence (Hong, 2016), and oppression (Gieseking, 2018).Big data is described in the corpus as a part of a positivist, colonialist, imperialist, and capitalist paradigm (e.g., in Hoffman, 2020; Corple & Linabary, 2019;Leurs, 2017).
In his 2018 study of the sociotechnical imaginaries of the European Commission, Rieder finds the utilization of stable phrases and a limited set of discursive elements recurring across the European Commission's policy documents.An additional reading of the policies published since 2018 that we conducted for this study corroborates that Rieder's findings still hold true.The variety of phrases represents different ways of conceptualizing big data.The phrases that occurred in the European Commission documents included new oil, gold mine, game changer, and magic material (Rieder, 2018).These are all highly positive metaphors that link big data with economic wealth.
Additionally, this way of framing big data feeds into the discourses of data as raw and isolated technical artifacts, as opposed to the data feminist understanding of data as a process (Cruz, 2020) or as a set of ongoing negotiations.Data as a resource, such as oil or gold, makes data disembodied, removed from the subjects that produce data, or are targeted for data collection.On the other hand, data as a process implies that humans, together with the algorithms they design, make a multitude of decisions in shaping what is characterized as "raw data".The implication is that data-driven decision-making, whether done by humans or machines, does not build upon neutral data.Rather, it is a matter of a sequence of (potentially biased) decisions being made about the various perspectives and ideologies that govern processes when creating the appearance of neutral data.
Other important concepts used by the European Commission are key asset, "motor and foundation of the future economy", and the lifeblood of digital markets (Rieder, 2018).Similar rhetoric persists in the policies of the Commission: data is labeled as the lifeblood of economic development (EC, 2020) and as an essential resource (EC, 2022a).These phrases point in different directions, with data being portrayed either as a stepstone upon which prosperity can be reached, or an entity that enables and fuels successful systems and practices.However, all the phrases employed by the European Commission have a strong connection to capitalist vocabulary.Imaginaries of big data in the European Commission hinge upon three storylines: big data as the cornerstone of a thriving data economy; big data as a way to transform and improve public services; and big data as a tool for evidence-based decision-making (Rieder, 2018).Prominent themes in data feminism are visibility and representation, power and domination, and vulnerability and exploitation.

Visibility and representation
Topics of under-and misrepresentation, as well as data bias, have been extensively covered in the data feminist corpus.The topics of silencing voices of minorities based on gender, sexual orientation, and race are characteristic of intersectional data feminism and occur across the material (e.g., Cruz et al., 2020;Ruberg & Reilos, 2020;Gieseking, 2018).Ruberg and Ruelos (2020) state that data play a role in valuing or devaluing the identities and experiences of marginalized groups.This paradox of exposure (D'Ignazio & Klein, 2020b) is evidenced in simultaneous advocacy for better representation and arguing against collecting data on already vulnerable and marginalized populations.The argument is that classification systems precede data collection and processing.Therefore, critical scrutiny of the ideologies underpinning the classification system itself is needed, particularly as they concern marginalized groups.Ideally, these groups should be considered during the construction of classification systems to avoid structural bias within the big data that is created.However, visibility increases exposure to over-policing, surveillance, and discrimination (Favaretto et al., 2019;Corple & Linabary, 2019).As Snapp et al. (2016, p. 135) noted, "[t]he right for participation, and thus representation in data, science, and policy, is often understood as conflicting with the right for protection, that is, safety from the disclosure of a marginalized orientation or identity".However, O'Neil (2017) stresses that certain big data systems can be specifically designed to exploit vulnerable people, by steering them toward questionable universities and other financial scams.According to dystopian imaginaries, the complex problems of the paradox of exposure are likely to increase in the future, as today's big data sets are unlikely to be redesigned and will only grow in size and scope.Bogers et al. (2020) discuss online representations of pregnancy as featuring predominantly white, able-bodied, CIS women, while overlooking non-white, disabled and LGBTQ+ people.This is an example of a taken-for-granted classification system, implicitly constructed to ignore certain groups.For Hong, a woman who is "allowed" visibility must be "white, pleasant, and subversive only to the extent that she does not disturb the expectations of normal feminine behavior" (2016, p. 3).Extending this line of thought, classification systems can be built upon conservative notions of core concepts such as gender and will therefore be unable to change over time by, for instance, considering non-binary gender identities.Within dystopian imaginaries, big data resources age in size only, becoming increasingly larger, while the core ideas of classification remain stagnant.
Overrepresentation is also discussed by Bogers et al. (2020) in what they call power shadows, or the hypervisibility of a selected identity or group that casts a shadow rendering other identities or groups less visible.That numerous classification systems of big data do not typically highlight the experiences of minoritized groups is an issue with no simple solution.Big data does not account for all people, and as currently often utilized, is not good at representing the experiences of minorities (Gieseking, 2018).Categorization and datafication of fluid identities and minorities is therefore an issue highlighted by data feminism in relation to big data.As Gieseking argued, datasets representing queer people may never be big enough in scale (2018, p. 150).Moreover, analytical tools are not built to see data concerning queer lives, which means that existing data are not immediately legible when they are available (Ruberg and Ruelos, 2020).These dystopian imaginaries signal a future in which the traits of individuals and minorities are overpowered and disappear within power shadows.
Regarding the ways machines can both see and overlook people, Agostinho (2019) notes that optical visual metaphors enhance the role of the senses and heighten the conviction of knowability.According to Agostinho, sensory imaginaries of big data as microscopes or magnifiers perpetuate the misconceptions of inherent objectivity and reliability.This perspective can be applied to both the European Commission policy documents and our data feminist corpus.Similarly, Gieseking (2018) points out the inherent masculine undertones of equating the bigness of data with legitimacy.Data feminist texts adopted these concepts to enter the discussion, but few have explicitly characterized the implications of using them.The concepts utilized by the European Commission, as analyzed by Rieder (2018), imply that the adoption of masculine terminology also draws from capitalist and economic narratives.Indeed, before programming became a male-dominated profession, the title computer had little pedigree, and was predominantly used to describe women who performed computations on early computers.In this instantiation, engineers were considered to do the real work.This changed as programing became more prestigious, and alongside the change in terminology, the number of women in computing dropped as its perceived importance increased.The concerted effort to align technology with masculinity ultimately resulted in the underrepresentation of women in computing (Van Oost, 2000;Terras & Nyhan, 2016).Several authors explore this phenomenon, both in literature on the history of women in computing and in analyses of the perceived pedigree of professions in relation to gender distribution within them.
In the majority of the European Commission policies and the data feminist corpus we examined, the notion of privacy and what constitutes personal data is implied, but not explicitly discussed.In her study of privacy in the context of the digital economy, Weinberg (2017, p.16) writes: "[w]ithout critically engaging the underlying assumptions of the categories of public and private, the forms of political resistance against data exploitation remain tethered to presuppositions about the liberal democratic subject".Weinberg (2017, p. 11) also argues that corporations and governments have found ways to aggregate data by using the "fiction of the sovereign subject to resist surveillance" in ways that are technically legal, and by fragmenting the information about individuals and combining them into mass collections of data.This imaginary warns of a future where the boundary between the private and the public has been renegotiated, blurred, and provided with false legitimation.

Power and domination
Critical data perspectives discuss data as a form of power.The data feminist materials in our corpus actively contribute to that discussion.Hoffman (2020) investigates instrumental, structural, and symbolic power and how big data can underwrite and manipulate the informational bases of other forms of power.Access to data is one form of power, alongside the power to decide what is or is not included in datasets, and the extent to which certain topics are represented.Bogers et al. (2020) and Cooky et al. (2020) are among those authors that identify the absence of women in data streams as an issue of power.Leurs (2017) equates power differences grounded in gender and race with the imperialistic and sexist design of technological systems.Thompson (2020) in particular reflects on the kinds of power given to "factual" numerical data that is intended to represent all people, despite its underrepresentation of women, and how this affects governmental, educational, and legal systems.Foucault (1980) famously discussed the inextricable connection between knowledge and structures of power.Many feminists have been inspired by Foucault and go further than associating power with knowledge.Rather, data feminists conceptualize data, the bedrock of knowledge, as associated with power structures.
Situating data used across various systems as representations of real-world conditions of power means that data bias has material consequences within and outside the digital.Cooky et al. (2020) pinpoint access to data which privileges powerful corporate bodies as an example of inequality in power dynamics.Moreover, Suarez and Gonzalo (2019) note that companies and corporations have control over data flows and benefit from them, while citizens whose data are collected know little of what is collected and to which end.These imaginaries warn of a future (and a present) where big data is used to shift power from citizens to corporations.Classification systems utilizing big data are developed so quickly and with such complex consequences that governments simply cannot design appropriate legislation fast enough.As such, the imaginaries of Big Tech drive the transformation of society.Domination of and through data is a recurring motif in the data feminist corpus (e.g., McQuillan, 2016;Luka & Millette, 2018).Suarez and Gonzalo (2019) call it data domination; conceptualizing data as a tool for those already in power to increase or solidify their dominance over vulnerable groups.It is also implied that big data serves to reinforce existing structures of dominance and plays a role in building new structures of oppression.In our corpus, data feminists are concerned with power and disruption of solidified power structures, especially where they (re)inforce discriminative practices.Indeed, big data is repeatedly related to exploitation, violence and power that are exerted over groups of people (e.g., Weinberg, 2017;Gieseking, 2018;Leurs, 2017).Some data feminist authors also claim that big data is capable of inflicting and facilitating harm and violence (e.g., Hoffmann, 2020;Cooky et al., 2018;Luka & Millette, 2018).According to Hoffman (2020, p.4), these harms reproduce "racist, sexist, and other norms and stereotypes that position some people as subordinate, inferior, or irredeemably 'other'".Some of the explicit forms in which harm is inflicted through big data include surveillance, harassment, commodification, privacy issues and abuse, most notably of minorities (e.g., in Cooky et al., 2020;Corple & Linabary, 2020;Hoffmann, 2020).For Hoffman (2020), data violence occurs when people are labeled and classified, and can be material, symbolic, and representational.
Imbedded in the topics of power, violence, and domination through technology are discussions about capitalism and colonialism in relation to datafication.Whereas data feminist texts in our corpus direct attention to complex social issues that would require significant time for key actors to negotiate and address, by contrast, European Commission policy texts denote a sense of urgency to capitalize on big data (Rieder, 2018).These policy texts argue that the EU needs to collectively act as early adopters of new innovations developed through big data practices.Otherwise, the next wave of dominating digital corporations will be USbased, just like Apple, Google, Amazon, Microsoft, and Meta.Framing big data as a resource reiterates the capitalist agenda and a narrative of data as a raw material that needs to be exploited.It should be noted that the European Union has made significant attempts at regulating security, privacy, and surveillance online.Such inroads are made through legislation, primarily the General Data Protection Regulation (GDPR) from 2018, the Digital Markets Act (DMA) and the Digital Service Act (DSA) from 2022.
Our data feminist corpus also draws parallels between the corporate capture and selling of user data and the capture of territory to extract value and exert control (e.g., in Corple & Linabary, 2020;McQuillan, 2016).Gieseking (2018) notes that big data cannot be disconnected from historical examples of domination and exploitation, without replicating them in the digital context.For the same author, the perceived objectivity and authority of big data is derived from "masculinist, racist, colonialist, ableist, and heteronormative structural oppressions" (2018, p. 150).Denying the context in which data are collected and organized and the oppressions that they reflect allows for the perpetuation of the same destructive practices.

Exploitation and vulnerability
Both our and Rieder's analyses suggest that within the European Commission imaginaries, big data is described as fostering a large economic potential that needs to be "tapped into" (EC, 2020, p. 1) and "exploited" (Rieder, 2018, p. 5;EC, 2022b, p. 1;p. 47).Rieder states that the broader European imaginary of a technological race with the United State has a large influence on the formation of big data imaginaries of European Commission.This broad imaginary further influenced the perspective of the Commission and the actors that figure in policies.There is a sense of urgency, a race to exploit data before someone else does it.For the Commission, balancing power on an international scale is of great importance, and seizing opportunities as well as protecting European citizens is a priority.A sense of missing out on the first wave of digital innovation dominated by Big Tech companies based in Silicon Valley fuels the urge to jump on the figurative big data train (Rieder, 2018;Burgelman et al., 2010).Because of the urge to tap into the resource that is big data, European Commission policies situate companies in Silicon Valley and national governments as key players.This is in stark contrast to data feminist texts, which place minority groups and local communities at the center of discussion about big data.Rieder found that in the policies of the European Commission, the benefits are "believed to outweigh any potential harm", and detrimental effects are considered exceptions to that rule (2018, p. 6).In the sociotechnical imaginaries of the European Commission, high velocity of data collection is equated to increased technological advances, economic prosperity, as well as the identification and mitigation of societal issues.Data feminist authors in our corpus are particularly concerned with how big data mining renders people vulnerable (Favaretto et al., 2019;Leurs, 2017) and further oppresses the already marginalized (Gieseking, 2018).The bottom-up approach of data feminism highlights detaching data from bodies as a structural feature.Envisioning data as embodied and situated challenges the discourses about the need for exploitation that are identified as part of a "gold rush" mentality (Luka & Millette, 2018).Centering the bodies and locations from which knowledge arises (Corple & Linabary, 2020) helps to counter big data neo-positivism.Across the corpus, data are envisioned as introducing or reinforcing various types of harm to vulnerable groups.As such, there is a notion of resistance against (D'Ignazio & Klein 2020b; Gieseking, 2018) and refusal of (Linabary et al., 2020) big data oppression.
The difference in perspectives between the European Commission and data feminist corpora are visible in the actors that are considered, the language that is employed, and how power is perceived and seen to be enacted.Within data feminism, exploitation through big data is repeatedly connected to colonialism (e.g., Cooky et al., 2020;Linabary et al., 2020;Dionne, 2019;McQuillan, 2017).Likewise, exploitation often denotes economic exploitation through datafication (e.g., Weinberg, 2017) but in several instances refers to the exploitation of free labor on social media (e.g., Corple & Linabary, 2020;Cooky et al., 2018).The focus on workplace optimization through big data has been shown to have negative consequences on the workloads for health workers (Dionne, 2019), women (Michailidou, 2018), and people of color (Cooky et al., 2018;Corple & Linabary et al., 2020;Michailidou, 2018).Some authors explore how the use of big data for research, particularly when utilizing social media, can make researchers complicit in inflicting harm (Corple & Linabary, 2020;Luka & Milette, 2018).
The struggle for the power to capture and shape political and social imaginaries influence the directions of technology development and common understandings of what constitutes attainable futures.Subsequently, the tension between dominant and alternative imaginaries is ultimately the struggle to sketch out boundaries of current understandings of the possible and future (technological) developments.In our data feminist corpus, big data is understood in a variety of messy and fuzzy ways.It is described as susceptible to change and imagined as a process.Big data is not depicted as a fixed technical artifact; rather it is understood as determined by social contexts and decisions made during stages of collection and processing.The data feminist corpus highlights big data as created by someone, for someone and for a specific purpose, indicating an awareness of big data as a sociotechnical phenomenon.In these texts, the conceptualization of big data entails multiplicity, and as such, allows for different perspectives and interpretations.This multiplicity introduces nuance into sociotechnical imaginaries and decentering of the Big Tech narratives.Conversely, the European Commission corpus engages with big data as a thing that simply is and tends to overlook the implications big data for the society beyond its alleged potential to improve the economy.Describing data as a raw material overlooks the choices during data collection, the decision-making processes during integration into various systems, and the social consequences after integration.In other words, the social aspects of big data.
The data feminist corpus we analyzed is often aligned with what boyd and Crawford (2012) have called the dystopian rhetoric, problematizing power and highlighting the potential for bias and perpetuating discrimination.Dystopian rhetoric is therefore often collapsed with critical perspectives stemming from the feminist tradition.While data feminist texts do offer visions of optimistic futurities with big data, the focus is on critical inspection and questioning of big data-related systems and practices.The European Commission corpus is more consistent with the utopian rhetoric, visible in the use of metaphors like oil and raw material.Characterizing sociotechnical imaginaries of big data as dystopian and utopian sociotechnical imaginaries based on the type of rhetoric they predominantly employ helps contrast and compare the differences in corpora.However, it also collapses critical engagement with big data behind the dystopian rhetoric.In fact, critical perspectives are often lacking in the European Commission corpus, while sociotechnical imaginaries in the data feminist corpus are richer and more nuanced.Four themes separate these particular sets of sociotechnical imaginaries, addressed in the following text.

The concept of data
Discussions of data in the data feminist corpus are very rich.Numerous sub-themes can be noted regarding how data is collected, processed, and owned.In this aspect, the themes align with those most prominently described in critical data studies.Critical data studies cover topics of how datafied systems serve and privilege certain groups (O'Neil, 2017;Zuboff, 2019) while overlooking or discriminating against others (e.g., Benjamin, 2020;Eubanks, 2019, Noble, 2018).Data feminist texts add to the critical discourse by conceptualizing data as a process, embedded in politics, enabling discussions of datasets as created within a context and never fully representative of reality.Moreover, various strategies for teaching computers how to collect and categorize data are scrutinized within a firmly established feminist tradition.
The policy documents of the European Commission portray a far less nuanced notion of data.As argued by Rieder (2018), the concept is described as a resource that needs to be mined and thereafter can produce value as the new oil.
Our own additional reading reaffirms this portrayal in more recent policies.Through this rhetoric, the European Commission upholds and perpetuates decidedly positive imaginaries of big data (Rieder, 2018).The language used in the policies is aimed at creating excitement by using phrases like game-changer and magic material.This mythology of big data (boyd & Crawford, 2012) creates a sense of urgency to exploit and capitalize on big data.Sociotechnical imaginaries of the European Commission rest upon the notion of a utopian future that will unfold with the mining and exploitation of big data.
Our study highlights the differences between the conceptualization of the relationship between personal data and big data in the corpora.An understanding of knowledge as situated gives feminist conceptualizations of big data a multiplicity in a way that is not present in the policy documents of the European Commission.However, it should be noted that in both data feminist and in European Commission documents, big data is often used as a shorthand to denote personal data.It is important to note that not all big data refers to personal information but avoiding explicit discussions of personal data altogether obscures the instances when it does have certain implications.While the problematic distinction between public and private spheres has been a longstanding discussion in feminist and data feminist discourse, these complexities are collapsed and simplified in the sociotechnical imaginaries of the European Commission.In the case of the European Commission and its characterization of big data as a resource, understanding personal information collected through datafication as a commodity would require consideration of compensation for citizens.In the imaginaries identified in the European Commission's texts, citizens are technology-oriented and flexible, separate from and wholly unaffected by the presence, ordering, and classification of data.Moreover, in the Commission's view, big data is a positive and stable fixture of European society.Thus, whatever data are available today, they will remain so tomorrow, making them ideal for exploitation and economic gains.
6.2 Micro-, meso-and macro perspectives The first theme concerns how the perspective levels differ in the data feminism and European Commission corpora.We have demonstrated that the top-down perspective of the Commission differs in implications from the bottom-up perspective of data feminists.Building on this, we introduce different ways that macro-, meso-, and micro-perspectives are present within these texts.
Firstly, our analysis discloses that the European Commission works with the top-down macro perspective, while data feminism engages simultaneously with macro-, meso-, and micro-perspectives.This is particularly visible in the way that data feminists position access to and utilization of data as a power negotiation, and how they build upon the tradition of discussing patriarchal macrostructures.The macro perspective is present in the strong tradition of talking about patriarchal macrostructures.
Secondly, in keeping with a feminist tradition, there is a strong trend of investigating local experiences, which aligns with the micro perspective.Data feminist interest in how classification biases privilege the powerful, while underserving marginalized groups such as women, people of color, and LGBTQ+ communities.Even though texts in the data feminist corpus define individuals and groups as resilient and holding agency, they perceive big data as capable of inflicting harm and rendering some people vulnerable to exploitation and oppression.However, optimistic visions of big data futures occur in data feminist texts as well.While they argue that big data is inherently tied up with masculine rhetoric and needs to be reframed and redefined, data feminists also see big data as a persistent element in the imaginaries of the future.
Finally, meso perspectives are visible through an emphasis on how institutions engage with and provide support for vulnerable people.In feminist imaginaries of data futures, big data often figure as a facilitator and a vehicle for capitalist narratives and aspirations.The feminist focus on localities and identifying vulnerable groups within established structures gives rise to imaginaries of domination, violence, and exploitation through data.Dystopian data feminist imaginaries build on different categories of data allowing for rich, critical discussions of future outcomes.By comparison, the top-down perspectives of the European Commission are tried and found wanting.

Addressing majorities and minorities
Dystopian imaginaries build upon the ways in which minorities and vulnerable groups in society may fare as big data technologies are increasingly incorporated into everyday experiences and interactions.Because, as we argue, the European Commission builds policy with dominant communities in mind, citizens are positioned as robust and flexible.However, this focus on an imaginary majority arguably creates vulnerable groups in society whose little power or voice within institutions are likely to suffer the consequences of becoming underserved.The commodification of data takes on different shapes for dominant groups compared to minority groups.However, much of data feminism builds upon intersectionality.This allows for critical engagement with imaginaries of exploitation that raise questions of what big data does to people, as well as who is prioritized as a citizen and how that prioritization perpetuates the privilege and discrimination described by critical race and intersectional feminist theory.The topic of hypervisibility and subsequent over-policing of marginalized groups that is reiterated and exponentialized in datafied systems is additionally complicated in the context of feminist traditions that seek to enhance the visibility and representation of marginalized identities.Such engagement with dystopian imaginaries also provides rich perspectives that are missing in utopian imaginaries.

Economy and human values
Following an intersectional feminist tradition, data feminisms center people as the focus of big data narratives.This is demonstrated by the language used in our data feminist corpus, which highlights that women, nonbinary people, people of color, and LGBTQ+ people are particularly exposed to discrimination, exploitation, and violence through data and classification.From this perspective, inequalities, bias, and discrimination come to the forefront.
Conversely, the European Commission is primarily concerned with continuously molding the European Union into an IT superpower that can rival the many successes of the United States since the 1990s.Betting on the development of big data within the EU is therefore deemed desirable and even necessary.This is not contextualized as a political choice fraught with underlying discourses of power and domination in the policy documents, it is instead framed as a logical imperative to stimulate European economic growth.
Macroeconomic aspects are not the focus of the data feminist corpus to the same extent that they are for the European Commission.Instead, there is an emphasis on the private sphere of experience which privileges the importance of basic human values.Several texts in the corpus equate mining data as a resource for profit with colonial tendencies.By contrast, the language used in the European Commission documents implies much less critical consideration of how, exactly, infiltrating data markets will result in universal benefits and prosperity.Data feminism entails a bottom-up perspective with a focus on local communities and experiences.As a result, it positions vulnerable groups and minorities as key actors in discussions of big data.In their position as policymakers, members of the European Commission employ a top-down perspective of power that prioritizes the protection of European Union citizens.For the Commission, Big Tech companies in Silicon Valley, national groups, and governments are central agents of interest, while data feminist texts often do not afford them the same degree of importance.
In data feminism, big data is a multiplicity of things, always fluid and under development, and understanding big data requires an understanding of the diverse underlying perspectives through which data are generated.For the European Commission, big data is something that can be used and reused, which implies a fixed entity and focus on openness and exploitation over variability and understanding data contexts.
Imaginaries of both data feminism and the European Commission feature futures inclusive of big data and the systems built upon it.However, while imaginaries of the European Commission reflect economic narratives and a topdown perspective, data feminist bottom-up perspectives highlight new forms of power and structural imbalance that signify potential for harm of distinct groups of people.For the Commission and its economic imperative, the focus is on sharing, generating, and exploiting data.That perspective reflects the position of power to govern and aims to increase the financial gain and leadership role of the European Union with the help of big data.Situating big data in this way supports the idea that citizens of the EU will only benefit from its increased presence in society.However, in our data feminist corpus, questioning who benefits from datafication and whose livelihoods are jeopardized figures at the forefront.
Imaginaries within data feminist texts are considerably richer than those identified within the European Commission policy documents.Imaginaries present in data feminism build upon a broad and inclusive view of different intersectional social perspectives (macro, meso, micro) that allow for recognition of a multitude of vulnerable groups, a highly sophisticated understanding of data, as well as a focus on human values over economic growth.Without conscientious and intentional integration of (data) feminist and other perspectives, dominant Big Tech imaginaries will prevail in the narratives of European policymakers.In this study, we focused on the sociotechnical imaginaries present in data feminism research.Future studies could identify existing big data practices and projects that inhabit and act within data feminist perspectives, and how they work with, use, and resist datafication when necessary.

FUNDING STATEMENT AND ACKNOWLEDGMENTS
We want to thank Merisa Martinez for helpful comments and for bringing to our attention Melissa Terras and Julianne Nyhan's work on Father Busa's female punch card operatives.A big thank you to Maja Krtalić, Pieta Eklund and Marija Marčetić for taking the time to read this text, offer helpful advice and brainstorming session.

Table 1 .
Collection of the data feminist corpus Rieder's (2018)groups and localities are central within data feminist texts, whereas legal bodies, businesses, and governments are the actors in focus for the European Commission.If the utopian imaginaries are, therefore, all-inclusive and symmetrical, all citizens across the 27 member states are to be treated in the same way by emerging algorithms.Our own, as well asRieder's (2018), readings of European Commission documents indicate that citizens are imagined as strong and healthy and flexible enough to adapt to change.Interestingly, class, race, age, and ability are barely mentioned in relation to big data.Data feminism focuses on the marginalized and the vulnerable in society, those that are most likely to suffer when citizens are classified according to various big data plans.

Table 2 .
Encoding of main differences in data feminist texts and European Commission policy texts