Research

Research with Twi-XL

On this site, you can find relevant information regarding research that has been–or is still–conducted with Twi-XL.

  • Published work
  • Data collection notice for Twi-XL research

Published work

Twi-XL’s PhD candidate Iris Baas has raised the very important question of “who is sharing the news”. Iris uses Twi-XL to approach this problem by analysing link sharing practices in the Dutch Twittersphere. Together with co-authors Marcel Broersma, Tommaso Caselli, and Marc Esteve Del Valle, Iris has published her paper on the “identity construction of news sharers on Dutch language Twitter”. In the following, you can read the abstract of the paper and follow this link to read the full paper.

Abstract:

This paper examines how self-ascribed identity in Twitter biographies relates to news-sharing behavior on the platform. Drawing on a dataset of Dutch Twitter users who shared links to news articles in 2021, this study applies contextual topic modeling and clustering techniques to analyze identity patterns at scale. While prior research on online identity has often relied on manual annotation or focused on elite actors, this work leverages computational methods to systematically identify thematic identity groupings and their relation to news dissemination. The findings reveal that political identity — particularly among right-leaning users — is often foregrounded in biographies, frequently articulated through oppositional or “anti”-statements. However, news sharing across identity groups remains largely centered on mainstream Dutch news outlets, indicating that ideological critique does not necessarily translate into alternative media consumption. This research also highlights the complexity of online identity construction, as users commonly blend social and professional affiliations in self-presentation. These results underscore the importance of identity cues in understanding how news circulates on social media and suggest that self-presentation plays a critical role in shaping users’ perceived credibility and influence. By offering a scalable approach to studying identity in relation to news sharing, this research contributes to broader debates on digital journalism, political communication, and platform-based public discourse.

Baas, I. M., Broersma, M., Caselli, T., & Del Valle, M. E. (2025). Who is sharing the news? Identity construction of news sharers on Dutch language Twitter. First Monday.

Data collection notice for Twi-XL research

This page contains the data collection notice for one of the PhD research projects using Twi-XL, conducted by Sarah Burkhardt. Her project undertakes a historical inquiry into the role of Dutch media for feminist activism and mediations of sexual misconduct.

The research data in use is collected in the public interest as part of the PDI-SSH funded research project Twi-XL: An infrastructure for cross-media research on public debates.


Purpose of the research project

Overall, this PhD research seeks to understand the role and responsibilities of different media and their technical affordances in shaping not only the public discourse but also the power of (feminist) digital activist movements, in particular personal testimony campaigns.

More concretely, it engages critically with the Dutch media discourse around sexual misconduct and how it is shaped by online movements like #MeToo (2017). Recent social media movements become thereby situated within a longer history of attempts to render visible‚ the struggle to end sexist oppression‘ (hooks, 1984) across different media. To better understand the power and role of media in shaping this discourse, it maps involved actors and investigates the role of space, such as educational or governmental institutions, social media platforms, legacy news outlets.

As part thereof, the project creates access and historical knowledge about the Dutch discourse on sexual misconduct through leveraging, building and critically reflecting Twi-XL as a cross-media research infrastructure. By designing Python-based Jupyter Notebooks for cross-media research within a humanities research agenda, the projects aims more broadly at facilitating access to existing Dutch data collections and archives, making them interoperable in critically reflected ways.


What data is collected?

This research leverages data from different archives hosted by collaborating Dutch institutions: the Dutch Twitter archive TwiNL, the Dutch public broadcasting collections hosted by NISV, and Dutch web archive hosted by the KB. Twi-XL has collected approximately 50% of all Dutch language tweets since 2011 until the shut-down of the official Twitter API at the beginning of 2023.

The research investigates particular feminist hashtags such as #MeToo, which comprises approximately 200.000 Dutch tweets, and broader terms relating to sexual misconduct (rape, sexual violence, sexual intimidation, transgressive behaviour, sexism). Within Twi-XL, access to these tweets was only provided in pseudonymised form. There are no available links or IDs that allow tracing tweets back, and the researcher strictly refrains from manually searching for tweets on the Internet on the basis of text. Any citations will be manually and carefully translated to English language, detaching them from their original Dutch format, and decoupling them from any personal identifiable information.

For studying debates on Dutch public broadcasting, data is queried from the Dutch NISV television archive for keywords related to the issue of sexual misconduct (e.g., verkrachting, seksueel geweld, seksuele intimdatie). The dataset consists of approximately 8.500 broadcasts in between 1960 and 2022 and contains broadcast metadata such as the date of screening, program and season title, program category and short summaries as well as the automated speech transcripts (ASR) that are quantitatively analyzed in terms of shifts in language discourse. Further, it contains links to individual broadcasts for closer inspection under authorized access to the CLARIAH Media Suite.


Discomfort, Risks & Insurance

Given the sensitivity of the research subject (public discourse on sexual misconduct), the research findings might cause certain individuals, groups or institutions to experience a sense of discomfort. Whilst such discomfort is in particular within feminist theory acknowledged as a natural if not even necessary for a critical investigation of such topics, it is strongly encouraged for anyone experiencing discomfort or holding concerns regarding their potential participation or association with a particular discourse to reach out to Sarah personally via email.

Throughout the research, the security and protection of vulnerable individuals and groups are of outmost importance and guide the research agenda. The aim of this research is to inform future steps towards shaping a healthy and productive media discourse around sexual misconduct, through both concrete involvement of institutional or political actors as well as technical design of media outlets.


Confidential treatment of personal data

The information gathered over the course of this research will be used for the purpose of this research project. Personal details will not be used in publications.

Twi-XL only provides pseudonymised tweet content and metadata with pseudonymised user ids. The research will not disclose or share any of such personally identifiable information and uses special encryption measures to keep the data safe on a nationally hosted server. The data is only accessible to a small number of authorized research staff members.

Data will be stored for a period of 10 years (01.01.2023-01.01.2033). The personal data will only be stored as long as is necessary for the research and will be deleted as soon as possible.


Further information

For further information on the research project, please contact Sarah Burkhardt via email that can be found on her academic profile.

If you have any complaints or concerns regarding this research project, you can contact the secretary of the Ethics Committee of the Faculty of Humanities of the University of Amsterdam (Binnengasthuisstraat 9, 1012 ZA Amsterdam, The Netherlands).