Notable other

IMPACTeen: Intentions, Manipulation, Persuasion, Annotations, and Consequences in Teen Communication Dataset

Aleksander Szczęsny, Wiktoria Mieleszczenko-Kowszewicz, Maciej Markiewicz, Beata Bajcar, Tomasz Adamczyk, Jolanta Babiak

Published
Jun 15, 2026 — 16:16 UTC

Problem — The paper addresses the lack of comprehensive datasets focused on social influence in adolescent communication, particularly in the context of interpersonal, media-based, and digital interactions. Existing datasets often do not capture the nuanced perspectives of various stakeholders involved in adolescent communication, such as parents, psychologists, and educators. This work is particularly relevant as it provides a structured resource for researchers interested in social influence detection and related fields. Notably, this is a preprint and has not undergone peer review.

Method — The authors developed the IMPACTeen dataset, which consists of 1,021 textual scenarios that exemplify social influence techniques in adolescent contexts. The dataset was generated using constrained large language model (LLM) techniques, followed by a rigorous two-step human editing and validation process to ensure realism and relevance to youth contexts. Each text in the dataset is annotated from five distinct perspectives: teenagers, parents, psychologists, communication experts, and teachers. The annotations cover multiple dimensions, including the presence of influence, techniques employed, intentions behind the influence, potential consequences, resistance strategies, reactions, and the confidence level of the annotators. The dataset is available in both Polish and English, enhancing its accessibility for cross-lingual research.

Results — The authors report that the dataset contains 5,100 individual annotation records, providing a rich resource for training and evaluating models on social influence detection. While specific performance metrics on benchmark tasks are not disclosed, the dataset’s multi-dimensional annotations are expected to facilitate significant advancements in understanding social influence dynamics among adolescents. The authors suggest that the dataset can be utilized to explore annotator disagreement and improve cross-lingual modeling capabilities.

Limitations — The authors acknowledge that the dataset is limited to Polish and English versions, which may restrict its applicability in non-European contexts. Additionally, while the dataset aims to capture a wide range of social influence scenarios, it may not encompass all possible contexts or techniques relevant to adolescent communication. The reliance on LLMs for initial text generation may introduce biases inherent to the models used, which could affect the realism of the scenarios. Furthermore, the dataset’s validation process, while thorough, may still be subject to subjective interpretations of social influence.

Why it matters — The IMPACTeen dataset represents a significant advancement in the study of social influence within adolescent communication, providing a structured framework for future research. Its multi-perspective annotations allow for a deeper understanding of how different stakeholders perceive and react to social influence, which is crucial for developing effective interventions and educational programs. The dataset’s implications extend to various domains, including psychology, communication studies, and machine learning applications focused on social influence detection. This work lays the groundwork for future studies that can leverage the dataset to explore complex social dynamics, as published in arXiv cs.AI.

Turing Wire

By Turing Wire editorial staff · Jun 15, 2026 · Editorial standards →

Source: arXiv cs.AI