16 Epistemic Standards and the Next Generation of Scholars
Andreea Musulan, Université de Montréal, andreea.musulan@gmail.com
Jean-François Godbout, Université de Montréal, jean-francois.godbout@umontreal.ca
Abstract: TBD
AI usage statement: TBD
16.1 Introduction
Artificial Intelligence (AI) is fundamentally changing the practice of quantitative social science. While the discipline has long been defined by the technical demands of data collection, processing, and analysis, the tools now available to researchers and students have simultaneously raised expectations for knowledge production and lowered the barriers to producing it. Ultimately, AI is not just increasing productivity; it is decoupling the ability to produce analysis from the ability to understand and evaluate it, allowing for research contributions that are not directly built on a foundation of human ingenuity.1
16.2 Background
With the social science behavioural revolutions (Dahl 1961), many departments required integrated methodological training, including statistics and programming. This led to a relatively small number of experts, along with specialized journals and conferences. Similarly, AI appears to be following this pattern, with methods-oriented scholars most likely responsible for teaching and developing these tools. However, because AI is especially useful for working with language data and programming, the years of training that previously constrained the pool of quantitative scholars no longer limit the size of the community. It also opens up the discipline to other subfields, particularly through text-as-data, the automation of research assistant tasks, and accessible coding and statistical guidance.
With respect to knowledge output, AI has expanded our ability to collect and process data, perform analyses, and report results (Karjus 2025; Filimonovic, Rutzer, and Wunsch 2025).2 In principle, this creates opportunities to contribute more efficiently to the advancement of the different social science disciplines. At the same time, it introduces new pressures on scholars to produce work that is meaningful and methodologically sound, while also increasing the number of scholars able to do this work. These pressures are related to the rate and quantity of research production, as well as technical sophistication and the amount of data being analyzed. While these may be manageable for experienced researchers, for students, they are reshaping the relationship between learning and skill acquisition.
16.3 Barriers
At present, the barrier to conducting quantitative social science research outside the domain of specialists is relatively low. Even undergraduate students can use Large Language Models (LLMs) to generate code, explain complex concepts, and produce analytical outputs.3 Tasks that previously required sustained effort to understand and internalize — such as programming and statistical reasoning — can now be completed with limited direct engagement. Rather than developing these foundational skills over time, students are increasingly able to compress the learning process through reliance on LLMs. This compression operates through reduced engagement with failure at all stages of the research process, including debugging and testing assumptions. Emerging evidence suggests that this reliance may be associated with weaker performance in tasks that require sustained critical engagement (Jošt, Taneski, and Karakatič 2024). The ability to recognize when an analysis is inappropriate, when assumptions are violated, or when results are misleading depends on an internalized understanding of the underlying processes. Without this, reliance on AI-generated outputs risks shifting the role of the researcher from active analyst to passive evaluator, with limited capacity to assess the quality of what is produced.
Historically, methodological capability in quantitative social science has been closely tied to statistical and programming expertise. These skills were not only instrumental for processing observational data and operationalizing concepts, but also for understanding the assumptions and limitations underlying empirical analysis that enable interpretation and validation. For example, preprocessing text data required familiarity with both the structure of the source data and the implications of different formatting and cleaning decisions. Today, AI tools can facilitate, and in some cases automate, many of these steps with minimal programming experience. While these tools remain embedded within broader systems of analytical dependencies that require informed oversight, they nevertheless expand what can be accomplished with relatively little technical training.
16.4 Developing expertise
This shift has important implications for how expertise is developed and evaluated. AI systems can assist with implementing analytical procedures and generating outputs, but they do not substitute for the ability to formulate strong research questions, critically assess the appropriateness of the analysis and the validity of its results. Consequently, it becomes possible to produce work that appears methodologically sophisticated without a corresponding depth of understanding. This can lead to the creation of research outputs that involve misinterpretation of results, incorrect application of assumptions, or inappropriate tool selection. Although experienced researchers may be well positioned to preemptively identify and address these potential issues, this is less likely to be the case for those still in the process of acquiring foundational skills.
These dynamics will transform the scale of social science research. AI will increase both the number of scholars and their capacity to produce outputs, likely leading to a substantial expansion in research volume. While this may compromise quality, that outcome is not inevitable. However, excessive cognitive offloading introduces the risk that research capacity will expand more quickly than underlying competence. Earlier forms of automation, including statistical analysis and natural language processing, enabled scaling but remained limited to structured, quantifiable data. In contrast, AI now functions as a highly capable research assistant, able to produce and analyze massive amounts of text, video, audio, and image data at a scale unprecedented in social science.
As AI continues to diffuse across the discipline, its integration will extend into qualitative research, making it increasingly difficult to distinguish from quantitative approaches. Text-as-data will play a central role in this convergence, as both traditions rely heavily on textual analysis. As with the behavioural revolution of the 1960s, disciplines and departments will need to redesign training and teaching approaches. New methods will emerge alongside or in place of traditional tools such as surveys and regression. This will also prompt foundational ontological and epistemological debates, shifting focus toward the role of AI itself rather than the traditional quantitative–qualitative divide.
16.5 Next steps
Quantitative social science is thus well positioned to help define epistemic standards in this context and to coordinate such efforts across disciplines. This involves outlining expectations particularly around the use of AI tools, such as prompt documentation, model selection, and reproducibility (Munafò et al. 2017; Gebru et al. 2021). Scholars can develop frameworks for assessing methodological quality in an environment where tools are widely accessible but unevenly understood. Without this role, research quality risks declining, as few other areas are equipped to evaluate AI-driven methods with the same level of rigor.
Although considerable attention has been given to how AI can be incorporated into research workflows with appropriate human oversight, less attention has been paid to how these tools are shaping the development of future scholars. This raises important questions for educational institutions. In particular, there is a need to consider how foundational skills can continue to be cultivated in an environment where many of the tasks associated with learning can be readily automated. Encouraging sustained engagement with core concepts, and ensuring that students develop the capacity to critically evaluate analytical outputs, will be essential for maintaining the integrity of the different social science disciplines. This has implications not only for training, but for the reliability and interpretability of research outputs at scale.
As a cross-cutting field, computational social science should play a central role in defining and teaching foundational skills in programming, model understanding, and advanced statistical analysis, much like the continued teaching of mathematics after the introduction of calculators. While AI automates increasingly complex tasks, this makes such training more important, not less. These skills are necessary to ensure that scholars retain the ability to interpret, evaluate, and control automated research outputs. Research integrity in the current technological context will depend not only on individual expertise; it will hinge on the discipline’s ability to outline and apply methodological standards for the use of AI tools at scale, ensuring that scholars retain the capacity to critically evaluate and interpret these outputs.
It is prudent to make the distinction between the augmentation of work using AI and automation for the central argument of this paper. While reliance on AI introduces “adverse consequences” (for example, through the lack of “domain expertise”) (Lei and Kim 2024, 251), the incorporation of AI as an augmentation tool has been shown to at least match the level of human performance in quantitative social science research (Brodeur et al. 2025, 3–4). Research incorporating AI in general has also been shown to be “more likely to be cited both within” and across scientific disciplines, but that there is a “misalignment between AI use and AI education” (i.e., there is insufficient training relative to the extent of AI application in research) (Gao and Wang 2024, 2286–87).↩︎
The application of AI in quantitative social science has helped the field confront one of its most widespread challenges, overfitting, through “cross validation and regularization” (Zhang and Feng 2021, 281).↩︎
Recent research on the use of generative AI use by doctoral students, particularly in more quantitative fields, such as economics, suggests that its incorporation results in higher quality and quantity of outputs (Xu and Shen 2026). However, the fundamental concepts and skills are developed prior to doctoral research.↩︎