Here is a critique of the ‘substitution framing’ common in media coverage of AI’s impact on scientific careers — a framing that misses the more consequential story of what rigorous, AI-augmented data analysis could achieve. Want to keep your job? Send this to your colleagues.

Share

The Wrong Conversation Is Happening

A recent article in Nature posed the question now circulating in every department meeting, grant committee, and HR budget conversation across the research world: which science jobs will AI take? The article interviewed researchers. It identified coding positions as already obsolete. It offered modest reassurance to bench scientists and senior investigators. And it framed the entire issue as a matter of competition — human analysts versus AI systems, fighting over the same territory.

This framing is not just wrong. It is actively harmful to the profession.

When organizations internalize the substitution narrative — when department heads and hiring managers conclude that AI can ‘do data analysis’ and therefore fewer analysts are needed — they make a category error with serious consequences. They confuse the automation of low-level analytical tasks with the elimination of the need for sophisticated analytical thinking. And in doing so, they set themselves up to produce research that is faster, cheaper, and substantially worse.

This article is a direct message to data analysts: you are not becoming obsolete. You are becoming the most important person in the room. But only if you seize the moment, expand your conception of what your job actually is, and refuse to let your employers treat AI as a cheaper version of you.

The real threat to data analysts is not that AI will replace you. It’s that your organization will use AI as an excuse to stop investing in rigorous methodology. They will suffer badly if they do - and it’s your job to protect them.

What AI Actually Does — and What It Doesn’t

To understand the opportunity, it helps to be precise about what AI tools currently do well and where they fall flat. This is not a binary. The capacity of these systems is not evenly distributed across the analytical workflow, and the distribution matters enormously.

AI coding assistants — tools like ChatGPT, Grok, Claude, GitHub Copilot, and similar systems — are genuinely impressive at generating syntactically correct, reasonably idiomatic code for standard analytical tasks. Given a description of a dataset and an objective, they can produce workable R or Python for descriptive statistics, common regression models, data wrangling operations, and basic visualization. For the kind of code that a first-year graduate student or a research programmer might produce after their third similar project, AI is a legitimate substitute. This is the part the Nature article got right. But they overemphasize the importance of the ability to generate (really, re-generate) source code for which thousands of libraries already exist and have existed for decades. What AI brings is ease of access. But, as in all things, you get what you pay for.

It does no good to own a complex machine if you do not have the instruction manual. It does no good to have an instruction manual if you cannot read it.

Here are some examples of what AI does not do, at least not without a skilled analyst directing it:

It does not recognize when a study design is fatally flawed unless asked - and often even then it will miss details. AI will fit a regression model on data with classic confounding structure and hand you coefficients without flagging that the estimates are uninterpretable. It will run a difference-in-differences analysis on data that violates parallel trends without telling you the identifying assumption is likely broken. It will produce a beautifully formatted results table from a convenience sample and note, at most, a boilerplate caveat about generalizability.

It will now know which measures are independent, how to update degrees of freedom, that it should learn, not assume the Type I/Type II error trade-off… the list goes on and on about what is will not bring to the table in a newb’s hands.

It does not make appropriate decisions about model specification. AI will include or exclude variables as instructed, but it lacks the domain knowledge and causal reasoning to distinguish a confounder from a cofactor or mediator, to recognize when a control variable opens a collider path, or to understand why adjusting for post-treatment variables biases estimates. These are not edge cases in research. They are central to nearly every analysis that matters.

It does not tell itself to use objective model selection criteria, or to use machine learning to prevent model overfit.

It does not evaluate the quality of evidence it synthesizes. When asked to summarize a literature, AI can produce fluent, well-organized prose that treats a severely underpowered study the same as a large pre-registered trial. It has no native capacity to weight evidence by methodological quality, to identify when apparent consensus reflects shared methodological limitations across studies, or to flag when heterogeneity in results is almost certainly driven by design differences rather than true moderators.

It does not know the difference between a study that has been manipulated by p-hacking or biased by inclusion/exclusion criteria and one that has not been forfeited by such actions.

It does not protect against researcher degrees of freedom. Perhaps most critically, AI has no stake in the integrity of a finding. Left to run unsupervised, it will cheerfully execute whatever analysis produces a result, without any internal pressure toward the kind of pre-commitment that gives research its evidentiary value.

This means that all comparisons will be equally important, and in the case of multiple hypothesis testing, it will not be selective to known which results may have particular implications for downstream clinical translational efforts.

These gaps are not temporary inconveniences that will close in the next model release. They are, in a meaningful sense, structural. The things AI cannot do are precisely the things that require judgment, domain knowledge, causal reasoning, and an understanding of what makes evidence convincing — capacities that accumulate through years of practice and that currently reside in people, not models.

The 20th Century Problem

Here is the uncomfortable truth that data analysts need to confront before they can make the argument for their own value: many analytical practices still dominating research — in academia, in industry, in government — are genuinely outdated, severely limited, and misleading. They predate the computational and statistical tools that could replace them. They persist not because they are the best available methods, but because they are what supervisors learned, what reviewers expect, and what organizations have calcified around.

Consider what is still treated as standard practice in large portions of empirical research. P-value thresholds as binary classifiers of truth. Analysis plans decided, tacitly or explicitly, after looking at data. Sensitivity analyses conducted selectively and reported when favorable. Effect size heterogeneity dismissed rather than investigated. Measurement models assumed rather than tested. Standard errors that treat clustered or autocorrelated observations as independent. Literature reviews that count studies rather than weighting them. These are not obscure methodological complaints. They are widely recognized problems — the replication crisis has made them famous — and yet they persist because reforming them requires both the methodological knowledge and the institutional will to push back on comfortable habits.

AI, properly used, is a powerful tool for attacking every one of them - and for generating improvements in data analysis procedures. This can only occur if there is a methodologically curious analyst in the room who knows what the problems are and has the standing to insist they be addressed. The risk is not that AI will replace good analytical practice. The risk is that organizations will use AI to enforce the execution of bad analytical practice faster, at lower cost, with more output — and then wonder why their research doesn’t replicate, their models don’t generalize, and their decisions keep failing.

AI can automate your workflow. Only you can determine whether that workflow is worth automating.

Pre-Registration and Analysis Plans: AI as a Partner in Rigor

One of the most concrete and immediately actionable ways AI augments the data analyst’s role is in the development of pre-registered analysis plans and data analysis plans (DAPs). This is an area where the tool is genuinely transformative, and where analysts who embrace it will produce work that is categorically more credible than what was achievable even five years ago.

Pre-registration solves a specific problem: when researchers make analytical decisions after seeing data, they introduce flexibility that inflates apparent evidence. The treatment of outliers, the choice of covariates, the operationalization of outcomes, the decision to use parametric versus non-parametric tests — each of these choices, made post-hoc, represents a researcher degree of freedom. Individually, each seems innocuous. Collectively, they create the conditions for false positives to accumulate and for findings to not replicate.

Want a humbling wake-up call? Have Grok, ChatGPT and Claude all critique the methodology of all of your past results in papers you have co-authored. Ask them:

“Look for incompleteness in design, sources of unstudied bias, threats to generalization, and any other problems in the design of analysis or study design considering the specific aims and hypotheses this paper has proposed to have tested.”

The ask them

“How could these problems be avoided? Consider specific formal steps in study design, design of analysis and other important roles of data analysts in science. Bring all of 20c and 21st knowledge in proper and rigorous data analysis to bear. Provide full citations and describe the expected improvement.”

No matter how good of a data analyst you think you are, you will learn to be a curious methodologist.

The historical barrier to thorough pre-registration was not unwillingness. It was effort. Writing a genuinely comprehensive analysis plan — one that specifies not just the primary analysis but the full decision tree for handling missing data, outliers, non-normality, model non-convergence, unexpected subgroup distributions, and competing model specifications — is a substantial undertaking. Analysts knew they should do it. They often didn’t have the time to do it well.

AI eliminates this barrier almost entirely. An analyst who understands the study design, the measurement instruments, the likely data quality issues, and the inferential goals can work with an AI system to develop analysis plans of a comprehensiveness that was previously impractical. They can stress-test the plan by asking the AI to identify scenarios they haven’t anticipated. They can check internal consistency — do the specified primary and secondary analyses measure what they intend to measure? They can ask “How many valid DAPs could be designed here, and which ones might be preferred, and why? Enumerate the strengths and weaknesses of each. Then I’ll tell you my preferred priorities of rigor, and you can make a recommendation on the best options”. The final decision, of course is the analysts.

Important questions can be pondered when you team up with AI. Do the exclusion criteria create potential selection artifacts? Are the power calculations consistent with the stated minimum detectable effects? What does a power curve here look like?

Crucially, this process forces the analyst into explicit engagement with assumptions they might otherwise leave implicit or leave to habit and worse, familiarity. The act of explaining a planned analysis to an AI system in enough detail for it to help develop the plan is itself a discipline. It surfaces ambiguity. It requires precision. It makes gaps visible before they become problems.

Organizations that have not yet adopted pre-registration as a standard practice are, in 2026, behind the curve. An analyst who can lead that transition — who can use AI to make rigorous pre-registration practical and not merely aspirational — is not just doing their job well. They are performing a function that directly protects the credibility of every piece of research the organization produces.

Study Design: Moving the Conversation Upstream

The highest-leverage moment in any research project is before data collection begins. Decisions made at the design stage — about sampling frames, randomization procedures, measurement instruments, treatment protocols, follow-up periods, and outcome hierarchies — shape everything that follows. They determine what causal claims are defensible, what heterogeneity can be explored, what sample sizes are needed, and what will happen when the inevitable deviations from protocol occur.

Historically, data analysts were often brought in late — after design decisions had already been made, sometimes after data had already been collected. That needs to end. This is the analytical equivalent of being handed a finished building and asked to evaluate whether the foundation is sound. It is not a position from which much can be done.

AI gives analysts a powerful argument and a powerful tool for moving upstream. The argument is methodological: the analytical implications of design choices are complex enough that someone who understands analysis needs to be at the table when design decisions are made. The tool is practical: analysts can now use AI to rapidly model the consequences of design alternatives in ways that make the argument concrete.

Investigators can also now ask the same questions of a study design or DAP they have been handed by data analysts… but they do not fully understand context, the jargon, and how the entire machine fits together. Bonferroni? Maybe. Family-wise instead? Likely. How about comparing or combining the Akaike Information Criterion information criterion with Mallow’s Cp, after testing for multicollinearity, of course! Run an independent test set of people in this subgroup to test for abductive accuracy, sensitivity, specificity and the ROC AUC?

To those in science who think their analysts are superfluous, AI will not offer these up. And your peers who keep their analysts will notice your limitations.

What does that look like in practice? It might mean running simulation studies to show how different sample size allocations affect power across a range of plausible effect sizes and variance structures. It might mean modeling the bias implications of different attrition patterns on a proposed longitudinal design. How about simulating the clinical trial to make sure that emergent subgroups for post-hoc analysis less than a given size are not analyzed due to low power? It might mean stress-testing a proposed matching strategy by generating synthetic data with known confounding structure and demonstrating what the estimator recovers under different assumptions. It might mean rapidly reviewing the measurement properties of proposed instruments and flagging whether their reliability is sufficient for the planned between-group comparisons.

I should mention that random allocation with matching becomes much easier - and learned empirical weighting particular attribute of study design participants during matching will likely emerge as a new area of methodological research.

Gibberish to you? If you don’t know what a Type III error is in statistics without googling it, don’t fire your analysts.

These are all things that skilled analysts knew they should do. AI makes them fast enough to do routinely. And the analyst who shows up to a design meeting with this kind of analysis — who can quantitatively demonstrate that Proposed Design A will produce uninterpretable results while Design B is tractable — has a kind of influence that no amount of methodological authority alone could purchase.

The analyst who moves upstream into design is not exceeding their role. They are finally occupying it fully.

Machine Learning: Expanding the Inferential Toolkit

A significant portion of analysts working in applied research settings — in public health, social science, economics, policy evaluation, clinical research — were trained primarily in regression-based methods. This is not a criticism. For many inferential tasks, regression-based estimators are appropriate, well-understood, and defensible. But the machine learning toolkit has expanded the range of questions that can be answered well, and AI substantially lowers the barrier to competent application of these methods.

Consider the problem of heterogeneous treatment effects. Standard regression approaches estimate average effects. They can incorporate interactions, but interaction testing is underpowered, and the choice of which interactions to test is itself a researcher degree of freedom. Methods like causal forests and other double machine learning estimators are specifically designed to recover effect heterogeneity from observational or experimental data — to identify subgroups for whom an intervention works well or poorly — without requiring the analyst to specify in advance which moderators matter. AI tools can help analysts implement these methods correctly, understand their assumptions, interpret their output, and explain their logic to non-technical stakeholders.

Which type of prediction model algorithm is generally best suited to this problem is no longer a matter of familiarity (limited by know-how) or personal preference. AI can help.

Similarly, modern methods for handling missing data — multiple imputation by chained equations, full information maximum likelihood estimation — are well-established but still routinely replaced in practice by listwise deletion or simple mean imputation, approaches that produce biased estimates whenever data are not missing completely at random. AI lowers the implementation cost of doing this correctly.

Regularization approaches for high-dimensional covariate adjustment — LASSO, ridge, elastic net — allow analysts to control for large sets of potential confounders without the overfitting and specification uncertainty that plague traditional stepwise selection. Doubly robust estimation methods protect against misspecification of either the outcome model or the propensity score model. Synthetic control methods extend the logic of difference-in-differences to settings with a single treated unit and long pre-treatment time series.

What about models that consider variable A OR variable B OR variable C instead of A+B+C?

None of these methods are new. Many have been available for years. What has changed is the cost of applying them well. AI assistants reduce implementation friction, help analysts understand when methods’ assumptions are plausible in their specific context, and help translate technical output into plain language for reports and presentations. The analyst who uses AI to bring these methods into their standard toolkit is not doing something exotic. They are doing what methodological progress requires: applying the best available tools to the questions at hand.

What about the question: “Here is my study design and DAP. What alternative exist that are more rigorous? How can I improve these without breaking the study’s budget? Will this design let us answer the question(s) we are asking?”

Evidence Synthesis: Finally Doing It Right

Literature review and evidence synthesis are areas where current practice is, in many fields, remarkably primitive relative to what the methodology supports. The standard approach — search a database, read abstracts, read selected papers, write a narrative summary — is susceptible to well-documented biases: publication bias, confirmation bias, anchoring to prominent studies, failure to account for methodological quality, and inability to integrate quantitative estimates across studies with precision.

Systematic review and meta-analysis address these problems, but conducting a rigorous systematic review is enormously time-intensive. For many research questions, the effort required to do it properly has meant it simply wasn’t done. Instead, researchers read selectively, remembered what confirmed their priors, and wrote narrative reviews that reflected the literature they happened to encounter.

AI changes this calculation in several ways. First, the screening phase of a systematic review — reviewing titles and abstracts for inclusion criteria — can be substantially accelerated, making systematic approaches tractable for a much wider range of questions. Second, and more interestingly, AI can assist in extracting standardized data from included studies, including sample characteristics, design features, outcome measures, effect sizes, and methodological quality indicators. (Do not trust AI to perfectly grab these important details, however. Run other AIs on the results to error-check). Third, AI can help analysts think through heterogeneity — not just estimating average effects across studies, but reasoning about what study characteristics predict effect size variation and what that implies about where an effect is likely to generalize.

But the most important contribution AI can make to evidence synthesis may be in helping analysts reason about discordant evidence — cases where studies reach different conclusions. When studies disagree, the naive approach is to count votes: more studies support X, therefore X. This is almost always wrong. Studies disagree for reasons, and the reasons matter. Disagreements often trace to differences in sample characteristics, operationalizations of the independent or dependent variable, analytic approaches, or methodological quality. Understanding the source of disagreement is usually more informative than averaging across it.

(Watch for a new method on calculating evidentiary weights of null results in the discordant result setting. It’s very cool.)

AI, directed by an analyst who understands what they’re looking for, can help systematically characterize study-level moderators, evaluate whether effect size variation is consistent with sampling error alone (the question that heterogeneity statistics like I² partially address, though often misinterpreted), and generate hypotheses about what design features or population characteristics explain divergent results. This moves evidence synthesis from the realm of narrative rhetoric — ‘most studies show...’ — toward something that more closely resembles scientific inference.

The Advocacy Problem: Making the Case in Your Organization

Everything described above — richer pre-registration, upstream design involvement, expanded ML methods, rigorous evidence synthesis — requires something beyond technical skill. It requires organizational buy-in. And organizational buy-in, in the current climate, is not guaranteed.

The substitution narrative has real power with budget-conscious decision-makers. ‘AI can do data analysis’ is a sentence that is approximately true in the narrow sense that AI can execute analytical tasks, and it is deeply misleading in the sense that matters, which is whether the resulting analyses are methodologically sound and interpretively valid. The people who run organizations are often not in a position to evaluate this distinction themselves. They are making decisions based on cost and output, and AI looks like it delivers both.

This is where analysts need to be advocates, not just technicians. And advocacy requires translating methodological arguments into consequences that non-methodologists care about.

The replication argument: Research that uses best-practice methodology costs more to produce but is far more likely to replicate, generalize, and survive scrutiny. Research produced by cutting analytical corners — even efficiently, with AI assistance — is research that will embarrass the organization when it fails, as it increasingly will in environments where replication attempts are more common and pre-registration is more expected. You can debate with AI which particular Phase 2B studies are important and which are less likely to have impact.

The decision quality argument: For organizations that use data analysis to inform consequential decisions — clinical protocols, policy interventions, product development, resource allocation — the cost of an analysis that reaches the wrong conclusion is not the cost of the analysis. It is the cost of the decision made on how to fund that analysis. A faster, cheaper analysis that leads to a wrong decision is not a bargain. It is an expensive failure with a delayed invoice.

The credibility argument: Regulatory bodies, journals, funders, and sophisticated clients increasingly demand methodological transparency and rigor. An organization whose analytical practices are visibly behind the current standard will face growing barriers: rejected grants, desk rejections from journals, requests for re-analysis from regulators, skepticism from sophisticated partners. The analyst who upgrades the organization’s methodology is not adding overhead. They are protecting the organization’s ability to do business.

The AI-augmentation argument: This is the argument that directly counters the substitution narrative. AI does not replace the analyst who understands what good analysis looks like and uses AI to achieve it faster and more comprehensively. AI replaces the analyst who performs rote tasks without that understanding. The question for any organization is not ‘do we still need analysts now that we have AI?’ The question is ‘are our analysts using AI to do more rigorous, more comprehensive work, or are they being displaced by AI tools that will produce the same marginal analysis they were producing before, just more cheaply?’ The former is a capability upgrade. The latter is a quality downgrade disguised as efficiency.

What This Means for Your Career Right Now

Concrete steps matter more than abstract positioning. Here is what embracing this opportunity actually looks like in practice.

Learn to use AI tools fluently and critically. This means not just using them to generate code, but using them as analytical collaborators — asking them to check your reasoning, identify assumptions you haven’t stated, generate alternative specifications you should consider, and stress-test your conclusions. The analyst who uses AI only as a code generator is using a fraction of its value. The analyst who uses it as an active partner in methodological reasoning is operating at a fundamentally different level.

Develop expertise in methods that are underused in your field. AI lowers implementation costs across the board, but it lowers them most for methods that were previously costly to implement. If your field still relies heavily on null hypothesis significance testing without attention to effect sizes and uncertainty quantification, learn Bayesian approaches and the arguments for them. If heterogeneous treatment effects are routinely ignored, learn causal forests. If evidence synthesis is done narratively, learn systematic review methods. If you’ve been doing regression and ANOVA or ANCOVA for years, by god, in the name of all things objective, learn machine learning. Position yourself as the person who knows not just how to execute analyses, but which analyses to execute and why.

Insert yourself into the design phase. Every project you work on that involves data collection is a project where you should be involved before data collection begins. Insist on it. Make this a deal-breaker condition, and a standing argument. Frame it not as methodological preference but as risk management: the earlier analytical expertise is applied to a project, the less expensive it is to correct avoidable problems.

Lead pre-registration efforts. Use AI to make comprehensive pre-registration practical. Develop templates for your organization’s common study types. Become the person who knows what a good analysis plan looks like and can help teams produce one efficiently. This is a concrete contribution to research integrity that is also a concrete demonstration of your value.

Document what good methodology prevents. Keep track of cases where methodological scrutiny — yours or AI-assisted — caught problems that would have produced wrong conclusions. Build an internal record of prevented failures. Organizations respond to evidence, and the evidence that good analysis is worth paying for is most compelling when it is specific, local, and concrete. Include these in your faculty or staff evaluation reports.

The Actual Stakes

The Nature article was not wrong that some data-adjacent jobs will be displaced. They will be. The research programmer who writes standard packages, the analyst who runs the same regressions on every dataset without thinking carefully about what questions they can and cannot answer — these roles are, genuinely, under pressure. This is not comfortable, but it is real.

What the article missed is the countervailing force: the enormous unmet demand for exactly the kind of methodological sophistication that AI cannot replace. Science has a reproducibility problem. Applied research has a rigor problem. Evidence synthesis has a quality problem. Organizations making decisions on data have a validity problem. None of these problems are solved by AI that can generate code. They are solved by analysts who understand what rigorous analysis requires and use every available tool — including AI — to deliver it.

AI will not just allow more data analysis to be done faster. It will allow the same amount of data analysis to be done better.

The 20th century left us with methodological habits that made sense given the computational and statistical tools of their era. Many of those habits are no longer necessary, but they persist because change is hard and inertia is powerful. AI has changed the cost structure in ways that make better methods practical. But methods don’t change themselves. Organizations don’t reform their own practices without someone pushing.

Data analysts are in the best possible position to push. They understand the problems. They have access to the tools. They are present at the moment when analytical decisions are made. The question is whether they will define their role narrowly — as executors of whatever analysis they are asked to run — or broadly, as the people responsible for ensuring that the research their organizations produce is worth producing.

AI is not your replacement. It is the best argument you have ever had for doing the job you were always supposed to be doing.

The 20th century is over. Act like it.

Thanks for reading Popular Rationalism! This post is public so feel free to share it. Share

Leave a comment