The proven thing — what we already know works in classrooms, and why the question isn't settled yet.
Education research has produced, over the past two decades, something rare: a genuine consensus. A specific intervention, under specific conditions, produces learning gains at a scale and consistency that almost nothing else in the field can match. We know what it looks like. We know what it costs. We know which design details, if skipped, collapse the effect to zero. The name for it is high-impact tutoring, and it is not a new idea. It is a proven one.
That word — proven — is worth sitting with.
What the evidence actually says
The case for high-impact tutoring is built from a set of evidence syntheses that are unusually convergent. The J-PAL Evidence-Effect summary, the IES analysis of tutoring as a learning-loss recovery strategy, and the Annenberg Institute's design-principles brief all point at the same finding: tutoring, under the right conditions, produces learning gains roughly 15 to 20 times larger than standard tutoring — the kind of generic, drop-in, once-a-week arrangement most people imagine when they hear the word. J-PAL Evidence-Effect IES blog
The gains translate to between three and fifteen additional months of learning per intervention. The Chicago Public Schools study, which ran AmeriCorps fellows as daily 2:1 tutors with low-performing students, found GPA gains of 0.58 points — effectively moving a student from a C− to a C+ — alongside attendance improvements of up to seven percent. These are not marginal numbers.
The design conditions that produce those numbers are specific. The Annenberg brief distills them into five: in-school-day delivery, at least three sessions per week, no more than four students per tutor, a consistent tutor relationship (the same person, not a rotating pool), and a paid trained tutor rather than a volunteer. Annenberg / EdResearch for Action
Every one of those conditions is a design choice. And that specificity is what makes the proven thing interesting.
The 2022 finding nobody is citing
Here is the detail that makes the evidence base sharper than it first appears. In 2022, researchers studied virtual tutoring programs across eight US school districts. The programs used the same word — tutoring — and were funded, in several cases, explicitly on the strength of the high-impact tutoring evidence base. The result was no detectable effect. NCTQ research brief
Same label. No effect. Because the design conditions broke.
The sessions happened outside the school day rather than inside it. The tutors were online rather than present and consistent. The frequency dropped below three sessions per week. Each of those deviations individually reduced the probability of a strong effect. All of them together eliminated it entirely.
This finding is under-cited in the broader conversation about what works in education, which is itself a curious fact. It is the cleanest argument education research has for the proposition that design discipline matters more than the label. Calling something "tutoring" and funding it on the basis of what the evidence says about tutoring — but then implementing it without the conditions the evidence demands — produces a study that looks like evidence against tutoring, when it is actually evidence for design precision.
That distinction is not a technicality. It is the substance of the finding.
The spending picture
Against this evidence base, the 2026 spending picture is instructive. Global investment in AI in education has reached approximately $10.6 billion per year, with $800 million or more in federal and philanthropic grants flowing into AI-in-education programs in 2026 alone. ai-in-education-evidence-base-2026
The investment in scaling high-impact tutoring, at the same moment in time, is an order of magnitude smaller.
This is not a verdict on AI in education. The AI evidence base is being built in real time — Stanford's 2026 review of 800+ academic papers found twenty with strong causal evidence, which is not nothing, and is also not ten billion dollars' worth of certainty. The more interesting observation is about the ratio. Education's most well-evidenced intervention, with fifteen to twenty times the effect size of standard tutoring and a clear design specification available in a publicly-funded brief, is not receiving fifteen to twenty times the investment. It is receiving a fraction of it.
Why is that?
The design-discipline problem
The honest answer is that high-impact tutoring is hard to package.
A software subscription is deployable. An account administrator can roll it out in an afternoon. Usage is trackable on a dashboard. The marginal cost of adding one more student is low.
High-impact tutoring requires a trained adult in a room with one to four students, at least three times a week, during the school day, consistently, over time. The logistics are real. The cost per student is real. The scheduling conflicts with existing structures are real. These are not objections to the evidence — they are the honest description of what the evidence is asking for.
This is where the conversation becomes interesting rather than discouraging. Because the question it opens is not "should we fund high-impact tutoring instead of AI in education." That framing is a false binary and it isn't very curious.
The question it opens is: what if the proven thing is the floor, and AI's job is to raise it?
A different frame
Some of the more interesting 2026 work in education AI is attempting precisely this: delivering the design conditions of high-impact tutoring at a scale and cost that the human-only version can't. Consistent presence. Adaptive pacing. Persistent relationship. More than three sessions a week. Those are not easy problems for AI to solve — the OECD's 2026 Digital Education Outlook found that heavy reliance on generative AI tends to reduce metacognitive engagement, which is the opposite of what high-impact tutoring is designed to build. ai-in-education-evidence-base-2026
But the aspiration is clearly there, and the frame it points at is celebratory rather than skeptical. The proven thing has already told us what conditions matter. The research that established those conditions is publicly available, funded, synthesized, and specific. Any new intervention in this space — AI or otherwise — has a ready-made evaluation framework: does it replicate these conditions, and does it produce comparable learning gains?
That is not a low bar. It is the right bar.
What we're watching for
We are a venture studio with a long attention span. We will return to this question as the evidence base fills in — as AI tutoring programs mature, as effect sizes are measured against the high-impact tutoring benchmark, as the 2022 virtual-tutoring lesson either gets applied or gets forgotten again.
The pattern we notice, and want to keep noticing: education research has done the hard work of identifying what works. The proven thing exists. The design specification is public. The open question is not whether tutoring works — it is whether the conditions can travel, scale, and survive implementation without the design discipline collapsing.
That is a question about architecture more than about evidence. high-impact-tutoring aid
We are watching that question closely. Replies, when they come, come by letter.
Sources
Wiki pages drawn from
concepts/high-impact-tutoring— the evidence base, design conditions, Chicago Public Schools study, 2022 virtual-tutoring finding, and open questions.topics/ai-in-education-evidence-base-2026— the spending picture, Stanford 800-papers / 20-strong-causal split, OECD metacognitive-engagement finding.concepts/aid— the Thewhat R&D path; high-impact tutoring as the human-centered baseline against which Aid-track work is measured.
External sources
- "High-quality tutoring — an evidence-based strategy to tackle learning loss" — Institute of Education Sciences (IES). https://ies.ed.gov/learn/blog/high-quality-tutoring-evidence-based-strategy-tackle-learning-loss
- "Tutoring programs lead to learning improvements" — J-PAL Evidence-Effect. https://www.povertyactionlab.org/evidence-effect/tutoring-programs-lead-to-learning-improvements
- "Design principles for effective tutoring" — Annenberg Institute at Brown University / EdResearch for Action, June 2024. https://annenberg.brown.edu/sites/default/files/EdResearch_for_Recovery_Design_Principles_1.pdf
- "High-impact tutoring — five ways to increase effectiveness with students" — National Council on Teacher Quality (NCTQ). https://www.nctq.org/research-insights/high-impact-tutoring-five-ways-to-increase-effectiveness-with-students/