Assisted-by: | Linux mainline · since 2026-01-01

How much of the Linux kernel is written by AI?

Loading data…

In context

click any square to zoom in

Models in the kernel

By vendor

who trained the model

By tool

how the model was invoked

Per-vendor model breakdown

merged tags only

Notable

two patches worth pointing at

Activity

Top authors

who wrote the patch

Top committers

who landed the patch

Recent commits

How this was made

methodology, sources, disclosure

Merged side

Shallow clone of torvalds/linux with --shallow-since="2026-01-01", then git log --all --grep="Assisted-by:" -i. Each merged commit and each Assisted-by: line counted directly. Denominator is git log --since=2026-01-01 --oneline | wc -l on the same clone.

Submitted side

Submissions to lkml and subsystem lists pulled with lei from lore.kernel.org/all: lei q -d mid -f mboxrd 'b:"Assisted-by:" AND d:20260101..'. Replies, cover letters (0/N), and bot accounts (Patchwork, kernel test robot, syzbot, 0day) are dropped. Patch series respins (v1, v2, v3) collapse to one entry per (canonical subject, sender). The Assisted-by: line must appear in non-quoted body text.

Vendor / model / tool buckets

Each tag string is parsed into {vendor, model, tool}. Claude variants (Opus 4.5, 4.6, 4.7; Sonnet 4.5, 4.6) collapse cleanly. Wrappers (Claude Code, Cursor, GitHub Copilot, OpenCode, Kiro, Cody) attribute the model to its actual lab; the wrapper appears under "By tool" instead. Tags that do not parse (free-text, joke names, FOO:BAR.baz) bucket as Unknown.

What the page does not show

Merge rate (different humans use different tools for different patch types; the ratio is not a model quality signal). Submitted-versus-merged percentages (same reason). Authorial intent or motivation behind any tag string. Patches that landed without disclosure (this page measures policy compliance, not actual AI usage). Share of kernel-wide lines: GitHub's stats API refuses repos over 10k commits and computing diffstats from the partial clone would take hours; the AI-side line counts stand on their own.

Reproducibility

Source on GitHub: snek-git/assisted-by. Three small Python scripts: parse_commits.py (merged side), parse_lei.py (submitted side), build_data.py (assembles the JSON payload). refresh.sh runs the full pipeline end to end.

About this page

Built with assistance from Anthropic's Claude (Opus 4.7, via Claude Code) by a human collaborator. Tag string normalisation choices are judgement calls; the parser source on GitHub is the authoritative answer for any "why was X bucketed as Y?" question. Where contributors have publicly explained their setup (e.g., Greg KH's local LLM fuzzer rig), the page links the source; where they have not, it does not guess at intent.