Test-Time Training with KV Binding Is Secretly Linear Attention

Junchen Liu^*, Sven Elflein, Or Litany, Zan Gojcic, Ruilong Li^*

^*Equal contribution.

ICML 2026

Abstract

Test-time training (TTT) with KV binding as sequence modeling layer is commonly interpreted as a form of online meta-learning that memorizes a key–value mapping at test time. However, our analysis reveals multiple phenomena that contradict this memorization-based interpretation. Motivated by these findings, we revisit the formulation of TTT and show that a broad class of TTT architectures can be expressed as a form of learned linear attention operator. Beyond explaining previously puzzling model behaviors, this perspective yields multiple practical benefits: it enables principled architectural simplifications, admits fully parallel formulations that preserve performance while improving efficiency, and provides a systematic reduction of diverse TTT variants to a standard linear attention form. Overall, our results reframe TTT not as test-time memorization, but as learned linear attention with enhanced representational capacity.

Updates

2026-04-30: Paper accepted to ICML 2026!
2026-04-24: Code released!

Code

Experiment code lives in two repositories:

LaCT (LLM and NVS experiments): https://github.com/JunchenLiu77/LaCT/tree/tttla
ViTTT (Image classification experiment): https://github.com/JunchenLiu77/ViTTT/tree/tttla

Citation

If you find this work useful, please consider citing:

@misc{liu2026testtimetrainingkvbinding,
      title={Test-Time Training with KV Binding Is Secretly Linear Attention},
      author={Junchen Liu and Sven Elflein and Or Litany and Zan Gojcic and Ruilong Li},
      year={2026},
      eprint={2602.21204},
      archivePrefix={arXiv},
      primaryClass={cs.LG},
      url={https://arxiv.org/abs/2602.21204},
}

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Test-Time Training with KV Binding Is Secretly Linear Attention

ICML 2026

Abstract

Updates

Code

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

Test-Time Training with KV Binding Is Secretly Linear Attention

ICML 2026

Abstract

Updates

Code

Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Packages