Evaluating coding agents on React retrieval tasks in complex, real-world codebases.
331 test cases across 14 pattern categories (HOC stacking, compound components, barrel re-exports, dynamic imports, render props, name collisions, and more).
Inspired by production codebases like Cal.com, Excalidraw, LobeChat, and Plane. Given a natural-language description of a UI element, each resolver must identify the correct source file.
Last benchmarked: 2026-03-15. Source & methodology.