Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Code Indexing & Navigation

Code indexing is about pre-computing a structured index of symbols, references, and relationships in a codebase so that navigation queries (go-to-definition, find references, call hierarchy) can be answered quickly without re-analyzing the code on every request.

The main design tension is between index freshness and cost. A full re-index on every change is expensive, but a stale index gives wrong answers. Formats like SCIP and LSIF try to standardize what gets stored, while systems like Glean and Kythe focus on how to store and query it at scale.

Loupe’s index format must not assume a specific language (language-agnostic), must support cheap partial updates when files change (incremental), and must answer navigation queries instantly even on large codebases (fast).

Subpages

  • SCIP - A Better Code Indexing Format than LSIF: Sourcegraph’s comparison of SCIP vs LSIF, arguing for a simpler, more debuggable indexing format.
  • Code Navigation for AI SWEs: a practical look at how AI coding agents can leverage code navigation infrastructure.
  • SCIP: a deep dive into the SCIP indexing format itself.
  • Glean: Meta’s system for collecting, deriving, and querying facts about code at scale.
  • Google Kythe: Google’s ecosystem for building tools that work with code, centered on a language-agnostic graph schema.
  • OpenGrok: Oracle’s fast source code search and cross-reference engine.
  • livegrep: a regex-based code search tool optimized for speed over large repositories.
  • Mycroft: a code indexing and navigation tool.