A Categorized Bibliography on Incremental Computation

Ramalingam & Reps, 1993. ACM Digital Library.

See research material.

Important keywords:

  • Incremental computation.
  • Batch computation.
  • Dynamization of static problems.
  • Selective recomputation.
  • Finite differencing.

Abstract

Small changes in the input to a computation often cause only small changes in the output. Rather than recomputing from scratch, we should be able to propagate only the delta. But until recently, I haven't known of a technique to do this efficently? Specifically, how do we transform a delta on the input to a delta on the output? How do we measure performance supposing we're taking this approach? The area of incremental computation was pretty much new to me, so I decided to read this survey to first have an overview of the field first.

This survey formalizes the problem raised above as follows: given \(f\) and input \(x\), keep \(f(x)\) up to date as \(x\) changes. An incremental algorithm takes batch input \(x\), batch output \(f(x)\), a change \(\Delta x\), and auxiliary information maintained across updates, and produces \(f(x + \Delta x)\).

The idea turns out to be everywhere. Similar notions of incremental computation arose in many fields but before this survey, little effort has been spent to unify these notions. This survey was intended to not only focus on the techniques within the programming language community, but also take a look in other areas as well.

The PL community cares about it for:

  • Reactive language primitives (observables, signals, etc)
  • Compiler optimization, especially in loops, in which changes between iterations can be small.
  • Responsive dev tools, such as IDE-centric program analyzers/compilers should not re-do it from scratch every time the user edits a file.

However, there is a less well-known body of related work in algorithms:

  • Dynamic graph problems
  • Dynamization of static structures
  • Databases (view maintenance)
  • AI (truth maintenance)
  • VLSI (incremental design-rule checking)

All working on variants of the same underlying problem.

Introduction

Small updates in input usually cause small changes in output. Therefore, the intuition is that it's more efficient to compute the changes it takes to go from the previous output to the new output, rather than recompute the output from scratch.

The authors survey this idea across multiple fields, deliberately bridging between programming languages research and work in algorithms, AI, databases, and VLSI.

"Because small changes in the input to a computation often cause only small changes in the output, the challenge is to compute the new output incrementally by updating parts of the old output, rather than by recomputing the entire output from scratch."

"The goal is to make use of the solution to one problem instance to find the solution to a 'nearby' problem instance."

"In so doing, the goal was to expose ties between the ideas on incremental computation that developed out of research on programming languages and programming tools, and ideas that have been developed by researchers in other areas."

The authors identify four areas where PL people care about incrementality:

  1. Reactive language primitives: baking reactivity into languages (observables, signals, etc).
  2. Compiler optimization: using incrementality as a compiler trick, e.g. loop optimization where changes between iterations are small.
  3. Responsive dev tools: incremental parsing, type-checking, etc. This is what modern LSPs and IDEs do. Don't re-analyze the whole file on every keystroke.
  4. Compilers and tooling: the practical realization of the first three.

But the survey also points to a less well-known body of work outside PL that I think is just as important:

  1. Comparison criteria: how do you even measure whether an incremental algorithm is "good"? Standard batch complexity doesn't cut it.
  2. General principles: things like dynamization of static problems. I think of this as foundational formulation/modeling.
  3. Problem-specific results: results on individual incremental problems that might reveal new general principles.

Assessment of Incremental Algorithms

Computational Complexity

This is the "how do you measure it" section. Standard asymptotic analysis (big-O on total input size) doesn't work well for incremental problems. Under those criteria, an incremental algorithm and a full recomputation can look the same, or worse, the batch start-over can be deemed "optimal" for some incremental problems. That feels wrong.

The survey lays out several alternative criteria that make more sense:

  • Direct comparison: just compare against the batch start-over algorithm directly.
  • Amortized-cost analysis: average the cost over a worst-case sequence of operations. This is closer to how I'd intuitively think about incremental performance.
  • Competitiveness: how much worse does an online algorithm (no future knowledge) do vs an optimal offline algorithm (perfect future knowledge), assuming an adversary picks the worst-case sequence? The competitive ratio is max cost(online) / cost(offline); lower is better, 1 is optimal. Interestingly, randomized online algorithms can beat the best deterministic ones, because the randomness limits what the adversary can exploit.
  • Probabilistic analysis: average-case performance under random inputs, as opposed to worst-case.
  • Incremental relative lower bounds: proving that for certain problems, no incremental algorithm can do better than batch start-over by more than some factor.
  • Reductions between problems: showing one incremental problem is at least as hard as another, transferring bounds across problems.
  • Boundedness: this one is interesting. Instead of cost as a function of total input size, measure cost as a function of \(|\Delta x| + |\Delta f(x)|\). An incremental algorithm is "bounded" if its update time depends only on the size of the change, never on the size of the full input. I think this is the most natural way to think about incrementality: ideally, if the change is small, the work should be small, regardless of how big the whole thing is.

Measurements of Actual Performance

Few experimental evaluations of incremental algorithms exist. But the ones that do suggest something surprising: theoretically "bad" incremental algorithms can still perform well in practice. Real workloads may not hit worst cases. So the theoretical criteria above don't tell the whole story.

Data-Structure Update Problems

Incremental Annotations on Graphs/Trees

This section covers a specific class of problem that I think is very relevant to what I'm building. The setup is:

  1. You have a "substrate" data structure \(x\), a tree, graph, matrix, whatever. This is the thing that changes.
  2. \(f(x)\) maps the substrate's state to some "annotation" on it. Think of annotating an AST with type information or symbol resolution.
  3. The problem: keep those annotations up to date as the substrate changes.

Annotations can depend on other annotations too, so changes can cascade.

Visualization of annotations & substrate

The survey identifies three approaches:

  • Selective recomputation: figure out which annotations are affected by a substrate change, then recompute only those.
    • A recomputed annotation if not changes may not need to trigger recomputation of further dependent annotations.
    • This is the approach behind:
      • Incremental updating of attributed derivation trees/augmented ASTs.
      • Incremental circuit-annotation problem.
      • Incremental data-flow analysis.
      • Path problems (e.g. shortest paths) in graphs.
    • At Holistics, the AML compiler mirrors salsa, presumably using this pattern also.
  • Differential updating: instead of recomputing affected annotations from scratch, compute the difference to apply. Related to database view maintenance and the INC language.
  • Other incremental expression-evaluation algorithms:
    • Function caching.
    • Incremental reduction in the lambda calculus.
    • Dynamic expression trees.

Other Dynamic Graph Problems

Maintaining:

  • Graph properties (connectivity, transitive closure
  • Minimum spanning trees.
  • Planarity.

under edge insertions and deletions.

I'm not entirely sure whether "edge" here always means the lines between nodes or sometimes refers to leaf nodes?

The work covers a wide range, from Even & Shiloach's early edge-deletion problem (1981) through to Eppstein et al's sparsification technique (1992).

Dynamization of Static Data Structures

This is a general methodology for transforming a static data structure into a dynamic one.

What does "static" mean here exactly? I think it means a structure that's built from scratch given the inputs, no support for incremental updates baked in.

Finite Differencing

Finite differencing is about automatically deriving incremental maintenance logic from batch definitions. You write the batch computation, and the technique derives how to update the output when the input changes. This is exactly what splicer.js is looking for.

The foundational paper is Paige & Koenig (1982) on finite differencing of computable expressions. Earlier work by Earley on high-level iterators and by Fong & Ullman on induction variables laid the groundwork.

Incremental Formal Systems

This section covers incrementalizing formal operations: reduction, parsing, deduction, constraint solving. I think of these as generic-but-also-specific techniques, each one applies a general incremental idea to a particular formal system.

  • Incremental reduction: goes back to Lombardi & Raphael's early work on Lisp as a language for incremental computation (1964). More recent work includes partial evaluation (Sundaresh & Hudak) and incremental reduction in the lambda calculus (Field & Teitelbaum).
  • Incremental parsing: how do you re-parse a file when only a small part changed? Work here includes augmenting parsers for incrementality (Ghezzi & Mandrioli), building "friendly" parsers (Jalili & Gallier), and even incremental parsing without a parser (Kaiser & Kant).
  • Truth maintenance: from AI, starting with Doyle's foundational truth maintenance system (1979). The problem: when your assumptions change, how do you maintain consistency of all derived beliefs? de Kleer extended this with assumption-based TMS.
  • Incremental deduction: rule support in Prolog, unification in many-sorted algebras, deriving incremental implementations from algebraic specifications.
  • Incremental constraint solving: constraint satisfaction for graphical interfaces (Vander Zanden), incremental constraint solvers (Freeman-Benson et al).

The remaining sections cover more applied work. These feel less central to what I'm after, but worth noting:

  • Special-purpose algorithms: domain-specific applications, including incremental compilation (DICE, Magpie), smart recompilation (Tichy), interactive document composition (JANUS), VLSI layout (Magic). These adapt incremental ideas to fairly niche fields.

  • Implementation frameworks: actual systems that enable incremental computation. VisiCalc (the original spreadsheet!) is listed here, along with the Synthesizer Generator, CENTAUR, PSG, Pan, and Alphonse. Also includes Cai & Paige's SETL-family work on binding performance at language design time.

  • Related problems: problems with similar structure, such as sensitivity analysis in linear network optimization, parametric maximum flow, continuous execution (Visiprog).

Verdict

This survey draws bridges between PL and other fields that all use incremental ideas without necessarily talking to each other. It's from 1993, so it misses the modern reactive/FRP wave, but the foundations are all here. The section on finite differencing is the most relevant to splicer.js.

References

Assessment of Incremental Algorithms

AuthorsYearTitleVenue
Yellin, Strom1991INC: A language for incremental computationsACM TOPLAS
Sleator, Tarjan1983A data structure for dynamic treesJ Computer and System Sciences
Tarjan1985Amortized computational complexitySIAM J Algebraic Discrete Methods
Sleator, Tarjan1985Amortized efficiency of list update and paging rulesCACM
McGeoch, Sleator (eds)1992On-Line AlgorithmsAMS
Karp1992On-line algorithms versus off-line algorithms: How much is it worth to know the future?IFIP
Lomchard et al1992Dynamic algorithms in D E Knuth's model: A probabilistic analysisTCS
Berman, Paull, Ryder1990Proving relative lower bounds for incremental algorithmsActa Informatica
Reif1987A topological approach to dynamic graph connectivityIPL
Reps, Teitelbaum, Demers1983Incremental context-dependent analysis for language-based editorsACM TOPLAS
Alpern et al1990Incremental evaluation of computational circuitsACM-SIAM SODA
Ramalingam, Reps1991-92On the computational complexity of incremental algorithmsUW-Madison TR
Dionne1978Etude et extension d'un algorithme de MarcheINFOR
Taylor, Ousterhout1984Magic's incremental design-rule checkerDAC
Scott, Ousterhout1984Plowing: Interactive stretching and compaction in MagicDAC
Ryder, Landi, Pande1990Profiling an incremental data flow analysis algorithmIEEE Trans. Software Eng.
Harrison, Munson1991Numbering document componentsElectronic Publishing

Selective Recomputation

AuthorsYearTitleVenue
Demers, Reps, Teitelbaum1981Incremental evaluation for attribute grammars with application to syntax-directed editorsPOPL
Reps1982Optimal-time incremental semantic analysis for syntax-directed editorsPOPL
Reps, Teitelbaum, Demers1983Incremental context-dependent analysis for language-based editorsACM TOPLAS
Reps1984Generating Language-Based EnvironmentsMIT Press
Yeh1983On incremental evaluation of ordered attributed grammarsBIT
Jones, Simon1986Hierarchical VLSI design systems based on attribute grammarsPOPL
Reps, Marceau, Teitelbaum1986Remote attribute updating for language-based editorsPOPL
Hoover, Teitelbaum1986Efficient incremental evaluation of aggregate values in attribute grammarsSIGPLAN Compiler Construction
Hoover1987Incremental graph evaluationCornell PhD
Kaiser1989Incremental dynamic semantics for language-based programming environmentsACM TOPLAS
Teitelbaum, Chapman1990Higher-order attribute grammars and editing environmentsPLDI
Zaring1990Parallel evaluation in attribute grammar based systemsCornell PhD
Alpern et al1990Incremental evaluation of computational circuitsACM-SIAM SODA
Ramalingam, Reps1991-92On the computational complexity of incremental algorithmsUW-Madison TR
Rosen1981Linear cost is sometimes quadraticPOPL
Ryder1982Incremental data flow analysis based on a unified model of elimination algorithmsRutgers PhD
Zadeck1983-84Incremental data flow analysis in a structured program editorRice PhD / SIGPLAN Compiler Construction
Cooper, Kennedy1986The impact of interprocedural analysis and optimization in the Rn programming environmentACM TOPLAS
Burke; Burke, Ryder1987Interval-based and iterative incremental data-flow analysisIBM TR
Carroll, Ryder1988Incremental data flow update via attribute and dominator updatesPOPL
Ryder, Paull1988Incremental data flow analysis algorithmsACM TOPLAS
Marlowe; Marlowe, Ryder1989-90Efficient hybrid algorithm for incremental data flow analysisRutgers PhD / POPL
Murchland 1967 ... Ramalingam, Reps 19921967-92Maintaining shortest distances in graphsVarious

Differential Updating

AuthorsYearTitleVenue
Koenig, Paige1981A transformational framework for the automatic control of derived dataVLDB
Shmueli, Itai1984Maintenance of viewsACM SIGMOD
Horwitz; Horwitz, Teitelbaum1985-86Generating editing environments based on relations and attributesCornell PhD / ACM TOPLAS
Yellin, Strom1991INC: A language for incremental computationsACM TOPLAS

Other Incremental Expression-Evaluation

AuthorsYearTitleVenue
Pugh; Pugh, Teitelbaum1988-89Incremental computation via function cachingCornell PhD / POPL
Field, Teitelbaum; Field1990-91Incremental reduction in the lambda calculusACM Lisp/FP / Cornell PhD
Cohen, Tamassia1991Dynamic expression trees and their applicationsACM-SIAM SODA

Dynamic Graph Problems

AuthorsYearTitleVenue
Cheston1976Incremental algorithms in graph theoryU Toronto PhD
Even, Shiloach1981An on-line edge-deletion problemJ ACM
Frederickson1985Data structures for on-line updating of minimum spanning treesSIAM J Computing
Italiano1986Amortized efficiency of a path retrieval data structureTCS
Reif1987A topological approach to dynamic graph connectivityIPL
La Poutre, van Leeuwen1988Maintenance of transitive closures and transitive reductions of graphsWG
Italiano1988Finding paths and deleting edges in directed acyclic graphsIPL
Yellin1988A dynamic transitive closure algorithmIBM TR
Di Battista, Tamassia1989Incremental planarity testingFOCS
Buchsbaum, Kanellakis, Vitter1990A data structure for arc insertion and regular path findingACM-SIAM SODA
Yannakakis1990Graph-theoretic methods in database theoryPODS
Frederickson1991Ambivalent data structures for dynamic 2-edge-connectivity and k smallest spanning treesFOCS
Galil, Italiano1991Fully dynamic algorithms for edge-connectivity problemsSTOC
Kanevsky et al1991On-line maintenance of the four-connected components of a graphFOCS
La Poutre1992Maintenance of triconnected components of graphsICALP
Eppstein, Galil, Italiano, Nissenzweig1992Sparsification: A technique for speeding up dynamic graph algorithmsFOCS

Dynamization of Static Data Structures

AuthorsYearTitleVenue
Bentley1979Decomposable searching problemsIPL
Bentley, Saxe1980Decomposable searching problems I: Static-to-dynamic transformationsJ Algorithms
Overmars1983The Design of Dynamic Data StructuresSpringer LNCS
Mehlhorn1984Data Structures and Algorithms 3: Multi-Dimensional Searching and Computational GeometrySpringer
Eppstein, Galil, Italiano, Nissenzweig1992Sparsification: A technique for speeding up dynamic graph algorithmsFOCS

Finite Differencing

AuthorsYearTitleVenue
Earley1974-76High-level operations in automatic programming / High-level iteratorsSIGPLAN VHL / J Computer Languages
Fong, Ullman; Fong1976-79Induction variables / Common subexpressions / Inductively computable constructs in VHL languagesPOPL
Paige, Koenig1982Finite differencing of computable expressionsACM TOPLAS
Goldberg, Paige1984Stream processingACM Lisp/FP
Paige1986Programming with invariantsIEEE Software

Incremental Formal Systems

AuthorsYearTitleVenue
Lombardi, Raphael; Lombardi1964-67Lisp as the language for an incremental computer / Incremental computationMIT Press / Advances in Computers
Sundaresh, Hudak; Sundaresh1991Incremental computation via partial evaluationPOPL / Yale PhD
Field, Teitelbaum; Field1990-91Incremental reduction in the lambda calculusACM Lisp/FP / Cornell PhD
MacLennan1986-87A calculus of functional differences and integralsNaval Postgraduate School TR
Ghezzi, Mandrioli1979-80Incremental parsing / Augmenting parsers to support incrementalityACM TOPLAS / J ACM
Wegman1980Parsing for structural editorsFOCS
Jalili, Gallier1982Building friendly parsersPOPL
Kaiser, Kant1985Incremental parsing without a parserJ Systems and Software
Doyle1979A truth maintenance systemArtificial Intelligence
de Kleer1986An assumption-based TMS / Extending the ATMS / Problem solving with the ATMSArtificial Intelligence
McAllister1990Truth maintenanceAAAI
Shmueli, Tsur, Zfira1984Rule support in Prolog
Mannila, Ukkonen1988Time parameter and arbitrary deletions in the set union problemSWAT
Snelting, Henhapl1986Unification in many-sorted algebras for incremental semantic analysisPOPL
Ballance; Ballance, Graham1989-91Syntactic/semantic checking and incremental consistency maintenanceUC Berkeley PhD / ICLP
van der Meulen1990Deriving incremental implementations from algebraic specificationsCWI TR
Vander Zanden1988Incremental constraint satisfaction and its application to graphical interfacesCornell PhD
Freeman-Benson, Maloney, Borning1990An incremental constraint solverCACM

Special-Purpose Algorithms

AuthorsYearTitleVenue
Fritzson1984Preliminary experience from the DICE systemACM SIGPLAN/SIGSOFT
Schwartz, Delisle, Begwani1984Incremental compilation in MagpieSIGPLAN Compiler Construction
Tichy1986Smart recompilationACM TOPLAS
Schwanke, Kaiser1988Smarter recompilationACM TOPLAS
Cooper, Kennedy1986The impact of interprocedural analysis and optimization in the Rn programming environmentACM TOPLAS
Burke, TorczonInterprocedural optimization: Eliminating unnecessary recompilationACM TOPLAS
Chamberlain et al1981-87JANUS: An interactive system for document composition / Document convergenceACM SIGPLAN / IBM Systems J
Chen, Harrison; Harrison, Munson1988-91Multiple representation document development / Integrated bibliography / Numbering document componentsIEEE Computer / Electronic Publishing
Brooks1988A two-view document editor with user-definable document structureDEC Systems Research Center TR
Ousterhout et al1984Magic: A VLSI layout systemDAC
Taylor, Ousterhout1984Magic's incremental design-rule checkerDAC
Scott, Ousterhout1984Plowing: Interactive stretching and compaction in MagicDAC

Implementation Frameworks

AuthorsYearTitleVenue
Bricklin, Frankston1979VisiCalcPersonal Software Inc
Alpern et al1989Graph attribution as a specification paradigmACM SIGPLAN/SIGSOFT
Borning1979ThingLab: A constraint-oriented simulation laboratoryStanford PhD / Xerox PARC
Konopasek, Jayaraman1984The TK!Solver BookOsborne/McGraw-Hill
Reps, Teitelbaum1988The Synthesizer GeneratorSpringer
Borras et al1988CENTAUR: The systemACM SIGPLAN/SIGSOFT
Bahlke, Snelting1986The PSG system: From formal language definitions to interactive programming environmentsACM TOPLAS
Ballance, Graham, Van De Vanter1992The Pan language-based editing systemACM TOSEM
Paige, Koenig1982Finite differencing of computable expressionsACM TOPLAS
Cai, Paige1987-91Binding performance at language design time / Program derivation by fixed point computation / Languages polynomial in input+outputPOPL / Science of Computer Programming / AMAST
Hoover1992Alphonse: Incremental computation as a programming abstractionPLDI
AuthorsYearTitleVenue
Bertsekas1991Linear Network Optimization: Algorithms and CodesMIT Press
Gallo, Grigoriadis, Tarjan1989A fast parametric maximum flow algorithm and applicationsSIAM J Computing
Henderson, Weiser1985Continuous execution: The Visiprog environmentIEEE ICSE