The Lean Language Reference

20.1. Reference Counting🔗

Lean uses reference counting for memory management. Each allocated object maintains a count of how many other objects refer to it. When a new reference is added, the count is incremented, and when a reference is dropped, the count is decremented. When a reference count reaches zero, the object is no longer reachable and can play no part in the further execution of the program. It is deallocated and all of its references to other objects are dropped, which may trigger further deallocations.

Reference counting provides a number of benefits:

Re-Use of Memory

If an object's reference count drops to zero just as another of the same size is to be allocated, then the original object's memory can be safely re-used for the new object. As a result, many common data-structure traversals (such as List.map) do not need to allocate memory when there is exactly one reference to the data structure to be traversed.

Opportunistic In-Place Updates

Primitive types, such as strings and arrays, may provide operations that copy shared data but modify unshared data in-place. As long as they hold the only reference to the value being modified, many operations on these primitive types will modify it rather than copy it. This can lead to substantial performance benefits. Carefully-written Array code avoids the performance overhead of immutable data structures while maintaining the ease of reasoning provided by pure functions.

Predictability

Reference counts are decremented at predictable times. As a result, reference-counted objects can be used to manage other resources, such as file handles. In Lean, a Handle does not need to be explicitly closed because it is closed immediately when it is no longer accessible.

Simpler FFI

Objects managed with reference counting don't need to be relocated as part of reclaiming unused memory. This greatly simplifies interaction with code written in other languages, such as C.

The traditional drawbacks of reference counting include the performance overhead due to updating reference counts along with the inability to recognize and deallocate cyclic data. The former drawback is minimized by an analysis based on borrowing that allows many reference count updates to be elided. Nevertheless, multi-threaded code requires that reference count updates are synchronized between threads, which also imposes a substantial overhead. To reduce this overhead, Lean values are partitioned into those which are reachable from multiple threads and those which are not. Single-threaded reference counts can be updated much faster than multi-threaded reference counts, and many values are accessed only on a single thread. Together, these techniques greatly reduce the performance overhead of reference counting. Because Lean cannot create cyclic data, no technique is needed to detect it. Ullrich and de Moura (2019)Sebastian Ullrich and Leonardo de Moura, 2019. “Counting Immutable Beans: Reference Counting Optimized for Purely Functional Programming”. In Proceedings of the 31st Symposium on Implementation and Application of Functional Languages (IFL 2019). provide more details on the implementation of reference counting in Lean.

20.1.1. Observing Uniqueness

Ensuring that arrays and strings are uniquely referenced is key to writing fast code in Lean. The primitive dbgTraceIfShared can be used to check whether a data structure is aliased. When called, it returns its argument unchanged, printing the provided trace message if the argument's reference count is greater than one.

🔗def
dbgTraceIfShared.{u} {α : Type u} (s : String)
  (a : α) : α

Display the given message if a is shared, that is, RC(a) > 1

Due to the specifics of how Lean.Parser.Command.eval : command`#eval e` evaluates the expression `e` by compiling and evaluating it. * The command attempts to use `ToExpr`, `Repr`, or `ToString` instances to print the result. * If `e` is a monadic value of type `m ty`, then the command tries to adapt the monad `m` to one of the monads that `#eval` supports, which include `IO`, `CoreM`, `MetaM`, `TermElabM`, and `CommandElabM`. Users can define `MonadEval` instances to extend the list of supported monads. The `#eval` command gracefully degrades in capability depending on what is imported. Importing the `Lean.Elab.Command` module provides full capabilities. Due to unsoundness, `#eval` refuses to evaluate expressions that depend on `sorry`, even indirectly, since the presence of `sorry` can lead to runtime instability and crashes. This check can be overridden with the `#eval! e` command. Options: * If `eval.pp` is true (default: true) then tries to use `ToExpr` instances to make use of the usual pretty printer. Otherwise, only tries using `Repr` and `ToString` instances. * If `eval.type` is true (default: false) then pretty prints the type of the evaluated value. * If `eval.derive.repr` is true (default: true) then attempts to auto-derive a `Repr` instance when there is no other way to print the result. See also: `#reduce e` for evaluation by term reduction. #eval is implemented, using dbgTraceIfShared with Lean.Parser.Command.eval : command`#eval e` evaluates the expression `e` by compiling and evaluating it. * The command attempts to use `ToExpr`, `Repr`, or `ToString` instances to print the result. * If `e` is a monadic value of type `m ty`, then the command tries to adapt the monad `m` to one of the monads that `#eval` supports, which include `IO`, `CoreM`, `MetaM`, `TermElabM`, and `CommandElabM`. Users can define `MonadEval` instances to extend the list of supported monads. The `#eval` command gracefully degrades in capability depending on what is imported. Importing the `Lean.Elab.Command` module provides full capabilities. Due to unsoundness, `#eval` refuses to evaluate expressions that depend on `sorry`, even indirectly, since the presence of `sorry` can lead to runtime instability and crashes. This check can be overridden with the `#eval! e` command. Options: * If `eval.pp` is true (default: true) then tries to use `ToExpr` instances to make use of the usual pretty printer. Otherwise, only tries using `Repr` and `ToString` instances. * If `eval.type` is true (default: false) then pretty prints the type of the evaluated value. * If `eval.derive.repr` is true (default: true) then attempts to auto-derive a `Repr` instance when there is no other way to print the result. See also: `#reduce e` for evaluation by term reduction. #eval can be misleading. Instead, it should be used in code that's explicitly compiled and run.

Observing Uniqueness

This program reads a line of input from the user, printing it after replacing its first character with a space. Replacing characters in a string uses an in-place update if the string is not shared and the characters are both contained in the 7-bit ASCII subset of Unicode. The dbgTraceIfShared call does nothing, indicating that the string will indeed be updated in place rather than copied.

def process (str : String) : IO Unit := do IO.println ((dbgTraceIfShared "String update" str).set 0 ' ') def main : IO Unit := do let line := ( ( IO.getStdin).getLine).trim process line

When run with this input:

stdinHere is input.

the program emits:

stdout ere is input.

with an empty standard error output:

stderr<empty>

This version of the program retains a reference to the original string, which necessitates copying the string in the call to String.set. This fact is visible in its standard error output.

def process (str : String) : IO Unit := do IO.println ((dbgTraceIfShared "String update" str).set 0 ' ') def main : IO Unit := do let line := ( ( IO.getStdin).getLine).trim process line IO.println "Original input:" IO.println line

When run with this input:

stdinHere is input.

the program emits:

stdout ere is input.Original input:Here is input.

In its standard error, the message passed to dbgTraceIfShared is visible.

stderrshared RC String update

20.1.2. Compiler IR

The compiler option trace.compiler.ir.result can be used to inspect the compiler's intermediate representation (IR) for a function. In this intermediate representation, reference counting, allocation, and reuse are explicit:

  • The isShared operator checks whether a reference count is 1.

  • ctor_n allocates the nth constructor of a type.

  • proj_n retrieves the nth field from a constructor value.

  • set x[n] mutates the nth field of the constructor in x.

  • ret x returns the value in x.

The specifics of reference count manipulations can depend on the results of optimization passes such as inlining. While the vast majority of Lean code doesn't require this kind of attention to achieve good performance, knowing how to diagnose unique reference issues can be very important when writing performance-critical code.

🔗option
trace.compiler.ir.result

Default value: false

(trace) enable/disable tracing for the given module and submodules

Reference Counts in IR

Compiler IR can be used to observe when reference counts are incremented, which can help diagnose situations when a value is expected to have a unique incoming reference, but is in fact shared. Here, process and process' each take a string as a parameter and modify it with String.set, returning a pair of strings. While process returns a constant string as the second element of the pair, process' returns the original string.

set_option trace.compiler.ir.result true [result] def process._closed_1 : obj := let x_1 : obj := ""; ret x_1 def process (x_1 : obj) : obj := let x_2 : u32 := 32; let x_3 : obj := 0; let x_4 : obj := String.set x_1 x_3 x_2; let x_5 : obj := process._closed_1; let x_6 : obj := ctor_0[Prod.mk] x_4 x_5; ret x_6def process (str : String) : String × String := (str.set 0 ' ', "") [result] def process' (x_1 : obj) : obj := let x_2 : u32 := 32; let x_3 : obj := 0; inc x_1; let x_4 : obj := String.set x_1 x_3 x_2; let x_5 : obj := ctor_0[Prod.mk] x_4 x_1; ret x_5def process' (str : String) : String × String:= (str.set 0 ' ', str)

The IR for process includes no inc or dec instructions. If the incoming string x_1 is a unique reference, then it is still a unique reference when passed to String.set, which can then use in-place modification:

[result]
def process._closed_1 : obj :=
  let x_1 : obj := "";
  ret x_1
def process (x_1 : obj) : obj :=
  let x_2 : u32 := 32;
  let x_3 : obj := 0;
  let x_4 : obj := String.set x_1 x_3 x_2;
  let x_5 : obj := process._closed_1;
  let x_6 : obj := ctor_0[Prod.mk] x_4 x_5;
  ret x_6

The IR for process', on the other hand, increments the reference count of the string just before calling String.set. Thus, the modified string x_4 is a copy, regardless of whether the reference to x_1 is unique:

[result]
def process' (x_1 : obj) : obj :=
  let x_2 : u32 := 32;
  let x_3 : obj := 0;
  inc x_1;
  let x_4 : obj := String.set x_1 x_3 x_2;
  let x_5 : obj := ctor_0[Prod.mk] x_4 x_1;
  ret x_5
Memory Re-Use in IR

The function discardElems is a simplified version of List.map that replaces every element in a list with (). Inspecting its intermediate representation demonstrates that it will re-use the list's memory when its reference is unique.

set_option trace.compiler.ir.result true [result] def discardElems._rarg (x_1 : obj) : obj := case x_1 : obj of List.nil → let x_2 : obj := ctor_0[List.nil]; ret x_2 List.cons → let x_3 : u8 := isShared x_1; case x_3 : u8 of Bool.false → let x_4 : obj := proj[1] x_1; let x_5 : obj := proj[0] x_1; dec x_5; let x_6 : obj := discardElems._rarg x_4; let x_7 : obj := ctor_0[PUnit.unit]; set x_1[1] := x_6; set x_1[0] := x_7; ret x_1 Bool.true → let x_8 : obj := proj[1] x_1; inc x_8; dec x_1; let x_9 : obj := discardElems._rarg x_8; let x_10 : obj := ctor_0[PUnit.unit]; let x_11 : obj := ctor_1[List.cons] x_10 x_9; ret x_11 def discardElems (x_1 : ◾) : obj := let x_2 : obj := pap discardElems._rarg; ret x_2def discardElems : List α List Unit | [] => [] | x :: xs => () :: discardElems xs

This emits the following IR:

[result]
def discardElems._rarg (x_1 : obj) : obj :=
  case x_1 : obj of
  List.nil →
    let x_2 : obj := ctor_0[List.nil];
    ret x_2
  List.cons →
    let x_3 : u8 := isShared x_1;
    case x_3 : u8 of
    Bool.false →
      let x_4 : obj := proj[1] x_1;
      let x_5 : obj := proj[0] x_1;
      dec x_5;
      let x_6 : obj := discardElems._rarg x_4;
      let x_7 : obj := ctor_0[PUnit.unit];
      set x_1[1] := x_6;
      set x_1[0] := x_7;
      ret x_1
    Bool.true →
      let x_8 : obj := proj[1] x_1;
      inc x_8;
      dec x_1;
      let x_9 : obj := discardElems._rarg x_8;
      let x_10 : obj := ctor_0[PUnit.unit];
      let x_11 : obj := ctor_1[List.cons] x_10 x_9;
      ret x_11
def discardElems (x_1 : ◾) : obj :=
  let x_2 : obj := pap discardElems._rarg;
  ret x_2

In the IR, the List.cons case explicitly checks whether the argument value is shared (i.e. whether it's reference count is greater than one). If the reference is unique, the reference count of the discarded list element x_5 is decremented and the constructor value is reused. If it is shared, a new List.cons is allocated in x_11 for the result.