19.8. Strings🔗

Strings represent Unicode text. Strings are specially supported by Lean:

They have a logical model that specifies their behavior in terms of lists of characters, which specifies the meaning of each operation on strings.
They have an optimized run-time representation in compiled code, as packed arrays of bytes that encode the string as UTF-8, and the Lean runtime specially optimizes string operations.
There is string literal syntax for writing strings.

The fact that strings are internally represented as UTF-8-encoded byte arrays is visible in the API:

There is no operation to project a particular character out of the string, as this would be a performance trap. Use a String.Iterator in a loop instead of a Nat.
Strings are indexed by String.Pos, which internally records byte counts rather than character counts, and thus takes constant time. Aside from 0, these should not be constructed directly, but rather updated using String.next and String.prev.

19.8.1. Logical Model

structure

String : Type
String : Type

A string is a sequence of Unicode code points.

At runtime, strings are represented by dynamic arrays of bytes using the UTF-8 encoding. Both the size in bytes (String.utf8ByteSize) and in characters (String.length) are cached and take constant time. Many operations on strings perform in-place modifications when the reference to the string is unique.

Constructor

String.mk

Pack a List Char into a String. This function is overridden by the compiler and is O(n) in the length of the list.

Fields

data : List Char

Unpack String into a List Char. This function is overridden by the compiler and is O(n) in the length of the list.

The logical model of strings in Lean is a structure that contains a single field, which is a list of characters. This is convenient when specifying and proving properties of string-processing functions at a low level.

19.8.2. Run-Time Representation🔗

Strings are represented as dynamic arrays of bytes, encoded in UTF-8. After the object header, a string contains:

byte count: The number of bytes that currently contain valid string data
capacity: The number of bytes presently allocated for the string
length: The length of the encoded string, which may be shorter than the byte count due to UTF-8 multi-byte characters
data: The actual character data in the string, null-terminated

Many string functions in the Lean runtime check whether they have exclusive access to their argument by consulting the reference count in the object header. If they do, and the string's capacity is sufficient, then the existing string can be mutated rather than allocating fresh memory. Otherwise, a new string must be allocated.

19.8.2.1. Performance Notes🔗

Despite the fact that they appear to be an ordinary constructor and projection, String.mk and String.data take time linear in the length of the string. This is because they must implement the conversions between lists of characters and packed arrays of bytes, which must necessarily visit each character.

19.8.3. Syntax🔗

Lean has three kinds of string literals: ordinary string literals, interpolated string literals, and raw string literals.

19.8.3.1. String Literals🔗

String literals begin and end with a double-quote character ". Between these characters, they may contain any other character, including newlines, which are included literally (with the caveat that all newlines in a Lean source file are interpreted as '\n', regardless of file encoding and platform). Special characters that cannot otherwise be written in string literals may be escaped with a backslash, so "\"Quotes\"" is a string literal that begins and ends with double quotes. The following forms of escape sequences are accepted:

\r, \n, \t, \\, \", \': These escape sequences have the usual meaning, mapping to CR, LF, tab, backslash, double quote, and single quote, respectively.
\xNN: When NN is a sequence of two hexadecimal digits, this escape denotes the character whose Unicode code point is indicated by the two-digit hexadecimal code.
\uNNNN: When NN is a sequence of two hexadecimal digits, this escape denotes the character whose Unicode code point is indicated by the four-digit hexadecimal code.

String literals may contain gaps. A gap is indicated by an escaped newline, with no intervening characters between the escaping backslash and the newline. In this case, the string denoted by the literal is missing the newline and all leading whitespace from the next line. String gaps may not precede lines that contain only whitespace.

Here, str1 and str2 are the same string:

def str1 := "String with \
             a gap"
def str2 := "String with a gap"

example : str1 = str2 := rfl

If the line following the gap is empty, the string is rejected:

def str3 := "String with \unexpected additional newline in string gap 
             a gap"

The parser error is:

<example>:2:0-3:0: unexpected additional newline in string gap

19.8.3.2. Interpolated Strings🔗

Preceding a string literal with s! causes it to be processed as an interpolated string, in which regions of the string surrounded by { and } characters are parsed and interpreted as Lean expressions. Interpolated strings are interpreted by appending the string that precedes the interpolation, the expression (with an added call to toString surrounding it), and the string that follows the interpolation.

For example:

example :
    s!"1 + 1 = {1 + 1}\n" =
    "1 + 1 = " ++ toString (1 + 1) ++ "\n" :=
  rfl

Preceding a literal with m! causes the interpolation to result in an instance of MessageData, the compiler's internal data structure for messages to be shown to users.

19.8.3.3. Raw String Literals🔗

In raw string literals, there are no escape sequences or gaps, and each character denotes itself exactly. Raw string literals are preceded by r, followed by zero or more hash characters (#) and a double quote ". The string literal is completed at a double quote that is followed by the same number of hash characters. For example, they can be used to avoid the need to double-escape certain characters:

example : r"\t" = "\\t" := rfl
"Write backslash in a string using '\\\\\\\\'"#eval r"Write backslash in a string using '\\\\'"

The #eval yields:

"Write backslash in a string using '\\\\\\\\'"

Including hash marks allows the strings to contain unescaped quotes:

example :
    r#"This is "literally" quoted"# =
    "This is \"literally\" quoted" :=
  rfl

Adding sufficiently many hash marks allows any raw literal to be written literally:

example :
    r##"This is r#"literally"# quoted"## =
    "This is r#\"literally\"# quoted" :=
  rfl

19.8.4. API Reference🔗

19.8.4.1. Constructing🔗

def

String.singleton (c : Char) : String
String.singleton (c : Char) : String

Returns a new string that contains only the character c.

Because strings are encoded in UTF-8, the resulting string may take multiple bytes.

Examples:

String.singleton 'L' = "L"
String.singleton ' ' = " "
String.singleton '"' = "\""
String.singleton '𝒫' = "𝒫"

def

String.append : String → String → String
String.append : String → String → String

Appends two strings. Usually accessed via the ++ operator.

The internal implementation will perform destructive updates if the string is not shared.

Examples:

"abc".append "def" = "abcdef"
"abc" ++ "def" = "abcdef"
"" ++ "" = ""

def

String.join (l : List String) : String
String.join (l : List String) : String

Appends all the strings in a list of strings, in order.

Use String.intercalate to place a separator string between the strings in a list.

Examples:

String.join ["gr", "ee", "n"] = "green"
String.join ["b", "", "l", "", "ue"] = "blue"
String.join [] = ""

def

String.intercalate (s : String) : List String → String
String.intercalate (s : String) :
  List String → String

Appends the strings in a list of strings, placing the separator s between each pair.

Examples:

", ".intercalate ["red", "green", "blue"] = "red, green, blue"
" and ".intercalate ["tea", "coffee"] = "tea and coffee"
" | ".intercalate ["M", "", "N"] = "M | | N"

19.8.4.2. Conversions🔗

def

String.toList (s : String) : List Char
String.toList (s : String) : List Char

Converts a string to a list of characters.

Even though the logical model of strings is as a structure that wraps a list of characters, this operation takes time and space linear in the length of the string. At runtime, strings are represented as dynamic arrays of bytes.

Examples:

"abc".toList = ['a', 'b', 'c']
"".toList = []
"\n".toList = ['\n']

def

String.isNat (s : String) : Bool
String.isNat (s : String) : Bool

Checks whether the string can be interpreted as the decimal representation of a natural number.

A string can be interpreted as a decimal natural number if it is not empty and all the characters in it are digits.

Use String.toNat? or String.toNat! to convert such a string to a natural number.

Examples:

"".isNat = false
"0".isNat = true
"5".isNat = true
"05".isNat = true
"587".isNat = true
"-587".isNat = false
" 5".isNat = false
"2+3".isNat = false
"0xff".isNat = false

def

String.toNat? (s : String) : Option Nat
String.toNat? (s : String) : Option Nat

Interprets a string as the decimal representation of a natural number, returning it. Returns none if the string does not contain a decimal natural number.

A string can be interpreted as a decimal natural number if it is not empty and all the characters in it are digits.

Use String.isNat to check whether String.toNat? would return some. String.toNat! is an alternative that panics instead of returning none when the string is not a natural number.

Examples:

"".toNat? = none
"0".toNat? = some 0
"5".toNat? = some 5
"587".toNat? = some 587
"-587".toNat? = none
" 5".toNat? = none
"2+3".toNat? = none
"0xff".toNat? = none

def

String.toNat! (s : String) : Nat
String.toNat! (s : String) : Nat

Interprets a string as the decimal representation of a natural number, returning it. Panics if the string does not contain a decimal natural number.

A string can be interpreted as a decimal natural number if it is not empty and all the characters in it are digits.

Use String.isNat to check whether String.toNat! would return a value. String.toNat? is a safer alternative that returns none instead of panicking when the string is not a natural number.

Examples:

"0".toNat! = 0
"5".toNat! = 5
"587".toNat! = 587

def

String.isInt (s : String) : Bool
String.isInt (s : String) : Bool

Checks whether the string can be interpreted as the decimal representation of an integer.

A string can be interpreted as a decimal integer if it only consists of at least one decimal digit and optionally - in front. Leading + characters are not allowed.

Use String.toInt? or String.toInt! to convert such a string to an integer.

Examples:

"".isInt = false
"-".isInt = false
"0".isInt = true
"-0".isInt = true
"5".isInt = true
"587".isInt = true
"-587".isInt = true
"+587".isInt = false
" 5".isInt = false
"2-3".isInt = false
"0xff".isInt = false

def

String.toInt? (s : String) : Option Int
String.toInt? (s : String) : Option Int

Interprets a string as the decimal representation of an integer, returning it. Returns none if the string does not contain a decimal integer.

A string can be interpreted as a decimal integer if it only consists of at least one decimal digit and optionally - in front. Leading + characters are not allowed.

Use String.isInt to check whether String.toInt? would return some. String.toInt! is an alternative that panics instead of returning none when the string is not an integer.

Examples:

"".toInt? = none
"-".toInt? = none
"0".toInt? = some 0
"5".toInt? = some 5
"-5".toInt? = some (-5)
"587".toInt? = some 587
"-587".toInt? = some (-587)
" 5".toInt? = none
"2-3".toInt? = none
"0xff".toInt? = none

def

String.toInt! (s : String) : Int
String.toInt! (s : String) : Int

Interprets a string as the decimal representation of an integer, returning it. Panics if the string does not contain a decimal integer.

A string can be interpreted as a decimal integer if it only consists of at least one decimal digit and optionally - in front. Leading + characters are not allowed.

Use String.isInt to check whether String.toInt! would return a value. String.toInt? is a safer alternative that returns none instead of panicking when the string is not an integer.

Examples:

"0".toInt! = 0
"5".toInt! = 5
"587".toInt! = 587
"-587".toInt! = -587

def

String.toFormat (s : String) : Std.Format
String.toFormat (s : String) : Std.Format

Converts a string to a pretty-printer document, replacing newlines in the string with Std.Format.line.

19.8.4.3. Properties🔗

def

String.isEmpty (s : String) : Bool
String.isEmpty (s : String) : Bool

Checks whether a string is empty.

Empty strings are equal to "" and have length and end position 0.

Examples:

"".isEmpty = true
"empty".isEmpty = false
" ".isEmpty = false

def

String.length : String → Nat
String.length : String → Nat

Returns the length of a string in Unicode code points.

Examples:

"".length = 0
"abc".length = 3
"L∃∀N".length = 4

19.8.4.4. Positions🔗

structure

String.Pos : Type
String.Pos : Type

A byte position in a String, according to its UTF-8 encoding.

Character positions (counting the Unicode code points rather than bytes) are represented by plain Nats. Indexing a String by a String.Pos takes constant time, while character positions need to be translated internally to byte positions, which takes linear time.

A byte position p is valid for a string s if 0 ≤ p ≤ s.endPos and p lies on a UTF-8 character boundary.

Constructor

String.Pos.mk

Fields

byteIdx : Nat

Get the underlying byte index of a String.Pos

def

String.Pos.isValid (s : String) (p : String.Pos) : Bool
String.Pos.isValid (s : String)
  (p : String.Pos) : Bool

Returns true if p is a valid UTF-8 position in the string s.

This means that p ≤ s.endPos and p lies on a UTF-8 character boundary. At runtime, this operation takes constant time.

Examples:

String.Pos.isValid "abc" ⟨0⟩ = true
String.Pos.isValid "abc" ⟨1⟩ = true
String.Pos.isValid "abc" ⟨3⟩ = true
String.Pos.isValid "abc" ⟨4⟩ = false
String.Pos.isValid "𝒫(A)" ⟨0⟩ = true
String.Pos.isValid "𝒫(A)" ⟨1⟩ = false
String.Pos.isValid "𝒫(A)" ⟨2⟩ = false
String.Pos.isValid "𝒫(A)" ⟨3⟩ = false
String.Pos.isValid "𝒫(A)" ⟨4⟩ = true

def

String.atEnd : String → String.Pos → Bool
String.atEnd : String → String.Pos → Bool

Returns true if a specified byte position is greater than or equal to the position which points to the end of a string. Otherwise, returns false.

Examples:

(0 |> "abc".next |> "abc".next |> "abc".atEnd) = false
(0 |> "abc".next |> "abc".next |> "abc".next |> "abc".next |> "abc".atEnd) = true
(0 |> "L∃∀N".next |> "L∃∀N".next |> "L∃∀N".next |> "L∃∀N".atEnd) = false
(0 |> "L∃∀N".next |> "L∃∀N".next |> "L∃∀N".next |> "L∃∀N".next |> "L∃∀N".atEnd) = true
"abc".atEnd ⟨4⟩ = true
"L∃∀N".atEnd ⟨7⟩ = false
"L∃∀N".atEnd ⟨8⟩ = true

def

String.endPos (s : String) : String.Pos
String.endPos (s : String) : String.Pos

A UTF-8 byte position that points at the end of a string, just after the last character.

"abc".endPos = ⟨3⟩
"L∃∀N".endPos = ⟨8⟩

def

String.next (s : String) (p : String.Pos) : String.Pos
String.next (s : String)
  (p : String.Pos) : String.Pos

Returns the next position in a string after position p. If p is not a valid position or p = s.endPos, returns the position one byte after p.

A run-time bounds check is performed to determine whether p is at the end of the string. If a bounds check has already been performed, use String.next' to avoid a repeated check.

Some examples of edge cases:

"abc".next ⟨3⟩ = ⟨4⟩, since 3 = "abc".endPos
"L∃∀N".next ⟨2⟩ = ⟨3⟩, since 2 points into the middle of a multi-byte UTF-8 character

Examples:

"abc".get ("abc".next 0) = 'b'
"L∃∀N".get (0 |> "L∃∀N".next |> "L∃∀N".next) = '∀'

def

String.next' (s : String) (p : String.Pos) (h : ¬s.atEnd p = true) :
  String.Pos
String.next' (s : String) (p : String.Pos)
  (h : ¬s.atEnd p = true) : String.Pos

Returns the next position in a string after position p. The result is unspecified if p is not a valid position.

Requires evidence, h, that p is within bounds. No run-time bounds check is performed, as in String.next.

A typical pattern combines String.next' with a dependent if-expression to avoid the overhead of an additional bounds check. For example:

def next? (s : String) (p : String.Pos) : Option Char :=
  if h : s.atEnd p then none else s.get (s.next' p h)

Example:

let abc := "abc"; abc.get (abc.next' 0 (by decide)) = 'b'

def

String.nextWhile (s : String) (p : Char → Bool) (i : String.Pos) :
  String.Pos
String.nextWhile (s : String)
  (p : Char → Bool) (i : String.Pos) :
  String.Pos

Repeatedly increments a position in a string, as if by String.next, while the predicate p returns true for the character at the position. Stops incrementing at the end of the string or when p returns false for the current character.

Examples:

let s := " a "; s.get (s.nextWhile Char.isWhitespace 0) = 'a'
let s := "a "; s.get (s.nextWhile Char.isWhitespace 0) = 'a'
let s := "ba "; s.get (s.nextWhile Char.isWhitespace 0) = 'b'

def

String.nextUntil (s : String) (p : Char → Bool) (i : String.Pos) :
  String.Pos
String.nextUntil (s : String)
  (p : Char → Bool) (i : String.Pos) :
  String.Pos

Repeatedly increments a position in a string, as if by String.next, while the predicate p returns false for the character at the position. Stops incrementing at the end of the string or when p returns true for the current character.

Examples:

let s := " a "; s.get (s.nextUntil Char.isWhitespace 0) = ' '
let s := " a "; s.get (s.nextUntil Char.isLetter 0) = 'a'
let s := "a "; s.get (s.nextUntil Char.isWhitespace 0) = ' '

def

String.prev : String → String.Pos → String.Pos
String.prev :
  String → String.Pos → String.Pos

Returns the position in a string before a specified position, p. If p = ⟨0⟩, returns 0. If p is greater than endPos, returns the position one byte before p. Otherwise, if p occurs in the middle of a multi-byte character, returns the beginning position of that character.

For example, "L∃∀N".prev ⟨3⟩ is ⟨1⟩, since byte 3 occurs in the middle of the multi-byte character '∃' that starts at byte 1.

Examples:

"abc".get ("abc".endPos |> "abc".prev) = 'c'
"L∃∀N".get ("L∃∀N".endPos |> "L∃∀N".prev |> "L∃∀N".prev |> "L∃∀N".prev) = '∃'

def

String.Pos.min (p₁ p₂ : String.Pos) : String.Pos
String.Pos.min (p₁ p₂ : String.Pos) :
  String.Pos

Returns either p₁ or p₂, whichever has the least byte index.

19.8.4.5. Lookups and Modifications🔗

def

String.get (s : String) (p : String.Pos) : Char
String.get (s : String) (p : String.Pos) :
  Char

Returns the character at position p of a string. If p is not a valid position, returns the fallback value (default : Char), which is 'A', but does not panic.

This function is overridden with an efficient implementation in runtime code. See String.utf8GetAux for the reference implementation.

Examples:

"abc".get ⟨1⟩ = 'b'
"abc".get ⟨3⟩ = (default : Char) because byte 3 is at the end of the string.
"L∃∀N".get ⟨2⟩ = (default : Char) because byte 2 is in the middle of '∃'.

def

String.get? : String → String.Pos → Option Char
String.get? :
  String → String.Pos → Option Char

Returns the character at position p of a string. If p is not a valid position, returns none.

This function is overridden with an efficient implementation in runtime code. See String.utf8GetAux? for the reference implementation.

Examples:

"abc".get? ⟨1⟩ = some 'b'
"abc".get? ⟨3⟩ = none
"L∃∀N".get? ⟨1⟩ = some '∃'
"L∃∀N".get? ⟨2⟩ = none

def

String.get! (s : String) (p : String.Pos) : Char
String.get! (s : String)
  (p : String.Pos) : Char

Returns the character at position p of a string. Panics if p is not a valid position.

See String.get? for a safer alternative.

This function is overridden with an efficient implementation in runtime code. See String.utf8GetAux for the reference implementation.

Examples

"abc".get! ⟨1⟩ = 'b'

def

String.get' (s : String) (p : String.Pos) (h : ¬s.atEnd p = true) : Char
String.get' (s : String) (p : String.Pos)
  (h : ¬s.atEnd p = true) : Char

Returns the character at position p of a string. Returns (default : Char), which is 'A', if p is not a valid position.

Requires evidence, h, that p is within bounds instead of performing a run-time bounds check as in String.get.

A typical pattern combines get' with a dependent if-expression to avoid the overhead of an additional bounds check. For example:

def getInBounds? (s : String) (p : String.Pos) : Option Char :=
  if h : s.atEnd p then none else some (s.get' p h)

Even with evidence of ¬ s.atEnd p, p may be invalid if a byte index points into the middle of a multi-byte UTF-8 character. For example, "L∃∀N".get' ⟨2⟩ (by decide) = (default : Char).

Examples:

"abc".get' 0 (by decide) = 'a'
let lean := "L∃∀N"; lean.get' (0 |> lean.next |> lean.next) (by decide) = '∀'

def

String.extract : String → String.Pos → String.Pos → String
String.extract :
  String →
    String.Pos → String.Pos → String

Creates a new string that consists of the region of the input string delimited by the two positions.

The result is "" if the start position is greater than or equal to the end position or if the start position is at the end of the string. If either position is invalid (that is, if either points at the middle of a multi-byte UTF-8 character) then the result is unspecified.

Examples:

"red green blue".extract ⟨0⟩ ⟨3⟩ = "red"
"red green blue".extract ⟨3⟩ ⟨0⟩ = ""
"red green blue".extract ⟨0⟩ ⟨100⟩ = "red green blue"
"red green blue".extract ⟨4⟩ ⟨100⟩ = "green blue"
"L∃∀N".extract ⟨2⟩ ⟨100⟩ = "green blue"

def

String.take (s : String) (n : Nat) : String
String.take (s : String) (n : Nat) :
  String

Creates a new string that contains the first n characters (Unicode code points) of s.

If n is greater than s.length, returns s.

Examples:

"red green blue".take 3 = "red"
"red green blue".take 1 = "r"
"red green blue".take 0 = ""
"red green blue".take 100 = "red green blue"

def

String.takeWhile (s : String) (p : Char → Bool) : String
String.takeWhile (s : String)
  (p : Char → Bool) : String

Creates a new string that contains the longest prefix of s in which p returns true for all characters.

Examples:

"red green blue".takeWhile (·.isLetter) = "red"
"red green blue".takeWhile (·== 'r') = "r"
"red green blue".takeWhile (·!= 'n') = "red gree"
"red green blue".takeWhile (fun _ => true) = "red green blue"

def

String.takeRight (s : String) (n : Nat) : String
String.takeRight (s : String) (n : Nat) :
  String

Creates a new string that contains the last n characters (Unicode code points) of s.

If n is greater than s.length, returns s.

Examples:

"red green blue".takeRight 4 = "blue"
"red green blue".takeRight 1 = "e"
"red green blue".takeRight 0 = ""
"red green blue".takeRight 100 = "red green blue"

def

String.takeRightWhile (s : String) (p : Char → Bool) : String
String.takeRightWhile (s : String)
  (p : Char → Bool) : String

Creates a new string that contains the longest suffix of s in which p returns true for all characters.

Examples:

"red green blue".takeRightWhile (·.isLetter) = "blue"
"red green blue".takeRightWhile (·== 'e') = "e"
"red green blue".takeRightWhile (·!= 'n') = " blue"
"red green blue".takeRightWhile (fun _ => true) = "red green blue"

def

String.drop (s : String) (n : Nat) : String
String.drop (s : String) (n : Nat) :
  String

Removes the specified number of characters (Unicode code points) from the start of the string.

If n is greater than s.length, returns "".

Examples:

"red green blue".drop 4 = "green blue"
"red green blue".drop 10 = "blue"
"red green blue".drop 50 = ""

def

String.dropWhile (s : String) (p : Char → Bool) : String
String.dropWhile (s : String)
  (p : Char → Bool) : String

Creates a new string by removing the longest prefix from s in which p returns true for all characters.

Examples:

"red green blue".dropWhile (·.isLetter) = " green blue"
"red green blue".dropWhile (·== 'r') = "ed green blue"
"red green blue".dropWhile (·!= 'n') = "n blue"
"red green blue".dropWhile (fun _ => true) = ""

def

String.dropRight (s : String) (n : Nat) : String
String.dropRight (s : String) (n : Nat) :
  String

Removes the specified number of characters (Unicode code points) from the end of the string.

If n is greater than s.length, returns "".

Examples:

"red green blue".dropRight 5 = "red green"
"red green blue".dropRight 11 = "red"
"red green blue".dropRight 50 = ""

def

String.dropRightWhile (s : String) (p : Char → Bool) : String
String.dropRightWhile (s : String)
  (p : Char → Bool) : String

Creates a new string by removing the longest suffix from s in which p returns true for all characters.

Examples:

"red green blue".dropRightWhile (·.isLetter) = "red green "
"red green blue".dropRightWhile (·== 'e') = "red green blu"
"red green blue".dropRightWhile (·!= 'n') = "red green"
"red green blue".dropRightWhile (fun _ => true) = ""

def

String.dropPrefix? (s pre : String) : Option Substring
String.dropPrefix? (s pre : String) :
  Option Substring

If pre is a prefix of s, returns the remainder. Returns none otherwise.

The string pre is a prefix of s if there exists a t : String such that s = pre ++ t. If so, the result is some t.

Use String.stripPrefix to return the string unchanged when pre is not a prefix.

Examples:

"red green blue".dropPrefix? "red " = some "green blue"
"red green blue".dropPrefix? "reed " = none
"red green blue".dropPrefix? "" = some "red green blue"

def

String.stripPrefix (s pre : String) : String
String.stripPrefix (s pre : String) :
  String

If pre is a prefix of s, returns the remainder. Returns s unmodified otherwise.

The string pre is a prefix of s if there exists a t : String such that s = pre ++ t. If so, the result is t. Otherwise, it is s.

Use String.dropPrefix? to return none when pre is not a prefix.

Examples:

"red green blue".stripPrefix "red " = "green blue"
"red green blue".stripPrefix "reed " = "red green blue"
"red green blue".stripPrefix "" = "red green blue"

def

String.dropSuffix? (s suff : String) : Option Substring
String.dropSuffix? (s suff : String) :
  Option Substring

If suff is a suffix of s, returns the remainder. Returns none otherwise.

The string suff is a suffix of s if there exists a t : String such that s = t ++ suff. If so, the result is some t.

Use String.stripSuffix to return the string unchanged when suff is not a suffix.

Examples:

"red green blue".dropSuffix? " blue" = some "red green"
"red green blue".dropSuffix? " blu " = none
"red green blue".dropSuffix? "" = some "red green blue"

def

String.stripSuffix (s suff : String) : String
String.stripSuffix (s suff : String) :
  String

If suff is a suffix of s, returns the remainder. Returns s unmodified otherwise.

The string suff is a suffix of s if there exists a t : String such that s = t ++ suff. If so, the result is t. Otherwise, it is s.

Use String.dropSuffix? to return none when suff is not a suffix.

Examples:

"red green blue".stripSuffix " blue" = "red green"
"red green blue".stripSuffix " blu " = "red green blue"
"red green blue".stripSuffix "" = "red green blue"

def

String.trim (s : String) : String
String.trim (s : String) : String

Removes leading and trailing whitespace from a string.

“Whitespace” is defined as characters for which Char.isWhitespace returns true.

Examples:

"abc".trim = "abc"
" abc".trim = "abc"
"abc \t ".trim = "abc"
" abc ".trim = "abc"
"abc\ndef\n".trim = "abc\ndef"

def

String.trimLeft (s : String) : String
String.trimLeft (s : String) : String

Removes leading whitespace from a string.

“Whitespace” is defined as characters for which Char.isWhitespace returns true.

Examples:

"abc".trimLeft = "abc"
" abc".trimLeft = " abc"
"abc \t ".trimLeft = "abc \t "
" abc ".trimLeft = "abc "
"abc\ndef\n".trimLeft = "abc\ndef\n"

def

String.trimRight (s : String) : String
String.trimRight (s : String) : String

Removes trailing whitespace from a string.

“Whitespace” is defined as characters for which Char.isWhitespace returns true.

Examples:

"abc".trimRight = "abc"
" abc".trimRight = " abc"
"abc \t ".trimRight = "abc"
" abc ".trimRight = " abc"
"abc\ndef\n".trimRight = "abc\ndef"

def

String.removeLeadingSpaces (s : String) : String
String.removeLeadingSpaces (s : String) :
  String

Consistently de-indents the lines in a string, removing the same amount of leading whitespace from each line such that the least-indented line has no leading whitespace.

The number of leading whitespace characters to remove from each line is determined by counting the number of leading space (' ') and tab ('\t') characters on lines after the first line that also contain non-whitespace characters. No distinction is made between tab and space characters; both count equally.

The least number of leading whitespace characters found is then removed from the beginning of each line. The first line's leading whitespace is not counted when determining how far to de-indent the string, but leading whitespace is removed from it.

Examples:

"Here:\n fun x =>\n x + 1".removeLeadingSpaces = "Here:\nfun x =>\n x + 1"
"Here:\n\t\tfun x =>\n\t \tx + 1".removeLeadingSpaces = "Here:\nfun x =>\n \tx + 1"
"Here:\n\t\tfun x =>\n \n\t \tx + 1".removeLeadingSpaces = "Here:\nfun x =>\n\n \tx + 1"

def

String.set : String → String.Pos → Char → String
String.set :
  String → String.Pos → Char → String

Replaces the character at a specified position in a string with a new character. If the position is invalid, the string is returned unchanged.

If both the replacement character and the replaced character are 7-bit ASCII characters and the string is not shared, then it is updated in-place and not copied.

Examples:

"abc".set ⟨1⟩ 'B' = "aBc"
"abc".set ⟨3⟩ 'D' = "abc"
"L∃∀N".set ⟨4⟩ 'X' = "L∃XN"
"L∃∀N".set ⟨2⟩ 'X' = "L∃∀N" because '∃' is a multi-byte character, so the byte index 2 is an invalid position.

def

String.modify (s : String) (i : String.Pos) (f : Char → Char) : String
String.modify (s : String)
  (i : String.Pos) (f : Char → Char) :
  String

Replaces the character at position p in the string s with the result of applying f to that character. If p is an invalid position, the string is returned unchanged.

If both the replacement character and the replaced character are 7-bit ASCII characters and the string is not shared, then it is updated in-place and not copied.

Examples:

"abc".modify ⟨1⟩ Char.toUpper = "aBc"
"abc".modify ⟨3⟩ Char.toUpper = "abc"

def

String.front (s : String) : Char
String.front (s : String) : Char

Returns the first character in s. If s = "", returns (default : Char).

Examples:

"abc".front = 'a'
"".front = (default : Char)

def

String.back (s : String) : Char
String.back (s : String) : Char

Returns the last character in s. If s = "", returns (default : Char).

Examples:

"abc".back = 'c'
"".back = (default : Char)

def

String.posOf (s : String) (c : Char) : String.Pos
String.posOf (s : String) (c : Char) :
  String.Pos

Returns the position of the first occurrence of a character, c, in a string s. If s does not contain c, returns s.endPos.

Examples:

"abcba".posOf 'a' = ⟨0⟩
"abcba".posOf 'z' = ⟨5⟩
"L∃∀N".posOf '∀' = ⟨4⟩

def

String.revPosOf (s : String) (c : Char) : Option String.Pos
String.revPosOf (s : String) (c : Char) :
  Option String.Pos

Returns the position of the last occurrence of a character, c, in a string s. If s does not contain c, returns none.

Examples:

"abcabc".revPosOf 'a' = some ⟨3⟩
"abcabc".revPosOf 'z' = none
"L∃∀N".revPosOf '∀' = some ⟨4⟩

def

String.contains (s : String) (c : Char) : Bool
String.contains (s : String) (c : Char) :
  Bool

Checks whether a string contains the specified character.

Examples:

"green".contains 'e' = true
"green".contains 'x' = false
"".contains 'x' = false

def

String.offsetOfPos (s : String) (pos : String.Pos) : Nat
String.offsetOfPos (s : String)
  (pos : String.Pos) : Nat

Returns the character index that corresponds to the provided position (i.e. UTF-8 byte index) in a string.

If the position is at the end of the string, then the string's length in characters is returned. If the position is invalid due to pointing at the middle of a UTF-8 byte sequence, then the character index of the next character after the position is returned.

Examples:

"L∃∀N".offsetOfPos ⟨0⟩ = 0
"L∃∀N".offsetOfPos ⟨1⟩ = 1
"L∃∀N".offsetOfPos ⟨2⟩ = 2
"L∃∀N".offsetOfPos ⟨4⟩ = 2
"L∃∀N".offsetOfPos ⟨5⟩ = 3
"L∃∀N".offsetOfPos ⟨50⟩ = 4

def

String.replace (s pattern replacement : String) : String
String.replace
  (s pattern replacement : String) :
  String

In the string s, replaces all occurrences of pattern with replacement.

Examples:

"red green blue".replace "e" "" = "rd grn blu"
"red green blue".replace "ee" "E" = "red grEn blue"
"red green blue".replace "e" "E" = "rEd grEEn bluE"

def

String.findLineStart (s : String) (pos : String.Pos) : String.Pos
String.findLineStart (s : String)
  (pos : String.Pos) : String.Pos

Returns the position of the beginning of the line that contains the position pos.

Lines are ended by '\n', and the returned position is either 0 : String.Pos or immediately after a '\n' character.

def

String.find (s : String) (p : Char → Bool) : String.Pos
String.find (s : String)
  (p : Char → Bool) : String.Pos

Finds the position of the first character in a string for which the Boolean predicate p returns true. If there is no such character in the string, then the end position of the string is returned.

Examples:

"coffee tea water".find (·.isWhitespace) = ⟨6⟩
"tea".find (·== 'X') = ⟨3⟩
"".find (·== 'X') = ⟨0⟩

def

String.revFind (s : String) (p : Char → Bool) : Option String.Pos
String.revFind (s : String)
  (p : Char → Bool) : Option String.Pos

Finds the position of the last character in a string for which the Boolean predicate p returns true. If there is no such character in the string, then none is returned.

Examples:

"coffee tea water".revFind (·.isWhitespace) = some ⟨10⟩
"tea".revFind (·== 'X') = none
"".revFind (·== 'X') = none

19.8.4.6. Folds and Aggregation🔗

def

String.map (f : Char → Char) (s : String) : String
String.map (f : Char → Char)
  (s : String) : String

Applies the function f to every character in a string, returning a string that contains the resulting characters.

Examples:

"abc123".map Char.toUpper = "ABC123"
"".map Char.toUpper = ""

def

String.foldl.{u} {α : Type u} (f : α → Char → α) (init : α)
  (s : String) : α
String.foldl.{u} {α : Type u}
  (f : α → Char → α) (init : α)
  (s : String) : α

Folds a function over a string from the left, accumulating a value starting with init. The accumulated value is combined with each character in order, using f.

Examples:

"coffee tea water".foldl (fun n c => if c.isWhitespace then n + 1 else n) 0 = 2
"coffee tea and water".foldl (fun n c => if c.isWhitespace then n + 1 else n) 0 = 3
"coffee tea water".foldl (·.push ·) "" = "coffee tea water"

def

String.foldr.{u} {α : Type u} (f : Char → α → α) (init : α)
  (s : String) : α
String.foldr.{u} {α : Type u}
  (f : Char → α → α) (init : α)
  (s : String) : α

Folds a function over a string from the right, accumulating a value starting with init. The accumulated value is combined with each character in reverse order, using f.

Examples:

"coffee tea water".foldr (fun c n => if c.isWhitespace then n + 1 else n) 0 = 2
"coffee tea and water".foldr (fun c n => if c.isWhitespace then n + 1 else n) 0 = 3
"coffee tea water".foldr (fun c s => c.push s) "" = "retaw dna aet eeffoc"

def

String.all (s : String) (p : Char → Bool) : Bool
String.all (s : String)
  (p : Char → Bool) : Bool

Checks whether the Boolean predicate p returns true for every character in a string.

Short-circuits at the first character for which p returns false.

Examples:

"brown".all (·.isLetter) = true
"brown and orange".all (·.isLetter) = false
"".all (fun _ => false) = true

def

String.any (s : String) (p : Char → Bool) : Bool
String.any (s : String)
  (p : Char → Bool) : Bool

Checks whether there is a character in a string for which the Boolean predicate p returns true.

Short-circuits at the first character for which p returns true.

Examples:

"brown".any (·.isLetter) = true
"brown".any (·.isWhitespace) = false
"brown and orange".any (·.isLetter) = true
"".any (fun _ => false) = false

19.8.4.7. Comparisons🔗

The LT String instance is defined by the lexicographic ordering on strings based on the LT Char instance. Logically, this is modeled by the lexicographic ordering on the lists that model strings, so List.Lex defines the order. It is decidable, and the decision procedure is overridden at runtime with efficient code that takes advantage of the run-time representation of strings.

def

String.le (a b : String) : Prop
String.le (a b : String) : Prop

Non-strict inequality on strings, typically used via the ≤ operator.

a ≤ b is defined to mean ¬ b < a.

def

String.firstDiffPos (a b : String) : String.Pos
String.firstDiffPos (a b : String) :
  String.Pos

Returns the first position where the two strings differ.

If one string is a prefix of the other, then the returned position is the end position of the shorter string. If the strings are identical, then their end position is returned.

Examples:

"tea".firstDiffPos "ten" = ⟨2⟩
"tea".firstDiffPos "tea" = ⟨3⟩
"tea".firstDiffPos "teas" = ⟨3⟩
"teas".firstDiffPos "tea" = ⟨3⟩

def

String.substrEq (s1 : String) (pos1 : String.Pos) (s2 : String)
  (pos2 : String.Pos) (sz : Nat) : Bool
String.substrEq (s1 : String)
  (pos1 : String.Pos) (s2 : String)
  (pos2 : String.Pos) (sz : Nat) : Bool

Checks whether substrings of two strings are equal. Substrings are indicated by their starting positions and a size in UTF-8 bytes. Returns false if the indicated substring does not exist in either string.

def

String.isPrefixOf (p s : String) : Bool
String.isPrefixOf (p s : String) : Bool

Checks whether the first string (p) is a prefix of the second (s).

String.startsWith is a version that takes the potential prefix after the string.

Examples:

"red".isPrefixOf "red green blue" = true
"green".isPrefixOf "red green blue" = false
"".isPrefixOf "red green blue" = true

def

String.startsWith (s pre : String) : Bool
String.startsWith (s pre : String) : Bool

Checks whether the first string (s) begins with the second (pre).

String.isPrefix is a version that takes the potential prefix before the string.

Examples:

"red green blue".startsWith "red" = true
"red green blue".startsWith "green" = false
"red green blue".startsWith "" = true
"red".startsWith "red" = true

def

String.endsWith (s post : String) : Bool
String.endsWith (s post : String) : Bool

Checks whether the first string (s) ends with the second (post).

Examples:

"red green blue".endsWith "blue" = true
"red green blue".endsWith "green" = false
"red green blue".endsWith "" = true
"red".endsWith "red" = true

def

String.decEq (s₁ s₂ : String) : Decidable (s₁ = s₂)
String.decEq (s₁ s₂ : String) :
  Decidable (s₁ = s₂)

Decides whether two strings are equal. Normally used via the DecidableEq String instance and the = operator.

At runtime, this function is overridden with an efficient native implementation.

opaque

String.hash (s : String) : UInt64
String.hash (s : String) : UInt64

Computes a hash for strings.

19.8.4.8. Manipulation🔗

def

String.split (s : String) (p : Char → Bool) : List String
String.split (s : String)
  (p : Char → Bool) : List String

Splits a string at each character for which p returns true.

The characters that satisfy p are not included in any of the resulting strings. If multiple characters in a row satisfy p, then the resulting list will contain empty strings.

Examples:

"coffee tea water".split (·.isWhitespace) = ["coffee", "tea", "water"]
"coffee tea water".split (·.isWhitespace) = ["coffee", "", "tea", "", "water"]
"fun x =>\n x + 1\n".split (·== '\n') = ["fun x =>", " x + 1", ""]

def

String.splitOn (s : String) (sep : String := " ") : List String
String.splitOn (s : String)
  (sep : String := " ") : List String

Splits a string s on occurrences of the separator string sep. The default separator is " ".

When sep is empty, the result is [s]. When sep occurs in overlapping patterns, the first match is taken. There will always be exactly n+1 elements in the returned list if there were n non-overlapping matches of sep in the string. The separators are not included in the returned substrings.

Examples:

"here is some text ".splitOn = ["here", "is", "some", "text", ""]
"here is some text ".splitOn "some" = ["here is ", " text "]
"here is some text ".splitOn "" = ["here is some text "]
"ababacabac".splitOn "aba" = ["", "bac", "c"]

def

String.push : String → Char → String
String.push : String → Char → String

Adds a character to the end of a string.

The internal implementation uses dynamic arrays and will perform destructive updates if the string is not shared.

Examples:

"abc".push 'd' = "abcd"
"".push 'a' = "a"

def

String.pushn (s : String) (c : Char) (n : Nat) : String
String.pushn (s : String) (c : Char)
  (n : Nat) : String

Adds multiple repetitions of a character to the end of a string.

Returns s, with n repetitions of c at the end. Internally, the implementation repeatedly calls String.push, so the string is modified in-place if there is a unique reference to it.

Examples:

"indeed".pushn '!' 2 = "indeed!!"
"indeed".pushn '!' 0 = "indeed"
"".pushn ' ' 4 = " "

def

String.capitalize (s : String) : String
String.capitalize (s : String) : String

Replaces the first character in s with the result of applying Char.toUpper to it. Returns the empty string if the string is empty.

Char.toUpper has no effect on characters outside of the range 'a'–'z'.

Examples:

"orange".capitalize = "Orange"
"ORANGE".capitalize = "ORANGE"
"".capitalize = ""

def

String.decapitalize (s : String) : String
String.decapitalize (s : String) : String

Replaces the first character in s with the result of applying Char.toLower to it. Returns the empty string if the string is empty.

Char.toLower has no effect on characters outside of the range 'A'–'Z'.

Examples:

"Orange".decapitalize = "orange"
"ORANGE".decapitalize = "oRANGE"
"".decapitalize = ""

def

String.toUpper (s : String) : String
String.toUpper (s : String) : String

Replaces each character in s with the result of applying Char.toUpper to it.

Char.toUpper has no effect on characters outside of the range 'a'–'z'.

Examples:

"orange".toUpper = "ORANGE"
"abc123".toUpper = "ABC123"

def

String.toLower (s : String) : String
String.toLower (s : String) : String

Replaces each character in s with the result of applying Char.toLower to it.

Char.toLower has no effect on characters outside of the range 'A'–'Z'.

Examples:

"ORANGE".toLower = "orange"
"Orange".toLower = "orange"
"ABc123".toLower = "abc123"

19.8.4.9. Iterators🔗

Fundamentally, a String.Iterator is a pair of a string and a valid position in the string. Iterators provide functions for getting the current character (curr), replacing the current character (setCurr), checking whether the iterator can move to the left or the right (hasPrev and hasNext, respectively), and moving the iterator (prev and next, respectively). Clients are responsible for checking whether they've reached the beginning or end of the string; otherwise, the iterator ensures that its position always points at a character.

structure

String.Iterator : Type
String.Iterator : Type

An iterator over the characters (Unicode code points) in a String. Typically created by String.iter.

String iterators pair a string with a valid byte index. This allows efficient character-by-character processing of strings while avoiding the need to manually ensure that byte indices are used with the correct strings.

An iterator is valid if the position i is valid for the string s, meaning 0 ≤ i ≤ s.endPos and i lies on a UTF8 byte boundary. If i = s.endPos, the iterator is at the end of the string.

Most operations on iterators return unspecified values if the iterator is not valid. The functions in the String.Iterator API rule out the creation of invalid iterators, with two exceptions:

Iterator.next iter is invalid if iter is already at the end of the string (iter.atEnd is true), and
Iterator.forward iter n/Iterator.nextn iter n is invalid if n is strictly greater than the number of remaining characters.

Constructor

String.Iterator.mk

Fields

s : String

The string being iterated over.

i : String.Pos

The current UTF-8 byte position in the string s.

This position is not guaranteed to be valid for the string. If the position is not valid, then the current character is (default : Char), similar to String.get on an invalid position.

def

String.iter (s : String) : String.Iterator
String.iter (s : String) : String.Iterator

Creates an iterator at the beginning of the string.

def

String.mkIterator (s : String) : String.Iterator
String.mkIterator (s : String) :
  String.Iterator

Creates an iterator at the beginning of the string.

def

String.Iterator.curr : String.Iterator → Char
String.Iterator.curr :
  String.Iterator → Char

Gets the character at the iterator's current position.

A run-time bounds check is performed. Use String.Iterator.curr' to avoid redundant bounds checks.

If the position is invalid, returns (default : Char).

def

String.Iterator.hasNext : String.Iterator → Bool
String.Iterator.hasNext :
  String.Iterator → Bool

Checks whether the iterator is at or before the string's last character.

def

String.Iterator.next : String.Iterator → String.Iterator
String.Iterator.next :
  String.Iterator → String.Iterator

Moves the iterator's position forward by one character, unconditionally.

It is only valid to call this function if the iterator is not at the end of the string (i.e. if Iterator.atEnd is false); otherwise, the resulting iterator will be invalid.

def

String.Iterator.forward : String.Iterator → Nat → String.Iterator
String.Iterator.forward :
  String.Iterator → Nat → String.Iterator

Moves the iterator's position forward by the specified number of characters.

The resulting iterator is only valid if the number of characters to skip is less than or equal to the number of characters left in the iterator.

def

String.Iterator.nextn : String.Iterator → Nat → String.Iterator
String.Iterator.nextn :
  String.Iterator → Nat → String.Iterator

Moves the iterator's position forward by the specified number of characters.

The resulting iterator is only valid if the number of characters to skip is less than or equal to the number of characters left in the iterator.

def

String.Iterator.hasPrev : String.Iterator → Bool
String.Iterator.hasPrev :
  String.Iterator → Bool

Checks whether the iterator is after the beginning of the string.

def

String.Iterator.prev : String.Iterator → String.Iterator
String.Iterator.prev :
  String.Iterator → String.Iterator

Moves the iterator's position backward by one character, unconditionally.

The position is not changed if the iterator is at the beginning of the string.

def

String.Iterator.prevn : String.Iterator → Nat → String.Iterator
String.Iterator.prevn :
  String.Iterator → Nat → String.Iterator

Moves the iterator's position back by the specified number of characters, stopping at the beginning of the string.

def

String.Iterator.atEnd : String.Iterator → Bool
String.Iterator.atEnd :
  String.Iterator → Bool

Checks whether the iterator is past its string's last character.

def

String.Iterator.toEnd : String.Iterator → String.Iterator
String.Iterator.toEnd :
  String.Iterator → String.Iterator

Moves the iterator's position to the end of the string, just past the last character.

def

String.Iterator.setCurr : String.Iterator → Char → String.Iterator
String.Iterator.setCurr :
  String.Iterator → Char → String.Iterator

Replaces the current character in the string.

Does nothing if the iterator is at the end of the string. If both the replacement character and the replaced character are 7-bit ASCII characters and the string is not shared, then it is updated in-place and not copied.

def

String.Iterator.extract : String.Iterator → String.Iterator → String
String.Iterator.extract :
  String.Iterator →
    String.Iterator → String

Extracts the substring between the positions of two iterators. The first iterator's position is the start of the substring, and the second iterator's position is the end.

Returns the empty string if the iterators are for different strings, or if the position of the first iterator is past the position of the second iterator.

def

String.Iterator.remainingToString : String.Iterator → String
String.Iterator.remainingToString :
  String.Iterator → String

The remaining characters in an iterator, as a string.

def

String.Iterator.remainingBytes : String.Iterator → Nat
String.Iterator.remainingBytes :
  String.Iterator → Nat

The number of UTF-8 bytes remaining in the iterator.

def

String.Iterator.pos (self : String.Iterator) : String.Pos
String.Iterator.pos
  (self : String.Iterator) : String.Pos

The current UTF-8 byte position in the string s.

This position is not guaranteed to be valid for the string. If the position is not valid, then the current character is (default : Char), similar to String.get on an invalid position.

19.8.4.10. Substrings🔗

def

String.toSubstring (s : String) : Substring
String.toSubstring (s : String) :
  Substring

Converts a String into a Substring that denotes the entire string.

def

String.toSubstring' (s : String) : Substring
String.toSubstring' (s : String) :
  Substring

Converts a String into a Substring that denotes the entire string.

This is a version of String.toSubstring that doesn't have an @[inline] annotation.

structure

Substring : Type
Substring : Type

A region or slice of some underlying string.

A substring contains an string together with the start and end byte positions of a region of interest. Actually extracting a substring requires copying and memory allocation, while many substrings of the same underlying string may exist with very little overhead, and they are more convenient than tracking the bounds by hand.

Using its constructor explicitly, it is possible to construct a Substring in which one or both of the positions is invalid for the string. Many operations will return unexpected or confusing results if the start and stop positions are not valid. Instead, it's better to use API functions that ensure the validity of the positions in a substring to create and manipulate them.

Constructor

Substring.mk

Fields

str : String

The underlying string.

startPos : String.Pos

The byte position of the start of the string slice.

stopPos : String.Pos

The byte position of the end of the string slice.

19.8.4.10.1. Properties

def

Substring.isEmpty (ss : Substring) : Bool
Substring.isEmpty (ss : Substring) : Bool

Checks whether a substring is empty.

A substring is empty if its start and end positions are the same.

def

Substring.bsize : Substring → Nat
Substring.bsize : Substring → Nat

The number of bytes used by the string's UTF-8 encoding.

19.8.4.10.2. Positions

def

Substring.atEnd : Substring → String.Pos → Bool
Substring.atEnd :
  Substring → String.Pos → Bool

Checks whether a position in a substring is precisely equal to its ending position.

The position is understood relative to the substring's starting position, rather than the underlying string's starting position.

def

Substring.posOf (s : Substring) (c : Char) : String.Pos
Substring.posOf (s : Substring)
  (c : Char) : String.Pos

Returns the substring-relative position of the first occurrence of c in s, or s.bsize if c doesn't occur.

def

Substring.next : Substring → String.Pos → String.Pos
Substring.next :
  Substring → String.Pos → String.Pos

Returns the next position in a substring after the given position. If the position is at the end of the substring, it is returned unmodified.

Both the input position and the returned position are interpreted relative to the substring's start position, not the underlying string.

def

Substring.nextn : Substring → Nat → String.Pos → String.Pos
Substring.nextn :
  Substring →
    Nat → String.Pos → String.Pos

Returns the position that's the specified number of characters forward from the given position in a substring. If the end position of the substring is reached, it is returned.

Both the input position and the returned position are interpreted relative to the substring's start position, not the underlying string.

def

Substring.prev : Substring → String.Pos → String.Pos
Substring.prev :
  Substring → String.Pos → String.Pos

Returns the previous position in a substring, just prior to the given position. If the position is at the beginning of the substring, it is returned unmodified.

Both the input position and the returned position are interpreted relative to the substring's start position, not the underlying string.

def

Substring.prevn : Substring → Nat → String.Pos → String.Pos
Substring.prevn :
  Substring →
    Nat → String.Pos → String.Pos

Returns the position that's the specified number of characters prior to the given position in a substring. If the start position of the substring is reached, it is returned.

Both the input position and the returned position are interpreted relative to the substring's start position, not the underlying string.

19.8.4.10.3. Folds and Aggregation

def

Substring.foldl.{u} {α : Type u} (f : α → Char → α) (init : α)
  (s : Substring) : α
Substring.foldl.{u} {α : Type u}
  (f : α → Char → α) (init : α)
  (s : Substring) : α

Folds a function over a substring from the left, accumulating a value starting with init. The accumulated value is combined with each character in order, using f.

def

Substring.foldr.{u} {α : Type u} (f : Char → α → α) (init : α)
  (s : Substring) : α
Substring.foldr.{u} {α : Type u}
  (f : Char → α → α) (init : α)
  (s : Substring) : α

Folds a function over a substring from the right, accumulating a value starting with init. The accumulated value is combined with each character in reverse order, using f.

def

Substring.all (s : Substring) (p : Char → Bool) : Bool
Substring.all (s : Substring)
  (p : Char → Bool) : Bool

Checks whether the Boolean predicate p returns true for every character in a substring.

Short-circuits at the first character for which p returns false.

def

Substring.any (s : Substring) (p : Char → Bool) : Bool
Substring.any (s : Substring)
  (p : Char → Bool) : Bool

Checks whether the Boolean predicate p returns true for any character in a substring.

Short-circuits at the first character for which p returns true.

19.8.4.10.4. Comparisons

def

Substring.beq (ss1 ss2 : Substring) : Bool
Substring.beq (ss1 ss2 : Substring) : Bool

Checks whether two substrings represent equal strings. Usually accessed via the == operator.

Two substrings do not need to have the same underlying string or the same start and end positions; instead, they are equal if they contain the same sequence of characters.

def

Substring.sameAs (ss1 ss2 : Substring) : Bool
Substring.sameAs (ss1 ss2 : Substring) :
  Bool

Checks whether two substrings have the same position and content.

The two substrings do not need to have the same underlying string for this check to succeed.

19.8.4.10.5. Prefix and Suffix

def

Substring.commonPrefix (s t : Substring) : Substring
Substring.commonPrefix (s t : Substring) :
  Substring

Returns the longest common prefix of two substrings.

The returned substring uses the same underlying string as s.

def

Substring.commonSuffix (s t : Substring) : Substring
Substring.commonSuffix (s t : Substring) :
  Substring

Returns the longest common suffix of two substrings.

The returned substring uses the same underlying string as s.

def

Substring.dropPrefix? (s pre : Substring) : Option Substring
Substring.dropPrefix?
  (s pre : Substring) : Option Substring

If pre is a prefix of s, returns the remainder. Returns none otherwise.

The substring pre is a prefix of s if there exists a t : Substring such that s.toString = pre.toString ++ t.toString. If so, the result is the substring of s without the prefix.

def

Substring.dropSuffix? (s suff : Substring) : Option Substring
Substring.dropSuffix?
  (s suff : Substring) : Option Substring

If suff is a suffix of s, returns the remainder. Returns none otherwise.

The substring suff is a suffix of s if there exists a t : Substring such that s.toString = t.toString ++ suff.toString. If so, the result the substring of s without the suffix.

19.8.4.10.6. Lookups

def

Substring.get : Substring → String.Pos → Char
Substring.get :
  Substring → String.Pos → Char

Returns the character at the given position in the substring.

The position is relative to the substring, rather than the underlying string, and no bounds checking is performed with respect to the substring's end position. If the relative position is not a valid position in the underlying string, the fallback value (default : Char), which is 'A', is returned. Does not panic.

def

Substring.contains (s : Substring) (c : Char) : Bool
Substring.contains (s : Substring)
  (c : Char) : Bool

Checks whether a substring contains the specified character.

def

Substring.front (s : Substring) : Char
Substring.front (s : Substring) : Char

Returns the first character in the substring.

If the substring is empty, but the substring's start position is a valid position in the underlying string, then the character at the start position is returned. If the substring's start position is not a valid position in the string, the fallback value (default : Char), which is 'A', is returned. Does not panic.

19.8.4.10.7. Modifications

def

Substring.drop : Substring → Nat → Substring
Substring.drop :
  Substring → Nat → Substring

Removes the specified number of characters (Unicode code points) from the beginning of a substring by advancing its start position.

If the substring's end position is reached, the start position is not advanced past it.

def

Substring.dropWhile : Substring → (Char → Bool) → Substring
Substring.dropWhile :
  Substring → (Char → Bool) → Substring

Removes the longest prefix of a substring in which a Boolean predicate returns true for all characters by moving the substring's start position. The start position is moved to the position of the first character for which the predicate returns false, or to the substring's end position if the predicate always returns true.

def

Substring.dropRight : Substring → Nat → Substring
Substring.dropRight :
  Substring → Nat → Substring

Removes the specified number of characters (Unicode code points) from the end of a substring by moving its end position towards its start position.

If the substring's start position is reached, the end position is not retracted past it.

def

Substring.dropRightWhile : Substring → (Char → Bool) → Substring
Substring.dropRightWhile :
  Substring → (Char → Bool) → Substring

Removes the longest suffix of a substring in which a Boolean predicate returns true for all characters by moving the substring's end position. The end position is moved just after the position of the last character for which the predicate returns false, or to the substring's start position if the predicate always returns true.

def

Substring.take : Substring → Nat → Substring
Substring.take :
  Substring → Nat → Substring

Retains only the specified number of characters (Unicode code points) at the beginning of a substring, by moving its end position towards its start position.

If the substring's start position is reached, the end position is not retracted past it.

def

Substring.takeWhile : Substring → (Char → Bool) → Substring
Substring.takeWhile :
  Substring → (Char → Bool) → Substring

Retains only the longest prefix of a substring in which a Boolean predicate returns true for all characters by moving the substring's end position towards its start position.

def

Substring.takeRight : Substring → Nat → Substring
Substring.takeRight :
  Substring → Nat → Substring

Retains only the specified number of characters (Unicode code points) at the end of a substring, by moving its start position towards its end position.

If the substring's end position is reached, the start position is not advanced past it.

def

Substring.takeRightWhile : Substring → (Char → Bool) → Substring
Substring.takeRightWhile :
  Substring → (Char → Bool) → Substring

Retains only the longest suffix of a substring in which a Boolean predicate returns true for all characters by moving the substring's start position towards its end position.

def

Substring.extract : Substring → String.Pos → String.Pos → Substring
Substring.extract :
  Substring →
    String.Pos → String.Pos → Substring

Returns the region of the substring delimited by the provided start and stop positions, as a substring. The positions are interpreted with respect to the substring's start position, rather than the underlying string.

If the resulting substring is empty, then the resulting substring is a substring of the empty string "". Otherwise, the underlying string is that of the input substring with the beginning and end positions adjusted.

def

Substring.trim : Substring → Substring
Substring.trim : Substring → Substring

Removes leading and trailing whitespace from a substring by first moving its start position to the first non-whitespace character, and then moving its end position to the last non-whitespace character.

If the substring consists only of whitespace, then the resulting substring's start position is moved to its end position.

“Whitespace” is defined as characters for which Char.isWhitespace returns true.

Examples:

" red green blue ".toSubstring.trim.toString = "red green blue"
" red green blue ".toSubstring.trim.startPos = ⟨1⟩
" red green blue ".toSubstring.trim.stopPos = ⟨15⟩
" ".toSubstring.trim.startPos = ⟨5⟩

def

Substring.trimLeft (s : Substring) : Substring
Substring.trimLeft (s : Substring) :
  Substring

Removes leading whitespace from a substring by moving its start position to the first non-whitespace character, or to its end position if there is no non-whitespace character.

“Whitespace” is defined as characters for which Char.isWhitespace returns true.

def

Substring.trimRight (s : Substring) : Substring
Substring.trimRight (s : Substring) :
  Substring

Removes trailing whitespace from a substring by moving its end position to the last non-whitespace character, or to its start position if there is no non-whitespace character.

“Whitespace” is defined as characters for which Char.isWhitespace returns true.

def

Substring.splitOn (s : Substring) (sep : String := " ") : List Substring
Substring.splitOn (s : Substring)
  (sep : String := " ") : List Substring

Splits a substring s on occurrences of the separator string sep. The default separator is " ".

19.8.4.10.8. Conversions

def

Substring.toString : Substring → String
Substring.toString : Substring → String

Copies the region of the underlying string pointed to by a substring into a fresh string.

def

Substring.isNat (s : Substring) : Bool
Substring.isNat (s : Substring) : Bool

Checks whether the substring can be interpreted as the decimal representation of a natural number.

A substring can be interpreted as a decimal natural number if it is not empty and all the characters in it are digits.

Use Substring.toNat? to convert such a substring to a natural number.

def

Substring.toNat? (s : Substring) : Option Nat
Substring.toNat? (s : Substring) :
  Option Nat

Checks whether the substring can be interpreted as the decimal representation of a natural number, returning the number if it can.

A substring can be interpreted as a decimal natural number if it is not empty and all the characters in it are digits.

Use Substring.isNat to check whether the substring is such a substring.

def

Substring.toIterator : Substring → String.Iterator
Substring.toIterator :
  Substring → String.Iterator

Returns an iterator into the underlying string, at the substring's starting position. The ending position is discarded, so the iterator alone cannot be used to determine whether its current position is within the original substring.

def

Substring.toName (s : Substring) : Lean.Name
Substring.toName (s : Substring) :
  Lean.Name

Converts a substring to the Lean compiler's representation of names. The resulting name is hierarchical, and the string is split at the dots ('.').

"a.b".toSubstring.toName is the name a.b, not «a.b». For the latter, use Name.mkSimple ∘ Substring.toString.

19.8.4.11. Metaprogramming🔗

def

String.toName (s : String) : Lean.Name
String.toName (s : String) : Lean.Name

Converts a string to the Lean compiler's representation of names. The resulting name is hierarchical, and the string is split at the dots ('.').

"a.b".toName is the name a.b, not «a.b». For the latter, use Name.mkSimple.

def

String.quote (s : String) : String
String.quote (s : String) : String

Converts a string to its corresponding Lean string literal syntax. Double quotes are added to each end, and internal characters are escaped as needed.

Examples:

"abc".quote = "\"abc\""
"\"".quote = "\"\\\"\""

19.8.4.12. Encodings🔗

def

String.getUtf8Byte (s : String) (n : Nat) (h : n < s.utf8ByteSize) :
  UInt8
String.getUtf8Byte (s : String) (n : Nat)
  (h : n < s.utf8ByteSize) : UInt8

Accesses the indicated byte in the UTF-8 encoding of a string.

At runtime, this function is implemented by efficient, constant-time code.

def

String.utf8ByteSize : String → Nat
String.utf8ByteSize : String → Nat

The number of bytes used by the string's UTF-8 encoding.

At runtime, this function takes constant time because the byte length of strings is cached.

def

String.utf8EncodeChar (c : Char) : List UInt8
String.utf8EncodeChar (c : Char) :
  List UInt8

Returns the sequence of bytes in a character's UTF-8 encoding.

def

String.utf8DecodeChar? (a : ByteArray) (i : Nat) : Option Char
String.utf8DecodeChar? (a : ByteArray)
  (i : Nat) : Option Char

Decodes the UTF-8 character sequence that starts at a given index in a byte array, or none if index i is out of bounds or is not the start of a valid UTF-8 character.

def

String.fromUTF8 (a : ByteArray) (h : String.validateUTF8 a = true) :
  String
String.fromUTF8 (a : ByteArray)
  (h : String.validateUTF8 a = true) :
  String

Decodes an array of bytes that encode a string as UTF-8 into the corresponding string. Invalid UTF-8 characters in the byte array result in (default : Char), or 'A', in the string.

def

String.fromUTF8? (a : ByteArray) : Option String
String.fromUTF8? (a : ByteArray) :
  Option String

Decodes an array of bytes that encode a string as UTF-8 into the corresponding string, or returns none if the array is not a valid UTF-8 encoding of a string.

def

String.fromUTF8! (a : ByteArray) : String
String.fromUTF8! (a : ByteArray) : String

Decodes an array of bytes that encode a string as UTF-8 into the corresponding string, or panics if the array is not a valid UTF-8 encoding of a string.

def

String.toUTF8 (a : String) : ByteArray
String.toUTF8 (a : String) : ByteArray

Encodes a string in UTF-8 as an array of bytes.

def

String.validateUTF8 (a : ByteArray) : Bool
String.validateUTF8 (a : ByteArray) : Bool

Checks whether an array of bytes is a valid UTF-8 encoding of a string.

def

String.crlfToLf (text : String) : String
String.crlfToLf (text : String) : String

Replaces each \r\n with \n to normalize line endings, but does not validate that there are no isolated \r characters.

This is an optimized version of String.replace text "\r\n" "\n".

19.8.5. FFI🔗

FFI type

typedef struct {
    lean_object m_header;
    /* byte length including '\0' terminator */
    size_t      m_size;
    size_t      m_capacity;
    /* UTF8 length */
    size_t      m_length;
    char        m_data[0];
} lean_string_object;

The representation of strings in C. See the description of run-time Strings for more details.

FFI function

bool lean_is_string(lean_object * o)

Returns true if o is a string, or false otherwise.

FFI function

lean_string_object * lean_to_string(lean_object * o)

Performs a runtime check that o is indeed a string. If o is not a string, an assertion fails.