19.6. Floating-Point Numbers🔗

Floating-point numbers are a an approximation of the real numbers that are efficiently implemented in computer hardware. Computations that use floating-point numbers are very efficient; however, the nature of the way that they approximate the real numbers is complex, with many corner cases. The IEEE 754 standard, which defines the floating-point format that is used on modern computers, allows hardware designers to make certain choices, and real systems differ in these small details. For example, there are many distinct bit representations of NaN, the indicator that a result is undefined, and some platforms differ with respect to which NaN is returned from adding two NaNs.

Lean exposes the underlying platform's floating-point values for use in programming, but they are not encoded in Lean's logic. They are represented by an opaque type. This means that the kernel is not capable of computing with or reasoning about floating-point values without additional axioms. A consequence of this is that equality of floating-point numbers is not decidable. Furthermore, comparisons between floating-point values are decidable, but the code that does so is opaque; in practice, the decision procedure can only be used in compiled code.

Lean provides two floating-point types: Float represents 64-bit floating point values, while Float32 represents 32-bit floating point values. The precision of Float does not vary based on the platform that Lean is running on.

type

Float : Type
Float : Type

64-bit floating-point numbers.

Float corresponds to the IEEE 754 binary64 format (double in C or f64 in Rust). Floating-point numbers are a finite representation of a subset of the real numbers, extended with extra “sentinel” values that represent undefined and infinite results as well as separate positive and negative zeroes. Arithmetic on floating-point numbers approximates the corresponding operations on the real numbers by rounding the results to numbers that are representable, propagating error and infinite values.

Floating-point numbers include subnormal numbers. Their special values are:

NaN, which denotes a class of “not a number” values that result from operations such as dividing zero by zero, and
Inf and -Inf, which represent positive and infinities that result from dividing non-zero values by zero.

type

Float32 : Type
Float32 : Type

32-bit floating-point numbers.

Float32 corresponds to the IEEE 754 binary32 format (float in C or f32 in Rust). Floating-point numbers are a finite representation of a subset of the real numbers, extended with extra “sentinel” values that represent undefined and infinite results as well as separate positive and negative zeroes. Arithmetic on floating-point numbers approximates the corresponding operations on the real numbers by rounding the results to numbers that are representable, propagating error and infinite values.

Floating-point numbers include subnormal numbers. Their special values are:

NaN, which denotes a class of “not a number” values that result from operations such as dividing zero by zero, and
Inf and -Inf, which represent positive and infinities that result from dividing non-zero values by zero.

No Kernel Reasoning About Floating-Point Numbers

The Lean kernel can compare expressions of type Float for syntactic equality, so 0.0 is definitionally equal to itself.

example : (0.0 : Float) = (0.0 : Float) := by⊢ 0.0 = 0.0 rflAll goals completed! 🐙

Terms that require reduction to become syntactically equal cannot be checked by the kernel:

example : (0.0 : Float) = (0.0 + 0.0 : Float) := by⊢ 0.0 = 0.0 + 0.0 Tactic `rfl` failed: The left-hand side
  0.0
is not definitionally equal to the right-hand side
  0.0 + 0.0

⊢ 0.0 = 0.0 + 0.0rfl⊢ 0.0 = 0.0 + 0.0

Tactic `rfl` failed: The left-hand side
  0.0
is not definitionally equal to the right-hand side
  0.0 + 0.0

⊢ 0.0 = 0.0 + 0.0

Similarly, the kernel cannot evaluate Bool-valued comparisons of floating-point numbers while checking definitional equality:

theorem Float.zero_eq_zero_plus_zero :
    ((0.0 : Float) == (0.0 + 0.0 : Float)) = true :=
  by⊢ (0.0 == 0.0 + 0.0) = true Tactic `rfl` failed: The left-hand side
  0.0 == 0.0 + 0.0
is not definitionally equal to the right-hand side
  true

⊢ (0.0 == 0.0 + 0.0) = truerfl⊢ (0.0 == 0.0 + 0.0) = true

Tactic `rfl` failed: The left-hand side
  0.0 == 0.0 + 0.0
is not definitionally equal to the right-hand side
  true

⊢ (0.0 == 0.0 + 0.0) = true

However, the native_decide tactic can invoke the underlying platform's floating-point primitives that are used by Lean for run-time programs:

theorem Float.zero_eq_zero_plus_zero :
    ((0.0 : Float) == (0.0 + 0.0 : Float)) = true := by⊢ (0.0 == 0.0 + 0.0) = true
  native_decideAll goals completed! 🐙

This tactic uses the axiom Lean.trustCompiler, which states that the Lean compiler, interpreter and the low-level implementations of built-in operators are trusted in addition to the kernel.

'Float.zero_eq_zero_plus_zero' depends on axioms: [propext, Classical.choice, Lean.ofReduceBool, Lean.trustCompiler]#print axioms Float.zero_eq_zero_plus_zero

'Float.zero_eq_zero_plus_zero' depends on axioms: [propext, Classical.choice, Lean.ofReduceBool, Lean.trustCompiler]

Floating-Point Equality Is Not Reflexive

Floating-point operations may produce NaN values that indicate an undefined result. These values are not comparable with each other; in particular, all comparisons involving NaN will return false, including equality.

false#eval ((0.0 : Float) / 0.0) == ((0.0 : Float) / 0.0)

Floating-Point Equality Is Not a Congruence

Applying a function to two equal floating-point numbers may not result in equal numbers. In particular, positive and negative zero are distinct values that are equated by floating-point equality, but division by positive or negative zero yields positive or negative infinite values.

def neg0 : Float := -0.0

def pos0 : Float := 0.0

(true, false)#eval (neg0 == pos0, 1.0 / neg0 == 1.0 / pos0)

(true, false)

19.6.1. Syntax

Lean does not have dedicated floating-point literals. Instead, floating-point literals are resolved via the appropriate instances of the OfScientific and Neg type classes.

Floating-Point Literals

The term

(-2.523 : Float)

is syntactic sugar for

(Neg.neg (OfScientific.ofScientific 22523 true 4) : Float)

and the term

(413.52 : Float32)

is syntactic sugar for

(OfScientific.ofScientific 41352 true 2 : Float32)

19.6.2. API Reference🔗

19.6.2.1. Properties

Floating-point numbers fall into one of three categories:

Finite numbers are ordinary floating-point values.
Infinities, which may be positive or negative, result from division by zero.
NaNs, which are not numbers, result from other undefined operations, such as the square root of a negative number.

opaque

Float.isInf : Float → Bool
Float.isInf : Float → Bool

Checks whether a floating-point number is a positive or negative infinite number, but not a finite number or NaN.

This function does not reduce in the kernel. It is compiled to the C operator isinf.

opaque

Float32.isInf : Float32 → Bool
Float32.isInf : Float32 → Bool

Checks whether a floating-point number is a positive or negative infinite number, but not a finite number or NaN.

This function does not reduce in the kernel. It is compiled to the C operator isinf.

opaque

Float.isNaN : Float → Bool
Float.isNaN : Float → Bool

Checks whether a floating point number is NaN (“not a number”) value.

NaN values result from operations that might otherwise be errors, such as dividing zero by zero.

This function does not reduce in the kernel. It is compiled to the C operator isnan.

opaque

Float32.isNaN : Float32 → Bool
Float32.isNaN : Float32 → Bool

Checks whether a floating point number is NaN ("not a number") value.

NaN values result from operations that might otherwise be errors, such as dividing zero by zero.

This function does not reduce in the kernel. It is compiled to the C operator isnan.

opaque

Float.isFinite : Float → Bool
Float.isFinite : Float → Bool

Checks whether a floating-point number is finite, that is, whether it is normal, subnormal, or zero, but not infinite or NaN.

This function does not reduce in the kernel. It is compiled to the C operator isfinite.

opaque

Float32.isFinite : Float32 → Bool
Float32.isFinite : Float32 → Bool

Checks whether a floating-point number is finite, that is, whether it is normal, subnormal, or zero, but not infinite or NaN.

This function does not reduce in the kernel. It is compiled to the C operator isfinite.

19.6.2.2. Syntax

These operations exist to support the OfScientific Float and OfScientific Float32 instances and are normally invoked indirectly as a result of a literal value.

opaque

Float.ofScientific (m : Nat) (s : Bool) (e : Nat) : Float
Float.ofScientific (m : Nat) (s : Bool)
  (e : Nat) : Float

Constructs a Float from the given mantissa, sign, and exponent values.

This function is part of the implementation of the OfScientific Float instance that is used to interpret floating-point literals.

opaque

Float32.ofScientific (m : Nat) (s : Bool) (e : Nat) : Float32
Float32.ofScientific (m : Nat) (s : Bool)
  (e : Nat) : Float32

Constructs a Float32 from the given mantissa, sign, and exponent values.

This function is part of the implementation of the OfScientific Float32 instance that is used to interpret floating-point literals.

19.6.2.3. Conversions

opaque

Float.toBits : Float → UInt64
Float.toBits : Float → UInt64

Bit-for-bit conversion to UInt64. Interprets a Float as a UInt64, ignoring the numeric value and treating the Float's bit pattern as a UInt64.

Floats and UInt64s have the same endianness on all supported platforms. IEEE 754 very precisely specifies the bit layout of floats.

This function is distinct from Float.toUInt64, which attempts to preserve the numeric value rather than reinterpreting the bit pattern.

opaque

Float32.toBits : Float32 → UInt32
Float32.toBits : Float32 → UInt32

Bit-for-bit conversion to UInt32. Interprets a Float32 as a UInt32, ignoring the numeric value and treating the Float32's bit pattern as a UInt32.

Float32s and UInt32s have the same endianness on all supported platforms. IEEE 754 very precisely specifies the bit layout of floats.

This function is distinct from Float.toUInt32, which attempts to preserve the numeric value rather than reinterpreting the bit pattern.

This function does not reduce in the kernel.

opaque

Float.ofBits : UInt64 → Float
Float.ofBits : UInt64 → Float

Bit-for-bit conversion from UInt64. Interprets a UInt64 as a Float, ignoring the numeric value and treating the UInt64's bit pattern as a Float.

Floats and UInt64s have the same endianness on all supported platforms. IEEE 754 very precisely specifies the bit layout of floats.

This function does not reduce in the kernel.

opaque

Float32.ofBits : UInt32 → Float32
Float32.ofBits : UInt32 → Float32

Bit-for-bit conversion from UInt32. Interprets a UInt32 as a Float32, ignoring the numeric value and treating the UInt32's bit pattern as a Float32.

Float32s and UInt32s have the same endianness on all supported platforms. IEEE 754 very precisely specifies the bit layout of floats.

This function does not reduce in the kernel.

opaque

Float.toFloat32 : Float → Float32
Float.toFloat32 : Float → Float32

Converts a 64-bit floating-point number to a 32-bit floating-point number. This may lose precision.

This function does not reduce in the kernel.

opaque

Float32.toFloat : Float32 → Float
Float32.toFloat : Float32 → Float

Converts a 32-bit floating-point number to a 64-bit floating-point number.

This function does not reduce in the kernel.

opaque

Float.toString : Float → String
Float.toString : Float → String

Converts a floating-point number to a string.

This function does not reduce in the kernel.

opaque

Float32.toString : Float32 → String
Float32.toString : Float32 → String

Converts a floating-point number to a string.

This function does not reduce in the kernel.

opaque

Float.toUInt8 : Float → UInt8
Float.toUInt8 : Float → UInt8

Converts a floating-point number to an 8-bit unsigned integer.

If the given Float is non-negative, truncates the value to a positive integer, rounding down and clamping to the range of UInt8. Returns 0 if the Float is negative or NaN, and returns the largest UInt8 value (i.e. UInt8.size - 1) if the float is larger than it.

This function does not reduce in the kernel.

opaque

Float.toInt8 : Float → Int8
Float.toInt8 : Float → Int8

Truncates a floating-point number to the nearest 8-bit signed integer, rounding towards zero.

If the Float is larger than the maximum value for Int8 (including Inf), returns the maximum value of Int8 (i.e. Int8.maxValue). If it is smaller than the minimum value for Int8 (including -Inf), returns the minimum value of Int8 (i.e. Int8.minValue). If it is NaN, returns 0.

This function does not reduce in the kernel.

opaque

Float32.toUInt8 : Float32 → UInt8
Float32.toUInt8 : Float32 → UInt8

Converts a floating-point number to an 8-bit unsigned integer.

If the given Float32 is non-negative, truncates the value to a positive integer, rounding down and clamping to the range of UInt8. Returns 0 if the Float32 is negative or NaN, and returns the largest UInt8 value (i.e. UInt8.size - 1) if the float is larger than it.

This function does not reduce in the kernel.

opaque

Float32.toInt8 : Float32 → Int8
Float32.toInt8 : Float32 → Int8

Truncates a floating-point number to the nearest 8-bit signed integer, rounding towards zero.

This function does not reduce in the kernel.

opaque

Float.toUInt16 : Float → UInt16
Float.toUInt16 : Float → UInt16

Converts a floating-point number to a 16-bit unsigned integer.

If the given Float is non-negative, truncates the value to a positive integer, rounding down and clamping to the range of UInt16. Returns 0 if the Float is negative or NaN, and returns the largest UInt16 value (i.e. UInt16.size - 1) if the float is larger than it.

This function does not reduce in the kernel.

opaque

Float.toInt16 : Float → Int16
Float.toInt16 : Float → Int16

Truncates a floating-point number to the nearest 16-bit signed integer, rounding towards zero.

If the Float is larger than the maximum value for Int16 (including Inf), returns the maximum value of Int16 (i.e. Int16.maxValue). If it is smaller than the minimum value for Int16 (including -Inf), returns the minimum value of Int16 (i.e. Int16.minValue). If it is NaN, returns 0.

This function does not reduce in the kernel.

opaque

Float32.toUInt16 : Float32 → UInt16
Float32.toUInt16 : Float32 → UInt16

Converts a floating-point number to a 16-bit unsigned integer.

If the given Float32 is non-negative, truncates the value to a positive integer, rounding down and clamping to the range of UInt16. Returns 0 if the Float32 is negative or NaN, and returns the largest UInt16 value (i.e. UInt16.size - 1) if the float is larger than it.

This function does not reduce in the kernel.

opaque

Float32.toInt16 : Float32 → Int16
Float32.toInt16 : Float32 → Int16

Truncates a floating-point number to the nearest 16-bit signed integer, rounding towards zero.

This function does not reduce in the kernel.

opaque

Float.toUInt32 : Float → UInt32
Float.toUInt32 : Float → UInt32

Converts a floating-point number to a 32-bit unsigned integer.

If the given Float is non-negative, truncates the value to a positive integer, rounding down and clamping to the range of UInt32. Returns 0 if the Float is negative or NaN, and returns the largest UInt32 value (i.e. UInt32.size - 1) if the float is larger than it.

This function does not reduce in the kernel.

opaque

Float32.toUInt32 : Float32 → UInt32
Float32.toUInt32 : Float32 → UInt32

Converts a floating-point number to a 32-bit unsigned integer.

If the given Float32 is non-negative, truncates the value to a positive integer, rounding down and clamping to the range of UInt32. Returns 0 if the Float32 is negative or NaN, and returns the largest UInt32 value (i.e. UInt32.size - 1) if the float is larger than it.

This function does not reduce in the kernel.

opaque

Float.toInt32 : Float → Int32
Float.toInt32 : Float → Int32

Truncates a floating-point number to the nearest 32-bit signed integer, rounding towards zero.

If the Float is larger than the maximum value for Int32 (including Inf), returns the maximum value of Int32 (i.e. Int32.maxValue). If it is smaller than the minimum value for Int32 (including -Inf), returns the minimum value of Int32 (i.e. Int32.minValue). If it is NaN, returns 0.

This function does not reduce in the kernel.

opaque

Float32.toInt32 : Float32 → Int32
Float32.toInt32 : Float32 → Int32

Truncates a floating-point number to the nearest 32-bit signed integer, rounding towards zero.

This function does not reduce in the kernel.

opaque

Float.toUInt64 : Float → UInt64
Float.toUInt64 : Float → UInt64

Converts a floating-point number to a 64-bit unsigned integer.

If the given Float is non-negative, truncates the value to a positive integer, rounding down and clamping to the range of UInt64. Returns 0 if the Float is negative or NaN, and returns the largest UInt64 value (i.e. UInt64.size - 1) if the float is larger than it.

This function does not reduce in the kernel.

opaque

Float.toInt64 : Float → Int64
Float.toInt64 : Float → Int64

Truncates a floating-point number to the nearest 64-bit signed integer, rounding towards zero.

If the Float is larger than the maximum value for Int64 (including Inf), returns the maximum value of Int64 (i.e. Int64.maxValue). If it is smaller than the minimum value for Int64 (including -Inf), returns the minimum value of Int64 (i.e. Int64.minValue). If it is NaN, returns 0.

This function does not reduce in the kernel.

opaque

Float32.toUInt64 : Float32 → UInt64
Float32.toUInt64 : Float32 → UInt64

Converts a floating-point number to a 64-bit unsigned integer.

If the given Float32 is non-negative, truncates the value to a positive integer, rounding down and clamping to the range of UInt64. Returns 0 if the Float32 is negative or NaN, and returns the largest UInt64 value (i.e. UInt64.size - 1) if the float is larger than it.

This function does not reduce in the kernel.

opaque

Float32.toInt64 : Float32 → Int64
Float32.toInt64 : Float32 → Int64

Truncates a floating-point number to the nearest 64-bit signed integer, rounding towards zero.

This function does not reduce in the kernel.

opaque

Float.toUSize : Float → USize
Float.toUSize : Float → USize

Converts a floating-point number to a word-sized unsigned integer.

If the given Float is non-negative, truncates the value to a positive integer, rounding down and clamping to the range of USize. Returns 0 if the Float is negative or NaN, and returns the largest USize value (i.e. USize.size - 1) if the float is larger than it.

This function does not reduce in the kernel.

opaque

Float32.toUSize : Float32 → USize
Float32.toUSize : Float32 → USize

Converts a floating-point number to a word-sized unsigned integer.

If the given Float32 is non-negative, truncates the value to a positive integer, rounding down and clamping to the range of USize. Returns 0 if the Float32 is negative or NaN, and returns the largest USize value (i.e. USize.size - 1) if the float is larger than it.

This function does not reduce in the kernel.

opaque

Float.toISize : Float → ISize
Float.toISize : Float → ISize

Truncates a floating-point number to the nearest word-sized signed integer, rounding towards zero.

If the Float is larger than the maximum value for ISize (including Inf), returns the maximum value of ISize (i.e. ISize.maxValue). If it is smaller than the minimum value for ISize (including -Inf), returns the minimum value of ISize (i.e. ISize.minValue). If it is NaN, returns 0.

This function does not reduce in the kernel.

opaque

Float32.toISize : Float32 → ISize
Float32.toISize : Float32 → ISize

Truncates a floating-point number to the nearest word-sized signed integer, rounding towards zero.

This function does not reduce in the kernel.

def

Float.ofInt : Int → Float
Float.ofInt : Int → Float

Converts an integer into the closest-possible 64-bit floating-point number, or positive or negative infinite floating-point value if the range of Float is exceeded.

def

Float32.ofInt : Int → Float32
Float32.ofInt : Int → Float32

Converts an integer into the closest-possible 32-bit floating-point number, or positive or negative infinite floating-point value if the range of Float32 is exceeded.

def

Float.ofNat (n : Nat) : Float
Float.ofNat (n : Nat) : Float

Converts a natural number into the closest-possible 64-bit floating-point number, or an infinite floating-point value if the range of Float is exceeded.

def

Float32.ofNat (n : Nat) : Float32
Float32.ofNat (n : Nat) : Float32

Converts a natural number into the closest-possible 32-bit floating-point number, or an infinite floating-point value if the range of Float32 is exceeded.

def

Float.ofBinaryScientific (m : Nat) (e : Int) : Float
Float.ofBinaryScientific (m : Nat)
  (e : Int) : Float

Computes m * 2^e.

def

Float32.ofBinaryScientific (m : Nat) (e : Int) : Float32
Float32.ofBinaryScientific (m : Nat)
  (e : Int) : Float32

Computes m * 2^e.

opaque

Float.frExp : Float → Float × Int
Float.frExp : Float → Float × Int

Splits the given float x into a significand/exponent pair (s, i) such that x = s * 2^i where s ∈ (-1;-0.5] ∪ [0.5; 1). Returns an undefined value if x is not finite.

This function does not reduce in the kernel. It is implemented in compiled code by the C function frexp.

opaque

Float32.frExp : Float32 → Float32 × Int
Float32.frExp : Float32 → Float32 × Int

Splits the given float x into a significand/exponent pair (s, i) such that x = s * 2^i where s ∈ (-1;-0.5] ∪ [0.5; 1). Returns an undefined value if x is not finite.

This function does not reduce in the kernel. It is implemented in compiled code by the C function frexp.

19.6.2.4. Comparisons

opaque

Float.beq (a b : Float) : Bool
Float.beq (a b : Float) : Bool

Checks whether two floating-point numbers are equal according to IEEE 754.

Floating-point equality does not correspond with propositional equality. In particular, it is not reflexive since NaN != NaN, and it is not a congruence because 0.0 == -0.0, but 1.0 / 0.0 != 1.0 / -0.0.

This function does not reduce in the kernel. It is compiled to the C equality operator.

opaque

Float32.beq (a b : Float32) : Bool
Float32.beq (a b : Float32) : Bool

Checks whether two floating-point numbers are equal according to IEEE 754.

This function does not reduce in the kernel. It is compiled to the C equality operator.

19.6.2.4.1. Inequalities

The decision procedures for inequalities are opaque constants in the logic. They can only be used via the Lean.ofReduceBool axiom, e.g. via the native_decide tactic.

def

Float.le : Float → Float → Prop
Float.le : Float → Float → Prop

Non-strict inequality of floating-point numbers. Typically used via the ≤ operator.

def

Float32.le : Float32 → Float32 → Prop
Float32.le : Float32 → Float32 → Prop

Non-strict inequality of floating-point numbers. Typically used via the ≤ operator.

def

Float.lt : Float → Float → Prop
Float.lt : Float → Float → Prop

Strict inequality of floating-point numbers. Typically used via the < operator.

def

Float32.lt : Float32 → Float32 → Prop
Float32.lt : Float32 → Float32 → Prop

Strict inequality of floating-point numbers. Typically used via the < operator.

opaque

Float.decLe (a b : Float) : Decidable (a ≤ b)
Float.decLe (a b : Float) :
  Decidable (a ≤ b)

Compares two floating point numbers for non-strict inequality.

This function does not reduce in the kernel. It is compiled to the C inequality operator.

opaque

Float32.decLe (a b : Float32) : Decidable (a ≤ b)
Float32.decLe (a b : Float32) :
  Decidable (a ≤ b)

Compares two floating point numbers for non-strict inequality.

This function does not reduce in the kernel. It is compiled to the C inequality operator.

opaque

Float.decLt (a b : Float) : Decidable (a < b)
Float.decLt (a b : Float) :
  Decidable (a < b)

Compares two floating point numbers for strict inequality.

This function does not reduce in the kernel. It is compiled to the C inequality operator.

opaque

Float32.decLt (a b : Float32) : Decidable (a < b)
Float32.decLt (a b : Float32) :
  Decidable (a < b)

Compares two floating point numbers for strict inequality.

This function does not reduce in the kernel. It is compiled to the C inequality operator.

19.6.2.5. Arithmetic

Arithmetic operations on floating-point values are typically invoked via the Add Float, Sub Float, Mul Float, Div Float, and HomogeneousPow Float instances, along with the corresponding Float32 instances.

opaque

Float.add : Float → Float → Float
Float.add : Float → Float → Float

Adds two 64-bit floating-point numbers according to IEEE 754. Typically used via the + operator.

This function does not reduce in the kernel. It is compiled to the C addition operator.

opaque

Float32.add : Float32 → Float32 → Float32
Float32.add : Float32 → Float32 → Float32

Adds two 32-bit floating-point numbers according to IEEE 754. Typically used via the + operator.

This function does not reduce in the kernel. It is compiled to the C addition operator.

opaque

Float.sub : Float → Float → Float
Float.sub : Float → Float → Float

Subtracts 64-bit floating-point numbers according to IEEE 754. Typically used via the - operator.

This function does not reduce in the kernel. It is compiled to the C subtraction operator.

opaque

Float32.sub : Float32 → Float32 → Float32
Float32.sub : Float32 → Float32 → Float32

Subtracts 32-bit floating-point numbers according to IEEE 754. Typically used via the - operator.

This function does not reduce in the kernel. It is compiled to the C subtraction operator.

opaque

Float.mul : Float → Float → Float
Float.mul : Float → Float → Float

Multiplies 64-bit floating-point numbers according to IEEE 754. Typically used via the * operator.

This function does not reduce in the kernel. It is compiled to the C multiplication operator.

opaque

Float32.mul : Float32 → Float32 → Float32
Float32.mul : Float32 → Float32 → Float32

Multiplies 32-bit floating-point numbers according to IEEE 754. Typically used via the * operator.

This function does not reduce in the kernel. It is compiled to the C multiplication operator.

opaque

Float.div : Float → Float → Float
Float.div : Float → Float → Float

Divides 64-bit floating-point numbers according to IEEE 754. Typically used via the / operator.

In Lean, division by zero typically yields zero. For Float, it instead yields either Inf, -Inf, or NaN.

This function does not reduce in the kernel. It is compiled to the C division operator.

opaque

Float32.div : Float32 → Float32 → Float32
Float32.div : Float32 → Float32 → Float32

Divides 32-bit floating-point numbers according to IEEE 754. Typically used via the / operator.

In Lean, division by zero typically yields zero. For Float32, it instead yields either Inf, -Inf, or NaN.

This function does not reduce in the kernel. It is compiled to the C division operator.

opaque

Float.pow : Float → Float → Float
Float.pow : Float → Float → Float

Raises one floating-point number to the power of another. Typically used via the ^ operator.

This function does not reduce in the kernel. It is implemented in compiled code by the C function pow.

opaque

Float32.pow : Float32 → Float32 → Float32
Float32.pow : Float32 → Float32 → Float32

Raises one floating-point number to the power of another. Typically used via the ^ operator.

This function does not reduce in the kernel. It is implemented in compiled code by the C function powf.

opaque

Float.exp (x : Float) : Float
Float.exp (x : Float) : Float

Computes the exponential e^x of a floating-point number.

This function does not reduce in the kernel. It is implemented in compiled code by the C function exp.

opaque

Float32.exp : Float32 → Float32
Float32.exp : Float32 → Float32

Computes the exponential e^x of a floating-point number.

This function does not reduce in the kernel. It is implemented in compiled code by the C function expf.

opaque

Float.exp2 (x : Float) : Float
Float.exp2 (x : Float) : Float

Computes the base-2 exponential 2^x of a floating-point number.

This function does not reduce in the kernel. It is implemented in compiled code by the C function exp2.

opaque

Float32.exp2 : Float32 → Float32
Float32.exp2 : Float32 → Float32

Computes the base-2 exponential 2^x of a floating-point number.

This function does not reduce in the kernel. It is implemented in compiled code by the C function exp2f.

19.6.2.5.1. Roots

Computing the square root of a negative number yields NaN.

opaque

Float.sqrt : Float → Float
Float.sqrt : Float → Float

Computes the square root of a floating-point number.

This function does not reduce in the kernel. It is implemented in compiled code by the C function sqrt.

opaque

Float32.sqrt : Float32 → Float32
Float32.sqrt : Float32 → Float32

Computes the square root of a floating-point number.

This function does not reduce in the kernel. It is implemented in compiled code by the C function sqrtf.

opaque

Float.cbrt : Float → Float
Float.cbrt : Float → Float

Computes the cube root of a floating-point number.

This function does not reduce in the kernel. It is implemented in compiled code by the C function cbrt.

opaque

Float32.cbrt : Float32 → Float32
Float32.cbrt : Float32 → Float32

Computes the cube root of a floating-point number.

This function does not reduce in the kernel. It is implemented in compiled code by the C function cbrtf.

19.6.2.6. Logarithms

opaque

Float.log (x : Float) : Float
Float.log (x : Float) : Float

Computes the natural logarithm ln x of a floating-point number.

This function does not reduce in the kernel. It is implemented in compiled code by the C function log.

opaque

Float32.log : Float32 → Float32
Float32.log : Float32 → Float32

Computes the natural logarithm ln x of a floating-point number.

This function does not reduce in the kernel. It is implemented in compiled code by the C function logf.

opaque

Float.log10 : Float → Float
Float.log10 : Float → Float

Computes the base-10 logarithm of a floating-point number.

This function does not reduce in the kernel. It is implemented in compiled code by the C function log10.

opaque

Float32.log10 : Float32 → Float32
Float32.log10 : Float32 → Float32

Computes the base-10 logarithm of a floating-point number.

This function does not reduce in the kernel. It is implemented in compiled code by the C function log10f.

opaque

Float.log2 : Float → Float
Float.log2 : Float → Float

Computes the base-2 logarithm of a floating-point number.

This function does not reduce in the kernel. It is implemented in compiled code by the C function log2.

opaque

Float32.log2 : Float32 → Float32
Float32.log2 : Float32 → Float32

Computes the base-2 logarithm of a floating-point number.

This function does not reduce in the kernel. It is implemented in compiled code by the C function log2f.

19.6.2.7. Scaling

opaque

Float.scaleB (x : Float) (i : Int) : Float
Float.scaleB (x : Float) (i : Int) : Float

Efficiently computes x * 2^i.

This function does not reduce in the kernel.

opaque

Float32.scaleB (x : Float32) (i : Int) : Float32
Float32.scaleB (x : Float32) (i : Int) :
  Float32

Efficiently computes x * 2^i.

This function does not reduce in the kernel.

19.6.2.8. Rounding

opaque

Float.round : Float → Float
Float.round : Float → Float

Rounds to the nearest integer, rounding away from zero at half-way points.

This function does not reduce in the kernel. It is implemented in compiled code by the C function round.

opaque

Float32.round : Float32 → Float32
Float32.round : Float32 → Float32

Rounds to the nearest integer, rounding away from zero at half-way points.

This function does not reduce in the kernel. It is implemented in compiled code by the C function roundf.

opaque

Float.floor : Float → Float
Float.floor : Float → Float

Computes the floor of a floating-point number, which is the largest integer that's no larger than the given number.

This function does not reduce in the kernel. It is implemented in compiled code by the C function floor.

Examples:

Float.floor 1.5 = 1
Float.floor (-1.5) = (-2)

opaque

Float32.floor : Float32 → Float32
Float32.floor : Float32 → Float32

Computes the floor of a floating-point number, which is the largest integer that's no larger than the given number.

This function does not reduce in the kernel. It is implemented in compiled code by the C function floorf.

Examples:

Float32.floor 1.5 = 1
Float32.floor (-1.5) = (-2)

opaque

Float.ceil : Float → Float
Float.ceil : Float → Float

Computes the ceiling of a floating-point number, which is the smallest integer that's no smaller than the given number.

This function does not reduce in the kernel. It is implemented in compiled code by the C function ceil.

Examples:

Float.ceil 1.5 = 2
Float.ceil (-1.5) = (-1)

opaque

Float32.ceil : Float32 → Float32
Float32.ceil : Float32 → Float32

Computes the ceiling of a floating-point number, which is the smallest integer that's no smaller than the given number.

This function does not reduce in the kernel. It is implemented in compiled code by the C function ceilf.

Examples:

Float32.ceil 1.5 = 2
Float32.ceil (-1.5) = (-1)

19.6.2.9. Trigonometry

19.6.2.9.1. Sine

opaque

Float.sin : Float → Float
Float.sin : Float → Float

Computes the sine of a floating-point number in radians.

This function does not reduce in the kernel. It is implemented in compiled code by the C function sin.

opaque

Float32.sin : Float32 → Float32
Float32.sin : Float32 → Float32

Computes the sine of a floating-point number in radians.

This function does not reduce in the kernel. It is implemented in compiled code by the C function sinf.

opaque

Float.sinh : Float → Float
Float.sinh : Float → Float

Computes the hyperbolic sine of a floating-point number.

This function does not reduce in the kernel. It is implemented in compiled code by the C function sinh.

opaque

Float32.sinh : Float32 → Float32
Float32.sinh : Float32 → Float32

Computes the hyperbolic sine of a floating-point number.

This function does not reduce in the kernel. It is implemented in compiled code by the C function sinhf.

opaque

Float.asin : Float → Float
Float.asin : Float → Float

Computes the arc sine (inverse sine) of a floating-point number in radians.

This function does not reduce in the kernel. It is implemented in compiled code by the C function asin.

opaque

Float32.asin : Float32 → Float32
Float32.asin : Float32 → Float32

Computes the arc sine (inverse sine) of a floating-point number in radians.

This function does not reduce in the kernel. It is implemented in compiled code by the C function asinf.

opaque

Float.asinh : Float → Float
Float.asinh : Float → Float

Computes the hyperbolic arc sine (inverse sine) of a floating-point number.

This function does not reduce in the kernel. It is implemented in compiled code by the C function asinh.

opaque

Float32.asinh : Float32 → Float32
Float32.asinh : Float32 → Float32

Computes the hyperbolic arc sine (inverse sine) of a floating-point number.

This function does not reduce in the kernel. It is implemented in compiled code by the C function asinhf.

19.6.2.9.2. Cosine

opaque

Float.cos : Float → Float
Float.cos : Float → Float

Computes the cosine of a floating-point number in radians.

This function does not reduce in the kernel. It is implemented in compiled code by the C function cos.

opaque

Float32.cos : Float32 → Float32
Float32.cos : Float32 → Float32

Computes the cosine of a floating-point number in radians.

This function does not reduce in the kernel. It is implemented in compiled code by the C function cosf.

opaque

Float.cosh : Float → Float
Float.cosh : Float → Float

Computes the hyperbolic cosine of a floating-point number.

This function does not reduce in the kernel. It is implemented in compiled code by the C function cosh.

opaque

Float32.cosh : Float32 → Float32
Float32.cosh : Float32 → Float32

Computes the hyperbolic cosine of a floating-point number.

This function does not reduce in the kernel. It is implemented in compiled code by the C function coshf.

opaque

Float.acos : Float → Float
Float.acos : Float → Float

Computes the arc cosine (inverse cosine) of a floating-point number in radians.

This function does not reduce in the kernel. It is implemented in compiled code by the C function acos.

opaque

Float32.acos : Float32 → Float32
Float32.acos : Float32 → Float32

Computes the arc cosine (inverse cosine) of a floating-point number in radians.

This function does not reduce in the kernel. It is implemented in compiled code by the C function acosf.

opaque

Float.acosh : Float → Float
Float.acosh : Float → Float

Computes the hyperbolic arc cosine (inverse cosine) of a floating-point number.

This function does not reduce in the kernel. It is implemented in compiled code by the C function acosh.

opaque

Float32.acosh : Float32 → Float32
Float32.acosh : Float32 → Float32

Computes the hyperbolic arc cosine (inverse cosine) of a floating-point number.

This function does not reduce in the kernel. It is implemented in compiled code by the C function acoshf.

19.6.2.9.3. Tangent

opaque

Float.tan : Float → Float
Float.tan : Float → Float

Computes the tangent of a floating-point number in radians.

This function does not reduce in the kernel. It is implemented in compiled code by the C function tan.

opaque

Float32.tan : Float32 → Float32
Float32.tan : Float32 → Float32

Computes the tangent of a floating-point number in radians.

This function does not reduce in the kernel. It is implemented in compiled code by the C function tanf.

opaque

Float.tanh : Float → Float
Float.tanh : Float → Float

Computes the hyperbolic tangent of a floating-point number.

This function does not reduce in the kernel. It is implemented in compiled code by the C function tanh.

opaque

Float32.tanh : Float32 → Float32
Float32.tanh : Float32 → Float32

Computes the hyperbolic tangent of a floating-point number.

This function does not reduce in the kernel. It is implemented in compiled code by the C function tanhf.

opaque

Float.atan : Float → Float
Float.atan : Float → Float

Computes the arc tangent (inverse tangent) of a floating-point number in radians.

This function does not reduce in the kernel. It is implemented in compiled code by the C function atan.

opaque

Float32.atan : Float32 → Float32
Float32.atan : Float32 → Float32

Computes the arc tangent (inverse tangent) of a floating-point number in radians.

This function does not reduce in the kernel. It is implemented in compiled code by the C function atanf.

opaque

Float.atanh : Float → Float
Float.atanh : Float → Float

Computes the hyperbolic arc tangent (inverse tangent) of a floating-point number.

This function does not reduce in the kernel. It is implemented in compiled code by the C function atanh.

opaque

Float32.atanh : Float32 → Float32
Float32.atanh : Float32 → Float32

Computes the hyperbolic arc tangent (inverse tangent) of a floating-point number.

This function does not reduce in the kernel. It is implemented in compiled code by the C function atanhf.

opaque

Float.atan2 (y x : Float) : Float
Float.atan2 (y x : Float) : Float

Computes the arc tangent (inverse tangent) of y / x in radians, in the range -π–π. The signs of the arguments determine the quadrant of the result.

This function does not reduce in the kernel. It is implemented in compiled code by the C function atan2.

opaque

Float32.atan2 : Float32 → Float32 → Float32
Float32.atan2 :
  Float32 → Float32 → Float32

Computes the arc tangent (inverse tangent) of y / x in radians, in the range -π–π. The signs of the arguments determine the quadrant of the result.

This function does not reduce in the kernel. It is implemented in compiled code by the C function atan2f.

19.6.2.10. Negation and Absolute Value

opaque

Float.abs : Float → Float
Float.abs : Float → Float

Computes the absolute value of a floating-point number.

This function does not reduce in the kernel. It is implemented in compiled code by the C function fabs.

opaque

Float32.abs : Float32 → Float32
Float32.abs : Float32 → Float32

Computes the absolute value of a floating-point number.

This function does not reduce in the kernel. It is implemented in compiled code by the C function fabsf.

opaque

Float.neg : Float → Float
Float.neg : Float → Float

Negates 64-bit floating-point numbers according to IEEE 754. Typically used via the - prefix operator.

This function does not reduce in the kernel. It is compiled to the C negation operator.

opaque

Float32.neg : Float32 → Float32
Float32.neg : Float32 → Float32

Negates 32-bit floating-point numbers according to IEEE 754. Typically used via the - prefix operator.

This function does not reduce in the kernel. It is compiled to the C negation operator.