From b3be7c9067b7991c2e017511094468b6756e5648 Mon Sep 17 00:00:00 2001 From: cbmarini Date: Tue, 2 Dec 2025 16:04:07 +0100 Subject: [PATCH] man: updated the sections on floating point numbers in the manual. --- doc/manual/float.tex | 245 ++++++++++++++++++++++++++------------ doc/manual/functions.tex | 56 ++++++++- doc/manual/statements.tex | 2 +- 3 files changed, 224 insertions(+), 79 deletions(-) diff --git a/doc/manual/float.tex b/doc/manual/float.tex index 53de055e..2dd97a64 100644 --- a/doc/manual/float.tex +++ b/doc/manual/float.tex @@ -1,68 +1,133 @@ -\chapter{Floating point} +\chapter{Floating point arithmetic} \label{floatingpoint} -Starting with version 5.0 \FORM\ is equiped with arbitrary floating point -capability. The low level routines are part of the GMP and mpfr libraries -which should be available on most systems. If not they can be picked up -easily from the internet. The main commands involving the floating point -system are +Starting with version 5.0, \FORM{} is equiped with arbitrary precision floating point +arithmetic. The low level routines are handled by the GMP and MPFR libraries, +which are available on most systems and if missing can be easily picked up +from the internet. This chapter describes the commands, functions, and behaviour +of \FORM's floating point sytem. + +\section{Initializing and closing the floating point system} +Before any floating-point operations can be performed, \FORM{} must activate the +floating point system and set the working precision. This initialization allocates +the internal data structures used by the GMP and MPFR libraries. The system remains +active until the end of the program, or until it is explicitly closed. +The two statements that control these operations are: \begin{description} -\item[\#startfloat] This instruction is needed to startup the floating -point system. Invoking it will allocate a number of arrays. The instruction -has either one or two arguments: +\item[\#StartFloat] This instruction initializes the floating +point system and allocates the necessary internal arrays. +It takes either one or two arguments: \begin{verbatim} - #startfloat [,MZV=] + #StartFloat [,MZV=] \end{verbatim} The first argument is mandatory and specifies the desired precision. It must -be a positive integer followed by either \texttt{b} (for precision in bits) +be a positive integer followed by either a \texttt{b} (for precision in bits) or \texttt{d} (for precision in decimal digits). -\FORM{} will round to at least this precision. Because the internal -routines work with WORDs, the precision (in bits) will internally be rounded up to the nearest -integer number of WORDs. The second argument is optional for when one wants -to work with multiple zeta values (MZVs) or Euler sums. It specifies the -maximum weight that will be used. The evaluation of the sums requires a -number of auxiliary arrays. The default value is zero. If one would like to -change the precision during a run, this is possible. The effect would be -that the existing arrays are released and new arrays will be allocated. -\item[\#endfloat] This instruction releases all arrays allocated for the -floating point system. +\FORM{} will round to at least this precision. +The second argument is optional and only needed when working with multiple +zeta values (MZVs) or Euler sums. It specifies the maximum weight +that will be used. The evaluation of the sums requires a +number of auxiliary arrays that depend on this weight. The default weight is zero. +\item[\#EndFloat] This instruction releases all arrays allocated for the +floating point system. Note that if one would like to change the precision during a run, +this is now possible with a new \texttt{\#StartFloat} instruction. \end{description} +Example programs that illustrate the use of these statements and the +functionality of \FORM's floating point system are given below. + + +\section{Conversion between rational and floating point coefficients} +A term in an expression can have a rational or floating point coefficient. +The following statements convert between the two. \begin{description} -\item[tofloat] Converts the rational coefficients at the ground level to -floating point numbers in the precision specified in the \#startfloat -instruction. From this point on the coefficient at this level will be -floating point. If one needs to convert numbers inside a function argument -one should use the argument environment. This can be nested. -\item[torational] Tries to convert the floating point coefficients to -rational numbers. To this end it uses repeated fractions as in +\item[ToFloat] Converts rational coefficients to +floating point numbers in the precision specified by \texttt{\#StartFloat}. +From this point on, the coefficient will be floating point. +\item[ToRational] Attempts to convert floating point coefficients to +rational numbers. To this end it uses continued fractions as in \begin{eqnarray} - x & \rightarrow & n_0 + 1/(n_1+1/(n_2+1/(n_3+\cdots))) \nonumber + x \;\rightarrow\; n_0 + \frac{1}{\,n_1 + \frac{1}{\,n_2 + \frac{1}{\,n_3 + \cdots}}}\;, + \nonumber \end{eqnarray} with $x$ a floating point number. The algorithm keeps track of the remaining precision and if $1/n_i$ is close to this precision it truncates -the sequence at $n_{i-1}$. After that it works out the fraction. It could -be that $x$ cannot be expressed as a fraction within the given precision. +the sequence at $n_{i-1}$. After that it works out the corresponding fraction. +It could be that $x$ cannot be expressed as a fraction within the given precision. This can usually be seen by that the fractions are `rather wild', or that the result changes when the precision is increased. This statement can also -be abbreviated to `torat'. -\item[evaluate] If this command has no arguments all floating point -functions that \FORM{} knows about will be evaluated. The currently allowed -arguments are the functions mzv\_, euler\_, sqrt\_ and mzvhalf\_. If any -(or more than one) of these are specified only those functions will be -evaluated. -\item[strictrounding] This statement rounds floating point numbers to a -given precision. The syntax is +be abbreviated as \texttt{ToRat}. +\end{description} + +The above statements operate on ground level coefficient only. To convert numbers +inside a function argument, one must use the \texttt{Argument} environment. +For example: +\begin{verbatim} + CFunction f; + #StartFloat 10d + Local F = 0.1666666666*f(0.1428571429); + ToRat; + Print "<1> %t"; + Argument f; + ToRat; + EndArgument; + Print "<2> %t"; + .end +<1> + 1/6*f(1.428571429e-01) +<2> + 1/6*f(1/7) +\end{verbatim} +The argument environment may be nested. +Similarly, the statements \texttt{Evaluate}, \texttt{StrictRounding} and \texttt{Chop} act at +the ground level. To have them act on function argument, one uses the \texttt{Argument} environment. +These statements are explained further below. + +\section{Evaluation of functions and symbols} +Before version 5.0, \FORM{} already reserved function names for many common mathematical +functions. These functions can now be evaluated numerically using: + +\begin{description} +\item[Evaluate] This statement evaluates the mathematical functions and or symbols numerically: +\begin{verbatim} +Evaluate [function(s)],[symbol(s)]; +\end{verbatim} +where the argument specifies the function(s) and/or symbol(s) to evaluate. +More than one function and/or symbol may be listed. +If this statement is used without arguments, all floating point functions and symbols that \FORM{} +knows will be evaluated. Currently, the full list of functions that can be evaluated numerically reads +\begin{verbatim} +sqrt_, ln_, eexp_, li2_, gamma_, agm_, +sin_, cos_, tan_, asin_, acos_, atan_, atan2_, +sinh_, cosh_, tanh_, asinh_, acosh_, atanh_, +mzv_, euler_, mzvhalf_, +\end{verbatim} +where the functions on the last line denote the multiple zeta values, Euler sums and +harmonic polylogarithms of argument $1/2$ respectively. +The list of symbols/constants that can be evaluated is \begin{verbatim} - strictrounding [precision]; +pi_, ee_, em_, \end{verbatim} -where precision is an optional argument that specifies the rounding +where \texttt{ee\_}\index{ee\_} denotes the basis of the natural logarithm +and \texttt{em\_}\index{em\_} the Euler-Mascheroni constant. + +In addition, the functions \texttt{lin\_}, \texttt{hpl\_} and \texttt{mpl\_} are reserved function names, +but currently have no numerical evaluation. +\end{description} + + +\section{Rounding behaviour} +\begin{description} +\item[StrictRounding] This statement rounds floating point numbers to a +given precision: +\begin{verbatim} + StrictRounding []; +\end{verbatim} +where \texttt{} is an optional argument that specifies the rounding precision in either digits or bits, using the same syntax as -\texttt{\#startfloat}. If no argument is given, this statement rounds -the floating point coefficients to the default precision. Internally, -the GMP and mpfr libraries may use extra precision beyond that set by -\texttt{\#startfloat}. As a result, terms may not merge due to this -extra precision. For example: +\texttt{\#startfloat}. If omitted, the default precision is used. + +Internally, the GMP and mpfr libraries may use extra precision beyond that set by +\texttt{\#startfloat}. As a result, terms that print the same may still differ slightly +due to this extra precision and therefore fail to merge. For example: \begin{verbatim} #startfloat 6d CFunction f; @@ -89,13 +154,13 @@ \chapter{Floating point} $1.1100110101011111101*2^{-14}$. When rounded to 5 bits, this becomes $1.1101*2^{-14}$, which in decimal digits appears as 1.10626220703125e-04. -\item[Chop] This statement removes floating point numbers that are smaller -in absolute magnitude than a specified threshold. It takes one argument delta: +\item[Chop] This statement removes floating point numbers that are {\em smaller} +in absolute magnitude than a specified threshold. It takes one argument: \begin{verbatim} Chop ; \end{verbatim} -All floating point numbers with absolute value less than delta are replaced by 0. -Terms with no floating point coefficient are left untouched. The threshold delta +All floating point numbers with absolute value {\em less} than \texttt{} are replaced by 0. +Terms with no floating point coefficient are left untouched. The threshold \texttt{} can be a floating point number, integer, rational number, or power. Because statements in \FORM{} act term by term, it is often important to sort before invoking the chop statement. Otherwise, terms might be removed individually, while after @@ -109,33 +174,21 @@ \chapter{Floating point} Format floatprecision; \end{verbatim} \FORM{} prints floats with the number of digits specified by the current -\#startfloat instruction. With +\texttt{\#startfloat} instruction. With \begin{verbatim} Format floatprecision ; \end{verbatim} \FORM{} prints the number of digits specified by \texttt{}. -The syntax is the same as for the precision in \#startfloat: a positive -integer followed by either \texttt{b} (for bits) or \texttt{d} (for decimal -digits). If the requested precision exceeds the precision specified by -\#startfloat, only the available digits are printed. Finally, with +The syntax is the same as for the precision in \texttt{\#startfloat}. +If the requested precision exceeds the precision specified by +\texttt{\#startfloat}, only the available digits are printed. Finally, with \begin{verbatim} Format floatprecision off; \end{verbatim} -the floating point numbers are printed in raw internal format. +the floating point numbers are printed in raw internal format, see also section \ref{sec:float_raw}. \end{description} -In addition to the above commands there are the following functions that -can be evaluated sqrt\_, ln\_, eexp\_, li2\_, gamma\_, agm\_, sin\_, cos\_, tan\_, -asin\_, acos\_, atan\_, atan2\_, sinh\_, cosh\_, tanh\_, asinh\_, acosh\_, atanh\_. -For the function lin\_ there is currently no code. -The agm\_ function is the arithmetic geometric mean of its two input -values. - -In addition to the above functions there are also the constant -pi\_\index{pi\_}, the basis of the natural logarithm ee\_\index{ee\_} and the -Euler-Mascheroni constant em\_\index{em\_}. These constants will also be -expanded with the evaluate command. When given as an argument to evaluate, -only the specified constants will be evaluated. +\section{Examples} The following example shows some work with Multiple Zeta Values (MZV's): \begin{verbatim} #StartFloat 500b, MZV=15 @@ -190,10 +243,10 @@ \chapter{Floating point} 0.08 sec out of 0.09 sec \end{verbatim} -The \#startfloat initializes the floating point system and allocates arrays -for 500 bits of precision. If there is a second number it indicates the -maximum weight for MZVs and Euler sums. The functions are only evaluated -when the proper command is given. In the second module we divide the +In the first module, \texttt{\#startfloat} initializes the floating point system with +500 bits of precision and a maximum weight for the MZVs and Euler sums of 15. +The \texttt{mzv\_} functions are then evaluated with the \texttt{Evaluate} +statement. In the second module we divide the numbers and convert the result to a rational. It is a good idea to try this with various precisions to see whether this is stable. With 60 bits the final answer would be @@ -202,5 +255,51 @@ \chapter{Floating point} \end{verbatim} while at 150 bits we have already the same answer as with 500 bits. The fraction that is obtained by this program can be proven to be correct. -\vspace{3mm} + +\section{Raw form} +\label{sec:float_raw} +Internally, floating point numbers are represented by the function \texttt{float\_}, +i.e. \texttt{float\_(prec, size, exp, limbs)}. The integer arguments encode the +internal representation of the floating point number as in the GMP library: +\begin{description} +\item[prec] The precision of the mantissa in limbs. +\item[size] The number of limbs currently in use. +\item[exp] The exponent, determining the location of the implied radix point. +\item[limbs] The limbs packed as the numerator of a \FORM{} rational. +\end{description} +In a normalized term containing \texttt{float\_}, the rational coefficient must +be either $1/1$ or $-1/1$, where the sign of the term is absorbed into the rational +coefficient. +Furthermore, the \texttt{float\_} is protected from the pattern matcher and from +statements that act on functions -- such as \texttt{Transform}, \texttt{Argument}, +\texttt{Normalize} etc. +The following program illustrates this: +% +\begin{verbatim} + CFunction f; + #StartFloat 10d + Local F = 1.23456789 + f(1,2); + Identify f?(?a) = f(10); + Print "<1> %t"; + .sort +<1> + 1.23456789e+00 +<1> + f(10) + #EndFloat + Normalize; + Print "<2> %t"; + .sort +<2> + float_(2,3,1,420101683733788795657820481376616399786) +<2> + 10*f(1) + #StartFloat 5d + Print "<3> %t"; + .end +<3> + 1.2346e+00 +<3> + 10*f(1) +\end{verbatim} +% +As shown, the \texttt{id}-statement does not effect the \texttt{float\_} function. +Here we also see the use of the preprocessor statement \texttt{\#EndFloat} which closes +the floating point system. After this statement, the \texttt{float\_} function becomes a +regular function. Its protected status, however, persists so that \texttt{id}-statements +or statements like \texttt{Normalize} still do not modify it. \ No newline at end of file diff --git a/doc/manual/functions.tex b/doc/manual/functions.tex index 113bd8db..40522c62 100644 --- a/doc/manual/functions.tex +++ b/doc/manual/functions.tex @@ -495,6 +495,14 @@ \section{firstterm\_}\index{firstterm\_}\index{function!firstterm\_} not generate an error message. %--#] firstterm_ : +%--#[ float_ : +\section{float\_}\index{float\_}\index{function!float\_} +\label{funfloat} +\noindent Internal function to describe floating point numbers. +For a description, of this function, see chapter~\ref{floatingpoint} +on the floating point system. + +%--#] float_ : %--#[ g5_ : \section{g5\_}\index{g5\_}\index{function!g5\_} @@ -1291,6 +1299,13 @@ \section{thetap\_}\index{thetap\_}\index{function!thetap\_} nothing is done. %--#] thetap_ : +%--#[ tofloat_ : + +\section{tofloat\_}\index{tofloat\_}\index{function!tofloat\_} +\label{funmatch} +\noindent Currently not active. + +%--#] tofloat_ : %--#[ topologies_ : \section{topologies\_}\index{topologies\_}\index{function!topologies\_} @@ -1299,15 +1314,22 @@ \section{topologies\_}\index{topologies\_}\index{function!topologies\_} diagrams~\ref{diagrams}. %--#] topologies_ : +%--#[ torat_ : + +\section{torat\_}\index{torat\_}\index{function!torat\_} +\label{funmatch} +\noindent Currently not active. + +%--#] torat_ : %--#[ Reserved names : \section{Extra reserved names} -\noindent In addition there are some names that have been reserved for -future use. At the moment these functions do not do very much. It is hoped -that in the future some simplifications of the arguments can be -implemented. These functions are: +\noindent In addition there are reserved function names for the common +mathematical functions. Since \FORM{} 5.0, these functions can be +evaluated numerically. See also chapter~\ref{floatingpoint} on the +floating point system. These functions are: \leftvitem{3cm}{sqrt\_}\index{sqrt\_}\index{function!sqrt\_} \rightvitem{13cm}{The regular square root.} @@ -1315,9 +1337,15 @@ \section{Extra reserved names} \leftvitem{3cm}{ln\_}\index{ln\_}\index{function!ln\_} \rightvitem{13cm}{The natural logarithm.} -\leftvitem{3cm}{eexp\_}\index{eexp\_}\index{function!eexp\_} +\leftvitem{3cm}{eexp\_}\index{eexp\_}\index{function!exp\_} \rightvitem{13cm}{The exponential function.} +\leftvitem{3cm}{gamma\_}\index{gamma\_}\index{function!gamma\_} +\rightvitem{13cm}{The gamma function.} + +\leftvitem{3cm}{agm\_}\index{agm\_}\index{function!agm\_} +\rightvitem{13cm}{The arithmetic-geometric mean function.} + \leftvitem{3cm}{sin\_}\index{sin\_}\index{function!sin\_} \rightvitem{13cm}{The sine function.} @@ -1360,9 +1388,27 @@ \section{Extra reserved names} \leftvitem{3cm}{li2\_}\index{li2\_}\index{function!li2\_} \rightvitem{13cm}{The dilogarithm function.} +\leftvitem{3cm}{mzv\_}\index{mzv\_}\index{function!mzv\_} +\rightvitem{13cm}{The multiple zeta values.} + +\leftvitem{3cm}{euler\_}\index{euler\_}\index{function!euler\_} +\rightvitem{13cm}{The Euler sums.} + +\leftvitem{3cm}{mzvhalf\_}\index{mzvhalf\_}\index{function!mzvhalf\_} +\rightvitem{13cm}{The harmonic polylogarithms of argument 1/2.} + +\noindent In addition there are some names that have been reserved for +future use. + \leftvitem{3cm}{lin\_}\index{lin\_}\index{function!lin\_} \rightvitem{13cm}{The polylogarithm function.} +\leftvitem{3cm}{hpl\_}\index{hpl\_}\index{function!hpl\_} +\rightvitem{13cm}{The harmonic polylogarithm function.} + +\leftvitem{3cm}{mpl\_}\index{mpl\_}\index{function!mpl\_} +\rightvitem{13cm}{The multiple polylogarithm function.} + \noindent The user is allowed to use these functions, but it could be that in the future they will develop a nontrivial behaviour. Hence caution is required. diff --git a/doc/manual/statements.tex b/doc/manual/statements.tex index b6cca485..1febcfec 100644 --- a/doc/manual/statements.tex +++ b/doc/manual/statements.tex @@ -1496,7 +1496,7 @@ \section{evaluate} \noindent \begin{tabular}{ll} Type & Executable statement\\ -Syntax & evaluate [$<$function name$>$]; \\ +Syntax & evaluate [$<$function(s)$>$][,$<$symbol(s)$>$]; \\ \end{tabular} \vspace{4mm} \noindent See chapter~\ref{floatingpoint} on the floating point capability.