File 1054-Make-printing-of-floats-more-user-friendly.patch of Package erlang

From 8a1ecbf9205b9416c17563cbc5a2fbd44d9a605e Mon Sep 17 00:00:00 2001
From: Kjell Winblad <kjellwinblad@gmail.com>
Date: Mon, 5 Oct 2020 14:22:51 +0200
Subject: [PATCH] Make printing of floats more user friendly

The pretty printer of floats may round the float value before printing
as long as the following property holds (X is a float value):

X=:=erlang:list_to_float(lists:flatten(io_lib:format("~w",[X]))).

This commit makes sure that the rounded digit always appear last, or
just before e if the float is printed using scientific notation. This
is accomplished by always printing using scientific notation if the
number is so large that it might be rounded before the fractional
part.

Before this commit, users could see the following in the Erlang shell:

> 36028797018963970.0.
36028797018963970.0
> 36028797018963970.0 == 36028797018963970.
false
> 36028797018963970.0 == 36028797018963968.
true

36028797018963970.0 is represented as the float value
36028797018963968.0 but the pretty printer rounds it to
36028797018963970.0 as all numbers in the range [36028797018963966.0,
36028797018963972.0] are represented by the float value
36028797018963968.0. However, this can be confusing to users
(https://bugs.erlang.org/browse/ERL-1308) as it is difficult to guess
that this is what is happening.

After this commit, the above example becomes as follows:

> 36028797018963968.0.
3.602879701896397e16
> 3.602879701896397e16 == 36028797018963970.
false
> 3.602879701896397e16 == 36028797018963968.
true

Which is less confusing because it is natural to think that the last
digit of a pretty printed float value can be rounded.

Thanks to @zuiderkwast for feedback on the documentation
---
 erts/doc/src/erlang.xml                    | 31 ++++++++-----
 lib/stdlib/src/io_lib_format.erl           | 41 +++++++++++++----
 lib/stdlib/test/io_SUITE.erl               | 52 ++++++++++++++++++++--
 system/doc/reference_manual/data_types.xml | 34 +++++++++++---
 4 files changed, 129 insertions(+), 29 deletions(-)

diff --git a/erts/doc/src/erlang.xml b/erts/doc/src/erlang.xml
index f39949a50f..0c5f4573b7 100644
--- a/erts/doc/src/erlang.xml
+++ b/erts/doc/src/erlang.xml
@@ -2038,10 +2038,10 @@ true</pre>
 "3.00000000000000044409e-01"
 </pre>
 <p>In the last example, <c>float_to_list(0.1+0.2)</c> evaluates to
-<c>"3.00000000000000044409e-01"</c>. The reason for this is that
-<seealso marker="system/reference_manual:data_types#float_representation_problem"><c>0.1</c> and
-<c>0.2</c> cannot be represented exactly as float
-values</seealso>.</p>
+<c>"3.00000000000000044409e-01"</c>. The reason for this is explained
+in <seealso
+marker="system/reference_manual:data_types#float_representation_problem">Representation
+of Floating Point Numbers</seealso>.</p>
       </desc>
     </func>
 
@@ -6478,9 +6478,13 @@ true</pre>
 36028797018963968</pre>
       <p>In the last example, <c>round(36028797018963969.0)</c>
       evaluates to <c>36028797018963968</c>. The reason for this is
-      that <seealso
-      marker="system/reference_manual:data_types#float_representation_problem"><c>36028797018963969.0</c>
-      cannot be represented exactly as a float value</seealso>.</p>
+      that the number <c>36028797018963969.0</c> cannot be represented
+      exactly as a float value. Instead, the float literal is
+      represented as <c>36028797018963968.0</c>, which is the closest number
+      that can be represented exactly as a float value. See <seealso
+      marker="system/reference_manual:data_types#float_representation_problem">Representation
+      of Floating Point Numbers</seealso> for additional
+      information.</p>
         <p>Allowed in guard tests.</p>
       </desc>
     </func>
@@ -12277,10 +12281,15 @@ improper_end</pre>
         <pre>
 > <input>trunc(36028797018963969.0).</input>
 36028797018963968</pre>
-<p>In the last example, <c>trunc(36028797018963969.0)</c> evaluates to
-<c>36028797018963968</c>. The reason for this is that <seealso
-marker="system/reference_manual:data_types#float_representation_problem"><c>36028797018963969.0</c>
-cannot be represented exactly as a float value</seealso>.</p>
+      <p>In the last example, <c>trunc(36028797018963969.0)</c>
+      evaluates to <c>36028797018963968</c>. The reason for this is
+      that the number <c>36028797018963969.0</c> cannot be represented
+      exactly as a float value. Instead, the float literal is
+      represented as <c>36028797018963968.0</c>, which is the closest number
+      that can be represented exactly as a float value. See <seealso
+      marker="system/reference_manual:data_types#float_representation_problem">Representation
+      of Floating Point Numbers</seealso> for additional
+      information.</p>
         <p>Allowed in guard tests.</p>
       </desc>
     </func>
diff --git a/lib/stdlib/src/io_lib_format.erl b/lib/stdlib/src/io_lib_format.erl
index 157cc07e19..435b31750e 100644
--- a/lib/stdlib/src/io_lib_format.erl
+++ b/lib/stdlib/src/io_lib_format.erl
@@ -525,12 +525,23 @@ float_data([D|Cs], Ds) when D >= $0, D =< $9 ->
 float_data([_|Cs], Ds) ->
     float_data(Cs, Ds).
 
-%%  Writes the shortest, correctly rounded string that converts
-%%  to Float when read back with list_to_float/1.
+%%  Returns a correctly rounded string that converts to Float when
+%%  read back with list_to_float/1.
 %%
-%%  See also "Printing Floating-Point Numbers Quickly and Accurately"
-%%  in Proceedings of the SIGPLAN '96 Conference on Programming
-%%  Language Design and Implementation.
+%%  When abs(Float) < float(1 bsl 53) the shortest such string is
+%%  returned, and otherwise the shortest such string using scientific
+%%  notation is returned. That is, scientific notation is used if and
+%%  only if scientific notation results in a shorter string than
+%%  normal notation when abs(Float) < float(1 bsl 53), and scientific
+%%  notation is used unconditionally if abs(Float) >= float(1 bsl
+%%  53). See comment in insert_decimal/2 for an explanation for why
+%%  float(1 bsl 53) is chosen as cutoff point.
+%%
+%%  The algorithm that is used to find the decimal number that is
+%%  represented by the returned String is described in "Printing
+%%  Floating-Point Numbers Quickly and Accurately" in Proceedings of
+%%  the SIGPLAN '96 Conference on Programming Language Design and
+%%  Implementation.
 
 -spec fwrite_g(float()) -> string().
 
@@ -539,7 +550,7 @@ fwrite_g(0.0) ->
 fwrite_g(Float) when is_float(Float) ->
     {Frac, Exp} = mantissa_exponent(Float),
     {Place, Digits} = fwrite_g_1(Float, Exp, Frac),
-    R = insert_decimal(Place, [$0 + D || D <- Digits]),
+    R = insert_decimal(Place, [$0 + D || D <- Digits], Float),
     [$- || true <- [Float < 0.0]] ++ R.
 
 -define(BIG_POW, (1 bsl 52)).
@@ -629,9 +640,9 @@ generate(R0, S, MPlus, MMinus, LowOk, HighOk) ->
             [D + 1]
     end.
 
-insert_decimal(0, S) ->
+insert_decimal(0, S, _) ->
     "0." ++ S;
-insert_decimal(Place, S) ->
+insert_decimal(Place, S, Float) ->
     L = length(S),
     if
         Place < 0;
@@ -649,7 +660,19 @@ insert_decimal(Place, S) ->
                     end;
                 true ->
                     if
-                        Place - L + 2 =< ExpCost ->
+                        %% All integers in the range [-2^53, 2^53] can
+                        %% be stored without loss of precision in an
+                        %% IEEE 754 64-bit double but 2^53+1 cannot be
+                        %% stored in an IEEE 754 64-bit double without
+                        %% loss of precision (float((1 bsl 53)+1) =:=
+                        %% float(1 bsl 53)). It thus makes sense to
+                        %% show floats that are >= 2^53 or <= -2^53 in
+                        %% scientific notation to indicate that the
+                        %% number is so large that there could be loss
+                        %% in precion when adding or subtracting 1.
+                        %%
+                        %% https://stackoverflow.com/questions/1848700/biggest-integer-that-can-be-stored-in-a-double?answertab=votes#tab-top
+                        Place - L + 2 =< ExpCost andalso abs(Float) < float(1 bsl 53) ->
                             S ++ lists:duplicate(Place - L, $0) ++ ".0";
                         true ->
                             insert_exp(ExpL, S)
diff --git a/lib/stdlib/test/io_SUITE.erl b/lib/stdlib/test/io_SUITE.erl
index 4eb5b1772c..5b8e42db1c 100644
--- a/lib/stdlib/test/io_SUITE.erl
+++ b/lib/stdlib/test/io_SUITE.erl
@@ -21,7 +21,7 @@
 
 -export([all/0, suite/0]).
 
--export([error_1/1, float_g/1, otp_5403/1, otp_5813/1, otp_6230/1, 
+-export([error_1/1, float_g/1, float_w/1, otp_5403/1, otp_5813/1, otp_6230/1, 
          otp_6282/1, otp_6354/1, otp_6495/1, otp_6517/1, otp_6502/1,
          manpage/1, otp_6708/1, otp_7084/0, otp_7084/1, otp_7421/1,
 	 io_lib_collect_line_3_wb/1, cr_whitespace_in_string/1,
@@ -55,7 +55,7 @@ suite() ->
      {timetrap,{minutes,1}}].
 
 all() -> 
-    [error_1, float_g, otp_5403, otp_5813, otp_6230,
+    [error_1, float_g, float_w, otp_5403, otp_5813, otp_6230,
      otp_6282, otp_6354, otp_6495, otp_6517, otp_6502,
      manpage, otp_6708, otp_7084, otp_7421,
      io_lib_collect_line_3_wb, cr_whitespace_in_string,
@@ -164,11 +164,42 @@ float_g(Config) when is_list(Config) ->
      "-5000.00",
      "-5.00000e+4",
      "-5.00000e+5"] = float_g_1("~g", -4.9999950001, -2, 5),
+
     ok.
 
 float_g_1(Fmt, V, Min, Max) ->
     [fmt(Fmt, [V*math:pow(10, E)]) || E <- lists:seq(Min, Max)].
 
+float_w(Config) when is_list(Config) ->
+    %% All floats that are >= float(1 bsl 53) or <= -float(1 bsl 53)
+    %% should be printed with scientific notation to make it clear
+    %% that the integer part can have lost precision, for example if
+    %% the float was created from a float literal.
+    %%
+    %% All integers in the range [-2^53, 2^53] can be stored without
+    %% loss of precision in an IEEE 754 64-bit double but 2^53+1
+    %% cannot be stored in an IEEE 754 64-bit double without loss of
+    %% precision (float((1 bsl 53)+1) =:= float(1 bsl 53)).
+    %%
+    %% https://stackoverflow.com/questions/1848700/biggest-integer-that-can-be-stored-in-a-double?answertab=votes#tab-top
+    Nums = [-float((1 bsl 53) -1),
+            -float(1 bsl 53),
+            -float((1 bsl 53) + 1),
+            float((1 bsl 53) -1),
+            float(1 bsl 53),
+            float((1 bsl 53) + 1)],
+
+
+    ["-9007199254740991.0",
+     "-9.007199254740992e15",
+     "-9.007199254740992e15",
+     "9007199254740991.0",
+     "9.007199254740992e15",
+     "9.007199254740992e15"] =
+        [begin g_t(X), fmt("~w", [X]) end || X <- Nums],
+
+    ok.
+
 %% OTP-5403. ~s formats I/O lists and a single binary.
 otp_5403(Config) when is_list(Config) ->
     "atom" = fmt("~s", [atom]),
@@ -1404,11 +1435,26 @@ gcd(A, B) -> gcd(B, A rem B).
 
 %%% End of rational numbers.
 
+%% Check that there is an exponent if and only if characters are saved
+%% when abs(list_to_float(S)) < float(1 bsl 53) and that there is an
+%% exponent when abs(list_to_float(S)) >= float(1 bsl 53).
+g_choice(S) when is_list(S) ->
+    ShouldAlwaysHaveExponent = abs(list_to_float(S)) >= float(1 bsl 53),
+    HasExponent = lists:member($e, S) orelse lists:member($E, S),
+    case ShouldAlwaysHaveExponent of
+        true ->
+            case HasExponent of
+                true -> ok;
+                false -> throw(should_have_exponent)
+            end;
+        false -> g_choice_small(S)
+    end.
+
 %% Check that there is an exponent if and only if characters are
 %% saved. Note: this assumes floating point numbers "Erlang style"
 %% (with a single zero before and after the dot, and no extra leading
 %% zero in the exponent).
-g_choice(S) when is_list(S) ->
+g_choice_small(S) when is_list(S) ->
     [MS | ES0] = string:tokens(S, "eE"),
     [IS, FS] = string:tokens(MS, "."),
     Il = length(IS),
diff --git a/system/doc/reference_manual/data_types.xml b/system/doc/reference_manual/data_types.xml
index f3a85df39b..4904d92e09 100644
--- a/system/doc/reference_manual/data_types.xml
+++ b/system/doc/reference_manual/data_types.xml
@@ -91,13 +91,35 @@
       <pre>
 > <input>0.1+0.2.</input>
 0.30000000000000004
-> <input>36028797018963969.0.</input>
-36028797018963970.0
-      </pre>
-      <p>For more information see <url href="https://floating-point-gui.de/">
-        What Every Programmer Should Know About Floating-Point Arithmetic</url>
-        and <url href="https://0.30000000000000004.com/">0.30000000000000004.com/</url>.
+</pre>
+<p>The real numbers <c>0.1</c> and <c>0.2</c> cannot be represented
+exactly as floats.</p>
+<pre>
+> <input>{36028797018963968.0, 36028797018963968 == 36028797018963968.0,
+  36028797018963970.0, 36028797018963970 == 36028797018963970.0}.</input>
+{3.602879701896397e16, true,
+ 3.602879701896397e16, false}.
+</pre>
+      <p>
+        The value <c>36028797018963968</c> can be represented exactly
+        as a float value but Erlang's pretty printer rounds
+        <c>36028797018963968.0</c> to <c>3.602879701896397e16</c>
+        (<c>=36028797018963970.0</c>) as all values in the range
+        <c>[36028797018963966.0, 36028797018963972.0]</c> are
+        represented by <c>36028797018963968.0</c>.
+      </p>
+      <p>
+        For more information about floats and issues with them see:
       </p>
+        <list type="bulleted">
+          <item><url
+                    href="https://floating-point-gui.de/">What Every Programmer
+          Should Know About Floating-Point Arithmetic</url>,</item>
+          <item><url
+                    href="https://0.30000000000000004.com/">0.30000000000000004.com/</url>, and</item>
+          <item><url href="https://docs.python.org/3/tutorial/floatingpoint.html">Floating Point Arithmetic: Issues and
+          Limitations</url>.</item>
+        </list>
       <p>If you need to work with decimal fractions, for instance if you need to represent money,
         then you should use a library that handles that or work in cents instead of euros so
         that you do not need decimal fractions.
-- 
2.26.2

openSUSE Build Service is sponsored by