File 4311-Add-note-about-a-new-regular-expression-engine-in-OT.patch of Package erlang
From 2884ebed1e513c97d84d7f715a68c84f113a29ef Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Bj=C3=B6rn=20Gustavsson?= <bjorn@erlang.org>
Date: Tue, 14 Mar 2023 06:54:21 +0100
Subject: [PATCH] Add note about a new regular expression engine in OTP 27
In Erlang/OTP 27, we will stop using the PCRE regular expression
library, because it is no longer maintained. It is likely that we
will instead use RE2 (https://github.com/google/re2).
---
.../upcoming_incompatibilities.xml | 51 +++++++++++++++++++
1 file changed, 51 insertions(+)
diff --git a/system/doc/general_info/upcoming_incompatibilities.xml b/system/doc/general_info/upcoming_incompatibilities.xml
index 62cdaf75c5..3f64dddf22 100644
--- a/system/doc/general_info/upcoming_incompatibilities.xml
+++ b/system/doc/general_info/upcoming_incompatibilities.xml
@@ -150,6 +150,57 @@
In OTP 28, the <c>{pid,_}</c>element will be removed altogether.
</p>
</section>
+
+ <section>
+ <marker id="new_re_engine"/>
+ <title>The re module will use a different regular expression engine</title>
+
+ <p>The functionality of module <seeerl
+ marker="stdlib:re"><c>re</c></seeerl> is currently provided by
+ the PCRE library, which is no longer actively
+ maintained. Therefore, in OTP 27, we will switch to a different
+ regular expression library.</p>
+
+ <p>The source code for PCRE used by the <c>re</c> module has
+ been modified by the OTP team to ensure that a regular
+ expression match would yield when matching huge input binaries
+ and/or when using demanding (back-tracking) regular
+ expressions. Because of the those modifications, moving to a new
+ version of PCRE has always been a time-consuming process because
+ all of the modifications had to be applied by hand again to the
+ updated PCRE source code.</p>
+
+ <p>Most likely, the new regular expression library will be <url
+ href="https://github.com/google/re2">RE2</url>. RE2 guarantees
+ that the match time is linear in the length of input string, and
+ it also eschews recursion to avoid stack overflow. That should
+ make it possible to use RE2 without modifying its source
+ code. For more information about why RE2 is a good choice, see
+ <url
+ href="https://github.com/google/re2/wiki/WhyRE2">WhyRE2</url>.</p>
+
+ <p>Some of implications of this change are:</p>
+
+ <list>
+ <item><p>We expect that the functions in the <c>re</c> module
+ will continue to be supported, although some of the options are likely
+ to be dis-continued.</p></item>
+
+ <item><p>It is likely that only pattern matching of UTF8-encoded binaries will be
+ supported (not Latin1-encoded binaries).</p></item>
+
+ <item><p>In order to guarantee the linear-time performance,
+ RE2 does not support all the constructs in regular expression
+ patterns that PCRE do. For example, backreferences and look-around
+ assertions are not supported. See <url
+ href="https://github.com/google/re2/wiki/Syntax">Syntax</url>
+ for a description of what RE2 supports.</p></item>
+
+ <item><p>Compiling a regular expression is likely to be
+ slower, and thus more can be gained by explicitly compiling
+ the regular expression before matching with it.</p></item>
+ </list>
+ </section>
</section>
</chapter>
--
2.35.3