Fix bug when \K is used in a lookbehind in a substitute pattern
This commit is contained in:
parent
03881ebe32
commit
fee9c36df1
@ -0,0 +1,179 @@
|
||||
From 0efedaf8864d1caa8ed0e7f8fb0b50d5231cacfa Mon Sep 17 00:00:00 2001
|
||||
From: ph10 <ph10@6239d852-aaf2-0410-a92c-79f79f948069>
|
||||
Date: Fri, 22 Jun 2018 16:29:56 +0000
|
||||
Subject: [PATCH] Fix bug when \K is used in a lookbehind in a substitute
|
||||
pattern.
|
||||
MIME-Version: 1.0
|
||||
Content-Type: text/plain; charset=UTF-8
|
||||
Content-Transfer-Encoding: 8bit
|
||||
|
||||
git-svn-id: svn://vcs.exim.org/pcre2/code/trunk@948 6239d852-aaf2-0410-a92c-79f79f948069
|
||||
Petr Písař <ppisar@redhat.com>: Ported to 10.31.
|
||||
|
||||
Signed-off-by: Petr Písař <ppisar@redhat.com>
|
||||
---
|
||||
doc/html/pcre2api.html | 14 ++++++++++++--
|
||||
doc/pcre2.txt | 14 ++++++++++++--
|
||||
doc/pcre2api.3 | 3 ++-
|
||||
src/pcre2_error.c | 2 +-
|
||||
src/pcre2_substitute.c | 6 +++---
|
||||
testdata/testinput2 | 3 +++
|
||||
testdata/testoutput2 | 6 +++++-
|
||||
7 files changed, 38 insertions(+), 10 deletions(-)
|
||||
|
||||
diff --git a/doc/html/pcre2api.html b/doc/html/pcre2api.html
|
||||
index ba3b2ca..af904e6 100644
|
||||
--- a/doc/html/pcre2api.html
|
||||
+++ b/doc/html/pcre2api.html
|
||||
@@ -2549,7 +2549,7 @@ calls to <b>pcre2_match()</b> if you are making repeated calls to find other
|
||||
matches in the same subject string.
|
||||
</P>
|
||||
<P>
|
||||
-WARNING: When PCRE2_NO_UTF_CHECK is set, the effect of passing an invalid
|
||||
+<b>Warning:</b> When PCRE2_NO_UTF_CHECK is set, the effect of passing an invalid
|
||||
string as a subject, or an invalid value of <i>startoffset</i>, is undefined.
|
||||
Your program may crash or loop indefinitely.
|
||||
<pre>
|
||||
@@ -2756,6 +2756,15 @@ branch of the group, but it is not on the matching path. On the other hand,
|
||||
when this pattern fails to match "bx", the returned name is B.
|
||||
</P>
|
||||
<P>
|
||||
+<b>Warning:</b> By default, certain start-of-match optimizations are used to
|
||||
+give a fast "no match" result in some situations. For example, if the anchoring
|
||||
+is removed from the pattern above, there is an initial check for the presence
|
||||
+of "c" in the subject before running the matching engine. This check fails for
|
||||
+"bx", causing a match failure without seeing any marks. You can disable the
|
||||
+start-of-match optimizations by setting the PCRE2_NO_START_OPTIMIZE option for
|
||||
+<b>pcre2_compile()</b> or starting the pattern with (*NO_START_OPT).
|
||||
+</P>
|
||||
+<P>
|
||||
After a successful match, a partial match, or one of the invalid UTF errors
|
||||
(for example, PCRE2_ERROR_UTF8_ERR5), <b>pcre2_get_startchar()</b> can be
|
||||
called. After a successful or partial match it returns the code unit offset of
|
||||
@@ -3310,7 +3319,8 @@ replacement string, with more particular errors being PCRE2_ERROR_BADREPESCAPE
|
||||
(invalid escape sequence), PCRE2_ERROR_REPMISSINGBRACE (closing curly bracket
|
||||
not found), PCRE2_ERROR_BADSUBSTITUTION (syntax error in extended group
|
||||
substitution), and PCRE2_ERROR_BADSUBSPATTERN (the pattern match ended before
|
||||
-it started, which can happen if \K is used in an assertion).
|
||||
+it started or the match started earlier than the current position in the
|
||||
+subject, which can happen if \K is used in an assertion).
|
||||
</P>
|
||||
<P>
|
||||
As for all PCRE2 errors, a text message that describes the error can be
|
||||
diff --git a/doc/pcre2.txt b/doc/pcre2.txt
|
||||
index 79d94e3..e5b941f 100644
|
||||
--- a/doc/pcre2.txt
|
||||
+++ b/doc/pcre2.txt
|
||||
@@ -2498,7 +2498,7 @@ MATCHING A PATTERN: THE TRADITIONAL FUNCTION
|
||||
second and subsequent calls to pcre2_match() if you are making repeated
|
||||
calls to find other matches in the same subject string.
|
||||
|
||||
- WARNING: When PCRE2_NO_UTF_CHECK is set, the effect of passing an
|
||||
+ Warning: When PCRE2_NO_UTF_CHECK is set, the effect of passing an
|
||||
invalid string as a subject, or an invalid value of startoffset, is
|
||||
undefined. Your program may crash or loop indefinitely.
|
||||
|
||||
@@ -2683,6 +2683,15 @@ OTHER INFORMATION ABOUT A MATCH
|
||||
the other hand, when this pattern fails to match "bx", the returned
|
||||
name is B.
|
||||
|
||||
+ Warning: By default, certain start-of-match optimizations are used to
|
||||
+ give a fast "no match" result in some situations. For example, if the
|
||||
+ anchoring is removed from the pattern above, there is an initial check
|
||||
+ for the presence of "c" in the subject before running the matching
|
||||
+ engine. This check fails for "bx", causing a match failure without see-
|
||||
+ ing any marks. You can disable the start-of-match optimizations by set-
|
||||
+ ting the PCRE2_NO_START_OPTIMIZE option for pcre2_compile() or starting
|
||||
+ the pattern with (*NO_START_OPT).
|
||||
+
|
||||
After a successful match, a partial match, or one of the invalid UTF
|
||||
errors (for example, PCRE2_ERROR_UTF8_ERR5), pcre2_get_startchar() can
|
||||
be called. After a successful or partial match it returns the code unit
|
||||
@@ -3209,7 +3218,8 @@ CREATING A NEW STRING WITH SUBSTITUTIONS
|
||||
PCRE2_ERROR_BADREPESCAPE (invalid escape sequence), PCRE2_ERROR_REP-
|
||||
MISSINGBRACE (closing curly bracket not found), PCRE2_ERROR_BADSUBSTI-
|
||||
TUTION (syntax error in extended group substitution), and
|
||||
- PCRE2_ERROR_BADSUBSPATTERN (the pattern match ended before it started,
|
||||
+ PCRE2_ERROR_BADSUBSPATTERN (the pattern match ended before it started
|
||||
+ or the match started earlier than the current position in the subject,
|
||||
which can happen if \K is used in an assertion).
|
||||
|
||||
As for all PCRE2 errors, a text message that describes the error can be
|
||||
diff --git a/doc/pcre2api.3 b/doc/pcre2api.3
|
||||
index 786b314..ac6e246 100644
|
||||
--- a/doc/pcre2api.3
|
||||
+++ b/doc/pcre2api.3
|
||||
@@ -3302,7 +3302,8 @@ replacement string, with more particular errors being PCRE2_ERROR_BADREPESCAPE
|
||||
(invalid escape sequence), PCRE2_ERROR_REPMISSINGBRACE (closing curly bracket
|
||||
not found), PCRE2_ERROR_BADSUBSTITUTION (syntax error in extended group
|
||||
substitution), and PCRE2_ERROR_BADSUBSPATTERN (the pattern match ended before
|
||||
-it started, which can happen if \eK is used in an assertion).
|
||||
+it started or the match started earlier than the current position in the
|
||||
+subject, which can happen if \eK is used in an assertion).
|
||||
.P
|
||||
As for all PCRE2 errors, a text message that describes the error can be
|
||||
obtained by calling the \fBpcre2_get_error_message()\fP function (see
|
||||
diff --git a/src/pcre2_error.c b/src/pcre2_error.c
|
||||
index d98cae9..a1f98d4 100644
|
||||
--- a/src/pcre2_error.c
|
||||
+++ b/src/pcre2_error.c
|
||||
@@ -255,7 +255,7 @@ static const unsigned char match_error_texts[] =
|
||||
"expected closing curly bracket in replacement string\0"
|
||||
"bad substitution in replacement string\0"
|
||||
/* 60 */
|
||||
- "match with end before start is not supported\0"
|
||||
+ "match with end before start or start moved backwards is not supported\0"
|
||||
"too many replacements (more than INT_MAX)\0"
|
||||
"bad serialized data\0"
|
||||
"heap limit exceeded\0"
|
||||
diff --git a/src/pcre2_substitute.c b/src/pcre2_substitute.c
|
||||
index 8da951f..955370a 100644
|
||||
--- a/src/pcre2_substitute.c
|
||||
+++ b/src/pcre2_substitute.c
|
||||
@@ -361,9 +361,9 @@ do
|
||||
}
|
||||
|
||||
/* Handle a successful match. Matches that use \K to end before they start
|
||||
- are not supported. */
|
||||
-
|
||||
- if (ovector[1] < ovector[0])
|
||||
+ or start before the current point in the subject are not supported. */
|
||||
+
|
||||
+ if (ovector[1] < ovector[0] || ovector[0] < start_offset)
|
||||
{
|
||||
rc = PCRE2_ERROR_BADSUBSPATTERN;
|
||||
goto EXIT;
|
||||
diff --git a/testdata/testinput2 b/testdata/testinput2
|
||||
index 5d3a80e..3499042 100644
|
||||
--- a/testdata/testinput2
|
||||
+++ b/testdata/testinput2
|
||||
@@ -4643,6 +4643,9 @@ B)x/alt_verbnames,mark
|
||||
|
||||
/(?=a\K)/replace=z
|
||||
BaCaD
|
||||
+
|
||||
+/(?<=\K.)/g,replace=-
|
||||
+ ab
|
||||
|
||||
/(?'abcdefghijklmnopqrstuvwxyzABCDEFG'toolong)/
|
||||
|
||||
diff --git a/testdata/testoutput2 b/testdata/testoutput2
|
||||
index fcaac8f..f9e128d 100644
|
||||
--- a/testdata/testoutput2
|
||||
+++ b/testdata/testoutput2
|
||||
@@ -14899,7 +14899,11 @@ Subject length lower bound = 1
|
||||
|
||||
/(?=a\K)/replace=z
|
||||
BaCaD
|
||||
-Failed: error -60: match with end before start is not supported
|
||||
+Failed: error -60: match with end before start or start moved backwards is not supported
|
||||
+
|
||||
+/(?<=\K.)/g,replace=-
|
||||
+ ab
|
||||
+Failed: error -60: match with end before start or start moved backwards is not supported
|
||||
|
||||
/(?'abcdefghijklmnopqrstuvwxyzABCDEFG'toolong)/
|
||||
Failed: error 148 at offset 36: subpattern name is too long (maximum 32 characters)
|
||||
--
|
||||
2.14.4
|
||||
|
@ -9,7 +9,7 @@
|
||||
#%%global rcversion RC1
|
||||
Name: pcre2
|
||||
Version: 10.31
|
||||
Release: %{?rcversion:0.}4%{?rcversion:.%rcversion}%{?dist}
|
||||
Release: %{?rcversion:0.}5%{?rcversion:.%rcversion}%{?dist}
|
||||
%global myversion %{version}%{?rcversion:-%rcversion}
|
||||
Summary: Perl-compatible regular expression library
|
||||
# the library: BSD with exceptions
|
||||
@ -68,6 +68,9 @@ Patch6: pcre2-10.31-A-small-fix-to-pcre2grep-to-avoid-compiler-warnings-.pat
|
||||
# Fix setting error offset zero for early errors in pcre2_pattern_convert(),
|
||||
# in upstream after 10.31
|
||||
Patch7: pcre2-10.31-Set-error-offset-zero-for-early-errors-in-pcre2_patt.patch
|
||||
# Fix bug when \K is used in a lookbehind in a substitute pattern,
|
||||
# in upstream after 10.31
|
||||
Patch8: pcre2-10.31-Fix-bug-when-K-is-used-in-a-lookbehind-in-a-substitu.patch
|
||||
BuildRequires: autoconf
|
||||
BuildRequires: automake
|
||||
BuildRequires: coreutils
|
||||
@ -151,6 +154,7 @@ Utilities demonstrating PCRE2 capabilities like pcre2grep or pcre2test.
|
||||
%patch5 -p1
|
||||
%patch6 -p1
|
||||
%patch7 -p1
|
||||
%patch8 -p1
|
||||
# Because of multilib patch
|
||||
libtoolize --copy --force
|
||||
autoreconf -vif
|
||||
@ -253,6 +257,9 @@ make %{?_smp_mflags} check VERBOSE=yes
|
||||
%{_mandir}/man1/pcre2test.*
|
||||
|
||||
%changelog
|
||||
* Mon Jun 25 2018 Petr Pisar <ppisar@redhat.com> - 10.31-5
|
||||
- Fix bug when \K is used in a lookbehind in a substitute pattern
|
||||
|
||||
* Fri Mar 16 2018 Petr Pisar <ppisar@redhat.com> - 10.31-4
|
||||
- Fix setting error offset zero for early errors in pcre2_pattern_convert()
|
||||
|
||||
|
Loading…
Reference in New Issue
Block a user