Fix incorrect first matching character when a backreference with zero minimum repeat starts a pattern

This commit is contained in:
Petr Písař 2017-12-22 14:52:40 +01:00
parent 347d7363ce
commit f4e05051d1
2 changed files with 97 additions and 0 deletions

View File

@ -0,0 +1,91 @@
From b5343d4a647d25640e16bfa1f813c39a7f2059a6 Mon Sep 17 00:00:00 2001
From: ph10 <ph10@6239d852-aaf2-0410-a92c-79f79f948069>
Date: Tue, 12 Dec 2017 15:01:51 +0000
Subject: [PATCH] Fix incorrect first matching character when a backreference
with zero minimum repeat starts a pattern (possibly after assertions).
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
git-svn-id: svn://vcs.exim.org/pcre2/code/trunk@888 6239d852-aaf2-0410-a92c-79f79f948069
Petr Písař: Ported to 10.30.
---
src/pcre2_compile.c | 2 +-
testdata/testinput2 | 10 ++++++++++
testdata/testoutput2 | 28 ++++++++++++++++++++++++++++
diff --git a/src/pcre2_compile.c b/src/pcre2_compile.c
index 0b91d14..ad17338 100644
--- a/src/pcre2_compile.c
+++ b/src/pcre2_compile.c
@@ -7135,7 +7135,7 @@ for (;; pptr++)
later. */
HANDLE_SINGLE_REFERENCE:
- if (firstcuflags == REQ_UNSET) firstcuflags = REQ_NONE;
+ if (firstcuflags == REQ_UNSET) zerofirstcuflags = firstcuflags = REQ_NONE;
*code++ = ((options & PCRE2_CASELESS) != 0)? OP_REFI : OP_REF;
PUT2INC(code, 0, meta_arg);
diff --git a/testdata/testinput2 b/testdata/testinput2
index 022df20..695f0a4 100644
--- a/testdata/testinput2
+++ b/testdata/testinput2
@@ -5375,4 +5375,14 @@ a)"xI
/[\d-[:print:]]/
+# Perl gets the second of these wrong, giving no match.
+
+"(?<=(a))\1?b"I
+ ab
+ aaab
+
+"(?=(a))\1?b"I
+ ab
+ aaab
+
# End of testinput2
diff --git a/testdata/testoutput2 b/testdata/testoutput2
index 2d9e347..31ccfbe 100644
--- a/testdata/testoutput2
+++ b/testdata/testoutput2
@@ -16340,6 +16340,34 @@ Failed: error 150 at offset 3: invalid range in character class
/[\d-[:print:]]/
Failed: error 150 at offset 3: invalid range in character class
+# Perl gets the second of these wrong, giving no match.
+
+"(?<=(a))\1?b"I
+Capturing subpattern count = 1
+Max back reference = 1
+Max lookbehind = 1
+Last code unit = 'b'
+Subject length lower bound = 1
+ ab
+ 0: b
+ 1: a
+ aaab
+ 0: ab
+ 1: a
+
+"(?=(a))\1?b"I
+Capturing subpattern count = 1
+Max back reference = 1
+Starting code units: a
+Last code unit = 'b'
+Subject length lower bound = 1
+ ab
+ 0: ab
+ 1: a
+ aaab
+ 0: ab
+ 1: a
+
# End of testinput2
Error -65: PCRE2_ERROR_BADDATA (unknown error number)
Error -62: bad serialized data
--
2.13.6

View File

@ -59,6 +59,9 @@ Patch5: pcre2-10.30-Fix-pcre2_jit_match-early-check.patch
# Allow pcre2grep match counter to handle values larger than 2147483647, # Allow pcre2grep match counter to handle values larger than 2147483647,
# upstream bug #2208, in upstream after 10.30 # upstream bug #2208, in upstream after 10.30
Patch6: pcre2-10.30-Change-pcre2grep-line-number-and-count-variables-to-.patch Patch6: pcre2-10.30-Change-pcre2grep-line-number-and-count-variables-to-.patch
# Fix incorrect first matching character when a backreference with zero minimum
# repeat starts a pattern, upstream bug #2209, in upstream after 10.30
Patch7: pcre2-10.30-Fix-incorrect-first-matching-character-when-a-backre.patch
BuildRequires: autoconf BuildRequires: autoconf
BuildRequires: automake BuildRequires: automake
BuildRequires: coreutils BuildRequires: coreutils
@ -139,6 +142,7 @@ Utilities demonstrating PCRE2 capabilities like pcre2grep or pcre2test.
%patch4 -p1 %patch4 -p1
%patch5 -p1 %patch5 -p1
%patch6 -p1 %patch6 -p1
%patch7 -p1
# Because of multilib patch # Because of multilib patch
libtoolize --copy --force libtoolize --copy --force
autoreconf -vif autoreconf -vif
@ -246,6 +250,8 @@ make %{?_smp_mflags} check VERBOSE=yes
- Fix pcre2_jit_match() to properly check the pattern was JIT-compiled - Fix pcre2_jit_match() to properly check the pattern was JIT-compiled
- Allow pcre2grep match counter to handle values larger than 2147483647, - Allow pcre2grep match counter to handle values larger than 2147483647,
(upstream bug #2208) (upstream bug #2208)
- Fix incorrect first matching character when a backreference with zero minimum
repeat starts a pattern (upstream bug #2209)
* Mon Nov 13 2017 Petr Pisar <ppisar@redhat.com> - 10.30-3 * Mon Nov 13 2017 Petr Pisar <ppisar@redhat.com> - 10.30-3
- Fix multi-line matching in pcre2grep tool (upstream bug #2187) - Fix multi-line matching in pcre2grep tool (upstream bug #2187)