pcre2/pcre2-10.22-Fix-class-bug-when-UCP-but-not-UTF-was-set-and-all-w.patch

138 lines
4.5 KiB
Diff
Raw Normal View History

From b3343e2c2c77b85f841a7af5e4121dab11692065 Mon Sep 17 00:00:00 2001
From: ph10 <ph10@6239d852-aaf2-0410-a92c-79f79f948069>
Date: Mon, 26 Dec 2016 17:11:18 +0000
Subject: [PATCH] Fix class bug when UCP but not UTF was set and all wide
characters need to be included.
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Ported to 10.22:
commit a83027bb4b195c879d504da051571f22a5ac7ca3
Author: ph10 <ph10@6239d852-aaf2-0410-a92c-79f79f948069>
Date: Mon Dec 26 17:11:18 2016 +0000
Fix class bug when UCP but not UTF was set and all wide characters need to b
e
included.
git-svn-id: svn://vcs.exim.org/pcre2/code/trunk@628 6239d852-aaf2-0410-a92c-79f79f948069
Signed-off-by: Petr Písař <ppisar@redhat.com>
---
src/pcre2_compile.c | 10 ++++++++--
testdata/testinput10 | 2 ++
testdata/testinput12 | 2 ++
testdata/testoutput10 | 8 ++++++++
testdata/testoutput12-16 | 8 ++++++++
testdata/testoutput12-32 | 8 ++++++++
6 files changed, 36 insertions(+), 2 deletions(-)
diff --git a/src/pcre2_compile.c b/src/pcre2_compile.c
index ae6b5e1..2c9f758 100644
--- a/src/pcre2_compile.c
+++ b/src/pcre2_compile.c
@@ -4482,10 +4482,14 @@ for (;; ptr++)
In the special case where there are no xclass items, this is
automatically handled by the use of OP_CLASS or OP_NCLASS, but an
explicit range is needed for OP_XCLASS. Setting a flag here causes
- the range to be generated later when it is known that OP_XCLASS is
- required. */
+ the range to be generated later when it is known that
+ OP_XCLASS is required. In the 8-bit library this is relevant only in
+ utf mode, since no wide characters can exist otherwise. */
default:
+#if PCRE2_CODE_UNIT_WIDTH == 8
+ if (utf)
+#endif
match_all_or_no_wide_chars |= local_negate;
break;
}
@@ -4993,6 +4997,8 @@ for (;; ptr++)
all wide characters (depending on whether the whole class is or is not
negated). This requirement is indicated by match_all_or_no_wide_chars being
true. We do this by including an explicit range, which works in both cases.
+ This applies only in UTF and 16-bit and 32-bit non-UTF modes, since there
+ cannot be any wide characters in 8-bit non-UTF mode.
When there *are* properties in a positive UTF-8 or any 16-bit or 32_bit
class where \S etc is present without PCRE2_UCP, causing an extended class
diff --git a/testdata/testinput10 b/testdata/testinput10
index 4b80778..1c6134b 100644
--- a/testdata/testinput10
+++ b/testdata/testinput10
@@ -454,4 +454,6 @@
\= Expect no match
123
+/[\s[:^ascii:]]/B,ucp
+
# End of testinput10
diff --git a/testdata/testinput12 b/testdata/testinput12
index 29934ec..d851ae6 100644
--- a/testdata/testinput12
+++ b/testdata/testinput12
@@ -354,4 +354,6 @@
\= Expect no match
123
+/[\s[:^ascii:]]/B,ucp
+
# End of testinput12
diff --git a/testdata/testoutput10 b/testdata/testoutput10
index 0c1e9b2..aef89ca 100644
--- a/testdata/testoutput10
+++ b/testdata/testoutput10
@@ -1564,4 +1564,12 @@ Failed: error -22: UTF-8 error: isolated byte with 0x80 bit set at offset 1
123
No match
+/[\s[:^ascii:]]/B,ucp
+------------------------------------------------------------------
+ Bra
+ [\x80-\xff\p{Xsp}]
+ Ket
+ End
+------------------------------------------------------------------
+
# End of testinput10
diff --git a/testdata/testoutput12-16 b/testdata/testoutput12-16
index 9cd6640..e2d5b9f 100644
--- a/testdata/testoutput12-16
+++ b/testdata/testoutput12-16
@@ -1396,4 +1396,12 @@ Subject length lower bound = 2
123
No match
+/[\s[:^ascii:]]/B,ucp
+------------------------------------------------------------------
+ Bra
+ [\x80-\xff\p{Xsp}\x{100}-\x{ffff}]
+ Ket
+ End
+------------------------------------------------------------------
+
# End of testinput12
diff --git a/testdata/testoutput12-32 b/testdata/testoutput12-32
index 75a5ad7..7479a93 100644
--- a/testdata/testoutput12-32
+++ b/testdata/testoutput12-32
@@ -1390,4 +1390,12 @@ Failed: error -28: UTF-32 error: code points greater than 0x10ffff are not defin
123
No match
+/[\s[:^ascii:]]/B,ucp
+------------------------------------------------------------------
+ Bra
+ [\x80-\xff\p{Xsp}\x{100}-\x{ffffffff}]
+ Ket
+ End
+------------------------------------------------------------------
+
# End of testinput12
--
2.7.4