import python3-3.6.8-40.el8

This commit is contained in:
CentOS Sources 2021-10-06 13:09:52 -04:00 committed by Stepan Oksanichenko
parent b70d3dfde9
commit 251e21d580
5 changed files with 921 additions and 7 deletions

View File

@ -0,0 +1,684 @@
commit 9e77ec82c40ab59846f9447b7c483e7b8e368b16
Author: Petr Viktorin <pviktori@redhat.com>
Date: Thu Mar 4 13:59:56 2021 +0100
CVE-2021-23336: Add `separator` argument to parse_qs; warn with default
Partially backports https://bugs.python.org/issue42967 : [security] Address a web cache-poisoning issue reported in urllib.parse.parse_qsl().
However, this solution is different than the upstream solution in Python 3.6.13.
An optional argument seperator is added to specify the separator.
It is recommended to set it to '&' or ';' to match the application or proxy in use.
The default can be set with an env variable of a config file.
If neither the argument, env var or config file specifies a separator, "&" is used
but a warning is raised if parse_qs is used on input that contains ';'.
Co-authors of the upstream change (who do not necessarily agree with this):
Co-authored-by: Adam Goldschmidt <adamgold7@gmail.com>
Co-authored-by: Ken Jin <28750310+Fidget-Spinner@users.noreply.github.com>
Co-authored-by: Éric Araujo <merwok@netwok.org>
diff --git a/Doc/library/cgi.rst b/Doc/library/cgi.rst
index 41219eeaaba..ddecc0af23a 100644
--- a/Doc/library/cgi.rst
+++ b/Doc/library/cgi.rst
@@ -277,13 +277,12 @@ These are useful if you want more control, or if you want to employ some of the
algorithms implemented in this module in other circumstances.
-.. function:: parse(fp=None, environ=os.environ, keep_blank_values=False, strict_parsing=False)
+.. function:: parse(fp=None, environ=os.environ, keep_blank_values=False, strict_parsing=False, separator=None)
Parse a query in the environment or from a file (the file defaults to
- ``sys.stdin``). The *keep_blank_values* and *strict_parsing* parameters are
+ ``sys.stdin``). The *keep_blank_values*, *strict_parsing* and *separator* parameters are
passed to :func:`urllib.parse.parse_qs` unchanged.
-
.. function:: parse_qs(qs, keep_blank_values=False, strict_parsing=False)
This function is deprecated in this module. Use :func:`urllib.parse.parse_qs`
@@ -308,7 +307,6 @@ algorithms implemented in this module in other circumstances.
Note that this does not parse nested multipart parts --- use
:class:`FieldStorage` for that.
-
.. function:: parse_header(string)
Parse a MIME header (such as :mailheader:`Content-Type`) into a main value and a
diff --git a/Doc/library/urllib.parse.rst b/Doc/library/urllib.parse.rst
index 647af613a31..bcab7c142bc 100644
--- a/Doc/library/urllib.parse.rst
+++ b/Doc/library/urllib.parse.rst
@@ -143,7 +143,7 @@ or on combining URL components into a URL string.
now raise :exc:`ValueError`.
-.. function:: parse_qs(qs, keep_blank_values=False, strict_parsing=False, encoding='utf-8', errors='replace', max_num_fields=None)
+.. function:: parse_qs(qs, keep_blank_values=False, strict_parsing=False, encoding='utf-8', errors='replace', max_num_fields=None, separator=None)
Parse a query string given as a string argument (data of type
:mimetype:`application/x-www-form-urlencoded`). Data are returned as a
@@ -168,6 +168,15 @@ or on combining URL components into a URL string.
read. If set, then throws a :exc:`ValueError` if there are more than
*max_num_fields* fields read.
+ The optional argument *separator* is the symbol to use for separating the
+ query arguments. It is recommended to set it to ``'&'`` or ``';'``.
+ It defaults to ``'&'``; a warning is raised if this default is used.
+ This default may be changed with the following environment variable settings:
+
+ - ``PYTHON_URLLIB_QS_SEPARATOR='&'``: use only ``&`` as separator, without warning (as in Python 3.6.13+ or 3.10)
+ - ``PYTHON_URLLIB_QS_SEPARATOR=';'``: use only ``;`` as separator
+ - ``PYTHON_URLLIB_QS_SEPARATOR=legacy``: use both ``&`` and ``;`` (as in previous versions of Python)
+
Use the :func:`urllib.parse.urlencode` function (with the ``doseq``
parameter set to ``True``) to convert such dictionaries into query
strings.
@@ -204,6 +213,9 @@ or on combining URL components into a URL string.
read. If set, then throws a :exc:`ValueError` if there are more than
*max_num_fields* fields read.
+ The optional argument *separator* is the symbol to use for separating the
+ query arguments. It works as in :py:func:`parse_qs`.
+
Use the :func:`urllib.parse.urlencode` function to convert such lists of pairs into
query strings.
@@ -213,7 +225,6 @@ or on combining URL components into a URL string.
.. versionchanged:: 3.6.8
Added *max_num_fields* parameter.
-
.. function:: urlunparse(parts)
Construct a URL from a tuple as returned by ``urlparse()``. The *parts*
diff --git a/Lib/cgi.py b/Lib/cgi.py
index 56f243e09f0..5ab2a5d6af6 100755
--- a/Lib/cgi.py
+++ b/Lib/cgi.py
@@ -117,7 +117,8 @@ log = initlog # The current logging function
# 0 ==> unlimited input
maxlen = 0
-def parse(fp=None, environ=os.environ, keep_blank_values=0, strict_parsing=0):
+def parse(fp=None, environ=os.environ, keep_blank_values=0,
+ strict_parsing=0, separator=None):
"""Parse a query in the environment or from a file (default stdin)
Arguments, all optional:
@@ -136,6 +137,8 @@ def parse(fp=None, environ=os.environ, keep_blank_values=0, strict_parsing=0):
strict_parsing: flag indicating what to do with parsing errors.
If false (the default), errors are silently ignored.
If true, errors raise a ValueError exception.
+
+ separator: str. The symbol to use for separating the query arguments.
"""
if fp is None:
fp = sys.stdin
@@ -156,7 +159,7 @@ def parse(fp=None, environ=os.environ, keep_blank_values=0, strict_parsing=0):
if environ['REQUEST_METHOD'] == 'POST':
ctype, pdict = parse_header(environ['CONTENT_TYPE'])
if ctype == 'multipart/form-data':
- return parse_multipart(fp, pdict)
+ return parse_multipart(fp, pdict, separator=separator)
elif ctype == 'application/x-www-form-urlencoded':
clength = int(environ['CONTENT_LENGTH'])
if maxlen and clength > maxlen:
@@ -182,21 +185,21 @@ def parse(fp=None, environ=os.environ, keep_blank_values=0, strict_parsing=0):
return urllib.parse.parse_qs(qs, keep_blank_values, strict_parsing,
encoding=encoding)
-
# parse query string function called from urlparse,
# this is done in order to maintain backward compatibility.
-
-def parse_qs(qs, keep_blank_values=0, strict_parsing=0):
+def parse_qs(qs, keep_blank_values=0, strict_parsing=0, separator=None):
"""Parse a query given as a string argument."""
warn("cgi.parse_qs is deprecated, use urllib.parse.parse_qs instead",
DeprecationWarning, 2)
- return urllib.parse.parse_qs(qs, keep_blank_values, strict_parsing)
+ return urllib.parse.parse_qs(qs, keep_blank_values, strict_parsing,
+ separator=separator)
-def parse_qsl(qs, keep_blank_values=0, strict_parsing=0):
+def parse_qsl(qs, keep_blank_values=0, strict_parsing=0, separator=None):
"""Parse a query given as a string argument."""
warn("cgi.parse_qsl is deprecated, use urllib.parse.parse_qsl instead",
DeprecationWarning, 2)
- return urllib.parse.parse_qsl(qs, keep_blank_values, strict_parsing)
+ return urllib.parse.parse_qsl(qs, keep_blank_values, strict_parsing,
+ separator=separator)
def parse_multipart(fp, pdict):
"""Parse multipart input.
@@ -297,7 +300,6 @@ def parse_multipart(fp, pdict):
return partdict
-
def _parseparam(s):
while s[:1] == ';':
s = s[1:]
@@ -405,7 +407,7 @@ class FieldStorage:
def __init__(self, fp=None, headers=None, outerboundary=b'',
environ=os.environ, keep_blank_values=0, strict_parsing=0,
limit=None, encoding='utf-8', errors='replace',
- max_num_fields=None):
+ max_num_fields=None, separator=None):
"""Constructor. Read multipart/* until last part.
Arguments, all optional:
@@ -453,6 +455,7 @@ class FieldStorage:
self.keep_blank_values = keep_blank_values
self.strict_parsing = strict_parsing
self.max_num_fields = max_num_fields
+ self.separator = separator
if 'REQUEST_METHOD' in environ:
method = environ['REQUEST_METHOD'].upper()
self.qs_on_post = None
@@ -678,7 +681,7 @@ class FieldStorage:
query = urllib.parse.parse_qsl(
qs, self.keep_blank_values, self.strict_parsing,
encoding=self.encoding, errors=self.errors,
- max_num_fields=self.max_num_fields)
+ max_num_fields=self.max_num_fields, separator=self.separator)
self.list = [MiniFieldStorage(key, value) for key, value in query]
self.skip_lines()
@@ -694,7 +697,7 @@ class FieldStorage:
query = urllib.parse.parse_qsl(
self.qs_on_post, self.keep_blank_values, self.strict_parsing,
encoding=self.encoding, errors=self.errors,
- max_num_fields=self.max_num_fields)
+ max_num_fields=self.max_num_fields, separator=self.separator)
self.list.extend(MiniFieldStorage(key, value) for key, value in query)
klass = self.FieldStorageClass or self.__class__
@@ -736,7 +739,8 @@ class FieldStorage:
part = klass(self.fp, headers, ib, environ, keep_blank_values,
strict_parsing,self.limit-self.bytes_read,
- self.encoding, self.errors, max_num_fields)
+ self.encoding, self.errors, max_num_fields,
+ separator=self.separator)
if max_num_fields is not None:
max_num_fields -= 1
diff --git a/Lib/test/test_cgi.py b/Lib/test/test_cgi.py
index b3e2d4cce8e..5ae3e085e1e 100644
--- a/Lib/test/test_cgi.py
+++ b/Lib/test/test_cgi.py
@@ -55,12 +55,9 @@ parse_strict_test_cases = [
("", ValueError("bad query field: ''")),
("&", ValueError("bad query field: ''")),
("&&", ValueError("bad query field: ''")),
- (";", ValueError("bad query field: ''")),
- (";&;", ValueError("bad query field: ''")),
# Should the next few really be valid?
("=", {}),
("=&=", {}),
- ("=;=", {}),
# This rest seem to make sense
("=a", {'': ['a']}),
("&=a", ValueError("bad query field: ''")),
@@ -75,8 +72,6 @@ parse_strict_test_cases = [
("a=a+b&b=b+c", {'a': ['a b'], 'b': ['b c']}),
("a=a+b&a=b+a", {'a': ['a b', 'b a']}),
("x=1&y=2.0&z=2-3.%2b0", {'x': ['1'], 'y': ['2.0'], 'z': ['2-3.+0']}),
- ("x=1;y=2.0&z=2-3.%2b0", {'x': ['1'], 'y': ['2.0'], 'z': ['2-3.+0']}),
- ("x=1;y=2.0;z=2-3.%2b0", {'x': ['1'], 'y': ['2.0'], 'z': ['2-3.+0']}),
("Hbc5161168c542333633315dee1182227:key_store_seqid=400006&cuyer=r&view=bustomer&order_id=0bb2e248638833d48cb7fed300000f1b&expire=964546263&lobale=en-US&kid=130003.300038&ss=env",
{'Hbc5161168c542333633315dee1182227:key_store_seqid': ['400006'],
'cuyer': ['r'],
@@ -164,6 +159,35 @@ class CgiTests(unittest.TestCase):
env = {'QUERY_STRING': orig}
fs = cgi.FieldStorage(environ=env)
+ if isinstance(expect, dict):
+ # test dict interface
+ self.assertEqual(len(expect), len(fs))
+ self.assertCountEqual(expect.keys(), fs.keys())
+ self.assertEqual(fs.getvalue("nonexistent field", "default"), "default")
+ # test individual fields
+ for key in expect.keys():
+ expect_val = expect[key]
+ self.assertIn(key, fs)
+ if len(expect_val) > 1:
+ self.assertEqual(fs.getvalue(key), expect_val)
+ else:
+ self.assertEqual(fs.getvalue(key), expect_val[0])
+
+ def test_separator(self):
+ parse_semicolon = [
+ ("x=1;y=2.0", {'x': ['1'], 'y': ['2.0']}),
+ ("x=1;y=2.0;z=2-3.%2b0", {'x': ['1'], 'y': ['2.0'], 'z': ['2-3.+0']}),
+ (";", ValueError("bad query field: ''")),
+ (";;", ValueError("bad query field: ''")),
+ ("=;a", ValueError("bad query field: 'a'")),
+ (";b=a", ValueError("bad query field: ''")),
+ ("b;=a", ValueError("bad query field: 'b'")),
+ ("a=a+b;b=b+c", {'a': ['a b'], 'b': ['b c']}),
+ ("a=a+b;a=b+a", {'a': ['a b', 'b a']}),
+ ]
+ for orig, expect in parse_semicolon:
+ env = {'QUERY_STRING': orig}
+ fs = cgi.FieldStorage(separator=';', environ=env)
if isinstance(expect, dict):
# test dict interface
self.assertEqual(len(expect), len(fs))
diff --git a/Lib/test/test_urlparse.py b/Lib/test/test_urlparse.py
index 68f633ca3a7..1ec86ba0fc2 100644
--- a/Lib/test/test_urlparse.py
+++ b/Lib/test/test_urlparse.py
@@ -2,6 +2,11 @@ import sys
import unicodedata
import unittest
import urllib.parse
+from test.support import EnvironmentVarGuard
+from warnings import catch_warnings
+import tempfile
+import contextlib
+import os.path
RFC1808_BASE = "http://a/b/c/d;p?q#f"
RFC2396_BASE = "http://a/b/c/d;p?q"
@@ -32,6 +37,9 @@ parse_qsl_test_cases = [
(b"&a=b", [(b'a', b'b')]),
(b"a=a+b&b=b+c", [(b'a', b'a b'), (b'b', b'b c')]),
(b"a=1&a=2", [(b'a', b'1'), (b'a', b'2')]),
+]
+
+parse_qsl_test_cases_semicolon = [
(";", []),
(";;", []),
(";a=b", [('a', 'b')]),
@@ -44,6 +52,21 @@ parse_qsl_test_cases = [
(b"a=1;a=2", [(b'a', b'1'), (b'a', b'2')]),
]
+parse_qsl_test_cases_legacy = [
+ (b"a=1;a=2&a=3", [(b'a', b'1'), (b'a', b'2'), (b'a', b'3')]),
+ (b"a=1;b=2&c=3", [(b'a', b'1'), (b'b', b'2'), (b'c', b'3')]),
+ (b"a=1&b=2&c=3;", [(b'a', b'1'), (b'b', b'2'), (b'c', b'3')]),
+]
+
+parse_qsl_test_cases_warn = [
+ (";a=b", [(';a', 'b')]),
+ ("a=a+b;b=b+c", [('a', 'a b;b=b c')]),
+ (b";a=b", [(b';a', b'b')]),
+ (b"a=a+b;b=b+c", [(b'a', b'a b;b=b c')]),
+ ("a=1;a=2&a=3", [('a', '1;a=2'), ('a', '3')]),
+ (b"a=1;a=2&a=3", [(b'a', b'1;a=2'), (b'a', b'3')]),
+]
+
# Each parse_qs testcase is a two-tuple that contains
# a string with the query and a dictionary with the expected result.
@@ -68,6 +91,9 @@ parse_qs_test_cases = [
(b"&a=b", {b'a': [b'b']}),
(b"a=a+b&b=b+c", {b'a': [b'a b'], b'b': [b'b c']}),
(b"a=1&a=2", {b'a': [b'1', b'2']}),
+]
+
+parse_qs_test_cases_semicolon = [
(";", {}),
(";;", {}),
(";a=b", {'a': ['b']}),
@@ -80,6 +106,24 @@ parse_qs_test_cases = [
(b"a=1;a=2", {b'a': [b'1', b'2']}),
]
+parse_qs_test_cases_legacy = [
+ ("a=1;a=2&a=3", {'a': ['1', '2', '3']}),
+ ("a=1;b=2&c=3", {'a': ['1'], 'b': ['2'], 'c': ['3']}),
+ ("a=1&b=2&c=3;", {'a': ['1'], 'b': ['2'], 'c': ['3']}),
+ (b"a=1;a=2&a=3", {b'a': [b'1', b'2', b'3']}),
+ (b"a=1;b=2&c=3", {b'a': [b'1'], b'b': [b'2'], b'c': [b'3']}),
+ (b"a=1&b=2&c=3;", {b'a': [b'1'], b'b': [b'2'], b'c': [b'3']}),
+]
+
+parse_qs_test_cases_warn = [
+ (";a=b", {';a': ['b']}),
+ ("a=a+b;b=b+c", {'a': ['a b;b=b c']}),
+ (b";a=b", {b';a': [b'b']}),
+ (b"a=a+b;b=b+c", {b'a':[ b'a b;b=b c']}),
+ ("a=1;a=2&a=3", {'a': ['1;a=2', '3']}),
+ (b"a=1;a=2&a=3", {b'a': [b'1;a=2', b'3']}),
+]
+
class UrlParseTestCase(unittest.TestCase):
def checkRoundtrips(self, url, parsed, split):
@@ -152,6 +196,40 @@ class UrlParseTestCase(unittest.TestCase):
self.assertEqual(result, expect_without_blanks,
"Error parsing %r" % orig)
+ def test_qs_default_warn(self):
+ for orig, expect in parse_qs_test_cases_warn:
+ with self.subTest(orig=orig, expect=expect):
+ with catch_warnings(record=True) as w:
+ result = urllib.parse.parse_qs(orig, keep_blank_values=True)
+ self.assertEqual(result, expect, "Error parsing %r" % orig)
+ self.assertEqual(len(w), 1)
+ self.assertEqual(w[0].category, urllib.parse._QueryStringSeparatorWarning)
+
+ def test_qsl_default_warn(self):
+ for orig, expect in parse_qsl_test_cases_warn:
+ with self.subTest(orig=orig, expect=expect):
+ with catch_warnings(record=True) as w:
+ result = urllib.parse.parse_qsl(orig, keep_blank_values=True)
+ self.assertEqual(result, expect, "Error parsing %r" % orig)
+ self.assertEqual(len(w), 1)
+ self.assertEqual(w[0].category, urllib.parse._QueryStringSeparatorWarning)
+
+ def test_default_qs_no_warnings(self):
+ for orig, expect in parse_qs_test_cases:
+ with self.subTest(orig=orig, expect=expect):
+ with catch_warnings(record=True) as w:
+ result = urllib.parse.parse_qs(orig, keep_blank_values=True)
+ self.assertEqual(result, expect, "Error parsing %r" % orig)
+ self.assertEqual(len(w), 0)
+
+ def test_default_qsl_no_warnings(self):
+ for orig, expect in parse_qsl_test_cases:
+ with self.subTest(orig=orig, expect=expect):
+ with catch_warnings(record=True) as w:
+ result = urllib.parse.parse_qsl(orig, keep_blank_values=True)
+ self.assertEqual(result, expect, "Error parsing %r" % orig)
+ self.assertEqual(len(w), 0)
+
def test_roundtrips(self):
str_cases = [
('file:///tmp/junk.txt',
@@ -885,8 +963,151 @@ class UrlParseTestCase(unittest.TestCase):
with self.assertRaises(ValueError):
urllib.parse.parse_qs('&'.join(['a=a']*11), max_num_fields=10)
with self.assertRaises(ValueError):
- urllib.parse.parse_qs(';'.join(['a=a']*11), max_num_fields=10)
+ urllib.parse.parse_qs(';'.join(['a=a']*11), separator=';', max_num_fields=10)
+ with self.assertRaises(ValueError):
+ urllib.parse.parse_qs('SEP'.join(['a=a']*11), separator='SEP', max_num_fields=10)
urllib.parse.parse_qs('&'.join(['a=a']*10), max_num_fields=10)
+ urllib.parse.parse_qs(';'.join(['a=a']*10), separator=';', max_num_fields=10)
+ urllib.parse.parse_qs('SEP'.join(['a=a']*10), separator='SEP', max_num_fields=10)
+
+ def test_parse_qs_separator_bytes(self):
+ expected = {b'a': [b'1'], b'b': [b'2']}
+
+ result = urllib.parse.parse_qs(b'a=1;b=2', separator=b';')
+ self.assertEqual(result, expected)
+ result = urllib.parse.parse_qs(b'a=1;b=2', separator=';')
+ self.assertEqual(result, expected)
+ result = urllib.parse.parse_qs('a=1;b=2', separator=';')
+ self.assertEqual(result, {'a': ['1'], 'b': ['2']})
+
+ @contextlib.contextmanager
+ def _qsl_sep_config(self, sep):
+ """Context for the given parse_qsl default separator configured in config file"""
+ old_filename = urllib.parse._QS_SEPARATOR_CONFIG_FILENAME
+ urllib.parse._default_qs_separator = None
+ try:
+ with tempfile.TemporaryDirectory() as tmpdirname:
+ filename = os.path.join(tmpdirname, 'conf.cfg')
+ with open(filename, 'w') as file:
+ file.write(f'[parse_qs]\n')
+ file.write(f'PYTHON_URLLIB_QS_SEPARATOR = {sep}')
+ urllib.parse._QS_SEPARATOR_CONFIG_FILENAME = filename
+ yield
+ finally:
+ urllib.parse._QS_SEPARATOR_CONFIG_FILENAME = old_filename
+ urllib.parse._default_qs_separator = None
+
+ def test_parse_qs_separator_semicolon(self):
+ for orig, expect in parse_qs_test_cases_semicolon:
+ with self.subTest(orig=orig, expect=expect, method='arg'):
+ result = urllib.parse.parse_qs(orig, separator=';')
+ self.assertEqual(result, expect, "Error parsing %r" % orig)
+ with self.subTest(orig=orig, expect=expect, method='env'):
+ with EnvironmentVarGuard() as environ, catch_warnings(record=True) as w:
+ environ['PYTHON_URLLIB_QS_SEPARATOR'] = ';'
+ result = urllib.parse.parse_qs(orig)
+ self.assertEqual(result, expect, "Error parsing %r" % orig)
+ self.assertEqual(len(w), 0)
+ with self.subTest(orig=orig, expect=expect, method='conf'):
+ with self._qsl_sep_config(';'), catch_warnings(record=True) as w:
+ result = urllib.parse.parse_qs(orig)
+ self.assertEqual(result, expect, "Error parsing %r" % orig)
+ self.assertEqual(len(w), 0)
+
+ def test_parse_qsl_separator_semicolon(self):
+ for orig, expect in parse_qsl_test_cases_semicolon:
+ with self.subTest(orig=orig, expect=expect, method='arg'):
+ result = urllib.parse.parse_qsl(orig, separator=';')
+ self.assertEqual(result, expect, "Error parsing %r" % orig)
+ with self.subTest(orig=orig, expect=expect, method='env'):
+ with EnvironmentVarGuard() as environ, catch_warnings(record=True) as w:
+ environ['PYTHON_URLLIB_QS_SEPARATOR'] = ';'
+ result = urllib.parse.parse_qsl(orig)
+ self.assertEqual(result, expect, "Error parsing %r" % orig)
+ self.assertEqual(len(w), 0)
+ with self.subTest(orig=orig, expect=expect, method='conf'):
+ with self._qsl_sep_config(';'), catch_warnings(record=True) as w:
+ result = urllib.parse.parse_qsl(orig)
+ self.assertEqual(result, expect, "Error parsing %r" % orig)
+ self.assertEqual(len(w), 0)
+
+ def test_parse_qs_separator_legacy(self):
+ for orig, expect in parse_qs_test_cases_legacy:
+ with self.subTest(orig=orig, expect=expect, method='env'):
+ with EnvironmentVarGuard() as environ, catch_warnings(record=True) as w:
+ environ['PYTHON_URLLIB_QS_SEPARATOR'] = 'legacy'
+ result = urllib.parse.parse_qs(orig)
+ self.assertEqual(result, expect, "Error parsing %r" % orig)
+ self.assertEqual(len(w), 0)
+ with self.subTest(orig=orig, expect=expect, method='conf'):
+ with self._qsl_sep_config('legacy'), catch_warnings(record=True) as w:
+ result = urllib.parse.parse_qs(orig)
+ self.assertEqual(result, expect, "Error parsing %r" % orig)
+ self.assertEqual(len(w), 0)
+
+ def test_parse_qsl_separator_legacy(self):
+ for orig, expect in parse_qsl_test_cases_legacy:
+ with self.subTest(orig=orig, expect=expect, method='env'):
+ with EnvironmentVarGuard() as environ, catch_warnings(record=True) as w:
+ environ['PYTHON_URLLIB_QS_SEPARATOR'] = 'legacy'
+ result = urllib.parse.parse_qsl(orig)
+ self.assertEqual(result, expect, "Error parsing %r" % orig)
+ self.assertEqual(len(w), 0)
+ with self.subTest(orig=orig, expect=expect, method='conf'):
+ with self._qsl_sep_config('legacy'), catch_warnings(record=True) as w:
+ result = urllib.parse.parse_qsl(orig)
+ self.assertEqual(result, expect, "Error parsing %r" % orig)
+ self.assertEqual(len(w), 0)
+
+ def test_parse_qs_separator_bad_value_env_or_config(self):
+ for bad_sep in '', 'abc', 'safe', '&;', 'SEP':
+ with self.subTest(bad_sep, method='env'):
+ with EnvironmentVarGuard() as environ, catch_warnings(record=True) as w:
+ environ['PYTHON_URLLIB_QS_SEPARATOR'] = bad_sep
+ with self.assertRaises(ValueError):
+ urllib.parse.parse_qsl('a=1;b=2')
+ with self.subTest(bad_sep, method='conf'):
+ with self._qsl_sep_config('bad_sep'), catch_warnings(record=True) as w:
+ with self.assertRaises(ValueError):
+ urllib.parse.parse_qsl('a=1;b=2')
+
+ def test_parse_qs_separator_bad_value_arg(self):
+ for bad_sep in True, {}, '':
+ with self.subTest(bad_sep):
+ with self.assertRaises(ValueError):
+ urllib.parse.parse_qsl('a=1;b=2', separator=bad_sep)
+
+ def test_parse_qs_separator_num_fields(self):
+ for qs, sep in (
+ ('a&b&c', '&'),
+ ('a;b;c', ';'),
+ ('a&b;c', 'legacy'),
+ ):
+ with self.subTest(qs=qs, sep=sep):
+ with EnvironmentVarGuard() as environ, catch_warnings(record=True) as w:
+ if sep != 'legacy':
+ with self.assertRaises(ValueError):
+ urllib.parse.parse_qsl(qs, separator=sep, max_num_fields=2)
+ if sep:
+ environ['PYTHON_URLLIB_QS_SEPARATOR'] = sep
+ with self.assertRaises(ValueError):
+ urllib.parse.parse_qsl(qs, max_num_fields=2)
+
+ def test_parse_qs_separator_priority(self):
+ # env variable trumps config file
+ with self._qsl_sep_config('~'), EnvironmentVarGuard() as environ:
+ environ['PYTHON_URLLIB_QS_SEPARATOR'] = '!'
+ result = urllib.parse.parse_qs('a=1!b=2~c=3')
+ self.assertEqual(result, {'a': ['1'], 'b': ['2~c=3']})
+ # argument trumps config file
+ with self._qsl_sep_config('~'):
+ result = urllib.parse.parse_qs('a=1$b=2~c=3', separator='$')
+ self.assertEqual(result, {'a': ['1'], 'b': ['2~c=3']})
+ # argument trumps env variable
+ with EnvironmentVarGuard() as environ:
+ environ['PYTHON_URLLIB_QS_SEPARATOR'] = '~'
+ result = urllib.parse.parse_qs('a=1$b=2~c=3', separator='$')
+ self.assertEqual(result, {'a': ['1'], 'b': ['2~c=3']})
def test_urlencode_sequences(self):
# Other tests incidentally urlencode things; test non-covered cases:
diff --git a/Lib/urllib/parse.py b/Lib/urllib/parse.py
index fa8827a9fa7..57b8fcf8bbd 100644
--- a/Lib/urllib/parse.py
+++ b/Lib/urllib/parse.py
@@ -28,6 +28,7 @@ test_urlparse.py provides a good indicator of parsing behavior.
"""
import re
+import os
import sys
import collections
@@ -644,7 +645,8 @@ def unquote(string, encoding='utf-8', errors='replace'):
def parse_qs(qs, keep_blank_values=False, strict_parsing=False,
- encoding='utf-8', errors='replace', max_num_fields=None):
+ encoding='utf-8', errors='replace', max_num_fields=None,
+ separator=None):
"""Parse a query given as a string argument.
Arguments:
@@ -673,7 +675,8 @@ def parse_qs(qs, keep_blank_values=False, strict_parsing=False,
parsed_result = {}
pairs = parse_qsl(qs, keep_blank_values, strict_parsing,
encoding=encoding, errors=errors,
- max_num_fields=max_num_fields)
+ max_num_fields=max_num_fields,
+ separator=separator)
for name, value in pairs:
if name in parsed_result:
parsed_result[name].append(value)
@@ -681,9 +684,16 @@ def parse_qs(qs, keep_blank_values=False, strict_parsing=False,
parsed_result[name] = [value]
return parsed_result
+class _QueryStringSeparatorWarning(RuntimeWarning):
+ """Warning for using default `separator` in parse_qs or parse_qsl"""
+
+# The default "separator" for parse_qsl can be specified in a config file.
+# It's cached after first read.
+_QS_SEPARATOR_CONFIG_FILENAME = '/etc/python/urllib.cfg'
+_default_qs_separator = None
def parse_qsl(qs, keep_blank_values=False, strict_parsing=False,
- encoding='utf-8', errors='replace', max_num_fields=None):
+ encoding='utf-8', errors='replace', max_num_fields=None, separator=None):
"""Parse a query given as a string argument.
Arguments:
@@ -710,15 +720,77 @@ def parse_qsl(qs, keep_blank_values=False, strict_parsing=False,
"""
qs, _coerce_result = _coerce_args(qs)
+ if isinstance(separator, bytes):
+ separator = separator.decode('ascii')
+
+ if (not separator or (not isinstance(separator, (str, bytes)))) and separator is not None:
+ raise ValueError("Separator must be of type string or bytes.")
+
+ # Used when both "&" and ";" act as separators. (Need a non-string value.)
+ _legacy = object()
+
+ if separator is None:
+ global _default_qs_separator
+ separator = _default_qs_separator
+ envvar_name = 'PYTHON_URLLIB_QS_SEPARATOR'
+ if separator is None:
+ # Set default separator from environment variable
+ separator = os.environ.get(envvar_name)
+ config_source = 'environment variable'
+ if separator is None:
+ # Set default separator from the configuration file
+ try:
+ file = open(_QS_SEPARATOR_CONFIG_FILENAME)
+ except FileNotFoundError:
+ pass
+ else:
+ with file:
+ import configparser
+ config = configparser.ConfigParser(
+ interpolation=None,
+ comment_prefixes=('#', ),
+ )
+ config.read_file(file)
+ separator = config.get('parse_qs', envvar_name, fallback=None)
+ _default_qs_separator = separator
+ config_source = _QS_SEPARATOR_CONFIG_FILENAME
+ if separator is None:
+ # The default is '&', but warn if not specified explicitly
+ if ';' in qs:
+ from warnings import warn
+ warn("The default separator of urllib.parse.parse_qsl and "
+ + "parse_qs was changed to '&' to avoid a web cache "
+ + "poisoning issue (CVE-2021-23336). "
+ + "By default, semicolons no longer act as query field "
+ + "separators. "
+ + "See https://access.redhat.com/articles/5860431 for "
+ + "more details.",
+ _QueryStringSeparatorWarning, stacklevel=2)
+ separator = '&'
+ elif separator == 'legacy':
+ separator = _legacy
+ elif len(separator) != 1:
+ raise ValueError(
+ f'{envvar_name} (from {config_source}) must contain '
+ + '1 character, or "legacy". See '
+ + 'https://access.redhat.com/articles/5860431 for more details.'
+ )
+
# If max_num_fields is defined then check that the number of fields
# is less than max_num_fields. This prevents a memory exhaustion DOS
# attack via post bodies with many fields.
if max_num_fields is not None:
- num_fields = 1 + qs.count('&') + qs.count(';')
+ if separator is _legacy:
+ num_fields = 1 + qs.count('&') + qs.count(';')
+ else:
+ num_fields = 1 + qs.count(separator)
if max_num_fields < num_fields:
raise ValueError('Max number of fields exceeded')
- pairs = [s2 for s1 in qs.split('&') for s2 in s1.split(';')]
+ if separator is _legacy:
+ pairs = [s2 for s1 in qs.split('&') for s2 in s1.split(';')]
+ else:
+ pairs = [s1 for s1 in qs.split(separator)]
r = []
for name_value in pairs:
if not name_value and not strict_parsing:
diff --git a/Misc/NEWS.d/next/Security/2021-02-14-15-59-16.bpo-42967.YApqDS.rst b/Misc/NEWS.d/next/Security/2021-02-14-15-59-16.bpo-42967.YApqDS.rst
new file mode 100644
index 00000000000..bc82c963067
--- /dev/null
+++ b/Misc/NEWS.d/next/Security/2021-02-14-15-59-16.bpo-42967.YApqDS.rst
@@ -0,0 +1 @@
+Make it possible to fix web cache poisoning vulnerability by allowing the user to choose a custom separator query args.

View File

@ -0,0 +1,101 @@
From 5b1e50256b6532667b6d31debc350f6c7d3f30aa Mon Sep 17 00:00:00 2001
From: "Miss Islington (bot)"
<31488909+miss-islington@users.noreply.github.com>
Date: Mon, 29 Mar 2021 08:40:53 -0700
Subject: [PATCH] bpo-42988: Remove the pydoc getfile feature (GH-25015)
(GH-25067)
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
CVE-2021-3426: Remove the "getfile" feature of the pydoc module which
could be abused to read arbitrary files on the disk (directory
traversal vulnerability). Moreover, even source code of Python
modules can contain sensitive data like passwords. Vulnerability
reported by David Schwörer.
(cherry picked from commit 9b999479c0022edfc9835a8a1f06e046f3881048)
Co-authored-by: Victor Stinner <vstinner@python.org>
---
Lib/pydoc.py | 18 ------------------
Lib/test/test_pydoc.py | 6 ------
.../2021-03-24-14-16-56.bpo-42988.P2aNco.rst | 4 ++++
3 files changed, 4 insertions(+), 24 deletions(-)
create mode 100644 Misc/NEWS.d/next/Security/2021-03-24-14-16-56.bpo-42988.P2aNco.rst
diff --git a/Lib/pydoc.py b/Lib/pydoc.py
index b521a5504728c4..5247ef9ea27aa1 100644
--- a/Lib/pydoc.py
+++ b/Lib/pydoc.py
@@ -2312,9 +2312,6 @@ def page(self, title, contents):
%s</head><body bgcolor="#f0f0f8">%s<div style="clear:both;padding-top:.5em;">%s</div>
</body></html>''' % (title, css_link, html_navbar(), contents)
- def filelink(self, url, path):
- return '<a href="getfile?key=%s">%s</a>' % (url, path)
-
html = _HTMLDoc()
@@ -2400,19 +2397,6 @@ def bltinlink(name):
'key = %s' % key, '#ffffff', '#ee77aa', '<br>'.join(results))
return 'Search Results', contents
- def html_getfile(path):
- """Get and display a source file listing safely."""
- path = urllib.parse.unquote(path)
- with tokenize.open(path) as fp:
- lines = html.escape(fp.read())
- body = '<pre>%s</pre>' % lines
- heading = html.heading(
- '<big><big><strong>File Listing</strong></big></big>',
- '#ffffff', '#7799ee')
- contents = heading + html.bigsection(
- 'File: %s' % path, '#ffffff', '#ee77aa', body)
- return 'getfile %s' % path, contents
-
def html_topics():
"""Index of topic texts available."""
@@ -2504,8 +2488,6 @@ def get_html_page(url):
op, _, url = url.partition('=')
if op == "search?key":
title, content = html_search(url)
- elif op == "getfile?key":
- title, content = html_getfile(url)
elif op == "topic?key":
# try topics first, then objects.
try:
diff --git a/Lib/test/test_pydoc.py b/Lib/test/test_pydoc.py
index 00803d3305cb53..49bc3eb164b19c 100644
--- a/Lib/test/test_pydoc.py
+++ b/Lib/test/test_pydoc.py
@@ -1052,18 +1052,12 @@ def test_url_requests(self):
("topic?key=def", "Pydoc: KEYWORD def"),
("topic?key=STRINGS", "Pydoc: TOPIC STRINGS"),
("foobar", "Pydoc: Error - foobar"),
- ("getfile?key=foobar", "Pydoc: Error - getfile?key=foobar"),
]
with self.restrict_walk_packages():
for url, title in requests:
self.call_url_handler(url, title)
- path = string.__file__
- title = "Pydoc: getfile " + path
- url = "getfile?key=" + path
- self.call_url_handler(url, title)
-
class TestHelper(unittest.TestCase):
def test_keywords(self):
diff --git a/Misc/NEWS.d/next/Security/2021-03-24-14-16-56.bpo-42988.P2aNco.rst b/Misc/NEWS.d/next/Security/2021-03-24-14-16-56.bpo-42988.P2aNco.rst
new file mode 100644
index 00000000000000..4b42dd05305a83
--- /dev/null
+++ b/Misc/NEWS.d/next/Security/2021-03-24-14-16-56.bpo-42988.P2aNco.rst
@@ -0,0 +1,4 @@
+CVE-2021-3426: Remove the ``getfile`` feature of the :mod:`pydoc` module which
+could be abused to read arbitrary files on the disk (directory traversal
+vulnerability). Moreover, even source code of Python modules can contain
+sensitive data like passwords. Vulnerability reported by David Schwörer.

View File

@ -0,0 +1,36 @@
bpo-44422: Fix threading.enumerate() reentrant call (GH-26727)
The threading.enumerate() function now uses a reentrant lock to
prevent a hang on reentrant call.
https://github.com/python/cpython/commit/243fd01047ddce1a7eb0f99a49732d123e942c63
Resolves: rhbz#1959459
diff --git a/Lib/threading.py b/Lib/threading.py
index 0ab1e46..7ab9ad8 100644
--- a/Lib/threading.py
+++ b/Lib/threading.py
@@ -727,8 +727,11 @@ _counter() # Consume 0 so first non-main thread has id 1.
def _newname(template="Thread-%d"):
return template % _counter()
-# Active thread administration
-_active_limbo_lock = _allocate_lock()
+# Active thread administration.
+#
+# bpo-44422: Use a reentrant lock to allow reentrant calls to functions like
+# threading.enumerate().
+_active_limbo_lock = RLock()
_active = {} # maps thread id to Thread object
_limbo = {}
_dangling = WeakSet()
@@ -1325,7 +1328,7 @@ def _after_fork():
# Reset _active_limbo_lock, in case we forked while the lock was held
# by another (non-forked) thread. http://bugs.python.org/issue874900
global _active_limbo_lock, _main_thread
- _active_limbo_lock = _allocate_lock()
+ _active_limbo_lock = RLock()
# fork() only copied the current thread; clear references to others.
new_active = {}

View File

@ -0,0 +1,43 @@
bpo-44434: Don't call PyThread_exit_thread() explicitly (GH-26758)
_thread.start_new_thread() no longer calls PyThread_exit_thread()
explicitly at the thread exit, the call was redundant.
On Linux with the glibc, pthread_cancel() loads dynamically the
libgcc_s.so.1 library. dlopen() can fail if there is no more
available file descriptor to open the file. In this case, the process
aborts with the error message:
"libgcc_s.so.1 must be installed for pthread_cancel to work"
pthread_cancel() unwinds back to the thread's wrapping function that
calls the thread entry point.
The unwind function is dynamically loaded from the libgcc_s library
since it is tightly coupled to the C compiler (GCC). The unwinder
depends on DWARF, the compiler generates DWARF, so the unwinder
belongs to the compiler.
Thanks Florian Weimer and Carlos O'Donell for their help on
investigating this issue.
https://github.com/python/cpython/commit/45a78f906d2d5fe5381d78466b11763fc56d57ba
Resolves: rhbz#1972293
diff --git a/Modules/_threadmodule.c b/Modules/_threadmodule.c
index a13b2e0..8cc035b 100644
--- a/Modules/_threadmodule.c
+++ b/Modules/_threadmodule.c
@@ -1027,7 +1027,10 @@ t_bootstrap(void *boot_raw)
nb_threads--;
PyThreadState_Clear(tstate);
PyThreadState_DeleteCurrent();
- PyThread_exit_thread();
+
+ // bpo-44434: Don't call explicitly PyThread_exit_thread(). On Linux with
+ // the glibc, pthread_exit() can abort the whole process if dlopen() fails
+ // to open the libgcc_s.so library (ex: EMFILE error).
}
static PyObject *

View File

@ -14,7 +14,7 @@ URL: https://www.python.org/
# WARNING When rebasing to a new Python version,
# remember to update the python3-docs package as well
Version: %{pybasever}.8
Release: 36%{?dist}
Release: 40%{?dist}
License: Python
@ -584,6 +584,32 @@ Patch356: 00356-k_and_a_options_for_pathfix.patch
# Main BZ: https://bugzilla.redhat.com/show_bug.cgi?id=1918168
Patch357: 00357-CVE-2021-3177.patch
# 00359 #
# CVE-2021-23336 python: Web Cache Poisoning via urllib.parse.parse_qsl and
# urllib.parse.parse_qs by using a semicolon in query parameters
# Upstream: https://bugs.python.org/issue42967
# Main BZ: https://bugzilla.redhat.com/show_bug.cgi?id=1928904
Patch359: 00359-CVE-2021-23336.patch
# 00360 #
# CVE-2021-3426: information disclosure via pydoc
# Upstream: https://bugs.python.org/issue42988
# Main BZ: https://bugzilla.redhat.com/show_bug.cgi?id=1935913
Patch360: 00360-CVE-2021-3426.patch
# 00362 #
# The threading.enumerate() function now uses a reentrant lock to
# prevent a hang on reentrant call.
# Upstream: https://bugs.python.org/issue44422
# Main BZ: https://bugzilla.redhat.com/show_bug.cgi?id=1959459
Patch362: 00362-threading-enumerate-rlock.patch
# 00364 #
# Don't call PyThread_exit_thread() explicitly.
# Upstream: https://bugs.python.org/issue44434
# Main BZ: https://bugzilla.redhat.com/show_bug.cgi?id=1972293
Patch364: 00364-thread-exit.patch
# (New patches go here ^^^)
#
# When adding new patches to "python" and "python3" in Fedora, EL, etc.,
@ -630,10 +656,10 @@ Requires: python3-setuptools-wheel
Requires: python3-pip-wheel
%endif
# Runtime require alternatives
Requires: %{_sbindir}/alternatives
Requires(post): %{_sbindir}/alternatives
Requires(postun): %{_sbindir}/alternatives
# Require alternatives version that implements the --keep-foreign flag
Requires: alternatives >= 1.19.1-1
Requires(post): alternatives >= 1.19.1-1
Requires(postun): alternatives >= 1.19.1-1
# This prevents ALL subpackages built from this spec to require
# /usr/bin/python3*. Granularity per subpackage is impossible.
@ -774,6 +800,9 @@ Provides: %{name}-tools = %{version}-%{release}
Provides: %{name}-tools%{?_isa} = %{version}-%{release}
Obsoletes: %{name}-tools < %{version}-%{release}
# Require alternatives version that implements the --keep-foreign flag
Requires(postun): alternatives >= 1.19.1-1
# python36 installs the alternatives master symlink to which we attach a slave
Requires: python36
Requires(post): python36
@ -910,6 +939,10 @@ git apply %{PATCH351}
%patch355 -p1
%patch356 -p1
%patch357 -p1
%patch359 -p1
%patch360 -p1
%patch362 -p1
%patch364 -p1
# Remove files that should be generated by the build
# (This is after patching, so that we can use patches directly from upstream)
@ -1377,7 +1410,7 @@ alternatives --install %{_bindir}/unversioned-python \
%postun -n platform-python
# Do this only during uninstall process (not during update)
if [ $1 -eq 0 ]; then
alternatives --remove python \
alternatives --keep-foreign --remove python \
%{_libexecdir}/no-python
fi
@ -1392,7 +1425,7 @@ alternatives --add-slave python3 %{_bindir}/python3.6 \
%postun -n python3-idle
# Do this only during uninstall process (not during update)
if [ $1 -eq 0 ]; then
alternatives --remove-slave python3 %{_bindir}/python3.6 \
alternatives --keep-foreign --remove-slave python3 %{_bindir}/python3.6 \
idle3
fi
@ -1835,6 +1868,23 @@ fi
# ======================================================
%changelog
* Thu Jul 29 2021 Tomas Orsava <torsava@redhat.com> - 3.6.8-40
- Adjusted the postun scriptlets to enable upgrading to RHEL 9
- Resolves: rhbz#1933055
* Fri Jul 09 2021 Victor Stinner <vstinner@redhat.com> - 3.6.8-39
- Fix reentrant call to threading.enumerate() (rhbz#1959459)
- Don't exit Python with abort() when a thread exit and there is no available
file descriptor to load dynamically the libgcc_s.so.1 library (rhbz#1972293)
* Fri Apr 30 2021 Charalampos Stratakis <cstratak@redhat.com> - 3.6.8-38
- Security fix for CVE-2021-3426: information disclosure via pydoc
Resolves: rhbz#1935913
* Thu Mar 04 2021 Petr Viktorin <pviktori@redhat.com> - 3.6.8-37
- Fix for CVE-2021-23336
Resolves: rhbz#1928904
* Fri Jan 22 2021 Lumír Balhar <lbalhar@redhat.com> - 3.6.8-36
- Fix for CVE-2021-3177
Resolves: rhbz#1918168