spamassassin/SOURCES/spamassassin-3.4.2-fix-rawbody-rules-documentation.patch

--- a/lib/Mail/SpamAssassin/Conf.pm	2019/08/01 12:28:38	1864149
+++ b/lib/Mail/SpamAssassin/Conf.pm	2019/08/08 08:11:36	1864686
@@ -3066,12 +3066,19 @@
 as per the header tests, C<#> must be escaped (C<\#>) or else it is considered
 the beginning of a comment.

-The 'body' in this case is the textual parts of the message body;
-any non-text MIME parts are stripped, and the message decoded from
-Quoted-Printable or Base-64-encoded format if necessary.  The message
-Subject header is considered part of the body and becomes the first
-paragraph when running the rules.  All HTML tags and line breaks will
-be removed before matching.
+The 'body' in this case is the textual parts of the message body; any
+non-text MIME parts are stripped, and the message decoded from
+Quoted-Printable or Base-64-encoded format if necessary.  Parts declared as
+text/html will be rendered from HTML to text.
+
+All body paragraphs (double-newline-separated blocks text) are turned into a
+line breaks removed, whitespace normalized single line.  Any lines longer
+than 2kB are split into shorter separate lines (from a boundary when
+possible), this may unexpectedly prevent pattern from matching.  Patterns
+are matched independently against each of these lines.
+
+Note that the message Subject header is considered part of the body and
+becomes the first line when running the rules.

 =item body SYMBOLIC_TEST_NAME eval:name_of_eval_method([args])

@@ -3152,6 +3159,10 @@
 tags and line breaks will still be present.  Multiline expressions will
 need to be used to match strings that are broken by line breaks.

+Note that the text is split into 2-4kB chunks (from a word boundary when
+possible), this may unexpectedly prevent pattern from matching.  Patterns
+are matched independently against each of these chunks.
+
 =item rawbody SYMBOLIC_TEST_NAME eval:name_of_eval_method([args])

 Define a raw-body eval test.  See above.
--- a/lib/Mail/SpamAssassin/PerMsgStatus.pm	2019/08/03 13:55:00	1864336
+++ b/lib/Mail/SpamAssassin/PerMsgStatus.pm	2019/08/08 08:11:36	1864686
@@ -1769,8 +1769,10 @@
 Returns the message body, with B<base64> or B<quoted-printable> encodings
 decoded, and non-text parts or non-inline attachments stripped.

-It is returned as an array of strings, with each string representing
-one newline-separated line of the body.
+This is the same result text as used in 'rawbody' rules.
+
+It is returned as an array of strings, with each string being a 2-4kB chunk
+of the body, split from boundaries if possible.

 =cut

@@ -1784,6 +1786,8 @@
 get_decoded_body_text_array()), with HTML rendered, and with whitespace
 normalized.

+This is the same result text as used in 'body' rules.
+
 It will always render text/html, and will use a heuristic to determine if other
 text/* parts should be considered text/html.