From 552075b562a0bcea553e8c5f234e75afbfd1d5f5 Mon Sep 17 00:00:00 2001 From: Phil Sutter Date: Thu, 9 Feb 2023 15:47:30 +0100 Subject: [PATCH] nftables-1.0.4-8.el9 * Thu Feb 09 2023 Phil Sutter [1.0.4-8.el9] - monitor: Sanitize startup race condition (Phil Sutter) [2130721] - evaluate: set eval ctx for add/update statements with integer constants (Phil Sutter) [2094894] - src: allow anon set concatenation with ether and vlan (Phil Sutter) [2094887] - evaluate: search stacked header list for matching payload dep (Phil Sutter) [2094887] - netlink_delinearize: also postprocess OP_AND in set element context (Phil Sutter) [2094887] - tests: add a test case for ether and vlan listing (Phil Sutter) [2094887] - debug: dump the l2 protocol stack (Phil Sutter) [2094887] - proto: track full stack of seen l2 protocols, not just cumulative offset (Phil Sutter) [2094887] - netlink_delinearize: postprocess binary ands in concatenations (Phil Sutter) [2094887] - netlink_delinearize: allow postprocessing on concatenated elements (Phil Sutter) [2094887] - intervals: check for EXPR_F_REMOVE in case of element mismatch (Phil Sutter) [2115627] - intervals: fix crash when trying to remove element in empty set (Phil Sutter) [2115627] - scanner: don't pop active flex scanner scope (Phil Sutter) [2113874] - parser: add missing synproxy scope closure (Phil Sutter) [2113874] - tests/py: Add a test for failing ipsec after counter (Phil Sutter) [2113874] - doc: Document limitations of ipsec expression with xfrm_interface (Phil Sutter) [1806431] Resolves: rhbz#1806431, rhbz#2094887, rhbz#2094894, rhbz#2113874, rhbz#2115627, rhbz#2130721, rhbz#2094890 --- ...itations-of-ipsec-expression-with-xf.patch | 44 +++ ...test-for-failing-ipsec-after-counter.patch | 86 ++++++ ...r-add-missing-synproxy-scope-closure.patch | 38 +++ ...-don-t-pop-active-flex-scanner-scope.patch | 144 +++++++++ ...ash-when-trying-to-remove-element-in.patch | 67 ++++ ...for-EXPR_F_REMOVE-in-case-of-element.patch | 80 +++++ ...ize-allow-postprocessing-on-concaten.patch | 76 +++++ ...ize-postprocess-binary-ands-in-conca.patch | 159 ++++++++++ ...-stack-of-seen-l2-protocols-not-just.patch | 287 ++++++++++++++++++ 0015-debug-dump-the-l2-protocol-stack.patch | 44 +++ ...test-case-for-ether-and-vlan-listing.patch | 65 ++++ ...ize-also-postprocess-OP_AND-in-set-e.patch | 99 ++++++ ...stacked-header-list-for-matching-pay.patch | 198 ++++++++++++ ...et-concatenation-with-ether-and-vlan.patch | 223 ++++++++++++++ ...l-ctx-for-add-update-statements-with.patch | 200 ++++++++++++ ...itor-Sanitize-startup-race-condition.patch | 107 +++++++ nft-test.stderr.expect | 5 +- nftables.spec | 36 ++- run-tests.stderr.expect | 13 +- 19 files changed, 1963 insertions(+), 8 deletions(-) create mode 100644 0006-doc-Document-limitations-of-ipsec-expression-with-xf.patch create mode 100644 0007-tests-py-Add-a-test-for-failing-ipsec-after-counter.patch create mode 100644 0008-parser-add-missing-synproxy-scope-closure.patch create mode 100644 0009-scanner-don-t-pop-active-flex-scanner-scope.patch create mode 100644 0010-intervals-fix-crash-when-trying-to-remove-element-in.patch create mode 100644 0011-intervals-check-for-EXPR_F_REMOVE-in-case-of-element.patch create mode 100644 0012-netlink_delinearize-allow-postprocessing-on-concaten.patch create mode 100644 0013-netlink_delinearize-postprocess-binary-ands-in-conca.patch create mode 100644 0014-proto-track-full-stack-of-seen-l2-protocols-not-just.patch create mode 100644 0015-debug-dump-the-l2-protocol-stack.patch create mode 100644 0016-tests-add-a-test-case-for-ether-and-vlan-listing.patch create mode 100644 0017-netlink_delinearize-also-postprocess-OP_AND-in-set-e.patch create mode 100644 0018-evaluate-search-stacked-header-list-for-matching-pay.patch create mode 100644 0019-src-allow-anon-set-concatenation-with-ether-and-vlan.patch create mode 100644 0020-evaluate-set-eval-ctx-for-add-update-statements-with.patch create mode 100644 0021-monitor-Sanitize-startup-race-condition.patch diff --git a/0006-doc-Document-limitations-of-ipsec-expression-with-xf.patch b/0006-doc-Document-limitations-of-ipsec-expression-with-xf.patch new file mode 100644 index 0000000..de0907b --- /dev/null +++ b/0006-doc-Document-limitations-of-ipsec-expression-with-xf.patch @@ -0,0 +1,44 @@ +From f999616d8248bf51831c4eef4ea5a2fd57805857 Mon Sep 17 00:00:00 2001 +From: Phil Sutter +Date: Thu, 9 Feb 2023 10:15:02 +0100 +Subject: [PATCH] doc: Document limitations of ipsec expression with + xfrm_interface + +Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1806431 +Upstream Status: nftables commit 446e76dbde713 + +commit 446e76dbde713327358f17a8af6ce86b8541c836 +Author: Phil Sutter +Date: Thu Jun 23 17:49:20 2022 +0200 + + doc: Document limitations of ipsec expression with xfrm_interface + + Point at a possible solution to match IPsec info of locally generated + traffic routed to an xfrm-type interface. + + Signed-off-by: Phil Sutter + Signed-off-by: Florian Westphal + +Signed-off-by: Phil Sutter +--- + doc/primary-expression.txt | 4 ++++ + 1 file changed, 4 insertions(+) + +diff --git a/doc/primary-expression.txt b/doc/primary-expression.txt +index f97778b..4d6b087 100644 +--- a/doc/primary-expression.txt ++++ b/doc/primary-expression.txt +@@ -428,6 +428,10 @@ Destination address of the tunnel| + ipv4_addr/ipv6_addr + |================================= + ++*Note:* When using xfrm_interface, this expression is not useable in output ++hook as the plain packet does not traverse it with IPsec info attached - use a ++chain in postrouting hook instead. ++ + NUMGEN EXPRESSION + ~~~~~~~~~~~~~~~~~ + +-- +2.39.1 + diff --git a/0007-tests-py-Add-a-test-for-failing-ipsec-after-counter.patch b/0007-tests-py-Add-a-test-for-failing-ipsec-after-counter.patch new file mode 100644 index 0000000..73c600c --- /dev/null +++ b/0007-tests-py-Add-a-test-for-failing-ipsec-after-counter.patch @@ -0,0 +1,86 @@ +From 39c0d8ee8c8b0a667fc1c20336b3d248d28759f3 Mon Sep 17 00:00:00 2001 +From: Phil Sutter +Date: Thu, 9 Feb 2023 10:18:10 +0100 +Subject: [PATCH] tests/py: Add a test for failing ipsec after counter + +Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2113874 +Upstream Status: nftables commit ed2426bccd3ea + +commit ed2426bccd3ea954adc8a010bf1736e8ed6a81b9 +Author: Phil Sutter +Date: Thu Jun 23 16:28:42 2022 +0200 + + tests/py: Add a test for failing ipsec after counter + + This is a bug in parser/scanner due to scoping: + + | Error: syntax error, unexpected string, expecting saddr or daddr + | add rule ip ipsec-ip4 ipsec-forw counter ipsec out ip daddr 192.168.1.2 + | ^^^^^ + + Signed-off-by: Phil Sutter + Signed-off-by: Florian Westphal + +Signed-off-by: Phil Sutter +--- + tests/py/inet/ipsec.t | 2 ++ + tests/py/inet/ipsec.t.json | 21 +++++++++++++++++++++ + tests/py/inet/ipsec.t.payload | 6 ++++++ + 3 files changed, 29 insertions(+) + +diff --git a/tests/py/inet/ipsec.t b/tests/py/inet/ipsec.t +index e924e9b..b18df39 100644 +--- a/tests/py/inet/ipsec.t ++++ b/tests/py/inet/ipsec.t +@@ -19,3 +19,5 @@ ipsec in ip6 daddr dead::beef;ok + ipsec out ip6 saddr dead::feed;ok + + ipsec in spnum 256 reqid 1;fail ++ ++counter ipsec out ip daddr 192.168.1.2;ok +diff --git a/tests/py/inet/ipsec.t.json b/tests/py/inet/ipsec.t.json +index d7d3a03..18a64f3 100644 +--- a/tests/py/inet/ipsec.t.json ++++ b/tests/py/inet/ipsec.t.json +@@ -134,3 +134,24 @@ + } + } + ] ++ ++# counter ipsec out ip daddr 192.168.1.2 ++[ ++ { ++ "counter": null ++ }, ++ { ++ "match": { ++ "left": { ++ "ipsec": { ++ "dir": "out", ++ "family": "ip", ++ "key": "daddr", ++ "spnum": 0 ++ } ++ }, ++ "op": "==", ++ "right": "192.168.1.2" ++ } ++ } ++] +diff --git a/tests/py/inet/ipsec.t.payload b/tests/py/inet/ipsec.t.payload +index c46a226..9648255 100644 +--- a/tests/py/inet/ipsec.t.payload ++++ b/tests/py/inet/ipsec.t.payload +@@ -37,3 +37,9 @@ ip ipsec-ip4 ipsec-forw + [ xfrm load out 0 saddr6 => reg 1 ] + [ cmp eq reg 1 0x0000adde 0x00000000 0x00000000 0xedfe0000 ] + ++# counter ipsec out ip daddr 192.168.1.2 ++ip ipsec-ip4 ipsec-forw ++ [ counter pkts 0 bytes 0 ] ++ [ xfrm load out 0 daddr4 => reg 1 ] ++ [ cmp eq reg 1 0x0201a8c0 ] ++ +-- +2.39.1 + diff --git a/0008-parser-add-missing-synproxy-scope-closure.patch b/0008-parser-add-missing-synproxy-scope-closure.patch new file mode 100644 index 0000000..9327a13 --- /dev/null +++ b/0008-parser-add-missing-synproxy-scope-closure.patch @@ -0,0 +1,38 @@ +From 8d7afedbb71217b8b7dbdec5240b822fb8c5b9d3 Mon Sep 17 00:00:00 2001 +From: Phil Sutter +Date: Thu, 9 Feb 2023 10:18:10 +0100 +Subject: [PATCH] parser: add missing synproxy scope closure + +Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2113874 +Upstream Status: nftables commit 994bf5004b365 + +commit 994bf5004b365904029f0fe8c2de587178583712 +Author: Florian Westphal +Date: Thu Jun 23 18:28:14 2022 +0200 + + parser: add missing synproxy scope closure + + Fixes: 232f2c3287fc ("scanner: synproxy: Move to own scope") + Signed-off-by: Florian Westphal + +Signed-off-by: Phil Sutter +--- + src/parser_bison.y | 2 +- + 1 file changed, 1 insertion(+), 1 deletion(-) + +diff --git a/src/parser_bison.y b/src/parser_bison.y +index ca5c488..b548d5b 100644 +--- a/src/parser_bison.y ++++ b/src/parser_bison.y +@@ -2016,7 +2016,7 @@ map_block_obj_type : COUNTER close_scope_counter { $$ = NFT_OBJECT_COUNTER; } + | QUOTA close_scope_quota { $$ = NFT_OBJECT_QUOTA; } + | LIMIT close_scope_limit { $$ = NFT_OBJECT_LIMIT; } + | SECMARK close_scope_secmark { $$ = NFT_OBJECT_SECMARK; } +- | SYNPROXY { $$ = NFT_OBJECT_SYNPROXY; } ++ | SYNPROXY close_scope_synproxy { $$ = NFT_OBJECT_SYNPROXY; } + ; + + map_block : /* empty */ { $$ = $-1; } +-- +2.39.1 + diff --git a/0009-scanner-don-t-pop-active-flex-scanner-scope.patch b/0009-scanner-don-t-pop-active-flex-scanner-scope.patch new file mode 100644 index 0000000..7d54bd2 --- /dev/null +++ b/0009-scanner-don-t-pop-active-flex-scanner-scope.patch @@ -0,0 +1,144 @@ +From ec8483bd878fa1c8a7b53b243e5d7017634df65e Mon Sep 17 00:00:00 2001 +From: Phil Sutter +Date: Thu, 9 Feb 2023 10:18:10 +0100 +Subject: [PATCH] scanner: don't pop active flex scanner scope + +Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2113874 +Upstream Status: nftables commit 8623772af0610 + +commit 8623772af06103ed4ccca3d07e55afbf3d952d6d +Author: Florian Westphal +Date: Thu Jun 23 19:56:19 2022 +0200 + + scanner: don't pop active flex scanner scope + + Currently we can pop a flex scope that is still active, i.e. the + scanner_pop_start_cond() for the scope has not been done. + + Example: + counter ipsec out ip daddr 192.168.1.2 counter name "ipsec_out" + + Here, parser fails because 'daddr' is parsed as STRING, not as DADDR token. + + Bug is as follows: + COUNTER changes scope to COUNTER. (COUNTER). + Next, IPSEC scope gets pushed, stack is: COUNTER, IPSEC. + + Then, the 'COUNTER' scope close happens. Because active scope has changed, + we cannot pop (we would pop the 'ipsec' scope in flex). + The pop operation gets delayed accordingly. + + Next, IP gets pushed, stack is: COUNTER, IPSEC, IP, plus the information + that one scope closure/pop was delayed. + + Then, the IP scope is closed. Because a pop operation was delayed, we pop again, + which brings us back to COUNTER state. + + This is bogus: The pop operation CANNOT be done yet, because the ipsec scope + is still open, but the existing code lacks the information to detect this. + + After popping the IP scope, we must remain in IPSEC scope until bison + parser calls scanner_pop_start_cond(, IPSEC). + + This adds a counter per flex scope so that we can detect this case. + In above case, after the IP scope gets closed, the "new" (previous) + scope (IPSEC) will be treated as active and its close is attempted again + on the next call to scanner_pop_start_cond(). + + After this patch, transition in above rule is: + + push counter (COUNTER) + push IPSEC (COUNTER, IPSEC) + pop COUNTER (delayed: COUNTER, IPSEC, pending-pop for COUNTER), + push IP (COUNTER, IPSEC, IP, pending-pop for COUNTER) + pop IP (COUNTER, IPSEC, pending-pop for COUNTER) + parse DADDR (we're in IPSEC scope, its valid token) + pop IPSEC (pops all remaining scopes). + + We could also resurrect the commit: + "scanner: flags: move to own scope", the test case passes with the + new scope closure logic. + + Fixes: bff106c5b277 ("scanner: add support for scope nesting") + Signed-off-by: Florian Westphal + +Signed-off-by: Phil Sutter +--- + include/parser.h | 3 +++ + src/scanner.l | 11 +++++++++++ + 2 files changed, 14 insertions(+) + +diff --git a/include/parser.h b/include/parser.h +index f32154c..5e5ad28 100644 +--- a/include/parser.h ++++ b/include/parser.h +@@ -26,6 +26,7 @@ struct parser_state { + unsigned int flex_state_pop; + unsigned int startcond_type; + struct list_head *cmds; ++ unsigned int *startcond_active; + }; + + enum startcond_type { +@@ -82,6 +83,8 @@ enum startcond_type { + PARSER_SC_STMT_REJECT, + PARSER_SC_STMT_SYNPROXY, + PARSER_SC_STMT_TPROXY, ++ ++ __SC_MAX + }; + + struct mnl_socket; +diff --git a/src/scanner.l b/src/scanner.l +index 2154281..ed7256b 100644 +--- a/src/scanner.l ++++ b/src/scanner.l +@@ -1148,6 +1148,8 @@ void *scanner_init(struct parser_state *state) + yylex_init_extra(state, &scanner); + yyset_out(NULL, scanner); + ++ state->startcond_active = xzalloc_array(__SC_MAX, ++ sizeof(*state->startcond_active)); + return scanner; + } + +@@ -1177,6 +1179,8 @@ void scanner_destroy(struct nft_ctx *nft) + struct parser_state *state = yyget_extra(nft->scanner); + + input_descriptor_list_destroy(state); ++ xfree(state->startcond_active); ++ + yylex_destroy(nft->scanner); + } + +@@ -1185,6 +1189,7 @@ static void scanner_push_start_cond(void *scanner, enum startcond_type type) + struct parser_state *state = yyget_extra(scanner); + + state->startcond_type = type; ++ state->startcond_active[type]++; + + yy_push_state((int)type, scanner); + } +@@ -1193,6 +1198,8 @@ void scanner_pop_start_cond(void *scanner, enum startcond_type t) + { + struct parser_state *state = yyget_extra(scanner); + ++ state->startcond_active[t]--; ++ + if (state->startcond_type != t) { + state->flex_state_pop++; + return; /* Can't pop just yet! */ +@@ -1202,6 +1209,10 @@ void scanner_pop_start_cond(void *scanner, enum startcond_type t) + state->flex_state_pop--; + state->startcond_type = yy_top_state(scanner); + yy_pop_state(scanner); ++ ++ t = state->startcond_type; ++ if (state->startcond_active[t]) ++ return; + } + + state->startcond_type = yy_top_state(scanner); +-- +2.39.1 + diff --git a/0010-intervals-fix-crash-when-trying-to-remove-element-in.patch b/0010-intervals-fix-crash-when-trying-to-remove-element-in.patch new file mode 100644 index 0000000..087d0d0 --- /dev/null +++ b/0010-intervals-fix-crash-when-trying-to-remove-element-in.patch @@ -0,0 +1,67 @@ +From 70732ebb3cd1b847e72dce5ae8fe3c69580bd778 Mon Sep 17 00:00:00 2001 +From: Phil Sutter +Date: Thu, 9 Feb 2023 10:25:59 +0100 +Subject: [PATCH] intervals: fix crash when trying to remove element in empty + set + +Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2115627 +Upstream Status: nftables commit 5357cb7b5cb93 + +commit 5357cb7b5cb93fc9b20d4d95b093d6b9f86b7727 +Author: Pablo Neira Ayuso +Date: Thu Jun 23 14:20:17 2022 +0200 + + intervals: fix crash when trying to remove element in empty set + + The set deletion routine expects an initialized set, otherwise it crashes. + + Fixes: 3e8d934e4f72 ("intervals: support to partial deletion with automerge") + Signed-off-by: Pablo Neira Ayuso + +Signed-off-by: Phil Sutter +--- + src/intervals.c | 6 +++++- + tests/shell/testcases/sets/errors_0 | 14 ++++++++++++++ + 2 files changed, 19 insertions(+), 1 deletion(-) + create mode 100755 tests/shell/testcases/sets/errors_0 + +diff --git a/src/intervals.c b/src/intervals.c +index dcc06d1..c21b3ee 100644 +--- a/src/intervals.c ++++ b/src/intervals.c +@@ -475,7 +475,11 @@ int set_delete(struct list_head *msgs, struct cmd *cmd, struct set *set, + if (set->automerge) + automerge_delete(msgs, set, init, debug_mask); + +- set_to_range(existing_set->init); ++ if (existing_set->init) { ++ set_to_range(existing_set->init); ++ } else { ++ existing_set->init = set_expr_alloc(&internal_location, set); ++ } + + list_splice_init(&init->expressions, &del_list); + +diff --git a/tests/shell/testcases/sets/errors_0 b/tests/shell/testcases/sets/errors_0 +new file mode 100755 +index 0000000..2960b69 +--- /dev/null ++++ b/tests/shell/testcases/sets/errors_0 +@@ -0,0 +1,14 @@ ++#!/bin/bash ++ ++set -e ++ ++RULESET="table ip x { ++ set y { ++ type ipv4_addr ++ flags interval ++ } ++} ++ ++delete element ip x y { 2.3.4.5 }" ++ ++$NFT -f - <<< $RULESET || exit 0 +-- +2.39.1 + diff --git a/0011-intervals-check-for-EXPR_F_REMOVE-in-case-of-element.patch b/0011-intervals-check-for-EXPR_F_REMOVE-in-case-of-element.patch new file mode 100644 index 0000000..aac35c6 --- /dev/null +++ b/0011-intervals-check-for-EXPR_F_REMOVE-in-case-of-element.patch @@ -0,0 +1,80 @@ +From 8099eb08541428efe92c4a1a4cbaf3530fd125e7 Mon Sep 17 00:00:00 2001 +From: Phil Sutter +Date: Thu, 9 Feb 2023 10:25:59 +0100 +Subject: [PATCH] intervals: check for EXPR_F_REMOVE in case of element + mismatch + +Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2115627 +Upstream Status: nftables commit 6d1ee9267e7e5 + +commit 6d1ee9267e7e5e429a84d7bb8a8644f9eebddb22 +Author: Pablo Neira Ayuso +Date: Thu Jun 23 18:41:21 2022 +0200 + + intervals: check for EXPR_F_REMOVE in case of element mismatch + + If auto-merge is disable and element to be deleted finds no exact + matching, then bail out. + + Fixes: 3e8d934e4f72 ("intervals: support to partial deletion with automerge") + Signed-off-by: Pablo Neira Ayuso + +Signed-off-by: Phil Sutter +--- + src/intervals.c | 4 ++++ + tests/shell/testcases/sets/errors_0 | 20 ++++++++++++++++++-- + 2 files changed, 22 insertions(+), 2 deletions(-) + +diff --git a/src/intervals.c b/src/intervals.c +index c21b3ee..13009ca 100644 +--- a/src/intervals.c ++++ b/src/intervals.c +@@ -421,6 +421,10 @@ static int setelem_delete(struct list_head *msgs, struct set *set, + expr_error(msgs, i, "element does not exist"); + err = -1; + goto err; ++ } else if (i->flags & EXPR_F_REMOVE) { ++ expr_error(msgs, i, "element does not exist"); ++ err = -1; ++ goto err; + } + prev = NULL; + } +diff --git a/tests/shell/testcases/sets/errors_0 b/tests/shell/testcases/sets/errors_0 +index 2960b69..a676ac7 100755 +--- a/tests/shell/testcases/sets/errors_0 ++++ b/tests/shell/testcases/sets/errors_0 +@@ -1,7 +1,5 @@ + #!/bin/bash + +-set -e +- + RULESET="table ip x { + set y { + type ipv4_addr +@@ -11,4 +9,22 @@ RULESET="table ip x { + + delete element ip x y { 2.3.4.5 }" + ++$NFT -f - <<< $RULESET ++if [ $? -eq 0 ] ++then ++ exit 1 ++fi ++ ++RULESET="table ip x { ++ set y { ++ type ipv4_addr ++ flags interval ++ } ++} ++ ++add element x y { 1.1.1.1/24 } ++delete element x y { 1.1.1.1/24 } ++add element x y { 1.1.1.1/24 } ++delete element x y { 2.2.2.2/24 }" ++ + $NFT -f - <<< $RULESET || exit 0 +-- +2.39.1 + diff --git a/0012-netlink_delinearize-allow-postprocessing-on-concaten.patch b/0012-netlink_delinearize-allow-postprocessing-on-concaten.patch new file mode 100644 index 0000000..ebadc79 --- /dev/null +++ b/0012-netlink_delinearize-allow-postprocessing-on-concaten.patch @@ -0,0 +1,76 @@ +From cff78b77d409445b0490e67bf9329e69e0bbc872 Mon Sep 17 00:00:00 2001 +From: Phil Sutter +Date: Thu, 9 Feb 2023 10:27:57 +0100 +Subject: [PATCH] netlink_delinearize: allow postprocessing on concatenated + elements + +Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2094887 +Upstream Status: nftables commit 0542a431e8dcc + +commit 0542a431e8dccfa86fa5b1744f536e61a0b204f3 +Author: Florian Westphal +Date: Tue Jun 14 21:57:58 2022 +0200 + + netlink_delinearize: allow postprocessing on concatenated elements + + Currently there is no case where the individual expressions inside a + mapped concatenation need to be munged. + + However, to support proper delinearization for an input like + 'rule netdev nt nc set update ether saddr . vlan id timeout 5s @macset' + + we need to allow this. + + Right now, this gets listed as: + + update @macset { @ll,48,48 . @ll,112,16 & 0xfff timeout 5s } + + because the ethernet protocol is replaced by vlan beforehand, + so we fail to map @ll,48,48 to a vlan protocol. + + Likewise, we can't map the vlan info either because we cannot + cope with the 'and' operation properly, nor is it removed. + + Prepare for this by deleting and re-adding so that we do not + corrupt the linked list. + + After this, the list can be safely changed and a followup patch + can start to delete/reallocate expressions. + + Signed-off-by: Florian Westphal + +Signed-off-by: Phil Sutter +--- + src/netlink_delinearize.c | 7 ++++++- + 1 file changed, 6 insertions(+), 1 deletion(-) + +diff --git a/src/netlink_delinearize.c b/src/netlink_delinearize.c +index 068c3bb..2f13990 100644 +--- a/src/netlink_delinearize.c ++++ b/src/netlink_delinearize.c +@@ -2538,16 +2538,21 @@ static void expr_postprocess(struct rule_pp_ctx *ctx, struct expr **exprp) + unsigned int type = expr->dtype->type, ntype = 0; + int off = expr->dtype->subtypes; + const struct datatype *dtype; ++ LIST_HEAD(tmp); ++ struct expr *n; + +- list_for_each_entry(i, &expr->expressions, list) { ++ list_for_each_entry_safe(i, n, &expr->expressions, list) { + if (type) { + dtype = concat_subtype_lookup(type, --off); + expr_set_type(i, dtype, dtype->byteorder); + } ++ list_del(&i->list); + expr_postprocess(ctx, &i); ++ list_add_tail(&i->list, &tmp); + + ntype = concat_subtype_add(ntype, i->dtype->type); + } ++ list_splice(&tmp, &expr->expressions); + datatype_set(expr, concat_type_alloc(ntype)); + break; + } +-- +2.39.1 + diff --git a/0013-netlink_delinearize-postprocess-binary-ands-in-conca.patch b/0013-netlink_delinearize-postprocess-binary-ands-in-conca.patch new file mode 100644 index 0000000..1cc3d39 --- /dev/null +++ b/0013-netlink_delinearize-postprocess-binary-ands-in-conca.patch @@ -0,0 +1,159 @@ +From 29f041b93d7fc4e23c62c2e2e3cbbeaafa83b4ef Mon Sep 17 00:00:00 2001 +From: Phil Sutter +Date: Thu, 9 Feb 2023 10:27:57 +0100 +Subject: [PATCH] netlink_delinearize: postprocess binary ands in + concatenations + +Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2094887 +Upstream Status: nftables commit 89688c947efc3 + +commit 89688c947efc36d25c58c85650414fa3a491732e +Author: Florian Westphal +Date: Tue Jun 14 21:56:48 2022 +0200 + + netlink_delinearize: postprocess binary ands in concatenations + + Input: + update ether saddr . vlan id timeout 5s @macset + ether saddr . vlan id @macset + + Before this patch, gets rendered as: + update @macset { @ll,48,48 . @ll,112,16 & 0xfff timeout 5s } + @ll,48,48 . @ll,112,16 & 0xfff @macset + + After this, listing will show: + update @macset { @ll,48,48 . vlan id timeout 5s } + @ll,48,48 . vlan id @macset + + The @ll, ... is due to vlan description replacing the ethernet one, + so payload decode fails to take the concatenation apart (the ethernet + header payload info is matched vs. vlan template). + + This will be adjusted by a followup patch. + + Signed-off-by: Florian Westphal + +Signed-off-by: Phil Sutter +--- + include/netlink.h | 6 ++++++ + src/netlink_delinearize.c | 45 ++++++++++++++++++++++++++++++++++----- + 2 files changed, 46 insertions(+), 5 deletions(-) + +diff --git a/include/netlink.h b/include/netlink.h +index e8e0f68..71c888f 100644 +--- a/include/netlink.h ++++ b/include/netlink.h +@@ -42,10 +42,16 @@ struct netlink_parse_ctx { + struct netlink_ctx *nlctx; + }; + ++ ++#define RULE_PP_IN_CONCATENATION (1 << 0) ++ ++#define RULE_PP_REMOVE_OP_AND (RULE_PP_IN_CONCATENATION) ++ + struct rule_pp_ctx { + struct proto_ctx pctx; + struct payload_dep_ctx pdctx; + struct stmt *stmt; ++ unsigned int flags; + }; + + extern const struct input_descriptor indesc_netlink; +diff --git a/src/netlink_delinearize.c b/src/netlink_delinearize.c +index 2f13990..cba419d 100644 +--- a/src/netlink_delinearize.c ++++ b/src/netlink_delinearize.c +@@ -2259,12 +2259,13 @@ static void binop_adjust(const struct expr *binop, struct expr *right, + } + } + +-static void binop_postprocess(struct rule_pp_ctx *ctx, struct expr *expr, +- struct expr **expr_binop) ++static void __binop_postprocess(struct rule_pp_ctx *ctx, ++ struct expr *expr, ++ struct expr *left, ++ struct expr *mask, ++ struct expr **expr_binop) + { + struct expr *binop = *expr_binop; +- struct expr *left = binop->left; +- struct expr *mask = binop->right; + unsigned int shift; + + assert(binop->etype == EXPR_BINOP); +@@ -2300,15 +2301,26 @@ static void binop_postprocess(struct rule_pp_ctx *ctx, struct expr *expr, + + assert(binop->left == left); + *expr_binop = expr_get(left); +- expr_free(binop); + + if (left->etype == EXPR_PAYLOAD) + payload_match_postprocess(ctx, expr, left); + else if (left->etype == EXPR_EXTHDR && right) + expr_set_type(right, left->dtype, left->byteorder); ++ ++ expr_free(binop); + } + } + ++static void binop_postprocess(struct rule_pp_ctx *ctx, struct expr *expr, ++ struct expr **expr_binop) ++{ ++ struct expr *binop = *expr_binop; ++ struct expr *left = binop->left; ++ struct expr *mask = binop->right; ++ ++ __binop_postprocess(ctx, expr, left, mask, expr_binop); ++} ++ + static void map_binop_postprocess(struct rule_pp_ctx *ctx, struct expr *expr) + { + struct expr *binop = expr->map; +@@ -2541,6 +2553,7 @@ static void expr_postprocess(struct rule_pp_ctx *ctx, struct expr **exprp) + LIST_HEAD(tmp); + struct expr *n; + ++ ctx->flags |= RULE_PP_IN_CONCATENATION; + list_for_each_entry_safe(i, n, &expr->expressions, list) { + if (type) { + dtype = concat_subtype_lookup(type, --off); +@@ -2552,6 +2565,7 @@ static void expr_postprocess(struct rule_pp_ctx *ctx, struct expr **exprp) + + ntype = concat_subtype_add(ntype, i->dtype->type); + } ++ ctx->flags &= ~RULE_PP_IN_CONCATENATION; + list_splice(&tmp, &expr->expressions); + datatype_set(expr, concat_type_alloc(ntype)); + break; +@@ -2568,6 +2582,27 @@ static void expr_postprocess(struct rule_pp_ctx *ctx, struct expr **exprp) + expr_set_type(expr->right, &integer_type, + BYTEORDER_HOST_ENDIAN); + break; ++ case OP_AND: ++ expr_set_type(expr->right, expr->left->dtype, ++ expr->left->byteorder); ++ ++ /* Do not process OP_AND in ordinary rule context. ++ * ++ * Removal needs to be performed as part of the relational ++ * operation because the RHS constant might need to be adjusted ++ * (shifted). ++ * ++ * This is different in set element context or concatenations: ++ * There is no relational operation (eq, neq and so on), thus ++ * it needs to be processed right away. ++ */ ++ if ((ctx->flags & RULE_PP_REMOVE_OP_AND) && ++ expr->left->etype == EXPR_PAYLOAD && ++ expr->right->etype == EXPR_VALUE) { ++ __binop_postprocess(ctx, expr, expr->left, expr->right, exprp); ++ return; ++ } ++ break; + default: + expr_set_type(expr->right, expr->left->dtype, + expr->left->byteorder); +-- +2.39.1 + diff --git a/0014-proto-track-full-stack-of-seen-l2-protocols-not-just.patch b/0014-proto-track-full-stack-of-seen-l2-protocols-not-just.patch new file mode 100644 index 0000000..c90e964 --- /dev/null +++ b/0014-proto-track-full-stack-of-seen-l2-protocols-not-just.patch @@ -0,0 +1,287 @@ +From 72f0ca53bcc6586b921ff70dac9f3a911e4ec170 Mon Sep 17 00:00:00 2001 +From: Phil Sutter +Date: Thu, 9 Feb 2023 10:27:57 +0100 +Subject: [PATCH] proto: track full stack of seen l2 protocols, not just + cumulative offset + +Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2094887 +Upstream Status: nftables commit 0d9daa0407212 + +commit 0d9daa0407212c8cc89b3ea8aee031ddf0109b08 +Author: Florian Westphal +Date: Mon Jul 25 14:32:13 2022 +0200 + + proto: track full stack of seen l2 protocols, not just cumulative offset + + For input, a cumulative size counter of all pushed l2 headers is enough, + because we have the full expression tree available to us. + + For delinearization we need to track all seen l2 headers, else we lose + information that we might need at a later time. + + Consider: + + rule netdev nt nc set update ether saddr . vlan id + + during delinearization, the vlan proto_desc replaces the ethernet one, + and by the time we try to split the concatenation apart we will search + the ether saddr offset vs. the templates for proto_vlan. + + This replaces the offset with an array that stores the protocol + descriptions seen. + + Then, if the payload offset is larger than our description, search the + l2 stack and adjust the offset until we're within the expected offset + boundary. + + Reported-by: Eric Garver + Signed-off-by: Florian Westphal + +Signed-off-by: Phil Sutter +--- + include/proto.h | 3 +- + src/evaluate.c | 15 +++++++-- + src/netlink_delinearize.c | 5 --- + src/payload.c | 67 ++++++++++++++++++++++++++++++++------- + src/proto.c | 2 -- + 5 files changed, 71 insertions(+), 21 deletions(-) + +diff --git a/include/proto.h b/include/proto.h +index a04240a..35e760c 100644 +--- a/include/proto.h ++++ b/include/proto.h +@@ -193,13 +193,14 @@ struct proto_ctx { + struct { + struct location location; + const struct proto_desc *desc; +- unsigned int offset; + struct { + struct location location; + const struct proto_desc *desc; + } protos[PROTO_CTX_NUM_PROTOS]; + unsigned int num_protos; + } protocol[PROTO_BASE_MAX + 1]; ++ const struct proto_desc *stacked_ll[PROTO_CTX_NUM_PROTOS]; ++ uint8_t stacked_ll_count; + }; + + extern void proto_ctx_init(struct proto_ctx *ctx, unsigned int family, +diff --git a/src/evaluate.c b/src/evaluate.c +index 82bf131..9246064 100644 +--- a/src/evaluate.c ++++ b/src/evaluate.c +@@ -678,7 +678,13 @@ static int resolve_protocol_conflict(struct eval_ctx *ctx, + conflict_resolution_gen_dependency(ctx, link, payload, &nstmt) < 0) + return 1; + +- payload->payload.offset += ctx->pctx.protocol[base].offset; ++ if (base == PROTO_BASE_LL_HDR) { ++ unsigned int i; ++ ++ for (i = 0; i < ctx->pctx.stacked_ll_count; i++) ++ payload->payload.offset += ctx->pctx.stacked_ll[i]->length; ++ } ++ + rule_stmt_insert_at(ctx->rule, nstmt, ctx->stmt); + + return 0; +@@ -727,7 +733,12 @@ static int __expr_evaluate_payload(struct eval_ctx *ctx, struct expr *expr) + if (desc == payload->payload.desc) { + const struct proto_hdr_template *tmpl; + +- payload->payload.offset += ctx->pctx.protocol[base].offset; ++ if (desc->base == PROTO_BASE_LL_HDR) { ++ unsigned int i; ++ ++ for (i = 0; i < ctx->pctx.stacked_ll_count; i++) ++ payload->payload.offset += ctx->pctx.stacked_ll[i]->length; ++ } + check_icmp: + if (desc != &proto_icmp && desc != &proto_icmp6) + return 0; +diff --git a/src/netlink_delinearize.c b/src/netlink_delinearize.c +index cba419d..0b5519d 100644 +--- a/src/netlink_delinearize.c ++++ b/src/netlink_delinearize.c +@@ -1976,11 +1976,6 @@ static void payload_match_postprocess(struct rule_pp_ctx *ctx, + struct expr *expr, + struct expr *payload) + { +- enum proto_bases base = payload->payload.base; +- +- assert(payload->payload.offset >= ctx->pctx.protocol[base].offset); +- payload->payload.offset -= ctx->pctx.protocol[base].offset; +- + switch (expr->op) { + case OP_EQ: + case OP_NEQ: +diff --git a/src/payload.c b/src/payload.c +index 66418cd..2c0d0ac 100644 +--- a/src/payload.c ++++ b/src/payload.c +@@ -116,8 +116,13 @@ static void payload_expr_pctx_update(struct proto_ctx *ctx, + if (desc->base == base->base) { + assert(base->length > 0); + +- if (!left->payload.is_raw) +- ctx->protocol[base->base].offset += base->length; ++ if (!left->payload.is_raw) { ++ if (desc->base == PROTO_BASE_LL_HDR && ++ ctx->stacked_ll_count < PROTO_CTX_NUM_PROTOS) { ++ ctx->stacked_ll[ctx->stacked_ll_count] = base; ++ ctx->stacked_ll_count++; ++ } ++ } + } + proto_ctx_update(ctx, desc->base, loc, desc); + } +@@ -869,6 +874,38 @@ void exthdr_dependency_kill(struct payload_dep_ctx *ctx, struct expr *expr, + } + } + ++static const struct proto_desc *get_stacked_desc(const struct proto_ctx *ctx, ++ const struct proto_desc *top, ++ const struct expr *e, ++ unsigned int *skip) ++{ ++ unsigned int i, total, payload_offset = e->payload.offset; ++ ++ assert(e->etype == EXPR_PAYLOAD); ++ ++ if (e->payload.base != PROTO_BASE_LL_HDR || ++ payload_offset < top->length) { ++ *skip = 0; ++ return top; ++ } ++ ++ for (i = 0, total = 0; i < ctx->stacked_ll_count; i++) { ++ const struct proto_desc *stacked; ++ ++ stacked = ctx->stacked_ll[i]; ++ if (payload_offset < stacked->length) { ++ *skip = total; ++ return stacked; ++ } ++ ++ payload_offset -= stacked->length; ++ total += stacked->length; ++ } ++ ++ *skip = total; ++ return top; ++} ++ + /** + * payload_expr_complete - fill in type information of a raw payload expr + * +@@ -880,9 +917,10 @@ void exthdr_dependency_kill(struct payload_dep_ctx *ctx, struct expr *expr, + */ + void payload_expr_complete(struct expr *expr, const struct proto_ctx *ctx) + { ++ unsigned int payload_offset = expr->payload.offset; + const struct proto_desc *desc; + const struct proto_hdr_template *tmpl; +- unsigned int i; ++ unsigned int i, total; + + assert(expr->etype == EXPR_PAYLOAD); + +@@ -891,9 +929,12 @@ void payload_expr_complete(struct expr *expr, const struct proto_ctx *ctx) + return; + assert(desc->base == expr->payload.base); + ++ desc = get_stacked_desc(ctx, desc, expr, &total); ++ payload_offset -= total; ++ + for (i = 0; i < array_size(desc->templates); i++) { + tmpl = &desc->templates[i]; +- if (tmpl->offset != expr->payload.offset || ++ if (tmpl->offset != payload_offset || + tmpl->len != expr->len) + continue; + +@@ -950,6 +991,7 @@ bool payload_expr_trim(struct expr *expr, struct expr *mask, + unsigned int payload_len = expr->len; + const struct proto_desc *desc; + unsigned int off, i, len = 0; ++ unsigned int total; + + assert(expr->etype == EXPR_PAYLOAD); + +@@ -959,10 +1001,8 @@ bool payload_expr_trim(struct expr *expr, struct expr *mask, + + assert(desc->base == expr->payload.base); + +- if (ctx->protocol[expr->payload.base].offset) { +- assert(payload_offset >= ctx->protocol[expr->payload.base].offset); +- payload_offset -= ctx->protocol[expr->payload.base].offset; +- } ++ desc = get_stacked_desc(ctx, desc, expr, &total); ++ payload_offset -= total; + + off = round_up(mask->len, BITS_PER_BYTE) - mask_len; + payload_offset += off; +@@ -1009,10 +1049,11 @@ bool payload_expr_trim(struct expr *expr, struct expr *mask, + void payload_expr_expand(struct list_head *list, struct expr *expr, + const struct proto_ctx *ctx) + { ++ unsigned int payload_offset = expr->payload.offset; + const struct proto_hdr_template *tmpl; + const struct proto_desc *desc; ++ unsigned int i, total; + struct expr *new; +- unsigned int i; + + assert(expr->etype == EXPR_PAYLOAD); + +@@ -1021,13 +1062,16 @@ void payload_expr_expand(struct list_head *list, struct expr *expr, + goto raw; + assert(desc->base == expr->payload.base); + ++ desc = get_stacked_desc(ctx, desc, expr, &total); ++ payload_offset -= total; ++ + for (i = 1; i < array_size(desc->templates); i++) { + tmpl = &desc->templates[i]; + + if (tmpl->len == 0) + break; + +- if (tmpl->offset != expr->payload.offset) ++ if (tmpl->offset != payload_offset) + continue; + + if (tmpl->icmp_dep && ctx->th_dep.icmp.type && +@@ -1039,6 +1083,7 @@ void payload_expr_expand(struct list_head *list, struct expr *expr, + list_add_tail(&new->list, list); + expr->len -= tmpl->len; + expr->payload.offset += tmpl->len; ++ payload_offset += tmpl->len; + if (expr->len == 0) + return; + } else if (expr->len > 0) { +@@ -1051,7 +1096,7 @@ void payload_expr_expand(struct list_head *list, struct expr *expr, + } + raw: + new = payload_expr_alloc(&expr->location, NULL, 0); +- payload_init_raw(new, expr->payload.base, expr->payload.offset, ++ payload_init_raw(new, expr->payload.base, payload_offset, + expr->len); + list_add_tail(&new->list, list); + } +diff --git a/src/proto.c b/src/proto.c +index a013a00..2663f21 100644 +--- a/src/proto.c ++++ b/src/proto.c +@@ -160,8 +160,6 @@ static void proto_ctx_debug(const struct proto_ctx *ctx, enum proto_bases base, + proto_base_names[i], + ctx->protocol[i].desc ? ctx->protocol[i].desc->name : + "none"); +- if (ctx->protocol[i].offset) +- pr_debug(" (offset: %u)", ctx->protocol[i].offset); + if (i == base) + pr_debug(" <-"); + pr_debug("\n"); +-- +2.39.1 + diff --git a/0015-debug-dump-the-l2-protocol-stack.patch b/0015-debug-dump-the-l2-protocol-stack.patch new file mode 100644 index 0000000..2712781 --- /dev/null +++ b/0015-debug-dump-the-l2-protocol-stack.patch @@ -0,0 +1,44 @@ +From ed9b1e973f19f66cf08ea0db8947768f0240fa10 Mon Sep 17 00:00:00 2001 +From: Phil Sutter +Date: Thu, 9 Feb 2023 10:27:57 +0100 +Subject: [PATCH] debug: dump the l2 protocol stack + +Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2094887 +Upstream Status: nftables commit dbd5f348c71de + +commit dbd5f348c71decf0baa8fb592c576f63fa232f50 +Author: Florian Westphal +Date: Mon Jul 25 16:42:23 2022 +0200 + + debug: dump the l2 protocol stack + + Previously we used to print the cumulative size of the headers, + update this to print the tracked l2 stack. + + Signed-off-by: Florian Westphal + +Signed-off-by: Phil Sutter +--- + src/proto.c | 6 ++++++ + 1 file changed, 6 insertions(+) + +diff --git a/src/proto.c b/src/proto.c +index 2663f21..c496482 100644 +--- a/src/proto.c ++++ b/src/proto.c +@@ -154,6 +154,12 @@ static void proto_ctx_debug(const struct proto_ctx *ctx, enum proto_bases base, + if (!(debug_mask & NFT_DEBUG_PROTO_CTX)) + return; + ++ if (base == PROTO_BASE_LL_HDR && ctx->stacked_ll_count) { ++ pr_debug(" saved ll headers:"); ++ for (i = 0; i < ctx->stacked_ll_count; i++) ++ pr_debug(" %s", ctx->stacked_ll[i]->name); ++ } ++ + pr_debug("update %s protocol context:\n", proto_base_names[base]); + for (i = PROTO_BASE_LL_HDR; i <= PROTO_BASE_MAX; i++) { + pr_debug(" %-20s: %s", +-- +2.39.1 + diff --git a/0016-tests-add-a-test-case-for-ether-and-vlan-listing.patch b/0016-tests-add-a-test-case-for-ether-and-vlan-listing.patch new file mode 100644 index 0000000..c28096b --- /dev/null +++ b/0016-tests-add-a-test-case-for-ether-and-vlan-listing.patch @@ -0,0 +1,65 @@ +From a273fd3aaf89391ac84d7a155b3bb177552bf115 Mon Sep 17 00:00:00 2001 +From: Phil Sutter +Date: Thu, 9 Feb 2023 10:27:57 +0100 +Subject: [PATCH] tests: add a test case for ether and vlan listing + +Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2094887 +Upstream Status: nftables commit f680055cd4377 + +commit f680055cd4377f2f531f5f77b3aaa7550988665d +Author: Florian Westphal +Date: Mon Jul 25 19:31:22 2022 +0200 + + tests: add a test case for ether and vlan listing + + before this patch series, test fails dump validation: + - update @macset { ether saddr . vlan id timeout 5s } counter packets 0 bytes 0 + - ether saddr . vlan id @macset + + update @macset { @ll,48,48 . @ll,112,16 & 0xfff timeout 5s } counter packets 0 bytes 0 + + @ll,48,48 . @ll,112,16 & 0xfff @macset + + Signed-off-by: Florian Westphal + +Signed-off-by: Phil Sutter +--- + tests/shell/testcases/sets/0070stacked_l2_headers | 6 ++++++ + .../sets/dumps/0070stacked_l2_headers.nft | 14 ++++++++++++++ + 2 files changed, 20 insertions(+) + create mode 100755 tests/shell/testcases/sets/0070stacked_l2_headers + create mode 100644 tests/shell/testcases/sets/dumps/0070stacked_l2_headers.nft + +diff --git a/tests/shell/testcases/sets/0070stacked_l2_headers b/tests/shell/testcases/sets/0070stacked_l2_headers +new file mode 100755 +index 0000000..07820b7 +--- /dev/null ++++ b/tests/shell/testcases/sets/0070stacked_l2_headers +@@ -0,0 +1,6 @@ ++#!/bin/bash ++ ++set -e ++dumpfile=$(dirname $0)/dumps/$(basename $0).nft ++ ++$NFT -f "$dumpfile" +diff --git a/tests/shell/testcases/sets/dumps/0070stacked_l2_headers.nft b/tests/shell/testcases/sets/dumps/0070stacked_l2_headers.nft +new file mode 100644 +index 0000000..ef254b9 +--- /dev/null ++++ b/tests/shell/testcases/sets/dumps/0070stacked_l2_headers.nft +@@ -0,0 +1,14 @@ ++table netdev nt { ++ set macset { ++ typeof ether saddr . vlan id ++ size 1024 ++ flags dynamic,timeout ++ } ++ ++ chain nc { ++ update @macset { ether saddr . vlan id timeout 5s } counter packets 0 bytes 0 ++ ether saddr . vlan id @macset ++ vlan pcp 1 ++ ether saddr 0a:0b:0c:0d:0e:0f vlan id 42 ++ } ++} +-- +2.39.1 + diff --git a/0017-netlink_delinearize-also-postprocess-OP_AND-in-set-e.patch b/0017-netlink_delinearize-also-postprocess-OP_AND-in-set-e.patch new file mode 100644 index 0000000..90601ac --- /dev/null +++ b/0017-netlink_delinearize-also-postprocess-OP_AND-in-set-e.patch @@ -0,0 +1,99 @@ +From 72a88a128ebb386307e9d3ef1b71cefa52c7a0af Mon Sep 17 00:00:00 2001 +From: Phil Sutter +Date: Thu, 9 Feb 2023 10:27:57 +0100 +Subject: [PATCH] netlink_delinearize: also postprocess OP_AND in set element + context + +Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2094887 +Upstream Status: nftables commit b1e3ed0335d13 + +commit b1e3ed0335d13d206a2a2698a1ba189fa396dbf3 +Author: Florian Westphal +Date: Mon Aug 1 13:03:18 2022 +0200 + + netlink_delinearize: also postprocess OP_AND in set element context + + Pablo reports: + add rule netdev nt y update @macset { vlan id timeout 5s } + + listing still shows the raw expression: + update @macset { @ll,112,16 & 0xfff timeout 5s } + + so also cover the 'set element' case. + + Reported-by: Pablo Neira Ayuso + Signed-off-by: Florian Westphal + +Signed-off-by: Phil Sutter +--- + include/netlink.h | 4 +++- + src/netlink_delinearize.c | 2 ++ + .../sets/dumps/0070stacked_l2_headers.nft | 14 ++++++++++++++ + 3 files changed, 19 insertions(+), 1 deletion(-) + +diff --git a/include/netlink.h b/include/netlink.h +index 71c888f..63d07ed 100644 +--- a/include/netlink.h ++++ b/include/netlink.h +@@ -44,8 +44,10 @@ struct netlink_parse_ctx { + + + #define RULE_PP_IN_CONCATENATION (1 << 0) ++#define RULE_PP_IN_SET_ELEM (1 << 1) + +-#define RULE_PP_REMOVE_OP_AND (RULE_PP_IN_CONCATENATION) ++#define RULE_PP_REMOVE_OP_AND (RULE_PP_IN_CONCATENATION | \ ++ RULE_PP_IN_SET_ELEM) + + struct rule_pp_ctx { + struct proto_ctx pctx; +diff --git a/src/netlink_delinearize.c b/src/netlink_delinearize.c +index 0b5519d..c6ad84d 100644 +--- a/src/netlink_delinearize.c ++++ b/src/netlink_delinearize.c +@@ -2660,7 +2660,9 @@ static void expr_postprocess(struct rule_pp_ctx *ctx, struct expr **exprp) + expr_postprocess(ctx, &expr->prefix); + break; + case EXPR_SET_ELEM: ++ ctx->flags |= RULE_PP_IN_SET_ELEM; + expr_postprocess(ctx, &expr->key); ++ ctx->flags &= ~RULE_PP_IN_SET_ELEM; + break; + case EXPR_EXTHDR: + exthdr_dependency_kill(&ctx->pdctx, expr, ctx->pctx.family); +diff --git a/tests/shell/testcases/sets/dumps/0070stacked_l2_headers.nft b/tests/shell/testcases/sets/dumps/0070stacked_l2_headers.nft +index ef254b9..0057e9c 100644 +--- a/tests/shell/testcases/sets/dumps/0070stacked_l2_headers.nft ++++ b/tests/shell/testcases/sets/dumps/0070stacked_l2_headers.nft +@@ -1,14 +1,28 @@ + table netdev nt { ++ set vlanidset { ++ typeof vlan id ++ size 1024 ++ flags dynamic,timeout ++ } ++ + set macset { + typeof ether saddr . vlan id + size 1024 + flags dynamic,timeout + } + ++ set ipset { ++ typeof vlan id . ip saddr ++ size 1024 ++ flags dynamic,timeout ++ } ++ + chain nc { + update @macset { ether saddr . vlan id timeout 5s } counter packets 0 bytes 0 + ether saddr . vlan id @macset + vlan pcp 1 + ether saddr 0a:0b:0c:0d:0e:0f vlan id 42 ++ update @vlanidset { vlan id timeout 5s } counter packets 0 bytes 0 ++ update @ipset { vlan id . ip saddr timeout 5s } counter packets 0 bytes 0 + } + } +-- +2.39.1 + diff --git a/0018-evaluate-search-stacked-header-list-for-matching-pay.patch b/0018-evaluate-search-stacked-header-list-for-matching-pay.patch new file mode 100644 index 0000000..3e9f329 --- /dev/null +++ b/0018-evaluate-search-stacked-header-list-for-matching-pay.patch @@ -0,0 +1,198 @@ +From 71fb4e3be07548216a831dda28f4aedfc37b2df1 Mon Sep 17 00:00:00 2001 +From: Phil Sutter +Date: Thu, 9 Feb 2023 10:27:58 +0100 +Subject: [PATCH] evaluate: search stacked header list for matching payload dep + +Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2094887 +Upstream Status: nftables commit 87c3041bfd244 + +commit 87c3041bfd244aaf39e644d33c0df4fe04079e1c +Author: Florian Westphal +Date: Mon Jul 25 20:02:28 2022 +0200 + + evaluate: search stacked header list for matching payload dep + + "ether saddr 0:1:2:3:4:6 vlan id 2" works, but reverse fails: + + "vlan id 2 ether saddr 0:1:2:3:4:6" will give + Error: conflicting protocols specified: vlan vs. ether + + After "proto: track full stack of seen l2 protocols, not just cumulative offset", + we have a list of all l2 headers, so search those to see if we had this + proto base in the past before rejecting this. + + Reported-by: Eric Garver + Signed-off-by: Florian Westphal + +Signed-off-by: Phil Sutter +--- + src/evaluate.c | 21 +++++++--- + tests/py/bridge/vlan.t | 3 ++ + tests/py/bridge/vlan.t.json | 56 +++++++++++++++++++++++++++ + tests/py/bridge/vlan.t.payload | 16 ++++++++ + tests/py/bridge/vlan.t.payload.netdev | 20 ++++++++++ + 5 files changed, 110 insertions(+), 6 deletions(-) + +diff --git a/src/evaluate.c b/src/evaluate.c +index 9246064..d67f915 100644 +--- a/src/evaluate.c ++++ b/src/evaluate.c +@@ -659,13 +659,22 @@ static int resolve_protocol_conflict(struct eval_ctx *ctx, + struct stmt *nstmt = NULL; + int link, err; + +- if (payload->payload.base == PROTO_BASE_LL_HDR && +- proto_is_dummy(desc)) { +- err = meta_iiftype_gen_dependency(ctx, payload, &nstmt); +- if (err < 0) +- return err; ++ if (payload->payload.base == PROTO_BASE_LL_HDR) { ++ if (proto_is_dummy(desc)) { ++ err = meta_iiftype_gen_dependency(ctx, payload, &nstmt); ++ if (err < 0) ++ return err; + +- rule_stmt_insert_at(ctx->rule, nstmt, ctx->stmt); ++ rule_stmt_insert_at(ctx->rule, nstmt, ctx->stmt); ++ } else { ++ unsigned int i; ++ ++ /* payload desc stored in the L2 header stack? No conflict. */ ++ for (i = 0; i < ctx->pctx.stacked_ll_count; i++) { ++ if (ctx->pctx.stacked_ll[i] == payload->payload.desc) ++ return 0; ++ } ++ } + } + + assert(base <= PROTO_BASE_MAX); +diff --git a/tests/py/bridge/vlan.t b/tests/py/bridge/vlan.t +index 924ed4e..4920601 100644 +--- a/tests/py/bridge/vlan.t ++++ b/tests/py/bridge/vlan.t +@@ -47,3 +47,6 @@ ether type ip vlan id 1 ip saddr 10.0.0.1;fail + + # mangling + vlan id 1 vlan id set 2;ok ++ ++ether saddr 00:01:02:03:04:05 vlan id 1;ok ++vlan id 2 ether saddr 0:1:2:3:4:6;ok;ether saddr 00:01:02:03:04:06 vlan id 2 +diff --git a/tests/py/bridge/vlan.t.json b/tests/py/bridge/vlan.t.json +index e7640f9..58d4a40 100644 +--- a/tests/py/bridge/vlan.t.json ++++ b/tests/py/bridge/vlan.t.json +@@ -761,3 +761,59 @@ + } + } + ] ++ ++# ether saddr 00:01:02:03:04:05 vlan id 1 ++[ ++ { ++ "match": { ++ "left": { ++ "payload": { ++ "field": "saddr", ++ "protocol": "ether" ++ } ++ }, ++ "op": "==", ++ "right": "00:01:02:03:04:05" ++ } ++ }, ++ { ++ "match": { ++ "left": { ++ "payload": { ++ "field": "id", ++ "protocol": "vlan" ++ } ++ }, ++ "op": "==", ++ "right": 1 ++ } ++ } ++] ++ ++# vlan id 2 ether saddr 0:1:2:3:4:6 ++[ ++ { ++ "match": { ++ "left": { ++ "payload": { ++ "field": "saddr", ++ "protocol": "ether" ++ } ++ }, ++ "op": "==", ++ "right": "00:01:02:03:04:06" ++ } ++ }, ++ { ++ "match": { ++ "left": { ++ "payload": { ++ "field": "id", ++ "protocol": "vlan" ++ } ++ }, ++ "op": "==", ++ "right": 2 ++ } ++ } ++] +diff --git a/tests/py/bridge/vlan.t.payload b/tests/py/bridge/vlan.t.payload +index 6c8d595..713670e 100644 +--- a/tests/py/bridge/vlan.t.payload ++++ b/tests/py/bridge/vlan.t.payload +@@ -276,3 +276,19 @@ bridge + [ payload load 2b @ link header + 14 => reg 1 ] + [ bitwise reg 1 = ( reg 1 & 0x000000f0 ) ^ 0x00000200 ] + [ payload write reg 1 => 2b @ link header + 14 csum_type 0 csum_off 0 csum_flags 0x0 ] ++ ++# ether saddr 00:01:02:03:04:05 vlan id 1 ++bridge test-bridge input ++ [ payload load 8b @ link header + 6 => reg 1 ] ++ [ cmp eq reg 1 0x03020100 0x00810504 ] ++ [ payload load 2b @ link header + 14 => reg 1 ] ++ [ bitwise reg 1 = ( reg 1 & 0x0000ff0f ) ^ 0x00000000 ] ++ [ cmp eq reg 1 0x00000100 ] ++ ++# vlan id 2 ether saddr 0:1:2:3:4:6 ++bridge test-bridge input ++ [ payload load 8b @ link header + 6 => reg 1 ] ++ [ cmp eq reg 1 0x03020100 0x00810604 ] ++ [ payload load 2b @ link header + 14 => reg 1 ] ++ [ bitwise reg 1 = ( reg 1 & 0x0000ff0f ) ^ 0x00000000 ] ++ [ cmp eq reg 1 0x00000200 ] +diff --git a/tests/py/bridge/vlan.t.payload.netdev b/tests/py/bridge/vlan.t.payload.netdev +index d2c7d74..98a2a2b 100644 +--- a/tests/py/bridge/vlan.t.payload.netdev ++++ b/tests/py/bridge/vlan.t.payload.netdev +@@ -322,3 +322,23 @@ netdev + [ payload load 2b @ link header + 14 => reg 1 ] + [ bitwise reg 1 = ( reg 1 & 0x000000f0 ) ^ 0x00000200 ] + [ payload write reg 1 => 2b @ link header + 14 csum_type 0 csum_off 0 csum_flags 0x0 ] ++ ++# vlan id 2 ether saddr 0:1:2:3:4:6 ++netdev test-netdev ingress ++ [ meta load iiftype => reg 1 ] ++ [ cmp eq reg 1 0x00000001 ] ++ [ payload load 8b @ link header + 6 => reg 1 ] ++ [ cmp eq reg 1 0x03020100 0x00810604 ] ++ [ payload load 2b @ link header + 14 => reg 1 ] ++ [ bitwise reg 1 = ( reg 1 & 0x0000ff0f ) ^ 0x00000000 ] ++ [ cmp eq reg 1 0x00000200 ] ++ ++# ether saddr 00:01:02:03:04:05 vlan id 1 ++netdev test-netdev ingress ++ [ meta load iiftype => reg 1 ] ++ [ cmp eq reg 1 0x00000001 ] ++ [ payload load 8b @ link header + 6 => reg 1 ] ++ [ cmp eq reg 1 0x03020100 0x00810504 ] ++ [ payload load 2b @ link header + 14 => reg 1 ] ++ [ bitwise reg 1 = ( reg 1 & 0x0000ff0f ) ^ 0x00000000 ] ++ [ cmp eq reg 1 0x00000100 ] +-- +2.39.1 + diff --git a/0019-src-allow-anon-set-concatenation-with-ether-and-vlan.patch b/0019-src-allow-anon-set-concatenation-with-ether-and-vlan.patch new file mode 100644 index 0000000..d797f10 --- /dev/null +++ b/0019-src-allow-anon-set-concatenation-with-ether-and-vlan.patch @@ -0,0 +1,223 @@ +From 99e943834fc6708fb8bd8f329988979433db19e8 Mon Sep 17 00:00:00 2001 +From: Phil Sutter +Date: Thu, 9 Feb 2023 10:27:58 +0100 +Subject: [PATCH] src: allow anon set concatenation with ether and vlan + +Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2094887 +Upstream Status: nftables commit c1c223f1b5818 + +commit c1c223f1b58188542222ee2d9a4a8cc133d1dc3b +Author: Florian Westphal +Date: Mon Jul 25 21:34:52 2022 +0200 + + src: allow anon set concatenation with ether and vlan + + vlan id uses integer type (which has a length of 0). + + Using it was possible, but listing would assert: + python: mergesort.c:24: concat_expr_msort_value: Assertion `ilen > 0' failed. + + There are two reasons for this. + First reason is that the udata/typeof information lacks the 'vlan id' + part, because internally this is 'payload . binop(payload AND mask)'. + + binop lacks an udata store. It makes little sense to store it, + 'typeof' keyword expects normal match syntax. + + So, when storing udata, store the left hand side of the binary + operation, i.e. the load of the 2-byte key. + + With that resolved, delinerization could work, but concat_elem_expr() + would splice 12 bits off the elements value, but it should be 16 (on + a byte boundary). + + Signed-off-by: Florian Westphal + +Signed-off-by: Phil Sutter +--- + src/expression.c | 17 +++++++++-- + src/netlink.c | 10 +++++-- + tests/py/bridge/vlan.t | 2 ++ + tests/py/bridge/vlan.t.json | 41 +++++++++++++++++++++++++++ + tests/py/bridge/vlan.t.payload | 12 ++++++++ + tests/py/bridge/vlan.t.payload.netdev | 14 +++++++++ + 6 files changed, 91 insertions(+), 5 deletions(-) + +diff --git a/src/expression.c b/src/expression.c +index deb649e..7390089 100644 +--- a/src/expression.c ++++ b/src/expression.c +@@ -879,17 +879,30 @@ static void concat_expr_print(const struct expr *expr, struct output_ctx *octx) + #define NFTNL_UDATA_SET_KEY_CONCAT_SUB_DATA 1 + #define NFTNL_UDATA_SET_KEY_CONCAT_SUB_MAX 2 + ++static struct expr *expr_build_udata_recurse(struct expr *e) ++{ ++ switch (e->etype) { ++ case EXPR_BINOP: ++ return e->left; ++ default: ++ break; ++ } ++ ++ return e; ++} ++ + static int concat_expr_build_udata(struct nftnl_udata_buf *udbuf, + const struct expr *concat_expr) + { + struct nftnl_udata *nest; ++ struct expr *expr, *tmp; + unsigned int i = 0; +- struct expr *expr; + +- list_for_each_entry(expr, &concat_expr->expressions, list) { ++ list_for_each_entry_safe(expr, tmp, &concat_expr->expressions, list) { + struct nftnl_udata *nest_expr; + int err; + ++ expr = expr_build_udata_recurse(expr); + if (!expr_ops(expr)->build_udata || i >= NFT_REG32_SIZE) + return -1; + +diff --git a/src/netlink.c b/src/netlink.c +index 89d864e..799cf9b 100644 +--- a/src/netlink.c ++++ b/src/netlink.c +@@ -1114,17 +1114,21 @@ static struct expr *concat_elem_expr(struct expr *key, + struct expr *data, int *off) + { + const struct datatype *subtype; ++ unsigned int sub_length; + struct expr *expr; + + if (key) { + (*off)--; +- expr = constant_expr_splice(data, key->len); ++ sub_length = round_up(key->len, BITS_PER_BYTE); ++ ++ expr = constant_expr_splice(data, sub_length); + expr->dtype = datatype_get(key->dtype); + expr->byteorder = key->byteorder; + expr->len = key->len; + } else { + subtype = concat_subtype_lookup(dtype->type, --(*off)); +- expr = constant_expr_splice(data, subtype->size); ++ sub_length = round_up(subtype->size, BITS_PER_BYTE); ++ expr = constant_expr_splice(data, sub_length); + expr->dtype = subtype; + expr->byteorder = subtype->byteorder; + } +@@ -1136,7 +1140,7 @@ static struct expr *concat_elem_expr(struct expr *key, + expr->dtype->basetype->type == TYPE_BITMASK) + expr = bitmask_expr_to_binops(expr); + +- data->len -= netlink_padding_len(expr->len); ++ data->len -= netlink_padding_len(sub_length); + + return expr; + } +diff --git a/tests/py/bridge/vlan.t b/tests/py/bridge/vlan.t +index 4920601..95bdff4 100644 +--- a/tests/py/bridge/vlan.t ++++ b/tests/py/bridge/vlan.t +@@ -50,3 +50,5 @@ vlan id 1 vlan id set 2;ok + + ether saddr 00:01:02:03:04:05 vlan id 1;ok + vlan id 2 ether saddr 0:1:2:3:4:6;ok;ether saddr 00:01:02:03:04:06 vlan id 2 ++ ++ether saddr . vlan id { 0a:0b:0c:0d:0e:0f . 42, 0a:0b:0c:0d:0e:0f . 4095 };ok +diff --git a/tests/py/bridge/vlan.t.json b/tests/py/bridge/vlan.t.json +index 58d4a40..f77756f 100644 +--- a/tests/py/bridge/vlan.t.json ++++ b/tests/py/bridge/vlan.t.json +@@ -817,3 +817,44 @@ + } + } + ] ++ ++# ether saddr . vlan id { 0a:0b:0c:0d:0e:0f . 42, 0a:0b:0c:0d:0e:0f . 4095 } ++[ ++ { ++ "match": { ++ "left": { ++ "concat": [ ++ { ++ "payload": { ++ "field": "saddr", ++ "protocol": "ether" ++ } ++ }, ++ { ++ "payload": { ++ "field": "id", ++ "protocol": "vlan" ++ } ++ } ++ ] ++ }, ++ "op": "==", ++ "right": { ++ "set": [ ++ { ++ "concat": [ ++ "0a:0b:0c:0d:0e:0f", ++ 42 ++ ] ++ }, ++ { ++ "concat": [ ++ "0a:0b:0c:0d:0e:0f", ++ 4095 ++ ] ++ } ++ ] ++ } ++ } ++ } ++] +diff --git a/tests/py/bridge/vlan.t.payload b/tests/py/bridge/vlan.t.payload +index 713670e..62e4b89 100644 +--- a/tests/py/bridge/vlan.t.payload ++++ b/tests/py/bridge/vlan.t.payload +@@ -292,3 +292,15 @@ bridge test-bridge input + [ payload load 2b @ link header + 14 => reg 1 ] + [ bitwise reg 1 = ( reg 1 & 0x0000ff0f ) ^ 0x00000000 ] + [ cmp eq reg 1 0x00000200 ] ++ ++# ether saddr . vlan id { 0a:0b:0c:0d:0e:0f . 42, 0a:0b:0c:0d:0e:0f . 4095 } ++__set%d test-bridge 3 size 2 ++__set%d test-bridge 0 ++ element 0d0c0b0a 00000f0e 00002a00 : 0 [end] element 0d0c0b0a 00000f0e 0000ff0f : 0 [end] ++bridge test-bridge input ++ [ payload load 2b @ link header + 12 => reg 1 ] ++ [ cmp eq reg 1 0x00000081 ] ++ [ payload load 6b @ link header + 6 => reg 1 ] ++ [ payload load 2b @ link header + 14 => reg 10 ] ++ [ bitwise reg 10 = ( reg 10 & 0x0000ff0f ) ^ 0x00000000 ] ++ [ lookup reg 1 set __set%d ] +diff --git a/tests/py/bridge/vlan.t.payload.netdev b/tests/py/bridge/vlan.t.payload.netdev +index 98a2a2b..1018d4c 100644 +--- a/tests/py/bridge/vlan.t.payload.netdev ++++ b/tests/py/bridge/vlan.t.payload.netdev +@@ -342,3 +342,17 @@ netdev test-netdev ingress + [ payload load 2b @ link header + 14 => reg 1 ] + [ bitwise reg 1 = ( reg 1 & 0x0000ff0f ) ^ 0x00000000 ] + [ cmp eq reg 1 0x00000100 ] ++ ++# ether saddr . vlan id { 0a:0b:0c:0d:0e:0f . 42, 0a:0b:0c:0d:0e:0f . 4095 } ++__set%d test-netdev 3 size 2 ++__set%d test-netdev 0 ++ element 0d0c0b0a 00000f0e 00002a00 : 0 [end] element 0d0c0b0a 00000f0e 0000ff0f : 0 [end] ++netdev test-netdev ingress ++ [ meta load iiftype => reg 1 ] ++ [ cmp eq reg 1 0x00000001 ] ++ [ payload load 2b @ link header + 12 => reg 1 ] ++ [ cmp eq reg 1 0x00000081 ] ++ [ payload load 6b @ link header + 6 => reg 1 ] ++ [ payload load 2b @ link header + 14 => reg 10 ] ++ [ bitwise reg 10 = ( reg 10 & 0x0000ff0f ) ^ 0x00000000 ] ++ [ lookup reg 1 set __set%d ] +-- +2.39.1 + diff --git a/0020-evaluate-set-eval-ctx-for-add-update-statements-with.patch b/0020-evaluate-set-eval-ctx-for-add-update-statements-with.patch new file mode 100644 index 0000000..6550439 --- /dev/null +++ b/0020-evaluate-set-eval-ctx-for-add-update-statements-with.patch @@ -0,0 +1,200 @@ +From 5a23da04e253a012adc877a82fcdec5e2f4a1534 Mon Sep 17 00:00:00 2001 +From: Phil Sutter +Date: Thu, 9 Feb 2023 12:45:30 +0100 +Subject: [PATCH] evaluate: set eval ctx for add/update statements with integer + constants + +Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2094894 +Upstream Status: nftables commit 4cc6b20d31498 + +commit 4cc6b20d31498d90e90ff574ce8b70276afcee8f +Author: Florian Westphal +Date: Mon Jan 23 19:03:28 2023 +0100 + + evaluate: set eval ctx for add/update statements with integer constants + + Eric reports that nft asserts when using integer basetype constants with + 'typeof' sets. Example: + table netdev t { + set s { + typeof ether saddr . vlan id + flags dynamic,timeout + } + + chain c { } + } + + loads fine. But adding a rule with add/update statement fails: + nft 'add rule netdev t c set update ether saddr . 0 @s' + nft: netlink_linearize.c:867: netlink_gen_expr: Assertion `dreg < ctx->reg_low' failed. + + When the 'ether saddr . 0' concat expression is processed, there is + no set definition available anymore to deduce the required size of the + integer constant. + + nft eval step then derives the required length using the data types. + '0' has integer basetype, so the deduced length is 0. + + The assertion triggers because serialization step finds that it + needs one more register. + + 2 are needed to store the ethernet address, another register is + needed for the vlan id. + + Update eval step to make the expression context store the set key + information when processing the preceeding set reference, then + let stmt_evaluate_set() preserve the existing context instead of + zeroing it again via stmt_evaluate_arg(). + + This makes concat expression evaluation compute the total size + needed based on the sets key definition. + + Reported-by: Eric Garver + Signed-off-by: Florian Westphal + +Signed-off-by: Phil Sutter +--- + src/evaluate.c | 32 +++++++++++++++++-- + .../maps/dumps/typeof_maps_concat.nft | 11 +++++++ + tests/shell/testcases/maps/typeof_maps_concat | 6 ++++ + .../sets/dumps/typeof_sets_concat.nft | 12 +++++++ + tests/shell/testcases/sets/typeof_sets_concat | 6 ++++ + 5 files changed, 65 insertions(+), 2 deletions(-) + create mode 100644 tests/shell/testcases/maps/dumps/typeof_maps_concat.nft + create mode 100755 tests/shell/testcases/maps/typeof_maps_concat + create mode 100644 tests/shell/testcases/sets/dumps/typeof_sets_concat.nft + create mode 100755 tests/shell/testcases/sets/typeof_sets_concat + +diff --git a/src/evaluate.c b/src/evaluate.c +index d67f915..7f81411 100644 +--- a/src/evaluate.c ++++ b/src/evaluate.c +@@ -1526,6 +1526,14 @@ static int interval_set_eval(struct eval_ctx *ctx, struct set *set, + return ret; + } + ++static void expr_evaluate_set_ref(struct eval_ctx *ctx, struct expr *expr) ++{ ++ struct set *set = expr->set; ++ ++ expr_set_context(&ctx->ectx, set->key->dtype, set->key->len); ++ ctx->ectx.key = set->key; ++} ++ + static int expr_evaluate_set(struct eval_ctx *ctx, struct expr **expr) + { + struct expr *set = *expr, *i, *next; +@@ -2388,6 +2396,7 @@ static int expr_evaluate(struct eval_ctx *ctx, struct expr **expr) + case EXPR_VARIABLE: + return expr_evaluate_variable(ctx, expr); + case EXPR_SET_REF: ++ expr_evaluate_set_ref(ctx, *expr); + return 0; + case EXPR_VALUE: + return expr_evaluate_value(ctx, expr); +@@ -2550,6 +2559,25 @@ static int stmt_evaluate_arg(struct eval_ctx *ctx, struct stmt *stmt, + return __stmt_evaluate_arg(ctx, stmt, dtype, len, byteorder, expr); + } + ++/* like stmt_evaluate_arg, but keep existing context created ++ * by previous expr_evaluate(). ++ * ++ * This is needed for add/update statements: ++ * ctx->ectx.key has the set key, which may be needed for 'typeof' ++ * sets: the 'add/update' expression might contain integer data types. ++ * ++ * Without the key we cannot derive the element size. ++ */ ++static int stmt_evaluate_key(struct eval_ctx *ctx, struct stmt *stmt, ++ const struct datatype *dtype, unsigned int len, ++ enum byteorder byteorder, struct expr **expr) ++{ ++ if (expr_evaluate(ctx, expr) < 0) ++ return -1; ++ ++ return __stmt_evaluate_arg(ctx, stmt, dtype, len, byteorder, expr); ++} ++ + static int stmt_evaluate_verdict(struct eval_ctx *ctx, struct stmt *stmt) + { + if (stmt_evaluate_arg(ctx, stmt, &verdict_type, 0, 0, &stmt->expr) < 0) +@@ -3762,7 +3790,7 @@ static int stmt_evaluate_set(struct eval_ctx *ctx, struct stmt *stmt) + return expr_error(ctx->msgs, stmt->set.set, + "Expression does not refer to a set"); + +- if (stmt_evaluate_arg(ctx, stmt, ++ if (stmt_evaluate_key(ctx, stmt, + stmt->set.set->set->key->dtype, + stmt->set.set->set->key->len, + stmt->set.set->set->key->byteorder, +@@ -3805,7 +3833,7 @@ static int stmt_evaluate_map(struct eval_ctx *ctx, struct stmt *stmt) + return expr_error(ctx->msgs, stmt->map.set, + "Expression does not refer to a set"); + +- if (stmt_evaluate_arg(ctx, stmt, ++ if (stmt_evaluate_key(ctx, stmt, + stmt->map.set->set->key->dtype, + stmt->map.set->set->key->len, + stmt->map.set->set->key->byteorder, +diff --git a/tests/shell/testcases/maps/dumps/typeof_maps_concat.nft b/tests/shell/testcases/maps/dumps/typeof_maps_concat.nft +new file mode 100644 +index 0000000..1ca98d8 +--- /dev/null ++++ b/tests/shell/testcases/maps/dumps/typeof_maps_concat.nft +@@ -0,0 +1,11 @@ ++table netdev t { ++ map m { ++ typeof ether saddr . vlan id : meta mark ++ size 1234 ++ flags dynamic,timeout ++ } ++ ++ chain c { ++ ether type != 8021q update @m { ether daddr . 123 timeout 1m : 0x0000002a } counter packets 0 bytes 0 return ++ } ++} +diff --git a/tests/shell/testcases/maps/typeof_maps_concat b/tests/shell/testcases/maps/typeof_maps_concat +new file mode 100755 +index 0000000..07820b7 +--- /dev/null ++++ b/tests/shell/testcases/maps/typeof_maps_concat +@@ -0,0 +1,6 @@ ++#!/bin/bash ++ ++set -e ++dumpfile=$(dirname $0)/dumps/$(basename $0).nft ++ ++$NFT -f "$dumpfile" +diff --git a/tests/shell/testcases/sets/dumps/typeof_sets_concat.nft b/tests/shell/testcases/sets/dumps/typeof_sets_concat.nft +new file mode 100644 +index 0000000..dbaf7cd +--- /dev/null ++++ b/tests/shell/testcases/sets/dumps/typeof_sets_concat.nft +@@ -0,0 +1,12 @@ ++table netdev t { ++ set s { ++ typeof ether saddr . vlan id ++ size 2048 ++ flags dynamic,timeout ++ } ++ ++ chain c { ++ ether type != 8021q add @s { ether saddr . 0 timeout 5s } counter packets 0 bytes 0 return ++ ether type != 8021q update @s { ether daddr . 123 timeout 1m } counter packets 0 bytes 0 return ++ } ++} +diff --git a/tests/shell/testcases/sets/typeof_sets_concat b/tests/shell/testcases/sets/typeof_sets_concat +new file mode 100755 +index 0000000..07820b7 +--- /dev/null ++++ b/tests/shell/testcases/sets/typeof_sets_concat +@@ -0,0 +1,6 @@ ++#!/bin/bash ++ ++set -e ++dumpfile=$(dirname $0)/dumps/$(basename $0).nft ++ ++$NFT -f "$dumpfile" +-- +2.39.1 + diff --git a/0021-monitor-Sanitize-startup-race-condition.patch b/0021-monitor-Sanitize-startup-race-condition.patch new file mode 100644 index 0000000..f1dffc7 --- /dev/null +++ b/0021-monitor-Sanitize-startup-race-condition.patch @@ -0,0 +1,107 @@ +From ef0f28bc8efdc5292ceb593c9c5a646f030e0994 Mon Sep 17 00:00:00 2001 +From: Phil Sutter +Date: Thu, 9 Feb 2023 15:25:37 +0100 +Subject: [PATCH] monitor: Sanitize startup race condition + +Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2130721 +Upstream Status: nftables commit 545edb7a8ef0a + +commit 545edb7a8ef0a8acf991b1b7857fddc24d7b151a +Author: Phil Sutter +Date: Wed Sep 28 23:26:42 2022 +0200 + + monitor: Sanitize startup race condition + + During startup, 'nft monitor' first fetches the current ruleset and then + keeps this cache up to date based on received events. This is racey, as + any ruleset changes in between the initial fetch and the socket opening + are not recognized. + + This script demonstrates the problem: + + | #!/bin/bash + | + | while true; do + | nft flush ruleset + | iptables-nft -A FORWARD + | done & + | maniploop=$! + | + | trap "kill $maniploop; kill \$!; wait" EXIT + | + | while true; do + | nft monitor rules >/dev/null & + | sleep 0.2 + | kill $! + | done + + If the table add event is missed, the rule add event callback fails to + deserialize the rule and calls abort(). + + Avoid the inconvenient program exit by returning NULL from + netlink_delinearize_rule() instead of aborting and make callers check + the return value. + + Signed-off-by: Phil Sutter + +Signed-off-by: Phil Sutter +--- + src/cache.c | 1 + + src/monitor.c | 5 +++++ + src/netlink_delinearize.c | 5 ++++- + 3 files changed, 10 insertions(+), 1 deletion(-) + +diff --git a/src/cache.c b/src/cache.c +index fd8df88..701aec6 100644 +--- a/src/cache.c ++++ b/src/cache.c +@@ -490,6 +490,7 @@ static int list_rule_cb(struct nftnl_rule *nlr, void *data) + + netlink_dump_rule(nlr, ctx); + rule = netlink_delinearize_rule(ctx, nlr); ++ assert(rule); + list_add_tail(&rule->list, &ctx->list); + + return 0; +diff --git a/src/monitor.c b/src/monitor.c +index 7fa92eb..a6b30a1 100644 +--- a/src/monitor.c ++++ b/src/monitor.c +@@ -551,6 +551,10 @@ static int netlink_events_rule_cb(const struct nlmsghdr *nlh, int type, + + nlr = netlink_rule_alloc(nlh); + r = netlink_delinearize_rule(monh->ctx, nlr); ++ if (!r) { ++ fprintf(stderr, "W: Received event for an unknown table.\n"); ++ goto out_free_nlr; ++ } + nlr_for_each_set(nlr, rule_map_decompose_cb, NULL, + &monh->ctx->nft->cache); + cmd = netlink_msg2cmd(type, nlh->nlmsg_flags); +@@ -587,6 +591,7 @@ static int netlink_events_rule_cb(const struct nlmsghdr *nlh, int type, + break; + } + rule_free(r); ++out_free_nlr: + nftnl_rule_free(nlr); + return MNL_CB_OK; + } +diff --git a/src/netlink_delinearize.c b/src/netlink_delinearize.c +index c6ad84d..1d47c74 100644 +--- a/src/netlink_delinearize.c ++++ b/src/netlink_delinearize.c +@@ -3194,7 +3194,10 @@ struct rule *netlink_delinearize_rule(struct netlink_ctx *ctx, + pctx->rule = rule_alloc(&netlink_location, &h); + pctx->table = table_cache_find(&ctx->nft->cache.table_cache, + h.table.name, h.family); +- assert(pctx->table != NULL); ++ if (!pctx->table) { ++ errno = ENOENT; ++ return NULL; ++ } + + pctx->rule->comment = nftnl_rule_get_comment(nlr); + +-- +2.39.1 + diff --git a/nft-test.stderr.expect b/nft-test.stderr.expect index ff57d6d..a90ee54 100644 --- a/nft-test.stderr.expect +++ b/nft-test.stderr.expect @@ -307,7 +307,10 @@ bridge/vlan.t: ERROR: line 40: add rule netdev test-netdev egress ether type 802 bridge/vlan.t: ERROR: line 41: add rule netdev test-netdev egress ether type 8021ad vlan id 1 vlan type 8021q vlan id 2 vlan type ip counter: This rule should not have failed. bridge/vlan.t: ERROR: line 42: add rule netdev test-netdev egress ether type 8021ad vlan id 1 vlan type 8021q vlan id 2 vlan type ip ip protocol 6: This rule should not have failed. bridge/vlan.t: ERROR: line 49: add rule netdev test-netdev egress vlan id 1 vlan id set 2: This rule should not have failed. -bridge/vlan.t: ERROR: line 49: The chain egress does not exist in netdev test-netdev. I cannot delete it. +bridge/vlan.t: ERROR: line 51: add rule netdev test-netdev egress ether saddr 00:01:02:03:04:05 vlan id 1: This rule should not have failed. +bridge/vlan.t: ERROR: line 52: add rule netdev test-netdev egress vlan id 2 ether saddr 0:1:2:3:4:6: This rule should not have failed. +bridge/vlan.t: ERROR: line 54: add rule netdev test-netdev egress ether saddr . vlan id { 0a:0b:0c:0d:0e:0f . 42, 0a:0b:0c:0d:0e:0f . 4095 }: This rule should not have failed. +bridge/vlan.t: ERROR: line 54: The chain egress does not exist in netdev test-netdev. I cannot delete it. inet/dccp.t: ERROR: line 3: I cannot create the chain 'egress' inet/dccp.t: ERROR: line 10: add rule netdev test-netdev egress dccp sport 21-35: This rule should not have failed. inet/dccp.t: ERROR: line 11: add rule netdev test-netdev egress dccp sport != 21-35: This rule should not have failed. diff --git a/nftables.spec b/nftables.spec index ef5bb89..503ad47 100644 --- a/nftables.spec +++ b/nftables.spec @@ -1,5 +1,5 @@ %define rpmversion 1.0.4 -%define specrelease 7 +%define specrelease 8 Name: nftables Version: %{rpmversion} @@ -24,6 +24,22 @@ Patch2: 0002-rule-collapse-set-element-commands.patch Patch3: 0003-intervals-do-not-report-exact-overlaps-for-new-eleme.patch Patch4: 0004-intervals-do-not-empty-cache-for-maps.patch Patch5: 0005-intervals-Do-not-sort-cached-set-elements-over-and-o.patch +Patch6: 0006-doc-Document-limitations-of-ipsec-expression-with-xf.patch +Patch7: 0007-tests-py-Add-a-test-for-failing-ipsec-after-counter.patch +Patch8: 0008-parser-add-missing-synproxy-scope-closure.patch +Patch9: 0009-scanner-don-t-pop-active-flex-scanner-scope.patch +Patch10: 0010-intervals-fix-crash-when-trying-to-remove-element-in.patch +Patch11: 0011-intervals-check-for-EXPR_F_REMOVE-in-case-of-element.patch +Patch12: 0012-netlink_delinearize-allow-postprocessing-on-concaten.patch +Patch13: 0013-netlink_delinearize-postprocess-binary-ands-in-conca.patch +Patch14: 0014-proto-track-full-stack-of-seen-l2-protocols-not-just.patch +Patch15: 0015-debug-dump-the-l2-protocol-stack.patch +Patch16: 0016-tests-add-a-test-case-for-ether-and-vlan-listing.patch +Patch17: 0017-netlink_delinearize-also-postprocess-OP_AND-in-set-e.patch +Patch18: 0018-evaluate-search-stacked-header-list-for-matching-pay.patch +Patch19: 0019-src-allow-anon-set-concatenation-with-ether-and-vlan.patch +Patch20: 0020-evaluate-set-eval-ctx-for-add-update-statements-with.patch +Patch21: 0021-monitor-Sanitize-startup-race-condition.patch BuildRequires: autoconf BuildRequires: automake @@ -135,6 +151,24 @@ sed -i -e 's/\(sofile=\)".*"/\1"'$sofile'"/' \ %{python3_sitelib}/nftables/ %changelog +* Thu Feb 09 2023 Phil Sutter [1.0.4-8.el9] +- monitor: Sanitize startup race condition (Phil Sutter) [2130721] +- evaluate: set eval ctx for add/update statements with integer constants (Phil Sutter) [2094894] +- src: allow anon set concatenation with ether and vlan (Phil Sutter) [2094887] +- evaluate: search stacked header list for matching payload dep (Phil Sutter) [2094887] +- netlink_delinearize: also postprocess OP_AND in set element context (Phil Sutter) [2094887] +- tests: add a test case for ether and vlan listing (Phil Sutter) [2094887] +- debug: dump the l2 protocol stack (Phil Sutter) [2094887] +- proto: track full stack of seen l2 protocols, not just cumulative offset (Phil Sutter) [2094887] +- netlink_delinearize: postprocess binary ands in concatenations (Phil Sutter) [2094887] +- netlink_delinearize: allow postprocessing on concatenated elements (Phil Sutter) [2094887] +- intervals: check for EXPR_F_REMOVE in case of element mismatch (Phil Sutter) [2115627] +- intervals: fix crash when trying to remove element in empty set (Phil Sutter) [2115627] +- scanner: don't pop active flex scanner scope (Phil Sutter) [2113874] +- parser: add missing synproxy scope closure (Phil Sutter) [2113874] +- tests/py: Add a test for failing ipsec after counter (Phil Sutter) [2113874] +- doc: Document limitations of ipsec expression with xfrm_interface (Phil Sutter) [1806431] + * Tue Jan 31 2023 Phil Sutter [1.0.4-7.el9] - One more attempt at fixing expected error records (Phil Sutter) [1973687] diff --git a/run-tests.stderr.expect b/run-tests.stderr.expect index ab2a7b0..7a58598 100644 --- a/run-tests.stderr.expect +++ b/run-tests.stderr.expect @@ -1,6 +1,7 @@ -W: [FAILED] ././tests/shell/testcases/cache/0010_implicit_chain_0 -W: [FAILED] ././tests/shell/testcases/chains/0021prio_0 -W: [FAILED] ././tests/shell/testcases/chains/0041chain_binding_0 -W: [FAILED] ././tests/shell/testcases/maps/typeof_integer_0 -W: [FAILED] ././tests/shell/testcases/maps/typeof_raw_0 -W: [FAILED] ././tests/shell/testcases/sets/typeof_raw_0 +W: [FAILED] ././tests/shell/testcases/cache/0010_implicit_chain_0 +W: [FAILED] ././tests/shell/testcases/chains/0021prio_0 +W: [FAILED] ././tests/shell/testcases/chains/0041chain_binding_0 +W: [FAILED] ././tests/shell/testcases/maps/typeof_integer_0 +W: [DUMP FAIL] ././tests/shell/testcases/maps/typeof_maps_concat +W: [FAILED] ././tests/shell/testcases/maps/typeof_raw_0 +W: [FAILED] ././tests/shell/testcases/sets/typeof_raw_0