import iproute-5.15.0-4.el8

This commit is contained in:
CentOS Sources 2022-05-10 03:16:50 -04:00 committed by Stepan Oksanichenko
parent 60a68c2bd4
commit 9bc5f8a379
34 changed files with 2877 additions and 1937 deletions

2
.gitignore vendored
View File

@ -1 +1 @@
SOURCES/iproute2-5.12.0.tar.xz
SOURCES/iproute2-5.15.0.tar.xz

View File

@ -1 +1 @@
4e18c1d72a29f41a5968ac8a9b266470f6ad89a7 SOURCES/iproute2-5.12.0.tar.xz
6cae5b261051a5f54596fea6647bf76cb87515a0 SOURCES/iproute2-5.15.0.tar.xz

View File

@ -0,0 +1,74 @@
From b30268eda844bdebbb8e5e4f5735e3b1bb666368 Mon Sep 17 00:00:00 2001
Message-Id: <b30268eda844bdebbb8e5e4f5735e3b1bb666368.1637678195.git.aclaudi@redhat.com>
From: Andrea Claudi <aclaudi@redhat.com>
Date: Tue, 23 Nov 2021 15:28:18 +0100
Subject: [PATCH] configure: fix parsing issue on include_dir option
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2016061
Upstream Status: iproute2.git commit 1d819dcc
commit 1d819dcc741e25958190e31f8186c940713fa0a8
Author: Andrea Claudi <aclaudi@redhat.com>
Date: Thu Oct 14 10:50:49 2021 +0200
configure: fix parsing issue on include_dir option
configure is stuck in an endless loop if '--include_dir' option is used
without a value:
$ ./configure --include_dir
./configure: line 506: shift: 2: shift count out of range
./configure: line 506: shift: 2: shift count out of range
[...]
Fix it splitting 'shift 2' into two consecutive shifts, and making the
second one conditional to the number of remaining arguments.
A check is also provided after the while loop to verify the include dir
exists; this avoid to produce an erroneous configuration.
Fixes: a9c3d70d902a ("configure: add options ability")
Signed-off-by: Andrea Claudi <aclaudi@redhat.com>
Acked-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: David Ahern <dsahern@kernel.org>
---
configure | 9 ++++++---
1 file changed, 6 insertions(+), 3 deletions(-)
diff --git a/configure b/configure
index 7f4f3bd9..ea9051ab 100755
--- a/configure
+++ b/configure
@@ -485,7 +485,7 @@ usage()
{
cat <<EOF
Usage: $0 [OPTIONS]
- --include_dir Path to iproute2 include dir
+ --include_dir <dir> Path to iproute2 include dir
--libbpf_dir Path to libbpf DESTDIR
--libbpf_force Enable/disable libbpf by force. Available options:
on: require link against libbpf, quit config if no libbpf support
@@ -502,8 +502,9 @@ else
while true; do
case "$1" in
--include_dir)
- INCLUDE=$2
- shift 2 ;;
+ shift
+ INCLUDE="$1"
+ [ "$#" -gt 0 ] && shift ;;
--libbpf_dir)
LIBBPF_DIR="$2"
shift 2 ;;
@@ -523,6 +524,8 @@ else
done
fi
+[ -d "$INCLUDE" ] || usage 1
+
echo "# Generated config based on" $INCLUDE >$CONFIG
quiet_config >> $CONFIG
--
2.31.1

View File

@ -1,73 +0,0 @@
From d9bcc70051d23c62cc802a356dc7e4324398765e Mon Sep 17 00:00:00 2001
Message-Id: <d9bcc70051d23c62cc802a356dc7e4324398765e.1624894546.git.aclaudi@redhat.com>
From: Andrea Claudi <aclaudi@redhat.com>
Date: Mon, 28 Jun 2021 15:22:17 +0200
Subject: [PATCH] tc: f_flower: Add option to match on related ct state
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1957243
Upstream Status: unknown commit 7fda6c58
commit 7fda6c588a295ad381fdf0b9b9971169b2f9d9dc
Author: Ariel Levkovich <lariel@nvidia.com>
Date: Fri May 21 20:07:06 2021 +0300
tc: f_flower: Add option to match on related ct state
Add support for matching on ct_state flag related.
The related state indicates a packet is associated with an existing
connection.
Example:
$ tc filter add dev ens1f0_0 ingress prio 1 chain 1 proto ip flower \
ct_state -est-rel+trk \
action mirred egress redirect dev ens1f0_1
$ tc filter add dev ens1f0_0 ingress prio 1 chain 1 proto ip flower \
ct_state +rel+trk \
action mirred egress redirect dev ens1f0_1
Signed-off-by: Ariel Levkovich <lariel@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
---
man/man8/tc-flower.8 | 2 ++
tc/f_flower.c | 3 ++-
2 files changed, 4 insertions(+), 1 deletion(-)
diff --git a/man/man8/tc-flower.8 b/man/man8/tc-flower.8
index f7336b62..4541d937 100644
--- a/man/man8/tc-flower.8
+++ b/man/man8/tc-flower.8
@@ -391,6 +391,8 @@ rpl - The packet is in the reply direction, meaning that it is in the opposite d
.TP
inv - The state is invalid. The packet couldn't be associated to a connection.
.TP
+rel - The packet is related to an existing connection.
+.TP
Example: +trk+est
.RE
.TP
diff --git a/tc/f_flower.c b/tc/f_flower.c
index 53822a95..29db2e23 100644
--- a/tc/f_flower.c
+++ b/tc/f_flower.c
@@ -94,7 +94,7 @@ static void explain(void)
" LSE := lse depth DEPTH { label LABEL | tc TC | bos BOS | ttl TTL }\n"
" FILTERID := X:Y:Z\n"
" MASKED_LLADDR := { LLADDR | LLADDR/MASK | LLADDR/BITS }\n"
- " MASKED_CT_STATE := combination of {+|-} and flags trk,est,new\n"
+ " MASKED_CT_STATE := combination of {+|-} and flags trk,est,new,rel\n"
" ACTION-SPEC := ... look at individual actions\n"
"\n"
"NOTE: CLASSID, IP-PROTO are parsed as hexadecimal input.\n"
@@ -345,6 +345,7 @@ static struct flower_ct_states {
{ "trk", TCA_FLOWER_KEY_CT_FLAGS_TRACKED },
{ "new", TCA_FLOWER_KEY_CT_FLAGS_NEW },
{ "est", TCA_FLOWER_KEY_CT_FLAGS_ESTABLISHED },
+ { "rel", TCA_FLOWER_KEY_CT_FLAGS_RELATED },
{ "inv", TCA_FLOWER_KEY_CT_FLAGS_INVALID },
{ "rpl", TCA_FLOWER_KEY_CT_FLAGS_REPLY },
};
--
2.31.1

View File

@ -0,0 +1,79 @@
From a9cf0f0c57cf978ebe2abfd4c5a1b7df94f0a8ac Mon Sep 17 00:00:00 2001
Message-Id: <a9cf0f0c57cf978ebe2abfd4c5a1b7df94f0a8ac.1637678195.git.aclaudi@redhat.com>
In-Reply-To: <b30268eda844bdebbb8e5e4f5735e3b1bb666368.1637678195.git.aclaudi@redhat.com>
References: <b30268eda844bdebbb8e5e4f5735e3b1bb666368.1637678195.git.aclaudi@redhat.com>
From: Andrea Claudi <aclaudi@redhat.com>
Date: Tue, 23 Nov 2021 15:28:18 +0100
Subject: [PATCH] configure: fix parsing issue on libbpf_dir option
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2016061
Upstream Status: iproute2.git commit 48c379bc
commit 48c379bc2afd43b3246f68ed46475f5318b1218f
Author: Andrea Claudi <aclaudi@redhat.com>
Date: Thu Oct 14 10:50:50 2021 +0200
configure: fix parsing issue on libbpf_dir option
configure is stuck in an endless loop if '--libbpf_dir' option is used
without a value:
$ ./configure --libbpf_dir
./configure: line 515: shift: 2: shift count out of range
./configure: line 515: shift: 2: shift count out of range
[...]
Fix it splitting 'shift 2' into two consecutive shifts, and making the
second one conditional to the number of remaining arguments.
A check is also provided after the while loop to verify the libbpf dir
exists; also, as LIBBPF_DIR does not have a default value, configure bails
out if the user does not specify a value after --libbpf_dir, thus avoiding
to produce an erroneous configuration.
Fixes: 7ae2585b865a ("configure: convert LIBBPF environment variables to command-line options")
Signed-off-by: Andrea Claudi <aclaudi@redhat.com>
Acked-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: David Ahern <dsahern@kernel.org>
---
configure | 10 +++++++---
1 file changed, 7 insertions(+), 3 deletions(-)
diff --git a/configure b/configure
index ea9051ab..0f304206 100755
--- a/configure
+++ b/configure
@@ -486,7 +486,7 @@ usage()
cat <<EOF
Usage: $0 [OPTIONS]
--include_dir <dir> Path to iproute2 include dir
- --libbpf_dir Path to libbpf DESTDIR
+ --libbpf_dir <dir> Path to libbpf DESTDIR
--libbpf_force Enable/disable libbpf by force. Available options:
on: require link against libbpf, quit config if no libbpf support
off: disable libbpf probing
@@ -506,8 +506,9 @@ else
INCLUDE="$1"
[ "$#" -gt 0 ] && shift ;;
--libbpf_dir)
- LIBBPF_DIR="$2"
- shift 2 ;;
+ shift
+ LIBBPF_DIR="$1"
+ [ "$#" -gt 0 ] && shift ;;
--libbpf_force)
if [ "$2" != 'on' ] && [ "$2" != 'off' ]; then
usage 1
@@ -525,6 +526,9 @@ else
fi
[ -d "$INCLUDE" ] || usage 1
+if [ "${LIBBPF_DIR-unused}" != "unused" ]; then
+ [ -d "$LIBBPF_DIR" ] || usage 1
+fi
echo "# Generated config based on" $INCLUDE >$CONFIG
quiet_config >> $CONFIG
--
2.31.1

View File

@ -1,43 +0,0 @@
From 5f12d06dac98f9085273ce548d2ed13341c920fe Mon Sep 17 00:00:00 2001
Message-Id: <5f12d06dac98f9085273ce548d2ed13341c920fe.1624894546.git.aclaudi@redhat.com>
In-Reply-To: <d9bcc70051d23c62cc802a356dc7e4324398765e.1624894546.git.aclaudi@redhat.com>
References: <d9bcc70051d23c62cc802a356dc7e4324398765e.1624894546.git.aclaudi@redhat.com>
From: Andrea Claudi <aclaudi@redhat.com>
Date: Mon, 28 Jun 2021 15:22:17 +0200
Subject: [PATCH] tc: f_flower: Add missing ct_state flags to usage description
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1957243
Upstream Status: unknown commit 825bd5da
commit 825bd5dacb98597a5595b470bd275bb103a7b9c2
Author: Ariel Levkovich <lariel@nvidia.com>
Date: Fri May 21 20:07:07 2021 +0300
tc: f_flower: Add missing ct_state flags to usage description
Add ct_state flags rpl and inv to the commands usage
description
Signed-off-by: Ariel Levkovich <lariel@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
---
tc/f_flower.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tc/f_flower.c b/tc/f_flower.c
index 29db2e23..c5af0276 100644
--- a/tc/f_flower.c
+++ b/tc/f_flower.c
@@ -94,7 +94,7 @@ static void explain(void)
" LSE := lse depth DEPTH { label LABEL | tc TC | bos BOS | ttl TTL }\n"
" FILTERID := X:Y:Z\n"
" MASKED_LLADDR := { LLADDR | LLADDR/MASK | LLADDR/BITS }\n"
- " MASKED_CT_STATE := combination of {+|-} and flags trk,est,new,rel\n"
+ " MASKED_CT_STATE := combination of {+|-} and flags trk,est,new,rel,rpl,inv\n"
" ACTION-SPEC := ... look at individual actions\n"
"\n"
"NOTE: CLASSID, IP-PROTO are parsed as hexadecimal input.\n"
--
2.31.1

View File

@ -0,0 +1,63 @@
From 56a144f7a352d4dbd1e08585e82fad4bd6677b52 Mon Sep 17 00:00:00 2001
Message-Id: <56a144f7a352d4dbd1e08585e82fad4bd6677b52.1637678195.git.aclaudi@redhat.com>
In-Reply-To: <b30268eda844bdebbb8e5e4f5735e3b1bb666368.1637678195.git.aclaudi@redhat.com>
References: <b30268eda844bdebbb8e5e4f5735e3b1bb666368.1637678195.git.aclaudi@redhat.com>
From: Andrea Claudi <aclaudi@redhat.com>
Date: Tue, 23 Nov 2021 15:28:18 +0100
Subject: [PATCH] configure: fix parsing issue with more than one value per
option
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2016061
Upstream Status: iproute2.git commit c330d097
commit c330d0979440a1dec4a436fd742bb6e28d195526
Author: Andrea Claudi <aclaudi@redhat.com>
Date: Thu Oct 14 10:50:51 2021 +0200
configure: fix parsing issue with more than one value per option
With commit a9c3d70d902a ("configure: add options ability") users are no
more able to provide wrong command lines like:
$ ./configure --include_dir foo bar
The script simply bails out when user provides more than one value for a
single option. However, in doing so, it breaks backward compatibility with
some packaging system, which expects unknown options to be ignored.
Commit a3272b93725a ("configure: restore backward compatibility") fix this
issue, but makes it possible again for users to provide wrong command lines
such as the one above.
This fixes the issue simply ignoring autoconf-like options such as
'--opt=value'.
Fixes: a3272b93725a ("configure: restore backward compatibility")
Signed-off-by: Andrea Claudi <aclaudi@redhat.com>
Acked-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: David Ahern <dsahern@kernel.org>
---
configure | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/configure b/configure
index 0f304206..9ec19a5b 100755
--- a/configure
+++ b/configure
@@ -517,10 +517,12 @@ else
shift 2 ;;
-h | --help)
usage 0 ;;
+ --*)
+ shift ;;
"")
break ;;
*)
- shift 1 ;;
+ usage 1 ;;
esac
done
fi
--
2.31.1

View File

@ -1,123 +0,0 @@
From 0ccd2dbb3eca44a892a183db8c2e4221488ecf51 Mon Sep 17 00:00:00 2001
Message-Id: <0ccd2dbb3eca44a892a183db8c2e4221488ecf51.1628790091.git.aclaudi@redhat.com>
In-Reply-To: <650694eb0120722499207078f965442ef7343bb1.1628790091.git.aclaudi@redhat.com>
References: <650694eb0120722499207078f965442ef7343bb1.1628790091.git.aclaudi@redhat.com>
From: Andrea Claudi <aclaudi@redhat.com>
Date: Mon, 9 Aug 2021 15:18:11 +0200
Subject: [PATCH] mptcp: add support for port based endpoint
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1984733
Upstream Status: iproute2.git commit 42fbca91
commit 42fbca91cd616ae714c3f6aa2d4e2c3399498e38
Author: Paolo Abeni <pabeni@redhat.com>
Date: Fri Feb 19 21:42:55 2021 +0100
mptcp: add support for port based endpoint
The feature is supported by the kernel since 5.11-net-next,
let's allow user-space to use it.
Just parse and dump an additional, per endpoint, u16 attribute
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
Signed-off-by: Andrea Claudi <aclaudi@redhat.com>
---
ip/ipmptcp.c | 16 ++++++++++++++--
man/man8/ip-mptcp.8 | 8 ++++++++
2 files changed, 22 insertions(+), 2 deletions(-)
diff --git a/ip/ipmptcp.c b/ip/ipmptcp.c
index e1ffafb3..5f659b59 100644
--- a/ip/ipmptcp.c
+++ b/ip/ipmptcp.c
@@ -17,7 +17,7 @@ static void usage(void)
{
fprintf(stderr,
"Usage: ip mptcp endpoint add ADDRESS [ dev NAME ] [ id ID ]\n"
- " [ FLAG-LIST ]\n"
+ " [ port NR ] [ FLAG-LIST ]\n"
" ip mptcp endpoint delete id ID\n"
" ip mptcp endpoint show [ id ID ]\n"
" ip mptcp endpoint flush\n"
@@ -97,6 +97,7 @@ static int mptcp_parse_opt(int argc, char **argv, struct nlmsghdr *n,
bool id_set = false;
__u32 index = 0;
__u32 flags = 0;
+ __u16 port = 0;
__u8 id = 0;
ll_init_map(&rth);
@@ -123,6 +124,10 @@ static int mptcp_parse_opt(int argc, char **argv, struct nlmsghdr *n,
if (!index)
invarg("device does not exist\n", ifname);
+ } else if (matches(*argv, "port") == 0) {
+ NEXT_ARG();
+ if (get_u16(&port, *argv, 0))
+ invarg("expected port", *argv);
} else if (get_addr(&address, *argv, AF_UNSPEC) == 0) {
addr_set = true;
} else {
@@ -145,6 +150,8 @@ static int mptcp_parse_opt(int argc, char **argv, struct nlmsghdr *n,
addattr32(n, MPTCP_BUFLEN, MPTCP_PM_ADDR_ATTR_FLAGS, flags);
if (index)
addattr32(n, MPTCP_BUFLEN, MPTCP_PM_ADDR_ATTR_IF_IDX, index);
+ if (port)
+ addattr16(n, MPTCP_BUFLEN, MPTCP_PM_ADDR_ATTR_PORT, port);
if (addr_set) {
int type;
@@ -181,8 +188,8 @@ static int print_mptcp_addrinfo(struct rtattr *addrinfo)
__u8 family = AF_UNSPEC, addr_attr_type;
const char *ifname;
unsigned int flags;
+ __u16 id, port;
int index;
- __u16 id;
parse_rtattr_nested(tb, MPTCP_PM_ADDR_ATTR_MAX, addrinfo);
@@ -196,6 +203,11 @@ static int print_mptcp_addrinfo(struct rtattr *addrinfo)
print_string(PRINT_ANY, "address", "%s ",
format_host_rta(family, tb[addr_attr_type]));
}
+ if (tb[MPTCP_PM_ADDR_ATTR_PORT]) {
+ port = rta_getattr_u16(tb[MPTCP_PM_ADDR_ATTR_PORT]);
+ if (port)
+ print_uint(PRINT_ANY, "port", "port %u ", port);
+ }
if (tb[MPTCP_PM_ADDR_ATTR_ID]) {
id = rta_getattr_u8(tb[MPTCP_PM_ADDR_ATTR_ID]);
print_uint(PRINT_ANY, "id", "id %u ", id);
diff --git a/man/man8/ip-mptcp.8 b/man/man8/ip-mptcp.8
index ef8409ea..98cb93b9 100644
--- a/man/man8/ip-mptcp.8
+++ b/man/man8/ip-mptcp.8
@@ -20,6 +20,8 @@ ip-mptcp \- MPTCP path manager configuration
.ti -8
.BR "ip mptcp endpoint add "
.IR IFADDR
+.RB "[ " port
+.IR PORT " ]"
.RB "[ " dev
.IR IFNAME " ]"
.RB "[ " id
@@ -87,6 +89,12 @@ ip mptcp endpoint flush flush all existing MPTCP endpoints
.TE
.TP
+.IR PORT
+When a port number is specified, incoming MPTCP subflows for already
+established MPTCP sockets will be accepted on the specified port, regardless
+the original listener port accepting the first MPTCP subflow and/or
+this peer being actually on the client side.
+
.IR ID
is a unique numeric identifier for the given endpoint
--
2.31.1

View File

@ -1,986 +0,0 @@
From 2e5b8fd1e0e8fc4135bd6a162f32df5e624262b1 Mon Sep 17 00:00:00 2001
Message-Id: <2e5b8fd1e0e8fc4135bd6a162f32df5e624262b1.1628790091.git.aclaudi@redhat.com>
In-Reply-To: <650694eb0120722499207078f965442ef7343bb1.1628790091.git.aclaudi@redhat.com>
References: <650694eb0120722499207078f965442ef7343bb1.1628790091.git.aclaudi@redhat.com>
From: Andrea Claudi <aclaudi@redhat.com>
Date: Wed, 11 Aug 2021 12:55:14 +0200
Subject: [PATCH] Update kernel headers
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1981393
Upstream Status: iproute2.git commit a5b355c0
commit a5b355c08c62fb5b3a42d0e27ef05571c7b30e2e
Author: David Ahern <dsahern@kernel.org>
Date: Fri Mar 19 14:59:17 2021 +0000
Update kernel headers
Update kernel headers to commit:
38cb57602369 ("selftests: net: forwarding: Fix a typo")
Signed-off-by: David Ahern <dsahern@kernel.org>
Signed-off-by: Andrea Claudi <aclaudi@redhat.com>
---
include/uapi/linux/bpf.h | 764 ++++++++++++++++++++++++++++++++-
include/uapi/linux/btf.h | 5 +-
include/uapi/linux/nexthop.h | 47 +-
include/uapi/linux/pkt_cls.h | 2 +
include/uapi/linux/rtnetlink.h | 7 +
5 files changed, 818 insertions(+), 7 deletions(-)
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index b1aba6af..502934f7 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -93,7 +93,717 @@ union bpf_iter_link_info {
} map;
};
-/* BPF syscall commands, see bpf(2) man-page for details. */
+/* BPF syscall commands, see bpf(2) man-page for more details. */
+/**
+ * DOC: eBPF Syscall Preamble
+ *
+ * The operation to be performed by the **bpf**\ () system call is determined
+ * by the *cmd* argument. Each operation takes an accompanying argument,
+ * provided via *attr*, which is a pointer to a union of type *bpf_attr* (see
+ * below). The size argument is the size of the union pointed to by *attr*.
+ */
+/**
+ * DOC: eBPF Syscall Commands
+ *
+ * BPF_MAP_CREATE
+ * Description
+ * Create a map and return a file descriptor that refers to the
+ * map. The close-on-exec file descriptor flag (see **fcntl**\ (2))
+ * is automatically enabled for the new file descriptor.
+ *
+ * Applying **close**\ (2) to the file descriptor returned by
+ * **BPF_MAP_CREATE** will delete the map (but see NOTES).
+ *
+ * Return
+ * A new file descriptor (a nonnegative integer), or -1 if an
+ * error occurred (in which case, *errno* is set appropriately).
+ *
+ * BPF_MAP_LOOKUP_ELEM
+ * Description
+ * Look up an element with a given *key* in the map referred to
+ * by the file descriptor *map_fd*.
+ *
+ * The *flags* argument may be specified as one of the
+ * following:
+ *
+ * **BPF_F_LOCK**
+ * Look up the value of a spin-locked map without
+ * returning the lock. This must be specified if the
+ * elements contain a spinlock.
+ *
+ * Return
+ * Returns zero on success. On error, -1 is returned and *errno*
+ * is set appropriately.
+ *
+ * BPF_MAP_UPDATE_ELEM
+ * Description
+ * Create or update an element (key/value pair) in a specified map.
+ *
+ * The *flags* argument should be specified as one of the
+ * following:
+ *
+ * **BPF_ANY**
+ * Create a new element or update an existing element.
+ * **BPF_NOEXIST**
+ * Create a new element only if it did not exist.
+ * **BPF_EXIST**
+ * Update an existing element.
+ * **BPF_F_LOCK**
+ * Update a spin_lock-ed map element.
+ *
+ * Return
+ * Returns zero on success. On error, -1 is returned and *errno*
+ * is set appropriately.
+ *
+ * May set *errno* to **EINVAL**, **EPERM**, **ENOMEM**,
+ * **E2BIG**, **EEXIST**, or **ENOENT**.
+ *
+ * **E2BIG**
+ * The number of elements in the map reached the
+ * *max_entries* limit specified at map creation time.
+ * **EEXIST**
+ * If *flags* specifies **BPF_NOEXIST** and the element
+ * with *key* already exists in the map.
+ * **ENOENT**
+ * If *flags* specifies **BPF_EXIST** and the element with
+ * *key* does not exist in the map.
+ *
+ * BPF_MAP_DELETE_ELEM
+ * Description
+ * Look up and delete an element by key in a specified map.
+ *
+ * Return
+ * Returns zero on success. On error, -1 is returned and *errno*
+ * is set appropriately.
+ *
+ * BPF_MAP_GET_NEXT_KEY
+ * Description
+ * Look up an element by key in a specified map and return the key
+ * of the next element. Can be used to iterate over all elements
+ * in the map.
+ *
+ * Return
+ * Returns zero on success. On error, -1 is returned and *errno*
+ * is set appropriately.
+ *
+ * The following cases can be used to iterate over all elements of
+ * the map:
+ *
+ * * If *key* is not found, the operation returns zero and sets
+ * the *next_key* pointer to the key of the first element.
+ * * If *key* is found, the operation returns zero and sets the
+ * *next_key* pointer to the key of the next element.
+ * * If *key* is the last element, returns -1 and *errno* is set
+ * to **ENOENT**.
+ *
+ * May set *errno* to **ENOMEM**, **EFAULT**, **EPERM**, or
+ * **EINVAL** on error.
+ *
+ * BPF_PROG_LOAD
+ * Description
+ * Verify and load an eBPF program, returning a new file
+ * descriptor associated with the program.
+ *
+ * Applying **close**\ (2) to the file descriptor returned by
+ * **BPF_PROG_LOAD** will unload the eBPF program (but see NOTES).
+ *
+ * The close-on-exec file descriptor flag (see **fcntl**\ (2)) is
+ * automatically enabled for the new file descriptor.
+ *
+ * Return
+ * A new file descriptor (a nonnegative integer), or -1 if an
+ * error occurred (in which case, *errno* is set appropriately).
+ *
+ * BPF_OBJ_PIN
+ * Description
+ * Pin an eBPF program or map referred by the specified *bpf_fd*
+ * to the provided *pathname* on the filesystem.
+ *
+ * The *pathname* argument must not contain a dot (".").
+ *
+ * On success, *pathname* retains a reference to the eBPF object,
+ * preventing deallocation of the object when the original
+ * *bpf_fd* is closed. This allow the eBPF object to live beyond
+ * **close**\ (\ *bpf_fd*\ ), and hence the lifetime of the parent
+ * process.
+ *
+ * Applying **unlink**\ (2) or similar calls to the *pathname*
+ * unpins the object from the filesystem, removing the reference.
+ * If no other file descriptors or filesystem nodes refer to the
+ * same object, it will be deallocated (see NOTES).
+ *
+ * The filesystem type for the parent directory of *pathname* must
+ * be **BPF_FS_MAGIC**.
+ *
+ * Return
+ * Returns zero on success. On error, -1 is returned and *errno*
+ * is set appropriately.
+ *
+ * BPF_OBJ_GET
+ * Description
+ * Open a file descriptor for the eBPF object pinned to the
+ * specified *pathname*.
+ *
+ * Return
+ * A new file descriptor (a nonnegative integer), or -1 if an
+ * error occurred (in which case, *errno* is set appropriately).
+ *
+ * BPF_PROG_ATTACH
+ * Description
+ * Attach an eBPF program to a *target_fd* at the specified
+ * *attach_type* hook.
+ *
+ * The *attach_type* specifies the eBPF attachment point to
+ * attach the program to, and must be one of *bpf_attach_type*
+ * (see below).
+ *
+ * The *attach_bpf_fd* must be a valid file descriptor for a
+ * loaded eBPF program of a cgroup, flow dissector, LIRC, sockmap
+ * or sock_ops type corresponding to the specified *attach_type*.
+ *
+ * The *target_fd* must be a valid file descriptor for a kernel
+ * object which depends on the attach type of *attach_bpf_fd*:
+ *
+ * **BPF_PROG_TYPE_CGROUP_DEVICE**,
+ * **BPF_PROG_TYPE_CGROUP_SKB**,
+ * **BPF_PROG_TYPE_CGROUP_SOCK**,
+ * **BPF_PROG_TYPE_CGROUP_SOCK_ADDR**,
+ * **BPF_PROG_TYPE_CGROUP_SOCKOPT**,
+ * **BPF_PROG_TYPE_CGROUP_SYSCTL**,
+ * **BPF_PROG_TYPE_SOCK_OPS**
+ *
+ * Control Group v2 hierarchy with the eBPF controller
+ * enabled. Requires the kernel to be compiled with
+ * **CONFIG_CGROUP_BPF**.
+ *
+ * **BPF_PROG_TYPE_FLOW_DISSECTOR**
+ *
+ * Network namespace (eg /proc/self/ns/net).
+ *
+ * **BPF_PROG_TYPE_LIRC_MODE2**
+ *
+ * LIRC device path (eg /dev/lircN). Requires the kernel
+ * to be compiled with **CONFIG_BPF_LIRC_MODE2**.
+ *
+ * **BPF_PROG_TYPE_SK_SKB**,
+ * **BPF_PROG_TYPE_SK_MSG**
+ *
+ * eBPF map of socket type (eg **BPF_MAP_TYPE_SOCKHASH**).
+ *
+ * Return
+ * Returns zero on success. On error, -1 is returned and *errno*
+ * is set appropriately.
+ *
+ * BPF_PROG_DETACH
+ * Description
+ * Detach the eBPF program associated with the *target_fd* at the
+ * hook specified by *attach_type*. The program must have been
+ * previously attached using **BPF_PROG_ATTACH**.
+ *
+ * Return
+ * Returns zero on success. On error, -1 is returned and *errno*
+ * is set appropriately.
+ *
+ * BPF_PROG_TEST_RUN
+ * Description
+ * Run the eBPF program associated with the *prog_fd* a *repeat*
+ * number of times against a provided program context *ctx_in* and
+ * data *data_in*, and return the modified program context
+ * *ctx_out*, *data_out* (for example, packet data), result of the
+ * execution *retval*, and *duration* of the test run.
+ *
+ * Return
+ * Returns zero on success. On error, -1 is returned and *errno*
+ * is set appropriately.
+ *
+ * **ENOSPC**
+ * Either *data_size_out* or *ctx_size_out* is too small.
+ * **ENOTSUPP**
+ * This command is not supported by the program type of
+ * the program referred to by *prog_fd*.
+ *
+ * BPF_PROG_GET_NEXT_ID
+ * Description
+ * Fetch the next eBPF program currently loaded into the kernel.
+ *
+ * Looks for the eBPF program with an id greater than *start_id*
+ * and updates *next_id* on success. If no other eBPF programs
+ * remain with ids higher than *start_id*, returns -1 and sets
+ * *errno* to **ENOENT**.
+ *
+ * Return
+ * Returns zero on success. On error, or when no id remains, -1
+ * is returned and *errno* is set appropriately.
+ *
+ * BPF_MAP_GET_NEXT_ID
+ * Description
+ * Fetch the next eBPF map currently loaded into the kernel.
+ *
+ * Looks for the eBPF map with an id greater than *start_id*
+ * and updates *next_id* on success. If no other eBPF maps
+ * remain with ids higher than *start_id*, returns -1 and sets
+ * *errno* to **ENOENT**.
+ *
+ * Return
+ * Returns zero on success. On error, or when no id remains, -1
+ * is returned and *errno* is set appropriately.
+ *
+ * BPF_PROG_GET_FD_BY_ID
+ * Description
+ * Open a file descriptor for the eBPF program corresponding to
+ * *prog_id*.
+ *
+ * Return
+ * A new file descriptor (a nonnegative integer), or -1 if an
+ * error occurred (in which case, *errno* is set appropriately).
+ *
+ * BPF_MAP_GET_FD_BY_ID
+ * Description
+ * Open a file descriptor for the eBPF map corresponding to
+ * *map_id*.
+ *
+ * Return
+ * A new file descriptor (a nonnegative integer), or -1 if an
+ * error occurred (in which case, *errno* is set appropriately).
+ *
+ * BPF_OBJ_GET_INFO_BY_FD
+ * Description
+ * Obtain information about the eBPF object corresponding to
+ * *bpf_fd*.
+ *
+ * Populates up to *info_len* bytes of *info*, which will be in
+ * one of the following formats depending on the eBPF object type
+ * of *bpf_fd*:
+ *
+ * * **struct bpf_prog_info**
+ * * **struct bpf_map_info**
+ * * **struct bpf_btf_info**
+ * * **struct bpf_link_info**
+ *
+ * Return
+ * Returns zero on success. On error, -1 is returned and *errno*
+ * is set appropriately.
+ *
+ * BPF_PROG_QUERY
+ * Description
+ * Obtain information about eBPF programs associated with the
+ * specified *attach_type* hook.
+ *
+ * The *target_fd* must be a valid file descriptor for a kernel
+ * object which depends on the attach type of *attach_bpf_fd*:
+ *
+ * **BPF_PROG_TYPE_CGROUP_DEVICE**,
+ * **BPF_PROG_TYPE_CGROUP_SKB**,
+ * **BPF_PROG_TYPE_CGROUP_SOCK**,
+ * **BPF_PROG_TYPE_CGROUP_SOCK_ADDR**,
+ * **BPF_PROG_TYPE_CGROUP_SOCKOPT**,
+ * **BPF_PROG_TYPE_CGROUP_SYSCTL**,
+ * **BPF_PROG_TYPE_SOCK_OPS**
+ *
+ * Control Group v2 hierarchy with the eBPF controller
+ * enabled. Requires the kernel to be compiled with
+ * **CONFIG_CGROUP_BPF**.
+ *
+ * **BPF_PROG_TYPE_FLOW_DISSECTOR**
+ *
+ * Network namespace (eg /proc/self/ns/net).
+ *
+ * **BPF_PROG_TYPE_LIRC_MODE2**
+ *
+ * LIRC device path (eg /dev/lircN). Requires the kernel
+ * to be compiled with **CONFIG_BPF_LIRC_MODE2**.
+ *
+ * **BPF_PROG_QUERY** always fetches the number of programs
+ * attached and the *attach_flags* which were used to attach those
+ * programs. Additionally, if *prog_ids* is nonzero and the number
+ * of attached programs is less than *prog_cnt*, populates
+ * *prog_ids* with the eBPF program ids of the programs attached
+ * at *target_fd*.
+ *
+ * The following flags may alter the result:
+ *
+ * **BPF_F_QUERY_EFFECTIVE**
+ * Only return information regarding programs which are
+ * currently effective at the specified *target_fd*.
+ *
+ * Return
+ * Returns zero on success. On error, -1 is returned and *errno*
+ * is set appropriately.
+ *
+ * BPF_RAW_TRACEPOINT_OPEN
+ * Description
+ * Attach an eBPF program to a tracepoint *name* to access kernel
+ * internal arguments of the tracepoint in their raw form.
+ *
+ * The *prog_fd* must be a valid file descriptor associated with
+ * a loaded eBPF program of type **BPF_PROG_TYPE_RAW_TRACEPOINT**.
+ *
+ * No ABI guarantees are made about the content of tracepoint
+ * arguments exposed to the corresponding eBPF program.
+ *
+ * Applying **close**\ (2) to the file descriptor returned by
+ * **BPF_RAW_TRACEPOINT_OPEN** will delete the map (but see NOTES).
+ *
+ * Return
+ * A new file descriptor (a nonnegative integer), or -1 if an
+ * error occurred (in which case, *errno* is set appropriately).
+ *
+ * BPF_BTF_LOAD
+ * Description
+ * Verify and load BPF Type Format (BTF) metadata into the kernel,
+ * returning a new file descriptor associated with the metadata.
+ * BTF is described in more detail at
+ * https://www.kernel.org/doc/html/latest/bpf/btf.html.
+ *
+ * The *btf* parameter must point to valid memory providing
+ * *btf_size* bytes of BTF binary metadata.
+ *
+ * The returned file descriptor can be passed to other **bpf**\ ()
+ * subcommands such as **BPF_PROG_LOAD** or **BPF_MAP_CREATE** to
+ * associate the BTF with those objects.
+ *
+ * Similar to **BPF_PROG_LOAD**, **BPF_BTF_LOAD** has optional
+ * parameters to specify a *btf_log_buf*, *btf_log_size* and
+ * *btf_log_level* which allow the kernel to return freeform log
+ * output regarding the BTF verification process.
+ *
+ * Return
+ * A new file descriptor (a nonnegative integer), or -1 if an
+ * error occurred (in which case, *errno* is set appropriately).
+ *
+ * BPF_BTF_GET_FD_BY_ID
+ * Description
+ * Open a file descriptor for the BPF Type Format (BTF)
+ * corresponding to *btf_id*.
+ *
+ * Return
+ * A new file descriptor (a nonnegative integer), or -1 if an
+ * error occurred (in which case, *errno* is set appropriately).
+ *
+ * BPF_TASK_FD_QUERY
+ * Description
+ * Obtain information about eBPF programs associated with the
+ * target process identified by *pid* and *fd*.
+ *
+ * If the *pid* and *fd* are associated with a tracepoint, kprobe
+ * or uprobe perf event, then the *prog_id* and *fd_type* will
+ * be populated with the eBPF program id and file descriptor type
+ * of type **bpf_task_fd_type**. If associated with a kprobe or
+ * uprobe, the *probe_offset* and *probe_addr* will also be
+ * populated. Optionally, if *buf* is provided, then up to
+ * *buf_len* bytes of *buf* will be populated with the name of
+ * the tracepoint, kprobe or uprobe.
+ *
+ * The resulting *prog_id* may be introspected in deeper detail
+ * using **BPF_PROG_GET_FD_BY_ID** and **BPF_OBJ_GET_INFO_BY_FD**.
+ *
+ * Return
+ * Returns zero on success. On error, -1 is returned and *errno*
+ * is set appropriately.
+ *
+ * BPF_MAP_LOOKUP_AND_DELETE_ELEM
+ * Description
+ * Look up an element with the given *key* in the map referred to
+ * by the file descriptor *fd*, and if found, delete the element.
+ *
+ * The **BPF_MAP_TYPE_QUEUE** and **BPF_MAP_TYPE_STACK** map types
+ * implement this command as a "pop" operation, deleting the top
+ * element rather than one corresponding to *key*.
+ * The *key* and *key_len* parameters should be zeroed when
+ * issuing this operation for these map types.
+ *
+ * This command is only valid for the following map types:
+ * * **BPF_MAP_TYPE_QUEUE**
+ * * **BPF_MAP_TYPE_STACK**
+ *
+ * Return
+ * Returns zero on success. On error, -1 is returned and *errno*
+ * is set appropriately.
+ *
+ * BPF_MAP_FREEZE
+ * Description
+ * Freeze the permissions of the specified map.
+ *
+ * Write permissions may be frozen by passing zero *flags*.
+ * Upon success, no future syscall invocations may alter the
+ * map state of *map_fd*. Write operations from eBPF programs
+ * are still possible for a frozen map.
+ *
+ * Not supported for maps of type **BPF_MAP_TYPE_STRUCT_OPS**.
+ *
+ * Return
+ * Returns zero on success. On error, -1 is returned and *errno*
+ * is set appropriately.
+ *
+ * BPF_BTF_GET_NEXT_ID
+ * Description
+ * Fetch the next BPF Type Format (BTF) object currently loaded
+ * into the kernel.
+ *
+ * Looks for the BTF object with an id greater than *start_id*
+ * and updates *next_id* on success. If no other BTF objects
+ * remain with ids higher than *start_id*, returns -1 and sets
+ * *errno* to **ENOENT**.
+ *
+ * Return
+ * Returns zero on success. On error, or when no id remains, -1
+ * is returned and *errno* is set appropriately.
+ *
+ * BPF_MAP_LOOKUP_BATCH
+ * Description
+ * Iterate and fetch multiple elements in a map.
+ *
+ * Two opaque values are used to manage batch operations,
+ * *in_batch* and *out_batch*. Initially, *in_batch* must be set
+ * to NULL to begin the batched operation. After each subsequent
+ * **BPF_MAP_LOOKUP_BATCH**, the caller should pass the resultant
+ * *out_batch* as the *in_batch* for the next operation to
+ * continue iteration from the current point.
+ *
+ * The *keys* and *values* are output parameters which must point
+ * to memory large enough to hold *count* items based on the key
+ * and value size of the map *map_fd*. The *keys* buffer must be
+ * of *key_size* * *count*. The *values* buffer must be of
+ * *value_size* * *count*.
+ *
+ * The *elem_flags* argument may be specified as one of the
+ * following:
+ *
+ * **BPF_F_LOCK**
+ * Look up the value of a spin-locked map without
+ * returning the lock. This must be specified if the
+ * elements contain a spinlock.
+ *
+ * On success, *count* elements from the map are copied into the
+ * user buffer, with the keys copied into *keys* and the values
+ * copied into the corresponding indices in *values*.
+ *
+ * If an error is returned and *errno* is not **EFAULT**, *count*
+ * is set to the number of successfully processed elements.
+ *
+ * Return
+ * Returns zero on success. On error, -1 is returned and *errno*
+ * is set appropriately.
+ *
+ * May set *errno* to **ENOSPC** to indicate that *keys* or
+ * *values* is too small to dump an entire bucket during
+ * iteration of a hash-based map type.
+ *
+ * BPF_MAP_LOOKUP_AND_DELETE_BATCH
+ * Description
+ * Iterate and delete all elements in a map.
+ *
+ * This operation has the same behavior as
+ * **BPF_MAP_LOOKUP_BATCH** with two exceptions:
+ *
+ * * Every element that is successfully returned is also deleted
+ * from the map. This is at least *count* elements. Note that
+ * *count* is both an input and an output parameter.
+ * * Upon returning with *errno* set to **EFAULT**, up to
+ * *count* elements may be deleted without returning the keys
+ * and values of the deleted elements.
+ *
+ * Return
+ * Returns zero on success. On error, -1 is returned and *errno*
+ * is set appropriately.
+ *
+ * BPF_MAP_UPDATE_BATCH
+ * Description
+ * Update multiple elements in a map by *key*.
+ *
+ * The *keys* and *values* are input parameters which must point
+ * to memory large enough to hold *count* items based on the key
+ * and value size of the map *map_fd*. The *keys* buffer must be
+ * of *key_size* * *count*. The *values* buffer must be of
+ * *value_size* * *count*.
+ *
+ * Each element specified in *keys* is sequentially updated to the
+ * value in the corresponding index in *values*. The *in_batch*
+ * and *out_batch* parameters are ignored and should be zeroed.
+ *
+ * The *elem_flags* argument should be specified as one of the
+ * following:
+ *
+ * **BPF_ANY**
+ * Create new elements or update a existing elements.
+ * **BPF_NOEXIST**
+ * Create new elements only if they do not exist.
+ * **BPF_EXIST**
+ * Update existing elements.
+ * **BPF_F_LOCK**
+ * Update spin_lock-ed map elements. This must be
+ * specified if the map value contains a spinlock.
+ *
+ * On success, *count* elements from the map are updated.
+ *
+ * If an error is returned and *errno* is not **EFAULT**, *count*
+ * is set to the number of successfully processed elements.
+ *
+ * Return
+ * Returns zero on success. On error, -1 is returned and *errno*
+ * is set appropriately.
+ *
+ * May set *errno* to **EINVAL**, **EPERM**, **ENOMEM**, or
+ * **E2BIG**. **E2BIG** indicates that the number of elements in
+ * the map reached the *max_entries* limit specified at map
+ * creation time.
+ *
+ * May set *errno* to one of the following error codes under
+ * specific circumstances:
+ *
+ * **EEXIST**
+ * If *flags* specifies **BPF_NOEXIST** and the element
+ * with *key* already exists in the map.
+ * **ENOENT**
+ * If *flags* specifies **BPF_EXIST** and the element with
+ * *key* does not exist in the map.
+ *
+ * BPF_MAP_DELETE_BATCH
+ * Description
+ * Delete multiple elements in a map by *key*.
+ *
+ * The *keys* parameter is an input parameter which must point
+ * to memory large enough to hold *count* items based on the key
+ * size of the map *map_fd*, that is, *key_size* * *count*.
+ *
+ * Each element specified in *keys* is sequentially deleted. The
+ * *in_batch*, *out_batch*, and *values* parameters are ignored
+ * and should be zeroed.
+ *
+ * The *elem_flags* argument may be specified as one of the
+ * following:
+ *
+ * **BPF_F_LOCK**
+ * Look up the value of a spin-locked map without
+ * returning the lock. This must be specified if the
+ * elements contain a spinlock.
+ *
+ * On success, *count* elements from the map are updated.
+ *
+ * If an error is returned and *errno* is not **EFAULT**, *count*
+ * is set to the number of successfully processed elements. If
+ * *errno* is **EFAULT**, up to *count* elements may be been
+ * deleted.
+ *
+ * Return
+ * Returns zero on success. On error, -1 is returned and *errno*
+ * is set appropriately.
+ *
+ * BPF_LINK_CREATE
+ * Description
+ * Attach an eBPF program to a *target_fd* at the specified
+ * *attach_type* hook and return a file descriptor handle for
+ * managing the link.
+ *
+ * Return
+ * A new file descriptor (a nonnegative integer), or -1 if an
+ * error occurred (in which case, *errno* is set appropriately).
+ *
+ * BPF_LINK_UPDATE
+ * Description
+ * Update the eBPF program in the specified *link_fd* to
+ * *new_prog_fd*.
+ *
+ * Return
+ * Returns zero on success. On error, -1 is returned and *errno*
+ * is set appropriately.
+ *
+ * BPF_LINK_GET_FD_BY_ID
+ * Description
+ * Open a file descriptor for the eBPF Link corresponding to
+ * *link_id*.
+ *
+ * Return
+ * A new file descriptor (a nonnegative integer), or -1 if an
+ * error occurred (in which case, *errno* is set appropriately).
+ *
+ * BPF_LINK_GET_NEXT_ID
+ * Description
+ * Fetch the next eBPF link currently loaded into the kernel.
+ *
+ * Looks for the eBPF link with an id greater than *start_id*
+ * and updates *next_id* on success. If no other eBPF links
+ * remain with ids higher than *start_id*, returns -1 and sets
+ * *errno* to **ENOENT**.
+ *
+ * Return
+ * Returns zero on success. On error, or when no id remains, -1
+ * is returned and *errno* is set appropriately.
+ *
+ * BPF_ENABLE_STATS
+ * Description
+ * Enable eBPF runtime statistics gathering.
+ *
+ * Runtime statistics gathering for the eBPF runtime is disabled
+ * by default to minimize the corresponding performance overhead.
+ * This command enables statistics globally.
+ *
+ * Multiple programs may independently enable statistics.
+ * After gathering the desired statistics, eBPF runtime statistics
+ * may be disabled again by calling **close**\ (2) for the file
+ * descriptor returned by this function. Statistics will only be
+ * disabled system-wide when all outstanding file descriptors
+ * returned by prior calls for this subcommand are closed.
+ *
+ * Return
+ * A new file descriptor (a nonnegative integer), or -1 if an
+ * error occurred (in which case, *errno* is set appropriately).
+ *
+ * BPF_ITER_CREATE
+ * Description
+ * Create an iterator on top of the specified *link_fd* (as
+ * previously created using **BPF_LINK_CREATE**) and return a
+ * file descriptor that can be used to trigger the iteration.
+ *
+ * If the resulting file descriptor is pinned to the filesystem
+ * using **BPF_OBJ_PIN**, then subsequent **read**\ (2) syscalls
+ * for that path will trigger the iterator to read kernel state
+ * using the eBPF program attached to *link_fd*.
+ *
+ * Return
+ * A new file descriptor (a nonnegative integer), or -1 if an
+ * error occurred (in which case, *errno* is set appropriately).
+ *
+ * BPF_LINK_DETACH
+ * Description
+ * Forcefully detach the specified *link_fd* from its
+ * corresponding attachment point.
+ *
+ * Return
+ * Returns zero on success. On error, -1 is returned and *errno*
+ * is set appropriately.
+ *
+ * BPF_PROG_BIND_MAP
+ * Description
+ * Bind a map to the lifetime of an eBPF program.
+ *
+ * The map identified by *map_fd* is bound to the program
+ * identified by *prog_fd* and only released when *prog_fd* is
+ * released. This may be used in cases where metadata should be
+ * associated with a program which otherwise does not contain any
+ * references to the map (for example, embedded in the eBPF
+ * program instructions).
+ *
+ * Return
+ * Returns zero on success. On error, -1 is returned and *errno*
+ * is set appropriately.
+ *
+ * NOTES
+ * eBPF objects (maps and programs) can be shared between processes.
+ *
+ * * After **fork**\ (2), the child inherits file descriptors
+ * referring to the same eBPF objects.
+ * * File descriptors referring to eBPF objects can be transferred over
+ * **unix**\ (7) domain sockets.
+ * * File descriptors referring to eBPF objects can be duplicated in the
+ * usual way, using **dup**\ (2) and similar calls.
+ * * File descriptors referring to eBPF objects can be pinned to the
+ * filesystem using the **BPF_OBJ_PIN** command of **bpf**\ (2).
+ *
+ * An eBPF object is deallocated only after all file descriptors referring
+ * to the object have been closed and no references remain pinned to the
+ * filesystem or attached (for example, bound to a program or device).
+ */
enum bpf_cmd {
BPF_MAP_CREATE,
BPF_MAP_LOOKUP_ELEM,
@@ -393,6 +1103,15 @@ enum bpf_link_type {
* is struct/union.
*/
#define BPF_PSEUDO_BTF_ID 3
+/* insn[0].src_reg: BPF_PSEUDO_FUNC
+ * insn[0].imm: insn offset to the func
+ * insn[1].imm: 0
+ * insn[0].off: 0
+ * insn[1].off: 0
+ * ldimm64 rewrite: address of the function
+ * verifier type: PTR_TO_FUNC.
+ */
+#define BPF_PSEUDO_FUNC 4
/* when bpf_call->src_reg == BPF_PSEUDO_CALL, bpf_call->imm == pc-relative
* offset to another bpf function
@@ -720,7 +1439,7 @@ union bpf_attr {
* parsed and used to produce a manual page. The workflow is the following,
* and requires the rst2man utility:
*
- * $ ./scripts/bpf_helpers_doc.py \
+ * $ ./scripts/bpf_doc.py \
* --filename include/uapi/linux/bpf.h > /tmp/bpf-helpers.rst
* $ rst2man /tmp/bpf-helpers.rst > /tmp/bpf-helpers.7
* $ man /tmp/bpf-helpers.7
@@ -1765,6 +2484,10 @@ union bpf_attr {
* Use with ENCAP_L3/L4 flags to further specify the tunnel
* type; *len* is the length of the inner MAC header.
*
+ * * **BPF_F_ADJ_ROOM_ENCAP_L2_ETH**:
+ * Use with BPF_F_ADJ_ROOM_ENCAP_L2 flag to further specify the
+ * L2 type as Ethernet.
+ *
* A call to this helper is susceptible to change the underlying
* packet buffer. Therefore, at load time, all checks on pointers
* previously done by the verifier are invalidated and must be
@@ -3850,7 +4573,7 @@ union bpf_attr {
*
* long bpf_check_mtu(void *ctx, u32 ifindex, u32 *mtu_len, s32 len_diff, u64 flags)
* Description
- * Check packet size against exceeding MTU of net device (based
+ * Check ctx packet size against exceeding MTU of net device (based
* on *ifindex*). This helper will likely be used in combination
* with helpers that adjust/change the packet size.
*
@@ -3915,6 +4638,34 @@ union bpf_attr {
* * **BPF_MTU_CHK_RET_FRAG_NEEDED**
* * **BPF_MTU_CHK_RET_SEGS_TOOBIG**
*
+ * long bpf_for_each_map_elem(struct bpf_map *map, void *callback_fn, void *callback_ctx, u64 flags)
+ * Description
+ * For each element in **map**, call **callback_fn** function with
+ * **map**, **callback_ctx** and other map-specific parameters.
+ * The **callback_fn** should be a static function and
+ * the **callback_ctx** should be a pointer to the stack.
+ * The **flags** is used to control certain aspects of the helper.
+ * Currently, the **flags** must be 0.
+ *
+ * The following are a list of supported map types and their
+ * respective expected callback signatures:
+ *
+ * BPF_MAP_TYPE_HASH, BPF_MAP_TYPE_PERCPU_HASH,
+ * BPF_MAP_TYPE_LRU_HASH, BPF_MAP_TYPE_LRU_PERCPU_HASH,
+ * BPF_MAP_TYPE_ARRAY, BPF_MAP_TYPE_PERCPU_ARRAY
+ *
+ * long (\*callback_fn)(struct bpf_map \*map, const void \*key, void \*value, void \*ctx);
+ *
+ * For per_cpu maps, the map_value is the value on the cpu where the
+ * bpf_prog is running.
+ *
+ * If **callback_fn** return 0, the helper will continue to the next
+ * element. If return value is 1, the helper will skip the rest of
+ * elements and return. Other return values are not used now.
+ *
+ * Return
+ * The number of traversed map elements for success, **-EINVAL** for
+ * invalid **flags**.
*/
#define __BPF_FUNC_MAPPER(FN) \
FN(unspec), \
@@ -4081,6 +4832,7 @@ union bpf_attr {
FN(ima_inode_hash), \
FN(sock_from_file), \
FN(check_mtu), \
+ FN(for_each_map_elem), \
/* */
/* integer value in 'imm' field of BPF_CALL instruction selects which helper
@@ -4174,6 +4926,7 @@ enum {
BPF_F_ADJ_ROOM_ENCAP_L4_GRE = (1ULL << 3),
BPF_F_ADJ_ROOM_ENCAP_L4_UDP = (1ULL << 4),
BPF_F_ADJ_ROOM_NO_CSUM_RESET = (1ULL << 5),
+ BPF_F_ADJ_ROOM_ENCAP_L2_ETH = (1ULL << 6),
};
enum {
@@ -5211,7 +5964,10 @@ struct bpf_pidns_info {
/* User accessible data for SK_LOOKUP programs. Add new fields at the end. */
struct bpf_sk_lookup {
- __bpf_md_ptr(struct bpf_sock *, sk); /* Selected socket */
+ union {
+ __bpf_md_ptr(struct bpf_sock *, sk); /* Selected socket */
+ __u64 cookie; /* Non-zero if socket was selected in PROG_TEST_RUN */
+ };
__u32 family; /* Protocol family (AF_INET, AF_INET6) */
__u32 protocol; /* IP protocol (IPPROTO_TCP, IPPROTO_UDP) */
diff --git a/include/uapi/linux/btf.h b/include/uapi/linux/btf.h
index 4a42eb48..2c42dcac 100644
--- a/include/uapi/linux/btf.h
+++ b/include/uapi/linux/btf.h
@@ -52,7 +52,7 @@ struct btf_type {
};
};
-#define BTF_INFO_KIND(info) (((info) >> 24) & 0x0f)
+#define BTF_INFO_KIND(info) (((info) >> 24) & 0x1f)
#define BTF_INFO_VLEN(info) ((info) & 0xffff)
#define BTF_INFO_KFLAG(info) ((info) >> 31)
@@ -72,7 +72,8 @@ struct btf_type {
#define BTF_KIND_FUNC_PROTO 13 /* Function Proto */
#define BTF_KIND_VAR 14 /* Variable */
#define BTF_KIND_DATASEC 15 /* Section */
-#define BTF_KIND_MAX BTF_KIND_DATASEC
+#define BTF_KIND_FLOAT 16 /* Floating point */
+#define BTF_KIND_MAX BTF_KIND_FLOAT
#define NR_BTF_KINDS (BTF_KIND_MAX + 1)
/* For some specific BTF_KIND, "struct btf_type" is immediately
diff --git a/include/uapi/linux/nexthop.h b/include/uapi/linux/nexthop.h
index b0a56139..37b14b4e 100644
--- a/include/uapi/linux/nexthop.h
+++ b/include/uapi/linux/nexthop.h
@@ -21,7 +21,10 @@ struct nexthop_grp {
};
enum {
- NEXTHOP_GRP_TYPE_MPATH, /* default type if not specified */
+ NEXTHOP_GRP_TYPE_MPATH, /* hash-threshold nexthop group
+ * default type if not specified
+ */
+ NEXTHOP_GRP_TYPE_RES, /* resilient nexthop group */
__NEXTHOP_GRP_TYPE_MAX,
};
@@ -52,8 +55,50 @@ enum {
NHA_FDB, /* flag; nexthop belongs to a bridge fdb */
/* if NHA_FDB is added, OIF, BLACKHOLE, ENCAP cannot be set */
+ /* nested; resilient nexthop group attributes */
+ NHA_RES_GROUP,
+ /* nested; nexthop bucket attributes */
+ NHA_RES_BUCKET,
+
__NHA_MAX,
};
#define NHA_MAX (__NHA_MAX - 1)
+
+enum {
+ NHA_RES_GROUP_UNSPEC,
+ /* Pad attribute for 64-bit alignment. */
+ NHA_RES_GROUP_PAD = NHA_RES_GROUP_UNSPEC,
+
+ /* u16; number of nexthop buckets in a resilient nexthop group */
+ NHA_RES_GROUP_BUCKETS,
+ /* clock_t as u32; nexthop bucket idle timer (per-group) */
+ NHA_RES_GROUP_IDLE_TIMER,
+ /* clock_t as u32; nexthop unbalanced timer */
+ NHA_RES_GROUP_UNBALANCED_TIMER,
+ /* clock_t as u64; nexthop unbalanced time */
+ NHA_RES_GROUP_UNBALANCED_TIME,
+
+ __NHA_RES_GROUP_MAX,
+};
+
+#define NHA_RES_GROUP_MAX (__NHA_RES_GROUP_MAX - 1)
+
+enum {
+ NHA_RES_BUCKET_UNSPEC,
+ /* Pad attribute for 64-bit alignment. */
+ NHA_RES_BUCKET_PAD = NHA_RES_BUCKET_UNSPEC,
+
+ /* u16; nexthop bucket index */
+ NHA_RES_BUCKET_INDEX,
+ /* clock_t as u64; nexthop bucket idle time */
+ NHA_RES_BUCKET_IDLE_TIME,
+ /* u32; nexthop id assigned to the nexthop bucket */
+ NHA_RES_BUCKET_NH_ID,
+
+ __NHA_RES_BUCKET_MAX,
+};
+
+#define NHA_RES_BUCKET_MAX (__NHA_RES_BUCKET_MAX - 1)
+
#endif
diff --git a/include/uapi/linux/pkt_cls.h b/include/uapi/linux/pkt_cls.h
index 7ea59cfe..025c40fe 100644
--- a/include/uapi/linux/pkt_cls.h
+++ b/include/uapi/linux/pkt_cls.h
@@ -190,6 +190,8 @@ enum {
TCA_POLICE_PAD,
TCA_POLICE_RATE64,
TCA_POLICE_PEAKRATE64,
+ TCA_POLICE_PKTRATE64,
+ TCA_POLICE_PKTBURST64,
__TCA_POLICE_MAX
#define TCA_POLICE_RESULT TCA_POLICE_RESULT
};
diff --git a/include/uapi/linux/rtnetlink.h b/include/uapi/linux/rtnetlink.h
index b34b9add..f62cccc1 100644
--- a/include/uapi/linux/rtnetlink.h
+++ b/include/uapi/linux/rtnetlink.h
@@ -178,6 +178,13 @@ enum {
RTM_GETVLAN,
#define RTM_GETVLAN RTM_GETVLAN
+ RTM_NEWNEXTHOPBUCKET = 116,
+#define RTM_NEWNEXTHOPBUCKET RTM_NEWNEXTHOPBUCKET
+ RTM_DELNEXTHOPBUCKET,
+#define RTM_DELNEXTHOPBUCKET RTM_DELNEXTHOPBUCKET
+ RTM_GETNEXTHOPBUCKET,
+#define RTM_GETNEXTHOPBUCKET RTM_GETNEXTHOPBUCKET
+
__RTM_MAX,
#define RTM_MAX (((__RTM_MAX + 3) & ~3) - 1)
};
--
2.31.1

View File

@ -0,0 +1,114 @@
From 1b4bdce40f9244823c464f2a36a0db7cd6ba427b Mon Sep 17 00:00:00 2001
Message-Id: <1b4bdce40f9244823c464f2a36a0db7cd6ba427b.1637678195.git.aclaudi@redhat.com>
In-Reply-To: <b30268eda844bdebbb8e5e4f5735e3b1bb666368.1637678195.git.aclaudi@redhat.com>
References: <b30268eda844bdebbb8e5e4f5735e3b1bb666368.1637678195.git.aclaudi@redhat.com>
From: Andrea Claudi <aclaudi@redhat.com>
Date: Tue, 23 Nov 2021 15:28:18 +0100
Subject: [PATCH] configure: simplify options parsing
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2016061
Upstream Status: iproute2.git commit 99245d17
commit 99245d1741a85e4397973782578d4a78673eb348
Author: Andrea Claudi <aclaudi@redhat.com>
Date: Thu Oct 14 10:50:52 2021 +0200
configure: simplify options parsing
This commit simplifies options parsing moving all the code not related to
parsing out of the case statement.
- The conditional shift after the assignments is moved right after the
case, reducing code duplication.
- The semantic checks on the LIBBPF_FORCE value is moved after the loop
like we already did for INCLUDE and LIBBPF_DIR.
- Finally, the loop condition is changed to check remaining arguments, thus
making it possible to get rid of the null string case break.
As a bonus, now the help message states that on or off should follow
--libbpf_force
Signed-off-by: Andrea Claudi <aclaudi@redhat.com>
Acked-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: David Ahern <dsahern@kernel.org>
---
configure | 37 ++++++++++++++++++-------------------
1 file changed, 18 insertions(+), 19 deletions(-)
diff --git a/configure b/configure
index 9ec19a5b..26e06eb8 100755
--- a/configure
+++ b/configure
@@ -485,12 +485,12 @@ usage()
{
cat <<EOF
Usage: $0 [OPTIONS]
- --include_dir <dir> Path to iproute2 include dir
- --libbpf_dir <dir> Path to libbpf DESTDIR
- --libbpf_force Enable/disable libbpf by force. Available options:
- on: require link against libbpf, quit config if no libbpf support
- off: disable libbpf probing
- -h | --help Show this usage info
+ --include_dir <dir> Path to iproute2 include dir
+ --libbpf_dir <dir> Path to libbpf DESTDIR
+ --libbpf_force <on|off> Enable/disable libbpf by force. Available options:
+ on: require link against libbpf, quit config if no libbpf support
+ off: disable libbpf probing
+ -h | --help Show this usage info
EOF
exit $1
}
@@ -499,31 +499,25 @@ EOF
if [ $# -eq 1 ] && [ "$(echo $1 | cut -c 1)" != '-' ]; then
INCLUDE="$1"
else
- while true; do
+ while [ "$#" -gt 0 ]; do
case "$1" in
--include_dir)
shift
- INCLUDE="$1"
- [ "$#" -gt 0 ] && shift ;;
+ INCLUDE="$1" ;;
--libbpf_dir)
shift
- LIBBPF_DIR="$1"
- [ "$#" -gt 0 ] && shift ;;
+ LIBBPF_DIR="$1" ;;
--libbpf_force)
- if [ "$2" != 'on' ] && [ "$2" != 'off' ]; then
- usage 1
- fi
- LIBBPF_FORCE=$2
- shift 2 ;;
+ shift
+ LIBBPF_FORCE="$1" ;;
-h | --help)
usage 0 ;;
--*)
- shift ;;
- "")
- break ;;
+ ;;
*)
usage 1 ;;
esac
+ [ "$#" -gt 0 ] && shift
done
fi
@@ -531,6 +525,11 @@ fi
if [ "${LIBBPF_DIR-unused}" != "unused" ]; then
[ -d "$LIBBPF_DIR" ] || usage 1
fi
+if [ "${LIBBPF_FORCE-unused}" != "unused" ]; then
+ if [ "$LIBBPF_FORCE" != 'on' ] && [ "$LIBBPF_FORCE" != 'off' ]; then
+ usage 1
+ fi
+fi
echo "# Generated config based on" $INCLUDE >$CONFIG
quiet_config >> $CONFIG
--
2.31.1

View File

@ -0,0 +1,53 @@
From fd03755c5b59a7c197dc9089494c08780f1669a7 Mon Sep 17 00:00:00 2001
Message-Id: <fd03755c5b59a7c197dc9089494c08780f1669a7.1637678195.git.aclaudi@redhat.com>
In-Reply-To: <b30268eda844bdebbb8e5e4f5735e3b1bb666368.1637678195.git.aclaudi@redhat.com>
References: <b30268eda844bdebbb8e5e4f5735e3b1bb666368.1637678195.git.aclaudi@redhat.com>
From: Andrea Claudi <aclaudi@redhat.com>
Date: Tue, 23 Nov 2021 15:28:18 +0100
Subject: [PATCH] configure: support --param=value style
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2016061
Upstream Status: iproute2.git commit 4b8bca5f
commit 4b8bca5f9e3e6f210b1036166dc98801e76d8ee5
Author: Andrea Claudi <aclaudi@redhat.com>
Date: Thu Oct 14 10:50:53 2021 +0200
configure: support --param=value style
This commit makes it possible to specify values for configure params
using the common autotools configure syntax '--param=value'.
Signed-off-by: Andrea Claudi <aclaudi@redhat.com>
Acked-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: David Ahern <dsahern@kernel.org>
---
configure | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/configure b/configure
index 26e06eb8..9a2645d9 100755
--- a/configure
+++ b/configure
@@ -504,12 +504,18 @@ else
--include_dir)
shift
INCLUDE="$1" ;;
+ --include_dir=*)
+ INCLUDE="${1#*=}" ;;
--libbpf_dir)
shift
LIBBPF_DIR="$1" ;;
+ --libbpf_dir=*)
+ LIBBPF_DIR="${1#*=}" ;;
--libbpf_force)
shift
LIBBPF_FORCE="$1" ;;
+ --libbpf_force=*)
+ LIBBPF_FORCE="${1#*=}" ;;
-h | --help)
usage 0 ;;
--*)
--
2.31.1

View File

@ -1,221 +0,0 @@
From b061aeba93b1c730b7dafeece6b90aad2e7afce8 Mon Sep 17 00:00:00 2001
Message-Id: <b061aeba93b1c730b7dafeece6b90aad2e7afce8.1628790091.git.aclaudi@redhat.com>
In-Reply-To: <650694eb0120722499207078f965442ef7343bb1.1628790091.git.aclaudi@redhat.com>
References: <650694eb0120722499207078f965442ef7343bb1.1628790091.git.aclaudi@redhat.com>
From: Andrea Claudi <aclaudi@redhat.com>
Date: Wed, 11 Aug 2021 12:55:14 +0200
Subject: [PATCH] police: add support for packet-per-second rate limiting
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1981393
Upstream Status: iproute2.git commit cf9ae1bd
commit cf9ae1bd31187d8ae62bc1bb408e443dbc8bd6a0
Author: Baowen Zheng <baowen.zheng@corigine.com>
Date: Fri Mar 26 13:50:18 2021 +0100
police: add support for packet-per-second rate limiting
Allow a policer action to enforce a rate-limit based on packets-per-second,
configurable using a packet-per-second rate and burst parameters.
e.g.
# $TC actions add action police pkts_rate 1000 pkts_burst 200 index 1
# $TC actions ls action police
total acts 1
action order 0: police 0x1 rate 0bit burst 0b mtu 4096Mb pkts_rate 1000 pkts_burst 200
ref 1 bind 0
Signed-off-by: Baowen Zheng <baowen.zheng@corigine.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: Louis Peens <louis.peens@netronome.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
Signed-off-by: Andrea Claudi <aclaudi@redhat.com>
---
man/man8/tc-police.8 | 35 ++++++++++++++++++++++++-------
tc/m_police.c | 50 +++++++++++++++++++++++++++++++++++++++++---
2 files changed, 75 insertions(+), 10 deletions(-)
diff --git a/man/man8/tc-police.8 b/man/man8/tc-police.8
index 52279755..86e263bb 100644
--- a/man/man8/tc-police.8
+++ b/man/man8/tc-police.8
@@ -5,9 +5,11 @@ police - policing action
.SH SYNOPSIS
.in +8
.ti -8
-.BR tc " ... " "action police"
+.BR tc " ... " "action police ["
.BI rate " RATE " burst
-.IR BYTES [\fB/ BYTES "] ["
+.IR BYTES [\fB/ BYTES "] ] ["
+.BI pkts_rate " RATE " pkts_burst
+.IR PACKETS "] ["
.B mtu
.IR BYTES [\fB/ BYTES "] ] ["
.BI peakrate " RATE"
@@ -34,19 +36,29 @@ police - policing action
.SH DESCRIPTION
The
.B police
-action allows to limit bandwidth of traffic matched by the filter it is
-attached to. Basically there are two different algorithms available to measure
-the packet rate: The first one uses an internal dual token bucket and is
-configured using the
+action allows limiting of the byte or packet rate of traffic matched by the
+filter it is attached to.
+.P
+There are two different algorithms available to measure the byte rate: The
+first one uses an internal dual token bucket and is configured using the
.BR rate ", " burst ", " mtu ", " peakrate ", " overhead " and " linklayer
parameters. The second one uses an in-kernel sampling mechanism. It can be
fine-tuned using the
.B estimator
filter parameter.
+.P
+There is one algorithm available to measure packet rate and it is similar to
+the first algorithm described for byte rate. It is configured using the
+.BR pkt_rate " and " pkt_burst
+parameters.
+.P
+At least one of the
+.BR rate " and " pkt_rate "
+parameters must be configured.
.SH OPTIONS
.TP
.BI rate " RATE"
-The maximum traffic rate of packets passing this action. Those exceeding it will
+The maximum byte rate of packets passing this action. Those exceeding it will
be treated as defined by the
.B conform-exceed
option.
@@ -55,6 +67,15 @@ option.
Set the maximum allowed burst in bytes, optionally followed by a slash ('/')
sign and cell size which must be a power of 2.
.TP
+.BI pkt_rate " RATE"
+The maximum packet rate or packets passing this action. Those exceeding it will
+be treated as defined by the
+.B conform-exceed
+option.
+.TP
+.BI pkt_burst " PACKETS"
+Set the maximum allowed burst in packets.
+.TP
.BI mtu " BYTES\fR[\fB/\fIBYTES\fR]"
This is the maximum packet size handled by the policer (larger ones will be
handled like they exceeded the configured rate). Setting this value correctly
diff --git a/tc/m_police.c b/tc/m_police.c
index bb51df68..9ef0e40b 100644
--- a/tc/m_police.c
+++ b/tc/m_police.c
@@ -38,7 +38,8 @@ struct action_util police_action_util = {
static void usage(void)
{
fprintf(stderr,
- "Usage: ... police rate BPS burst BYTES[/BYTES] [ mtu BYTES[/BYTES] ]\n"
+ "Usage: ... police [ rate BPS burst BYTES[/BYTES] ] \n"
+ " [ pkts_rate RATE pkts_burst PACKETS ] [ mtu BYTES[/BYTES] ]\n"
" [ peakrate BPS ] [ avrate BPS ] [ overhead BYTES ]\n"
" [ linklayer TYPE ] [ CONTROL ]\n"
"Where: CONTROL := conform-exceed <EXCEEDACT>[/NOTEXCEEDACT]\n"
@@ -67,6 +68,7 @@ static int act_parse_police(struct action_util *a, int *argc_p, char ***argv_p,
int Rcell_log = -1, Pcell_log = -1;
struct rtattr *tail;
__u64 rate64 = 0, prate64 = 0;
+ __u64 pps64 = 0, ppsburst64 = 0;
if (a) /* new way of doing things */
NEXT_ARG();
@@ -144,6 +146,18 @@ static int act_parse_police(struct action_util *a, int *argc_p, char ***argv_p,
NEXT_ARG();
if (get_linklayer(&linklayer, *argv))
invarg("linklayer", *argv);
+ } else if (matches(*argv, "pkts_rate") == 0) {
+ NEXT_ARG();
+ if (pps64)
+ duparg("pkts_rate", *argv);
+ if (get_u64(&pps64, *argv, 10))
+ invarg("pkts_rate", *argv);
+ } else if (matches(*argv, "pkts_burst") == 0) {
+ NEXT_ARG();
+ if (ppsburst64)
+ duparg("pkts_burst", *argv);
+ if (get_u64(&ppsburst64, *argv, 10))
+ invarg("pkts_burst", *argv);
} else if (strcmp(*argv, "help") == 0) {
usage();
} else {
@@ -161,8 +175,8 @@ action_ctrl_ok:
return -1;
/* Must at least do late binding, use TB or ewma policing */
- if (!rate64 && !avrate && !p.index && !mtu) {
- fprintf(stderr, "'rate' or 'avrate' or 'mtu' MUST be specified.\n");
+ if (!rate64 && !avrate && !p.index && !mtu && !pps64) {
+ fprintf(stderr, "'rate' or 'avrate' or 'mtu' or 'pkts_rate' MUST be specified.\n");
return -1;
}
@@ -172,6 +186,18 @@ action_ctrl_ok:
return -1;
}
+ /* When the packets TB policer is used, pkts_burst is required */
+ if (pps64 && !ppsburst64) {
+ fprintf(stderr, "'pkts_burst' requires 'pkts_rate'.\n");
+ return -1;
+ }
+
+ /* forbid rate and pkts_rate in same action */
+ if (pps64 && rate64) {
+ fprintf(stderr, "'rate' and 'pkts_rate' are not allowed in same action.\n");
+ return -1;
+ }
+
if (prate64) {
if (!rate64) {
fprintf(stderr, "'peakrate' requires 'rate'.\n");
@@ -223,6 +249,12 @@ action_ctrl_ok:
if (presult)
addattr32(n, MAX_MSG, TCA_POLICE_RESULT, presult);
+ if (pps64) {
+ addattr64(n, MAX_MSG, TCA_POLICE_PKTRATE64, pps64);
+ ppsburst64 = tc_calc_xmittime(pps64, ppsburst64);
+ addattr64(n, MAX_MSG, TCA_POLICE_PKTBURST64, ppsburst64);
+ }
+
addattr_nest_end(n, tail);
res = 0;
@@ -244,6 +276,7 @@ static int print_police(struct action_util *a, FILE *f, struct rtattr *arg)
unsigned int buffer;
unsigned int linklayer;
__u64 rate64, prate64;
+ __u64 pps64, ppsburst64;
if (arg == NULL)
return 0;
@@ -287,6 +320,17 @@ static int print_police(struct action_util *a, FILE *f, struct rtattr *arg)
tc_print_rate(PRINT_FP, NULL, "avrate %s ",
rta_getattr_u32(tb[TCA_POLICE_AVRATE]));
+ if ((tb[TCA_POLICE_PKTRATE64] &&
+ RTA_PAYLOAD(tb[TCA_POLICE_PKTRATE64]) >= sizeof(pps64)) &&
+ (tb[TCA_POLICE_PKTBURST64] &&
+ RTA_PAYLOAD(tb[TCA_POLICE_PKTBURST64]) >= sizeof(ppsburst64))) {
+ pps64 = rta_getattr_u64(tb[TCA_POLICE_PKTRATE64]);
+ ppsburst64 = rta_getattr_u64(tb[TCA_POLICE_PKTBURST64]);
+ ppsburst64 = tc_calc_xmitsize(pps64, ppsburst64);
+ fprintf(f, "pkts_rate %llu ", pps64);
+ fprintf(f, "pkts_burst %llu ", ppsburst64);
+ }
+
print_action_control(f, "action ", p->action, "");
if (tb[TCA_POLICE_RESULT]) {
--
2.31.1

View File

@ -0,0 +1,72 @@
From 3be62dd57ef875f9cf4674f8665c5da48c4e2274 Mon Sep 17 00:00:00 2001
Message-Id: <3be62dd57ef875f9cf4674f8665c5da48c4e2274.1637678195.git.aclaudi@redhat.com>
In-Reply-To: <b30268eda844bdebbb8e5e4f5735e3b1bb666368.1637678195.git.aclaudi@redhat.com>
References: <b30268eda844bdebbb8e5e4f5735e3b1bb666368.1637678195.git.aclaudi@redhat.com>
From: Andrea Claudi <aclaudi@redhat.com>
Date: Tue, 23 Nov 2021 15:28:18 +0100
Subject: [PATCH] configure: add the --prefix option
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2016061
Upstream Status: iproute2.git commit 0ee1950b
commit 0ee1950b5c38986ea896606810231f5f9d761a00
Author: Andrea Claudi <aclaudi@redhat.com>
Date: Thu Oct 14 10:50:54 2021 +0200
configure: add the --prefix option
This commit add the '--prefix' option to the iproute2 configure script.
This mimics the '--prefix' option that autotools configure provides, and
will be used later to allow users or packagers to set the lib directory.
Signed-off-by: Andrea Claudi <aclaudi@redhat.com>
Acked-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: David Ahern <dsahern@kernel.org>
---
configure | 8 ++++++++
1 file changed, 8 insertions(+)
diff --git a/configure b/configure
index 9a2645d9..05e23eff 100755
--- a/configure
+++ b/configure
@@ -3,6 +3,7 @@
# This is not an autoconf generated configure
INCLUDE="$PWD/include"
+PREFIX="/usr"
# Output file which is input to Makefile
CONFIG=config.mk
@@ -490,6 +491,7 @@ Usage: $0 [OPTIONS]
--libbpf_force <on|off> Enable/disable libbpf by force. Available options:
on: require link against libbpf, quit config if no libbpf support
off: disable libbpf probing
+ --prefix <dir> Path prefix of the lib files to install
-h | --help Show this usage info
EOF
exit $1
@@ -516,6 +518,11 @@ else
LIBBPF_FORCE="$1" ;;
--libbpf_force=*)
LIBBPF_FORCE="${1#*=}" ;;
+ --prefix)
+ shift
+ PREFIX="$1" ;;
+ --prefix=*)
+ PREFIX="${1#*=}" ;;
-h | --help)
usage 0 ;;
--*)
@@ -536,6 +543,7 @@ if [ "${LIBBPF_FORCE-unused}" != "unused" ]; then
usage 1
fi
fi
+[ -z "$PREFIX" ] && usage 1
echo "# Generated config based on" $INCLUDE >$CONFIG
quiet_config >> $CONFIG
--
2.31.1

View File

@ -1,159 +0,0 @@
From 04b921c03a4680931df6660b88444f2478fb585c Mon Sep 17 00:00:00 2001
Message-Id: <04b921c03a4680931df6660b88444f2478fb585c.1628790091.git.aclaudi@redhat.com>
In-Reply-To: <650694eb0120722499207078f965442ef7343bb1.1628790091.git.aclaudi@redhat.com>
References: <650694eb0120722499207078f965442ef7343bb1.1628790091.git.aclaudi@redhat.com>
From: Andrea Claudi <aclaudi@redhat.com>
Date: Wed, 11 Aug 2021 12:55:14 +0200
Subject: [PATCH] police: Add support for json output
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1981393
Upstream Status: iproute2.git commit 0d5cf51e
commit 0d5cf51e0d6c7bfdc51754381b85367b5f8e254a
Author: Roi Dayan <roid@nvidia.com>
Date: Mon Jun 7 09:44:08 2021 +0300
police: Add support for json output
Change to use the print wrappers instead of fprintf().
This is example output of the options part before this commit:
"options": {
"handle": 1,
"in_hw": true,
"actions": [ {
"order": 1 police 0x2 ,
"control_action": {
"type": "drop"
},
"control_action": {
"type": "continue"
}overhead 0b linklayer unspec
ref 1 bind 1
,
"used_hw_stats": [ "delayed" ]
} ]
}
This is the output of the same dump with this commit:
"options": {
"handle": 1,
"in_hw": true,
"actions": [ {
"order": 1,
"kind": "police",
"index": 2,
"control_action": {
"type": "drop"
},
"control_action": {
"type": "continue"
},
"overhead": 0,
"linklayer": "unspec",
"ref": 1,
"bind": 1,
"used_hw_stats": [ "delayed" ]
} ]
}
Signed-off-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Paul Blakey <paulb@nvidia.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
Signed-off-by: Andrea Claudi <aclaudi@redhat.com>
---
tc/m_police.c | 30 +++++++++++++++++-------------
1 file changed, 17 insertions(+), 13 deletions(-)
diff --git a/tc/m_police.c b/tc/m_police.c
index 9ef0e40b..2594c089 100644
--- a/tc/m_police.c
+++ b/tc/m_police.c
@@ -278,18 +278,19 @@ static int print_police(struct action_util *a, FILE *f, struct rtattr *arg)
__u64 rate64, prate64;
__u64 pps64, ppsburst64;
+ print_string(PRINT_ANY, "kind", "%s", "police");
if (arg == NULL)
return 0;
parse_rtattr_nested(tb, TCA_POLICE_MAX, arg);
if (tb[TCA_POLICE_TBF] == NULL) {
- fprintf(f, "[NULL police tbf]");
- return 0;
+ fprintf(stderr, "[NULL police tbf]");
+ return -1;
}
#ifndef STOOPID_8BYTE
if (RTA_PAYLOAD(tb[TCA_POLICE_TBF]) < sizeof(*p)) {
- fprintf(f, "[truncated police tbf]");
+ fprintf(stderr, "[truncated police tbf]");
return -1;
}
#endif
@@ -300,13 +301,13 @@ static int print_police(struct action_util *a, FILE *f, struct rtattr *arg)
RTA_PAYLOAD(tb[TCA_POLICE_RATE64]) >= sizeof(rate64))
rate64 = rta_getattr_u64(tb[TCA_POLICE_RATE64]);
- fprintf(f, " police 0x%x ", p->index);
+ print_uint(PRINT_ANY, "index", "\t index %u ", p->index);
tc_print_rate(PRINT_FP, NULL, "rate %s ", rate64);
buffer = tc_calc_xmitsize(rate64, p->burst);
print_size(PRINT_FP, NULL, "burst %s ", buffer);
print_size(PRINT_FP, NULL, "mtu %s ", p->mtu);
if (show_raw)
- fprintf(f, "[%08x] ", p->burst);
+ print_hex(PRINT_FP, NULL, "[%08x] ", p->burst);
prate64 = p->peakrate.rate;
if (tb[TCA_POLICE_PEAKRATE64] &&
@@ -327,8 +328,8 @@ static int print_police(struct action_util *a, FILE *f, struct rtattr *arg)
pps64 = rta_getattr_u64(tb[TCA_POLICE_PKTRATE64]);
ppsburst64 = rta_getattr_u64(tb[TCA_POLICE_PKTBURST64]);
ppsburst64 = tc_calc_xmitsize(pps64, ppsburst64);
- fprintf(f, "pkts_rate %llu ", pps64);
- fprintf(f, "pkts_burst %llu ", ppsburst64);
+ print_u64(PRINT_ANY, "pkts_rate", "pkts_rate %llu ", pps64);
+ print_u64(PRINT_ANY, "pkts_burst", "pkts_burst %llu ", ppsburst64);
}
print_action_control(f, "action ", p->action, "");
@@ -337,14 +338,17 @@ static int print_police(struct action_util *a, FILE *f, struct rtattr *arg)
__u32 action = rta_getattr_u32(tb[TCA_POLICE_RESULT]);
print_action_control(f, "/", action, " ");
- } else
- fprintf(f, " ");
+ } else {
+ print_string(PRINT_FP, NULL, " ", NULL);
+ }
- fprintf(f, "overhead %ub ", p->rate.overhead);
+ print_uint(PRINT_ANY, "overhead", "overhead %u ", p->rate.overhead);
linklayer = (p->rate.linklayer & TC_LINKLAYER_MASK);
if (linklayer > TC_LINKLAYER_ETHERNET || show_details)
- fprintf(f, "linklayer %s ", sprint_linklayer(linklayer, b2));
- fprintf(f, "\n\tref %d bind %d", p->refcnt, p->bindcnt);
+ print_string(PRINT_ANY, "linklayer", "linklayer %s ",
+ sprint_linklayer(linklayer, b2));
+ print_int(PRINT_ANY, "ref", "ref %d ", p->refcnt);
+ print_int(PRINT_ANY, "bind", "bind %d ", p->bindcnt);
if (show_stats) {
if (tb[TCA_POLICE_TM]) {
struct tcf_t *tm = RTA_DATA(tb[TCA_POLICE_TM]);
@@ -352,7 +356,7 @@ static int print_police(struct action_util *a, FILE *f, struct rtattr *arg)
print_tm(f, tm);
}
}
- fprintf(f, "\n");
+ print_nl();
return 0;
--
2.31.1

View File

@ -0,0 +1,151 @@
From f9649a5c15f7dcee4e684854fcc75a7a3fe27683 Mon Sep 17 00:00:00 2001
Message-Id: <f9649a5c15f7dcee4e684854fcc75a7a3fe27683.1637678195.git.aclaudi@redhat.com>
In-Reply-To: <b30268eda844bdebbb8e5e4f5735e3b1bb666368.1637678195.git.aclaudi@redhat.com>
References: <b30268eda844bdebbb8e5e4f5735e3b1bb666368.1637678195.git.aclaudi@redhat.com>
From: Andrea Claudi <aclaudi@redhat.com>
Date: Tue, 23 Nov 2021 15:28:18 +0100
Subject: [PATCH] configure: add the --libdir option
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2016061
Upstream Status: iproute2.git commit cee0cf84
commit cee0cf84bd32c8d9215f0c155187ad99d52a69b1
Author: Andrea Claudi <aclaudi@redhat.com>
Date: Thu Oct 14 10:50:55 2021 +0200
configure: add the --libdir option
This commit allows users/packagers to choose a lib directory to store
iproute2 lib files.
At the moment iproute2 ship lib files in /usr/lib and offers no way to
modify this setting. However, according to the FHS, distros may choose
"one or more variants of the /lib directory on systems which support
more than one binary format" (e.g. /usr/lib64 on Fedora).
As Luca states in commit a3272b93725a ("configure: restore backward
compatibility"), packaging systems may assume that 'configure' is from
autotools, and try to pass it some parameters.
Allowing the '--libdir=/path/to/libdir' syntax, we can use this to our
advantage, and let the lib directory to be chosen by the distro
packaging system.
Note that LIBDIR uses "\${prefix}/lib" as default value because autoconf
allows this to be expanded to the --prefix value at configure runtime.
"\${prefix}" is replaced with the PREFIX value in check_lib_dir().
Signed-off-by: Andrea Claudi <aclaudi@redhat.com>
Acked-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: David Ahern <dsahern@kernel.org>
---
Makefile | 7 ++++---
configure | 18 ++++++++++++++++++
2 files changed, 22 insertions(+), 3 deletions(-)
diff --git a/Makefile b/Makefile
index 5bc11477..45655ca4 100644
--- a/Makefile
+++ b/Makefile
@@ -1,6 +1,8 @@
# SPDX-License-Identifier: GPL-2.0
# Top level Makefile for iproute2
+-include config.mk
+
ifeq ("$(origin V)", "command line")
VERBOSE = $(V)
endif
@@ -13,7 +15,6 @@ MAKEFLAGS += --no-print-directory
endif
PREFIX?=/usr
-LIBDIR?=$(PREFIX)/lib
SBINDIR?=/sbin
CONFDIR?=/etc/iproute2
NETNS_RUN_DIR?=/var/run/netns
@@ -60,7 +61,7 @@ SUBDIRS=lib ip tc bridge misc netem genl tipc devlink rdma dcb man vdpa
LIBNETLINK=../lib/libutil.a ../lib/libnetlink.a
LDLIBS += $(LIBNETLINK)
-all: config
+all: config.mk
@set -e; \
for i in $(SUBDIRS); \
do echo; echo $$i; $(MAKE) -C $$i; done
@@ -80,7 +81,7 @@ help:
@echo "Make Arguments:"
@echo " V=[0|1] - set build verbosity level"
-config:
+config.mk:
@if [ ! -f config.mk -o configure -nt config.mk ]; then \
sh configure $(KERNEL_INCLUDE); \
fi
diff --git a/configure b/configure
index 05e23eff..8ddff43c 100755
--- a/configure
+++ b/configure
@@ -4,6 +4,7 @@
INCLUDE="$PWD/include"
PREFIX="/usr"
+LIBDIR="\${prefix}/lib"
# Output file which is input to Makefile
CONFIG=config.mk
@@ -149,6 +150,15 @@ EOF
rm -f $TMPDIR/ipttest.c $TMPDIR/ipttest
}
+check_lib_dir()
+{
+ LIBDIR=$(echo $LIBDIR | sed "s|\${prefix}|$PREFIX|")
+
+ echo -n "lib directory: "
+ echo "$LIBDIR"
+ echo "LIBDIR:=$LIBDIR" >> $CONFIG
+}
+
check_ipt()
{
if ! grep TC_CONFIG_XT $CONFIG > /dev/null; then
@@ -487,6 +497,7 @@ usage()
cat <<EOF
Usage: $0 [OPTIONS]
--include_dir <dir> Path to iproute2 include dir
+ --libdir <dir> Path to iproute2 lib dir
--libbpf_dir <dir> Path to libbpf DESTDIR
--libbpf_force <on|off> Enable/disable libbpf by force. Available options:
on: require link against libbpf, quit config if no libbpf support
@@ -508,6 +519,11 @@ else
INCLUDE="$1" ;;
--include_dir=*)
INCLUDE="${1#*=}" ;;
+ --libdir)
+ shift
+ LIBDIR="$1" ;;
+ --libdir=*)
+ LIBDIR="${1#*=}" ;;
--libbpf_dir)
shift
LIBBPF_DIR="$1" ;;
@@ -544,6 +560,7 @@ if [ "${LIBBPF_FORCE-unused}" != "unused" ]; then
fi
fi
[ -z "$PREFIX" ] && usage 1
+[ -z "$LIBDIR" ] && usage 1
echo "# Generated config based on" $INCLUDE >$CONFIG
quiet_config >> $CONFIG
@@ -568,6 +585,7 @@ if ! grep -q TC_CONFIG_NO_XT $CONFIG; then
fi
echo
+check_lib_dir
if ! grep -q TC_CONFIG_NO_XT $CONFIG; then
echo -n "iptables modules directory: "
check_ipt_lib_dir
--
2.31.1

View File

@ -1,73 +0,0 @@
From 148b286b52aa8f38d8d7587b598522310067de7b Mon Sep 17 00:00:00 2001
Message-Id: <148b286b52aa8f38d8d7587b598522310067de7b.1628790091.git.aclaudi@redhat.com>
In-Reply-To: <650694eb0120722499207078f965442ef7343bb1.1628790091.git.aclaudi@redhat.com>
References: <650694eb0120722499207078f965442ef7343bb1.1628790091.git.aclaudi@redhat.com>
From: Andrea Claudi <aclaudi@redhat.com>
Date: Wed, 11 Aug 2021 12:55:14 +0200
Subject: [PATCH] police: Fix normal output back to what it was
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1981393
Upstream Status: iproute2.git commit 71d36000
commit 71d36000dc9ce8397fc45b680e0c0340df5a28e5
Author: Roi Dayan <roid@nvidia.com>
Date: Mon Jul 12 15:26:53 2021 +0300
police: Fix normal output back to what it was
With the json support fix the normal output was
changed. set it back to what it was.
Print overhead with print_size().
Print newline before ref.
Fixes: 0d5cf51e0d6c ("police: Add support for json output")
Signed-off-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Andrea Claudi <aclaudi@redhat.com>
---
tc/m_police.c | 10 ++++++----
1 file changed, 6 insertions(+), 4 deletions(-)
diff --git a/tc/m_police.c b/tc/m_police.c
index 2594c089..f38ab90a 100644
--- a/tc/m_police.c
+++ b/tc/m_police.c
@@ -278,7 +278,7 @@ static int print_police(struct action_util *a, FILE *f, struct rtattr *arg)
__u64 rate64, prate64;
__u64 pps64, ppsburst64;
- print_string(PRINT_ANY, "kind", "%s", "police");
+ print_string(PRINT_JSON, "kind", "%s", "police");
if (arg == NULL)
return 0;
@@ -301,7 +301,8 @@ static int print_police(struct action_util *a, FILE *f, struct rtattr *arg)
RTA_PAYLOAD(tb[TCA_POLICE_RATE64]) >= sizeof(rate64))
rate64 = rta_getattr_u64(tb[TCA_POLICE_RATE64]);
- print_uint(PRINT_ANY, "index", "\t index %u ", p->index);
+ print_hex(PRINT_FP, NULL, " police 0x%x ", p->index);
+ print_uint(PRINT_JSON, "index", NULL, p->index);
tc_print_rate(PRINT_FP, NULL, "rate %s ", rate64);
buffer = tc_calc_xmitsize(rate64, p->burst);
print_size(PRINT_FP, NULL, "burst %s ", buffer);
@@ -342,12 +343,13 @@ static int print_police(struct action_util *a, FILE *f, struct rtattr *arg)
print_string(PRINT_FP, NULL, " ", NULL);
}
- print_uint(PRINT_ANY, "overhead", "overhead %u ", p->rate.overhead);
+ print_size(PRINT_ANY, "overhead", "overhead %s ", p->rate.overhead);
linklayer = (p->rate.linklayer & TC_LINKLAYER_MASK);
if (linklayer > TC_LINKLAYER_ETHERNET || show_details)
print_string(PRINT_ANY, "linklayer", "linklayer %s ",
sprint_linklayer(linklayer, b2));
- print_int(PRINT_ANY, "ref", "ref %d ", p->refcnt);
+ print_nl();
+ print_int(PRINT_ANY, "ref", "\tref %d ", p->refcnt);
print_int(PRINT_ANY, "bind", "bind %d ", p->bindcnt);
if (show_stats) {
if (tb[TCA_POLICE_TM]) {
--
2.31.1

View File

@ -1,68 +0,0 @@
From 7fcfc0e4d6949ff32df3ed749bad8eb419cebbda Mon Sep 17 00:00:00 2001
Message-Id: <7fcfc0e4d6949ff32df3ed749bad8eb419cebbda.1628790091.git.aclaudi@redhat.com>
In-Reply-To: <650694eb0120722499207078f965442ef7343bb1.1628790091.git.aclaudi@redhat.com>
References: <650694eb0120722499207078f965442ef7343bb1.1628790091.git.aclaudi@redhat.com>
From: Andrea Claudi <aclaudi@redhat.com>
Date: Wed, 11 Aug 2021 14:49:33 +0200
Subject: [PATCH] tc: u32: Fix key folding in sample option
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1979425
Upstream Status: iproute2.git commit 9b7ea92b
commit 9b7ea92b9e3feff2876f772ace01148b7406839c
Author: Phil Sutter <phil@nwl.cc>
Date: Wed Aug 4 11:18:28 2021 +0200
tc: u32: Fix key folding in sample option
In between Linux kernel 2.4 and 2.6, key folding for hash tables changed
in kernel space. When iproute2 dropped support for the older algorithm,
the wrong code was removed and kernel 2.4 folding method remained in
place. To get things functional for recent kernels again, restoring the
old code alone was not sufficient - additional byteorder fixes were
needed.
While being at it, make use of ffs() and thereby align the code with how
kernel determines the shift width.
Fixes: 267480f55383c ("Backout the 2.4 utsname hash patch.")
Signed-off-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Andrea Claudi <aclaudi@redhat.com>
---
tc/f_u32.c | 11 ++++++++---
1 file changed, 8 insertions(+), 3 deletions(-)
diff --git a/tc/f_u32.c b/tc/f_u32.c
index 2ed5254a..a5747f67 100644
--- a/tc/f_u32.c
+++ b/tc/f_u32.c
@@ -978,6 +978,13 @@ show_k:
goto show_k;
}
+static __u32 u32_hash_fold(struct tc_u32_key *key)
+{
+ __u8 fshift = key->mask ? ffs(ntohl(key->mask)) - 1 : 0;
+
+ return ntohl(key->val & key->mask) >> fshift;
+}
+
static int u32_parse_opt(struct filter_util *qu, char *handle,
int argc, char **argv, struct nlmsghdr *n)
{
@@ -1110,9 +1117,7 @@ static int u32_parse_opt(struct filter_util *qu, char *handle,
}
NEXT_ARG();
}
- hash = sel2.keys[0].val & sel2.keys[0].mask;
- hash ^= hash >> 16;
- hash ^= hash >> 8;
+ hash = u32_hash_fold(&sel2.keys[0]);
htid = ((hash % divisor) << 12) | (htid & 0xFFF00000);
sample_ok = 1;
continue;
--
2.31.1

View File

@ -0,0 +1,141 @@
From 548e39858ecf9291494555466e6e931935b4a0ee Mon Sep 17 00:00:00 2001
Message-Id: <548e39858ecf9291494555466e6e931935b4a0ee.1643220552.git.aclaudi@redhat.com>
In-Reply-To: <b30268eda844bdebbb8e5e4f5735e3b1bb666368.1643220552.git.aclaudi@redhat.com>
References: <b30268eda844bdebbb8e5e4f5735e3b1bb666368.1643220552.git.aclaudi@redhat.com>
From: Andrea Claudi <aclaudi@redhat.com>
Date: Wed, 26 Jan 2022 10:37:45 +0100
Subject: [PATCH] vdpa: align uapi headers
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2036880
Upstream Status: iproute2.git commit fa58de9b0c73
commit 34672eae6ec8e885f76ae30af10f849720012dd6
Author: Stephen Hemminger <stephen@networkplumber.org>
Date: Thu Nov 18 09:56:57 2021 -0800
vdpa: align uapi headers
Update vdpa headers based on 5.16.0-rc1 and remove redundant
copy.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
include/uapi/linux/vdpa.h | 40 ----------------------------
vdpa/include/uapi/linux/vdpa.h | 7 +++++
vdpa/include/uapi/linux/virtio_ids.h | 26 ++++++++++++++++++
3 files changed, 33 insertions(+), 40 deletions(-)
delete mode 100644 include/uapi/linux/vdpa.h
diff --git a/include/uapi/linux/vdpa.h b/include/uapi/linux/vdpa.h
deleted file mode 100644
index 37ae26b6..00000000
--- a/include/uapi/linux/vdpa.h
+++ /dev/null
@@ -1,40 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0+ WITH Linux-syscall-note */
-/*
- * vdpa device management interface
- * Copyright (c) 2020 Mellanox Technologies Ltd. All rights reserved.
- */
-
-#ifndef _LINUX_VDPA_H_
-#define _LINUX_VDPA_H_
-
-#define VDPA_GENL_NAME "vdpa"
-#define VDPA_GENL_VERSION 0x1
-
-enum vdpa_command {
- VDPA_CMD_UNSPEC,
- VDPA_CMD_MGMTDEV_NEW,
- VDPA_CMD_MGMTDEV_GET, /* can dump */
- VDPA_CMD_DEV_NEW,
- VDPA_CMD_DEV_DEL,
- VDPA_CMD_DEV_GET, /* can dump */
-};
-
-enum vdpa_attr {
- VDPA_ATTR_UNSPEC,
-
- /* bus name (optional) + dev name together make the parent device handle */
- VDPA_ATTR_MGMTDEV_BUS_NAME, /* string */
- VDPA_ATTR_MGMTDEV_DEV_NAME, /* string */
- VDPA_ATTR_MGMTDEV_SUPPORTED_CLASSES, /* u64 */
-
- VDPA_ATTR_DEV_NAME, /* string */
- VDPA_ATTR_DEV_ID, /* u32 */
- VDPA_ATTR_DEV_VENDOR_ID, /* u32 */
- VDPA_ATTR_DEV_MAX_VQS, /* u32 */
- VDPA_ATTR_DEV_MAX_VQ_SIZE, /* u16 */
-
- /* new attributes must be added above here */
- VDPA_ATTR_MAX,
-};
-
-#endif
diff --git a/vdpa/include/uapi/linux/vdpa.h b/vdpa/include/uapi/linux/vdpa.h
index 37ae26b6..b7eab069 100644
--- a/vdpa/include/uapi/linux/vdpa.h
+++ b/vdpa/include/uapi/linux/vdpa.h
@@ -17,6 +17,7 @@ enum vdpa_command {
VDPA_CMD_DEV_NEW,
VDPA_CMD_DEV_DEL,
VDPA_CMD_DEV_GET, /* can dump */
+ VDPA_CMD_DEV_CONFIG_GET, /* can dump */
};
enum vdpa_attr {
@@ -32,6 +33,12 @@ enum vdpa_attr {
VDPA_ATTR_DEV_VENDOR_ID, /* u32 */
VDPA_ATTR_DEV_MAX_VQS, /* u32 */
VDPA_ATTR_DEV_MAX_VQ_SIZE, /* u16 */
+ VDPA_ATTR_DEV_MIN_VQ_SIZE, /* u16 */
+
+ VDPA_ATTR_DEV_NET_CFG_MACADDR, /* binary */
+ VDPA_ATTR_DEV_NET_STATUS, /* u8 */
+ VDPA_ATTR_DEV_NET_CFG_MAX_VQP, /* u16 */
+ VDPA_ATTR_DEV_NET_CFG_MTU, /* u16 */
/* new attributes must be added above here */
VDPA_ATTR_MAX,
diff --git a/vdpa/include/uapi/linux/virtio_ids.h b/vdpa/include/uapi/linux/virtio_ids.h
index bc1c0621..80d76b75 100644
--- a/vdpa/include/uapi/linux/virtio_ids.h
+++ b/vdpa/include/uapi/linux/virtio_ids.h
@@ -51,8 +51,34 @@
#define VIRTIO_ID_PSTORE 22 /* virtio pstore device */
#define VIRTIO_ID_IOMMU 23 /* virtio IOMMU */
#define VIRTIO_ID_MEM 24 /* virtio mem */
+#define VIRTIO_ID_SOUND 25 /* virtio sound */
#define VIRTIO_ID_FS 26 /* virtio filesystem */
#define VIRTIO_ID_PMEM 27 /* virtio pmem */
+#define VIRTIO_ID_RPMB 28 /* virtio rpmb */
#define VIRTIO_ID_MAC80211_HWSIM 29 /* virtio mac80211-hwsim */
+#define VIRTIO_ID_VIDEO_ENCODER 30 /* virtio video encoder */
+#define VIRTIO_ID_VIDEO_DECODER 31 /* virtio video decoder */
+#define VIRTIO_ID_SCMI 32 /* virtio SCMI */
+#define VIRTIO_ID_NITRO_SEC_MOD 33 /* virtio nitro secure module*/
+#define VIRTIO_ID_I2C_ADAPTER 34 /* virtio i2c adapter */
+#define VIRTIO_ID_WATCHDOG 35 /* virtio watchdog */
+#define VIRTIO_ID_CAN 36 /* virtio can */
+#define VIRTIO_ID_DMABUF 37 /* virtio dmabuf */
+#define VIRTIO_ID_PARAM_SERV 38 /* virtio parameter server */
+#define VIRTIO_ID_AUDIO_POLICY 39 /* virtio audio policy */
+#define VIRTIO_ID_BT 40 /* virtio bluetooth */
+#define VIRTIO_ID_GPIO 41 /* virtio gpio */
+
+/*
+ * Virtio Transitional IDs
+ */
+
+#define VIRTIO_TRANS_ID_NET 1000 /* transitional virtio net */
+#define VIRTIO_TRANS_ID_BLOCK 1001 /* transitional virtio block */
+#define VIRTIO_TRANS_ID_BALLOON 1002 /* transitional virtio balloon */
+#define VIRTIO_TRANS_ID_CONSOLE 1003 /* transitional virtio console */
+#define VIRTIO_TRANS_ID_SCSI 1004 /* transitional virtio SCSI */
+#define VIRTIO_TRANS_ID_RNG 1005 /* transitional virtio rng */
+#define VIRTIO_TRANS_ID_9P 1009 /* transitional virtio 9p console */
#endif /* _LINUX_VIRTIO_IDS_H */
--
2.34.1

View File

@ -1,84 +0,0 @@
From 0b66dc13c157f4d34518c06dd774ef39be0df271 Mon Sep 17 00:00:00 2001
Message-Id: <0b66dc13c157f4d34518c06dd774ef39be0df271.1628790091.git.aclaudi@redhat.com>
In-Reply-To: <650694eb0120722499207078f965442ef7343bb1.1628790091.git.aclaudi@redhat.com>
References: <650694eb0120722499207078f965442ef7343bb1.1628790091.git.aclaudi@redhat.com>
From: Andrea Claudi <aclaudi@redhat.com>
Date: Thu, 12 Aug 2021 18:26:39 +0200
Subject: [PATCH] tc: htb: improve burst error messages
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1910745
Upstream Status: iproute2.git commit e44786b2
commit e44786b26934e4fbf337b0af73a9e6f53d458a25
Author: Andrea Claudi <aclaudi@redhat.com>
Date: Thu May 6 12:42:06 2021 +0200
tc: htb: improve burst error messages
When a wrong value is provided for "burst" or "cburst" parameters, the
resulting error message is unclear and can be misleading:
$ tc class add dev dummy0 parent 1: classid 1:1 htb rate 100KBps burst errtrigger
Illegal "buffer"
The message claims an illegal "buffer" is provided, but neither the
inline help nor the man page list "buffer" among the htb parameters, and
the only way to know that "burst", "maxburst" and "buffer" are synonyms
is to look into tc/q_htb.c.
This commit tries to improve this simply changing the error string to
the parameter name provided in the user-given command, clearly pointing
out where the wrong value is.
$ tc class add dev dummy0 parent 1: classid 1:1 htb rate 100KBps burst errtrigger
Illegal "burst"
$ tc class add dev dummy0 parent 1: classid 1:1 htb rate 100Kbps maxburst errtrigger
Illegal "maxburst"
Reported-by: Sebastian Mitterle <smitterl@redhat.com>
Signed-off-by: Andrea Claudi <aclaudi@redhat.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
Signed-off-by: Andrea Claudi <aclaudi@redhat.com>
---
tc/q_htb.c | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/tc/q_htb.c b/tc/q_htb.c
index 42566355..b5f95f67 100644
--- a/tc/q_htb.c
+++ b/tc/q_htb.c
@@ -125,6 +125,7 @@ static int htb_parse_class_opt(struct qdisc_util *qu, int argc, char **argv, str
unsigned int linklayer = LINKLAYER_ETHERNET; /* Assume ethernet */
struct rtattr *tail;
__u64 ceil64 = 0, rate64 = 0;
+ char *param;
while (argc > 0) {
if (matches(*argv, "prio") == 0) {
@@ -160,17 +161,19 @@ static int htb_parse_class_opt(struct qdisc_util *qu, int argc, char **argv, str
} else if (matches(*argv, "burst") == 0 ||
strcmp(*argv, "buffer") == 0 ||
strcmp(*argv, "maxburst") == 0) {
+ param = *argv;
NEXT_ARG();
if (get_size_and_cell(&buffer, &cell_log, *argv) < 0) {
- explain1("buffer");
+ explain1(param);
return -1;
}
} else if (matches(*argv, "cburst") == 0 ||
strcmp(*argv, "cbuffer") == 0 ||
strcmp(*argv, "cmaxburst") == 0) {
+ param = *argv;
NEXT_ARG();
if (get_size_and_cell(&cbuffer, &ccell_log, *argv) < 0) {
- explain1("cbuffer");
+ explain1(param);
return -1;
}
} else if (strcmp(*argv, "ceil") == 0) {
--
2.31.1

View File

@ -0,0 +1,239 @@
From e3610d3ea2c8d88c3dd61d845a682f74a1af1d1f Mon Sep 17 00:00:00 2001
Message-Id: <e3610d3ea2c8d88c3dd61d845a682f74a1af1d1f.1643220552.git.aclaudi@redhat.com>
In-Reply-To: <b30268eda844bdebbb8e5e4f5735e3b1bb666368.1643220552.git.aclaudi@redhat.com>
References: <b30268eda844bdebbb8e5e4f5735e3b1bb666368.1643220552.git.aclaudi@redhat.com>
From: Andrea Claudi <aclaudi@redhat.com>
Date: Wed, 26 Jan 2022 10:39:38 +0100
Subject: [PATCH] vdpa: Enable user to query vdpa device config layout
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2036880
Upstream Status: iproute2.git commit a311f0c4
commit a311f0c43a67be939dfafda563453a3f9bf30e42
Author: Parav Pandit <parav@nvidia.com>
Date: Fri Dec 17 10:08:25 2021 +0200
vdpa: Enable user to query vdpa device config layout
Query the device configuration layout whenever kernel supports it.
An example of configuration layout of vdpa device of type network:
$ vdpa dev add name bar mgmtdev vdpasim_net
$ vdpa dev config show
bar: mac 00:35:09:19:48:05 link up link_announce false mtu 1500
$ vdpa dev config show -jp
{
"config": {
"bar": {
"mac": "00:35:09:19:48:05",
"link ": "up",
"link_announce ": false,
"mtu": 1500,
}
}
}
Signed-off-by: Parav Pandit <parav@nvidia.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
---
man/man8/vdpa-dev.8 | 21 +++++++++
vdpa/vdpa.c | 110 ++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 131 insertions(+)
diff --git a/man/man8/vdpa-dev.8 b/man/man8/vdpa-dev.8
index 36433519..5d3a3f26 100644
--- a/man/man8/vdpa-dev.8
+++ b/man/man8/vdpa-dev.8
@@ -36,6 +36,10 @@ vdpa-dev \- vdpa device configuration
.B vdpa dev del
.I DEV
+.ti -8
+.B vdpa dev config show
+.RI "[ " DEV " ]"
+
.SH "DESCRIPTION"
.SS vdpa dev show - display vdpa device attributes
@@ -65,6 +69,18 @@ Name of the management device to use for device addition.
.I "DEV"
- specifies the vdpa device to delete.
+.SS vdpa dev config show - Show configuration of specific device or all devices.
+
+.PP
+.I "DEV"
+- specifies the vdpa device to show its configuration.
+If this argument is omitted all devices configuration is listed.
+
+.in +4
+Format is:
+.in +2
+VDPA_DEVICE_NAME
+
.SH "EXAMPLES"
.PP
vdpa dev show
@@ -86,6 +102,11 @@ vdpa dev del foo
.RS 4
Delete the vdpa device named foo which was previously created.
.RE
+.PP
+vdpa dev config show foo
+.RS 4
+Shows the vdpa device configuration of device named foo.
+.RE
.SH SEE ALSO
.BR vdpa (8),
diff --git a/vdpa/vdpa.c b/vdpa/vdpa.c
index 7fdb36b9..ba704254 100644
--- a/vdpa/vdpa.c
+++ b/vdpa/vdpa.c
@@ -6,9 +6,11 @@
#include <linux/genetlink.h>
#include <linux/vdpa.h>
#include <linux/virtio_ids.h>
+#include <linux/virtio_net.h>
#include <linux/netlink.h>
#include <libmnl/libmnl.h>
#include "mnl_utils.h"
+#include <rt_names.h>
#include "version.h"
#include "json_print.h"
@@ -413,6 +415,7 @@ static void cmd_dev_help(void)
fprintf(stderr, "Usage: vdpa dev show [ DEV ]\n");
fprintf(stderr, " vdpa dev add name NAME mgmtdev MANAGEMENTDEV\n");
fprintf(stderr, " vdpa dev del DEV\n");
+ fprintf(stderr, "Usage: vdpa dev config COMMAND [ OPTIONS ]\n");
}
static const char *device_type_name(uint32_t type)
@@ -520,6 +523,111 @@ static int cmd_dev_del(struct vdpa *vdpa, int argc, char **argv)
return mnlu_gen_socket_sndrcv(&vdpa->nlg, nlh, NULL, NULL);
}
+static void pr_out_dev_net_config(struct nlattr **tb)
+{
+ SPRINT_BUF(macaddr);
+ uint16_t val_u16;
+
+ if (tb[VDPA_ATTR_DEV_NET_CFG_MACADDR]) {
+ const unsigned char *data;
+ uint16_t len;
+
+ len = mnl_attr_get_payload_len(tb[VDPA_ATTR_DEV_NET_CFG_MACADDR]);
+ data = mnl_attr_get_payload(tb[VDPA_ATTR_DEV_NET_CFG_MACADDR]);
+
+ print_string(PRINT_ANY, "mac", "mac %s ",
+ ll_addr_n2a(data, len, 0, macaddr, sizeof(macaddr)));
+ }
+ if (tb[VDPA_ATTR_DEV_NET_STATUS]) {
+ val_u16 = mnl_attr_get_u16(tb[VDPA_ATTR_DEV_NET_STATUS]);
+ print_string(PRINT_ANY, "link ", "link %s ",
+ (val_u16 & VIRTIO_NET_S_LINK_UP) ? "up" : "down");
+ print_bool(PRINT_ANY, "link_announce ", "link_announce %s ",
+ (val_u16 & VIRTIO_NET_S_ANNOUNCE) ? true : false);
+ }
+ if (tb[VDPA_ATTR_DEV_NET_CFG_MAX_VQP]) {
+ val_u16 = mnl_attr_get_u16(tb[VDPA_ATTR_DEV_NET_CFG_MAX_VQP]);
+ print_uint(PRINT_ANY, "max_vq_pairs", "max_vq_pairs %d ",
+ val_u16);
+ }
+ if (tb[VDPA_ATTR_DEV_NET_CFG_MTU]) {
+ val_u16 = mnl_attr_get_u16(tb[VDPA_ATTR_DEV_NET_CFG_MTU]);
+ print_uint(PRINT_ANY, "mtu", "mtu %d ", val_u16);
+ }
+}
+
+static void pr_out_dev_config(struct vdpa *vdpa, struct nlattr **tb)
+{
+ uint32_t device_id = mnl_attr_get_u32(tb[VDPA_ATTR_DEV_ID]);
+
+ pr_out_vdev_handle_start(vdpa, tb);
+ switch (device_id) {
+ case VIRTIO_ID_NET:
+ pr_out_dev_net_config(tb);
+ break;
+ default:
+ break;
+ }
+ pr_out_vdev_handle_end(vdpa);
+}
+
+static int cmd_dev_config_show_cb(const struct nlmsghdr *nlh, void *data)
+{
+ struct genlmsghdr *genl = mnl_nlmsg_get_payload(nlh);
+ struct nlattr *tb[VDPA_ATTR_MAX + 1] = {};
+ struct vdpa *vdpa = data;
+
+ mnl_attr_parse(nlh, sizeof(*genl), attr_cb, tb);
+ if (!tb[VDPA_ATTR_DEV_NAME] || !tb[VDPA_ATTR_DEV_ID])
+ return MNL_CB_ERROR;
+ pr_out_dev_config(vdpa, tb);
+ return MNL_CB_OK;
+}
+
+static int cmd_dev_config_show(struct vdpa *vdpa, int argc, char **argv)
+{
+ uint16_t flags = NLM_F_REQUEST | NLM_F_ACK;
+ struct nlmsghdr *nlh;
+ int err;
+
+ if (argc <= 0)
+ flags |= NLM_F_DUMP;
+
+ nlh = mnlu_gen_socket_cmd_prepare(&vdpa->nlg, VDPA_CMD_DEV_CONFIG_GET,
+ flags);
+ if (argc > 0) {
+ err = vdpa_argv_parse_put(nlh, vdpa, argc, argv,
+ VDPA_OPT_VDEV_HANDLE);
+ if (err)
+ return err;
+ }
+
+ pr_out_section_start(vdpa, "config");
+ err = mnlu_gen_socket_sndrcv(&vdpa->nlg, nlh, cmd_dev_config_show_cb, vdpa);
+ pr_out_section_end(vdpa);
+ return err;
+}
+
+static void cmd_dev_config_help(void)
+{
+ fprintf(stderr, "Usage: vdpa dev config show [ DEV ]\n");
+}
+
+static int cmd_dev_config(struct vdpa *vdpa, int argc, char **argv)
+{
+ if (!argc)
+ return cmd_dev_config_show(vdpa, argc - 1, argv + 1);
+
+ if (matches(*argv, "help") == 0) {
+ cmd_dev_config_help();
+ return 0;
+ } else if (matches(*argv, "show") == 0) {
+ return cmd_dev_config_show(vdpa, argc - 1, argv + 1);
+ }
+ fprintf(stderr, "Command \"%s\" not found\n", *argv);
+ return -ENOENT;
+}
+
static int cmd_dev(struct vdpa *vdpa, int argc, char **argv)
{
if (!argc)
@@ -535,6 +643,8 @@ static int cmd_dev(struct vdpa *vdpa, int argc, char **argv)
return cmd_dev_add(vdpa, argc - 1, argv + 1);
} else if (matches(*argv, "del") == 0) {
return cmd_dev_del(vdpa, argc - 1, argv + 1);
+ } else if (matches(*argv, "config") == 0) {
+ return cmd_dev_config(vdpa, argc - 1, argv + 1);
}
fprintf(stderr, "Command \"%s\" not found\n", *argv);
return -ENOENT;
--
2.34.1

View File

@ -1,67 +0,0 @@
From d1f0f7f4e3e3a372a51e64bdd88f8ddecde1fbbf Mon Sep 17 00:00:00 2001
Message-Id: <d1f0f7f4e3e3a372a51e64bdd88f8ddecde1fbbf.1633614399.git.aclaudi@redhat.com>
In-Reply-To: <650694eb0120722499207078f965442ef7343bb1.1633614399.git.aclaudi@redhat.com>
References: <650694eb0120722499207078f965442ef7343bb1.1633614399.git.aclaudi@redhat.com>
From: Andrea Claudi <aclaudi@redhat.com>
Date: Tue, 28 Sep 2021 11:46:43 +0200
Subject: [PATCH] lib: bpf_legacy: fix bpffs mount when /sys/fs/bpf exists
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1995082
Upstream Status: iproute2.git commit 2f5825cb
commit 2f5825cb38028a14961a79844a069be4e3057eca
Author: Andrea Claudi <aclaudi@redhat.com>
Date: Tue Sep 21 11:33:24 2021 +0200
lib: bpf_legacy: fix bpffs mount when /sys/fs/bpf exists
bpf selftests using iproute2 fails with:
$ ip link set dev veth0 xdp object ../bpf/xdp_dummy.o section xdp_dummy
Continuing without mounted eBPF fs. Too old kernel?
mkdir (null)/globals failed: No such file or directory
Unable to load program
This happens when the /sys/fs/bpf directory exists. In this case, mkdir
in bpf_mnt_check_target() fails with errno == EEXIST, and the function
returns -1. Thus bpf_get_work_dir() does not call bpf_mnt_fs() and the
bpffs is not mounted.
Fix this in bpf_mnt_check_target(), returning 0 when the mountpoint
exists.
Fixes: d4fcdbbec9df ("lib/bpf: Fix and simplify bpf_mnt_check_target()")
Reported-by: Mingyu Shi <mshi@redhat.com>
Reported-by: Jiri Benc <jbenc@redhat.com>
Suggested-by: Jiri Benc <jbenc@redhat.com>
Signed-off-by: Andrea Claudi <aclaudi@redhat.com>
Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
lib/bpf_legacy.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/lib/bpf_legacy.c b/lib/bpf_legacy.c
index 7ec9ce9d..f9dfad6e 100644
--- a/lib/bpf_legacy.c
+++ b/lib/bpf_legacy.c
@@ -513,9 +513,12 @@ static int bpf_mnt_check_target(const char *target)
int ret;
ret = mkdir(target, S_IRWXU);
- if (ret && errno != EEXIST)
+ if (ret) {
+ if (errno == EEXIST)
+ return 0;
fprintf(stderr, "mkdir %s failed: %s\n", target,
strerror(errno));
+ }
return ret;
}
--
2.31.1

View File

@ -0,0 +1,233 @@
From 33a786460c3ef992e5a8d7a1be7ef5aac8860ba9 Mon Sep 17 00:00:00 2001
Message-Id: <33a786460c3ef992e5a8d7a1be7ef5aac8860ba9.1643220552.git.aclaudi@redhat.com>
In-Reply-To: <b30268eda844bdebbb8e5e4f5735e3b1bb666368.1643220552.git.aclaudi@redhat.com>
References: <b30268eda844bdebbb8e5e4f5735e3b1bb666368.1643220552.git.aclaudi@redhat.com>
From: Andrea Claudi <aclaudi@redhat.com>
Date: Wed, 26 Jan 2022 10:39:38 +0100
Subject: [PATCH] vdpa: Enable user to set mac address of vdpa device
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2036880
Upstream Status: iproute2.git commit 384938f9
commit 384938f9b00f2d203603e0919f23ae6857a14d96
Author: Parav Pandit <parav@nvidia.com>
Date: Fri Dec 17 10:08:26 2021 +0200
vdpa: Enable user to set mac address of vdpa device
vdpa: Enable user to set mtu of the vdpa device
Implement mtu setting for vdpa device.
$ vdpa mgmtdev show
vdpasim_net:
supported_classes net
Add the device with specified mac address:
$ vdpa dev add name bar mgmtdev vdpasim_net mac 00:11:22:33:44:55
View the config after setting:
$ vdpa dev config show
bar: mac 00:11:22:33:44:55 link up link_announce false mtu 1500
Signed-off-by: Parav Pandit <parav@nvidia.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
---
man/man8/vdpa-dev.8 | 11 ++++++++++
vdpa/vdpa.c | 52 ++++++++++++++++++++++++++++++++++++---------
2 files changed, 53 insertions(+), 10 deletions(-)
diff --git a/man/man8/vdpa-dev.8 b/man/man8/vdpa-dev.8
index 5d3a3f26..5c5ac469 100644
--- a/man/man8/vdpa-dev.8
+++ b/man/man8/vdpa-dev.8
@@ -31,6 +31,7 @@ vdpa-dev \- vdpa device configuration
.I NAME
.B mgmtdev
.I MGMTDEV
+.RI "[ mac " MACADDR " ]"
.ti -8
.B vdpa dev del
@@ -63,6 +64,11 @@ Name of the new vdpa device to add.
.BI mgmtdev " MGMTDEV"
Name of the management device to use for device addition.
+.PP
+.BI mac " MACADDR"
+- specifies the mac address for the new vdpa device.
+This is applicable only for the network type of vdpa device. This is optional.
+
.SS vdpa dev del - Delete the vdpa device.
.PP
@@ -98,6 +104,11 @@ vdpa dev add name foo mgmtdev vdpa_sim_net
Add the vdpa device named foo on the management device vdpa_sim_net.
.RE
.PP
+vdpa dev add name foo mgmtdev vdpa_sim_net mac 00:11:22:33:44:55
+.RS 4
+Add the vdpa device named foo on the management device vdpa_sim_net with mac address of 00:11:22:33:44:55.
+.RE
+.PP
vdpa dev del foo
.RS 4
Delete the vdpa device named foo which was previously created.
diff --git a/vdpa/vdpa.c b/vdpa/vdpa.c
index ba704254..63d464d1 100644
--- a/vdpa/vdpa.c
+++ b/vdpa/vdpa.c
@@ -4,6 +4,7 @@
#include <getopt.h>
#include <errno.h>
#include <linux/genetlink.h>
+#include <linux/if_ether.h>
#include <linux/vdpa.h>
#include <linux/virtio_ids.h>
#include <linux/virtio_net.h>
@@ -20,6 +21,7 @@
#define VDPA_OPT_VDEV_MGMTDEV_HANDLE BIT(1)
#define VDPA_OPT_VDEV_NAME BIT(2)
#define VDPA_OPT_VDEV_HANDLE BIT(3)
+#define VDPA_OPT_VDEV_MAC BIT(4)
struct vdpa_opts {
uint64_t present; /* flags of present items */
@@ -27,6 +29,7 @@ struct vdpa_opts {
char *mdev_name;
const char *vdev_name;
unsigned int device_id;
+ char mac[ETH_ALEN];
};
struct vdpa {
@@ -136,6 +139,21 @@ static int vdpa_argv_str(struct vdpa *vdpa, int argc, char **argv,
return 0;
}
+static int vdpa_argv_mac(struct vdpa *vdpa, int argc, char **argv, char *mac)
+{
+ int alen;
+
+ if (argc <= 0 || *argv == NULL) {
+ fprintf(stderr, "String parameter expected\n");
+ return -EINVAL;
+ }
+
+ alen = ll_addr_a2n(mac, ETH_ALEN, *argv);
+ if (alen < 0)
+ return -EINVAL;
+ return 0;
+}
+
struct vdpa_args_metadata {
uint64_t o_flag;
const char *err_msg;
@@ -183,13 +201,16 @@ static void vdpa_opts_put(struct nlmsghdr *nlh, struct vdpa *vdpa)
if ((opts->present & VDPA_OPT_VDEV_NAME) ||
(opts->present & VDPA_OPT_VDEV_HANDLE))
mnl_attr_put_strz(nlh, VDPA_ATTR_DEV_NAME, opts->vdev_name);
+ if (opts->present & VDPA_OPT_VDEV_MAC)
+ mnl_attr_put(nlh, VDPA_ATTR_DEV_NET_CFG_MACADDR,
+ sizeof(opts->mac), opts->mac);
}
static int vdpa_argv_parse(struct vdpa *vdpa, int argc, char **argv,
- uint64_t o_required)
+ uint64_t o_required, uint64_t o_optional)
{
+ uint64_t o_all = o_required | o_optional;
struct vdpa_opts *opts = &vdpa->opts;
- uint64_t o_all = o_required;
uint64_t o_found = 0;
int err;
@@ -233,6 +254,15 @@ static int vdpa_argv_parse(struct vdpa *vdpa, int argc, char **argv,
NEXT_ARG_FWD();
o_found |= VDPA_OPT_VDEV_MGMTDEV_HANDLE;
+ } else if ((strcmp(*argv, "mac") == 0) &&
+ (o_all & VDPA_OPT_VDEV_MAC)) {
+ NEXT_ARG_FWD();
+ err = vdpa_argv_mac(vdpa, argc, argv, opts->mac);
+ if (err)
+ return err;
+
+ NEXT_ARG_FWD();
+ o_found |= VDPA_OPT_VDEV_MAC;
} else {
fprintf(stderr, "Unknown option \"%s\"\n", *argv);
return -EINVAL;
@@ -246,11 +276,11 @@ static int vdpa_argv_parse(struct vdpa *vdpa, int argc, char **argv,
static int vdpa_argv_parse_put(struct nlmsghdr *nlh, struct vdpa *vdpa,
int argc, char **argv,
- uint64_t o_required)
+ uint64_t o_required, uint64_t o_optional)
{
int err;
- err = vdpa_argv_parse(vdpa, argc, argv, o_required);
+ err = vdpa_argv_parse(vdpa, argc, argv, o_required, o_optional);
if (err)
return err;
vdpa_opts_put(nlh, vdpa);
@@ -386,7 +416,7 @@ static int cmd_mgmtdev_show(struct vdpa *vdpa, int argc, char **argv)
flags);
if (argc > 0) {
err = vdpa_argv_parse_put(nlh, vdpa, argc, argv,
- VDPA_OPT_MGMTDEV_HANDLE);
+ VDPA_OPT_MGMTDEV_HANDLE, 0);
if (err)
return err;
}
@@ -413,7 +443,7 @@ static int cmd_mgmtdev(struct vdpa *vdpa, int argc, char **argv)
static void cmd_dev_help(void)
{
fprintf(stderr, "Usage: vdpa dev show [ DEV ]\n");
- fprintf(stderr, " vdpa dev add name NAME mgmtdev MANAGEMENTDEV\n");
+ fprintf(stderr, " vdpa dev add name NAME mgmtdev MANAGEMENTDEV [ mac MACADDR ]\n");
fprintf(stderr, " vdpa dev del DEV\n");
fprintf(stderr, "Usage: vdpa dev config COMMAND [ OPTIONS ]\n");
}
@@ -483,7 +513,7 @@ static int cmd_dev_show(struct vdpa *vdpa, int argc, char **argv)
nlh = mnlu_gen_socket_cmd_prepare(&vdpa->nlg, VDPA_CMD_DEV_GET, flags);
if (argc > 0) {
err = vdpa_argv_parse_put(nlh, vdpa, argc, argv,
- VDPA_OPT_VDEV_HANDLE);
+ VDPA_OPT_VDEV_HANDLE, 0);
if (err)
return err;
}
@@ -502,7 +532,8 @@ static int cmd_dev_add(struct vdpa *vdpa, int argc, char **argv)
nlh = mnlu_gen_socket_cmd_prepare(&vdpa->nlg, VDPA_CMD_DEV_NEW,
NLM_F_REQUEST | NLM_F_ACK);
err = vdpa_argv_parse_put(nlh, vdpa, argc, argv,
- VDPA_OPT_VDEV_MGMTDEV_HANDLE | VDPA_OPT_VDEV_NAME);
+ VDPA_OPT_VDEV_MGMTDEV_HANDLE | VDPA_OPT_VDEV_NAME,
+ VDPA_OPT_VDEV_MAC);
if (err)
return err;
@@ -516,7 +547,8 @@ static int cmd_dev_del(struct vdpa *vdpa, int argc, char **argv)
nlh = mnlu_gen_socket_cmd_prepare(&vdpa->nlg, VDPA_CMD_DEV_DEL,
NLM_F_REQUEST | NLM_F_ACK);
- err = vdpa_argv_parse_put(nlh, vdpa, argc, argv, VDPA_OPT_VDEV_HANDLE);
+ err = vdpa_argv_parse_put(nlh, vdpa, argc, argv, VDPA_OPT_VDEV_HANDLE,
+ 0);
if (err)
return err;
@@ -597,7 +629,7 @@ static int cmd_dev_config_show(struct vdpa *vdpa, int argc, char **argv)
flags);
if (argc > 0) {
err = vdpa_argv_parse_put(nlh, vdpa, argc, argv,
- VDPA_OPT_VDEV_HANDLE);
+ VDPA_OPT_VDEV_HANDLE, 0);
if (err)
return err;
}
--
2.34.1

View File

@ -0,0 +1,158 @@
From 6af0b1f5f1848689ee3c4cd00af224309185c644 Mon Sep 17 00:00:00 2001
Message-Id: <6af0b1f5f1848689ee3c4cd00af224309185c644.1643220552.git.aclaudi@redhat.com>
In-Reply-To: <b30268eda844bdebbb8e5e4f5735e3b1bb666368.1643220552.git.aclaudi@redhat.com>
References: <b30268eda844bdebbb8e5e4f5735e3b1bb666368.1643220552.git.aclaudi@redhat.com>
From: Andrea Claudi <aclaudi@redhat.com>
Date: Wed, 26 Jan 2022 10:39:38 +0100
Subject: [PATCH] vdpa: Enable user to set mtu of the vdpa device
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2036880
Upstream Status: iproute2.git commit 167e33f3
commit 167e33f3be88c0fbe206df25145b850ddf3897a2
Author: Parav Pandit <parav@nvidia.com>
Date: Fri Dec 17 10:08:27 2021 +0200
vdpa: Enable user to set mtu of the vdpa device
Implement mtu setting for vdpa device.
$ vdpa mgmtdev show
vdpasim_net:
supported_classes net
Add the device with mac address and mtu:
$ vdpa dev add name bar mgmtdev vdpasim_net mac 00:11:22:33:44:55 mtu 9000
In above command only mac address or only mtu can also be set.
View the config after setting:
$ vdpa dev config show
bar: mac 00:11:22:33:44:55 link up link_announce false mtu 9000
Signed-off-by: Parav Pandit <parav@nvidia.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
---
man/man8/vdpa-dev.8 | 10 ++++++++++
vdpa/vdpa.c | 28 ++++++++++++++++++++++++++--
2 files changed, 36 insertions(+), 2 deletions(-)
diff --git a/man/man8/vdpa-dev.8 b/man/man8/vdpa-dev.8
index 5c5ac469..aa21ae3a 100644
--- a/man/man8/vdpa-dev.8
+++ b/man/man8/vdpa-dev.8
@@ -32,6 +32,7 @@ vdpa-dev \- vdpa device configuration
.B mgmtdev
.I MGMTDEV
.RI "[ mac " MACADDR " ]"
+.RI "[ mtu " MTU " ]"
.ti -8
.B vdpa dev del
@@ -69,6 +70,10 @@ Name of the management device to use for device addition.
- specifies the mac address for the new vdpa device.
This is applicable only for the network type of vdpa device. This is optional.
+.BI mtu " MTU"
+- specifies the mtu for the new vdpa device.
+This is applicable only for the network type of vdpa device. This is optional.
+
.SS vdpa dev del - Delete the vdpa device.
.PP
@@ -109,6 +114,11 @@ vdpa dev add name foo mgmtdev vdpa_sim_net mac 00:11:22:33:44:55
Add the vdpa device named foo on the management device vdpa_sim_net with mac address of 00:11:22:33:44:55.
.RE
.PP
+vdpa dev add name foo mgmtdev vdpa_sim_net mac 00:11:22:33:44:55 mtu 9000
+.RS 4
+Add the vdpa device named foo on the management device vdpa_sim_net with mac address of 00:11:22:33:44:55 and mtu of 9000 bytes.
+.RE
+.PP
vdpa dev del foo
.RS 4
Delete the vdpa device named foo which was previously created.
diff --git a/vdpa/vdpa.c b/vdpa/vdpa.c
index 63d464d1..f048e470 100644
--- a/vdpa/vdpa.c
+++ b/vdpa/vdpa.c
@@ -22,6 +22,7 @@
#define VDPA_OPT_VDEV_NAME BIT(2)
#define VDPA_OPT_VDEV_HANDLE BIT(3)
#define VDPA_OPT_VDEV_MAC BIT(4)
+#define VDPA_OPT_VDEV_MTU BIT(5)
struct vdpa_opts {
uint64_t present; /* flags of present items */
@@ -30,6 +31,7 @@ struct vdpa_opts {
const char *vdev_name;
unsigned int device_id;
char mac[ETH_ALEN];
+ uint16_t mtu;
};
struct vdpa {
@@ -154,6 +156,17 @@ static int vdpa_argv_mac(struct vdpa *vdpa, int argc, char **argv, char *mac)
return 0;
}
+static int vdpa_argv_u16(struct vdpa *vdpa, int argc, char **argv,
+ uint16_t *result)
+{
+ if (argc <= 0 || *argv == NULL) {
+ fprintf(stderr, "number expected\n");
+ return -EINVAL;
+ }
+
+ return get_u16(result, *argv, 10);
+}
+
struct vdpa_args_metadata {
uint64_t o_flag;
const char *err_msg;
@@ -204,6 +217,8 @@ static void vdpa_opts_put(struct nlmsghdr *nlh, struct vdpa *vdpa)
if (opts->present & VDPA_OPT_VDEV_MAC)
mnl_attr_put(nlh, VDPA_ATTR_DEV_NET_CFG_MACADDR,
sizeof(opts->mac), opts->mac);
+ if (opts->present & VDPA_OPT_VDEV_MTU)
+ mnl_attr_put_u16(nlh, VDPA_ATTR_DEV_NET_CFG_MTU, opts->mtu);
}
static int vdpa_argv_parse(struct vdpa *vdpa, int argc, char **argv,
@@ -263,6 +278,15 @@ static int vdpa_argv_parse(struct vdpa *vdpa, int argc, char **argv,
NEXT_ARG_FWD();
o_found |= VDPA_OPT_VDEV_MAC;
+ } else if ((strcmp(*argv, "mtu") == 0) &&
+ (o_all & VDPA_OPT_VDEV_MTU)) {
+ NEXT_ARG_FWD();
+ err = vdpa_argv_u16(vdpa, argc, argv, &opts->mtu);
+ if (err)
+ return err;
+
+ NEXT_ARG_FWD();
+ o_found |= VDPA_OPT_VDEV_MTU;
} else {
fprintf(stderr, "Unknown option \"%s\"\n", *argv);
return -EINVAL;
@@ -443,7 +467,7 @@ static int cmd_mgmtdev(struct vdpa *vdpa, int argc, char **argv)
static void cmd_dev_help(void)
{
fprintf(stderr, "Usage: vdpa dev show [ DEV ]\n");
- fprintf(stderr, " vdpa dev add name NAME mgmtdev MANAGEMENTDEV [ mac MACADDR ]\n");
+ fprintf(stderr, " vdpa dev add name NAME mgmtdev MANAGEMENTDEV [ mac MACADDR ] [ mtu MTU ]\n");
fprintf(stderr, " vdpa dev del DEV\n");
fprintf(stderr, "Usage: vdpa dev config COMMAND [ OPTIONS ]\n");
}
@@ -533,7 +557,7 @@ static int cmd_dev_add(struct vdpa *vdpa, int argc, char **argv)
NLM_F_REQUEST | NLM_F_ACK);
err = vdpa_argv_parse_put(nlh, vdpa, argc, argv,
VDPA_OPT_VDEV_MGMTDEV_HANDLE | VDPA_OPT_VDEV_NAME,
- VDPA_OPT_VDEV_MAC);
+ VDPA_OPT_VDEV_MAC | VDPA_OPT_VDEV_MTU);
if (err)
return err;
--
2.34.1

View File

@ -0,0 +1,206 @@
From 0a250b280fbaf8e4d6ad173cf6d9e082658954b4 Mon Sep 17 00:00:00 2001
Message-Id: <0a250b280fbaf8e4d6ad173cf6d9e082658954b4.1644243783.git.aclaudi@redhat.com>
In-Reply-To: <b30268eda844bdebbb8e5e4f5735e3b1bb666368.1644243783.git.aclaudi@redhat.com>
References: <b30268eda844bdebbb8e5e4f5735e3b1bb666368.1644243783.git.aclaudi@redhat.com>
From: Andrea Claudi <aclaudi@redhat.com>
Date: Mon, 7 Feb 2022 15:16:36 +0100
Subject: [PATCH] tc: u32: add support for json output
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1989591
Upstream Status: unknown commit c733722b
commit c733722b993cb82832722b1490cbc5002035fd20
Author: Wen Liang <liangwen12year@gmail.com>
Date: Wed Jan 26 14:44:47 2022 -0500
tc: u32: add support for json output
Currently u32 filter output does not support json. This commit uses
proper json functions to add support for it.
`sprint_u32_handle` adds an extra space after the raw check, remove the
extra space.
Signed-off-by: Wen Liang <liangwen12year@gmail.com>
Tested-by: Victor Nogueira <victor@mojatatu.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
---
tc/f_u32.c | 83 ++++++++++++++++++++++++++++++------------------------
1 file changed, 46 insertions(+), 37 deletions(-)
diff --git a/tc/f_u32.c b/tc/f_u32.c
index a5747f67..11da202e 100644
--- a/tc/f_u32.c
+++ b/tc/f_u32.c
@@ -109,7 +109,7 @@ static char *sprint_u32_handle(__u32 handle, char *buf)
}
}
if (show_raw)
- snprintf(b, bsize, "[%08x] ", handle);
+ snprintf(b, bsize, "[%08x]", handle);
return buf;
}
@@ -1213,11 +1213,11 @@ static int u32_print_opt(struct filter_util *qu, FILE *f, struct rtattr *opt,
if (handle) {
SPRINT_BUF(b1);
- fprintf(f, "fh %s ", sprint_u32_handle(handle, b1));
+ print_string(PRINT_ANY, "fh", "fh %s ", sprint_u32_handle(handle, b1));
}
if (TC_U32_NODE(handle))
- fprintf(f, "order %d ", TC_U32_NODE(handle));
+ print_int(PRINT_ANY, "order", "order %d ", TC_U32_NODE(handle));
if (tb[TCA_U32_SEL]) {
if (RTA_PAYLOAD(tb[TCA_U32_SEL]) < sizeof(*sel))
@@ -1227,15 +1227,15 @@ static int u32_print_opt(struct filter_util *qu, FILE *f, struct rtattr *opt,
}
if (tb[TCA_U32_DIVISOR]) {
- fprintf(f, "ht divisor %d ",
- rta_getattr_u32(tb[TCA_U32_DIVISOR]));
+ __u32 htdivisor = rta_getattr_u32(tb[TCA_U32_DIVISOR]);
+
+ print_int(PRINT_ANY, "ht_divisor", "ht divisor %d ", htdivisor);
} else if (tb[TCA_U32_HASH]) {
__u32 htid = rta_getattr_u32(tb[TCA_U32_HASH]);
-
- fprintf(f, "key ht %x bkt %x ", TC_U32_USERHTID(htid),
- TC_U32_HASH(htid));
+ print_hex(PRINT_ANY, "key_ht", "key ht %x ", TC_U32_USERHTID(htid));
+ print_hex(PRINT_ANY, "bkt", "bkt %x ", TC_U32_HASH(htid));
} else {
- fprintf(f, "??? ");
+ fprintf(stderr, "divisor and hash missing ");
}
if (tb[TCA_U32_CLASSID]) {
SPRINT_BUF(b1);
@@ -1244,27 +1244,27 @@ static int u32_print_opt(struct filter_util *qu, FILE *f, struct rtattr *opt,
sprint_tc_classid(rta_getattr_u32(tb[TCA_U32_CLASSID]),
b1));
} else if (sel && sel->flags & TC_U32_TERMINAL) {
- fprintf(f, "terminal flowid ??? ");
+ print_string(PRINT_FP, NULL, "terminal flowid ", NULL);
}
if (tb[TCA_U32_LINK]) {
SPRINT_BUF(b1);
- fprintf(f, "link %s ",
- sprint_u32_handle(rta_getattr_u32(tb[TCA_U32_LINK]),
- b1));
+ char *link = sprint_u32_handle(rta_getattr_u32(tb[TCA_U32_LINK]), b1);
+
+ print_string(PRINT_ANY, "link", "link %s ", link);
}
if (tb[TCA_U32_FLAGS]) {
__u32 flags = rta_getattr_u32(tb[TCA_U32_FLAGS]);
if (flags & TCA_CLS_FLAGS_SKIP_HW)
- fprintf(f, "skip_hw ");
+ print_bool(PRINT_ANY, "skip_hw", "skip_hw ", true);
if (flags & TCA_CLS_FLAGS_SKIP_SW)
- fprintf(f, "skip_sw ");
+ print_bool(PRINT_ANY, "skip_sw", "skip_sw ", true);
if (flags & TCA_CLS_FLAGS_IN_HW)
- fprintf(f, "in_hw ");
+ print_bool(PRINT_ANY, "in_hw", "in_hw ", true);
else if (flags & TCA_CLS_FLAGS_NOT_IN_HW)
- fprintf(f, "not_in_hw ");
+ print_bool(PRINT_ANY, "not_in_hw", "not_in_hw ", true);
}
if (tb[TCA_U32_PCNT]) {
@@ -1275,10 +1275,10 @@ static int u32_print_opt(struct filter_util *qu, FILE *f, struct rtattr *opt,
pf = RTA_DATA(tb[TCA_U32_PCNT]);
}
- if (sel && show_stats && NULL != pf)
- fprintf(f, " (rule hit %llu success %llu)",
- (unsigned long long) pf->rcnt,
- (unsigned long long) pf->rhit);
+ if (sel && show_stats && NULL != pf) {
+ print_u64(PRINT_ANY, "rule_hit", "(rule hit %llu ", pf->rcnt);
+ print_u64(PRINT_ANY, "success", "success %llu)", pf->rhit);
+ }
if (tb[TCA_U32_MARK]) {
struct tc_u32_mark *mark = RTA_DATA(tb[TCA_U32_MARK]);
@@ -1286,8 +1286,10 @@ static int u32_print_opt(struct filter_util *qu, FILE *f, struct rtattr *opt,
if (RTA_PAYLOAD(tb[TCA_U32_MARK]) < sizeof(*mark)) {
fprintf(f, "\n Invalid mark (kernel&iproute2 mismatch)\n");
} else {
- fprintf(f, "\n mark 0x%04x 0x%04x (success %d)",
- mark->val, mark->mask, mark->success);
+ print_nl();
+ print_0xhex(PRINT_ANY, "fwmark_value", " mark 0x%04x ", mark->val);
+ print_0xhex(PRINT_ANY, "fwmark_mask", "0x%04x ", mark->mask);
+ print_int(PRINT_ANY, "fwmark_success", "(success %d)", mark->success);
}
}
@@ -1298,38 +1300,45 @@ static int u32_print_opt(struct filter_util *qu, FILE *f, struct rtattr *opt,
for (i = 0; i < sel->nkeys; i++) {
show_keys(f, sel->keys + i);
if (show_stats && NULL != pf)
- fprintf(f, " (success %llu ) ",
- (unsigned long long) pf->kcnts[i]);
+ print_u64(PRINT_ANY, "success", " (success %llu ) ",
+ pf->kcnts[i]);
}
}
if (sel->flags & (TC_U32_VAROFFSET | TC_U32_OFFSET)) {
- fprintf(f, "\n offset ");
- if (sel->flags & TC_U32_VAROFFSET)
- fprintf(f, "%04x>>%d at %d ",
- ntohs(sel->offmask),
- sel->offshift, sel->offoff);
+ print_nl();
+ print_string(PRINT_ANY, NULL, "%s", " offset ");
+ if (sel->flags & TC_U32_VAROFFSET) {
+ print_hex(PRINT_ANY, "offset_mask", "%04x", ntohs(sel->offmask));
+ print_int(PRINT_ANY, "offset_shift", ">>%d ", sel->offshift);
+ print_int(PRINT_ANY, "offset_off", "at %d ", sel->offoff);
+ }
if (sel->off)
- fprintf(f, "plus %d ", sel->off);
+ print_int(PRINT_ANY, "plus", "plus %d ", sel->off);
}
if (sel->flags & TC_U32_EAT)
- fprintf(f, " eat ");
+ print_string(PRINT_ANY, NULL, "%s", " eat ");
if (sel->hmask) {
- fprintf(f, "\n hash mask %08x at %d ",
- (unsigned int)htonl(sel->hmask), sel->hoff);
+ print_nl();
+ unsigned int hmask = (unsigned int)htonl(sel->hmask);
+
+ print_hex(PRINT_ANY, "hash_mask", " hash mask %08x ", hmask);
+ print_int(PRINT_ANY, "hash_off", "at %d ", sel->hoff);
}
}
if (tb[TCA_U32_POLICE]) {
- fprintf(f, "\n");
+ print_nl();
tc_print_police(f, tb[TCA_U32_POLICE]);
}
if (tb[TCA_U32_INDEV]) {
struct rtattr *idev = tb[TCA_U32_INDEV];
-
- fprintf(f, "\n input dev %s\n", rta_getattr_str(idev));
+ print_nl();
+ print_string(PRINT_ANY, "input_dev", " input dev %s",
+ rta_getattr_str(idev));
+ print_nl();
}
if (tb[TCA_U32_ACT])
--
2.34.1

View File

@ -0,0 +1,240 @@
From 66efa0a6dc179f814614fbd2f47c37d7e20e4405 Mon Sep 17 00:00:00 2001
Message-Id: <66efa0a6dc179f814614fbd2f47c37d7e20e4405.1644243783.git.aclaudi@redhat.com>
In-Reply-To: <b30268eda844bdebbb8e5e4f5735e3b1bb666368.1644243783.git.aclaudi@redhat.com>
References: <b30268eda844bdebbb8e5e4f5735e3b1bb666368.1644243783.git.aclaudi@redhat.com>
From: Andrea Claudi <aclaudi@redhat.com>
Date: Mon, 7 Feb 2022 15:16:36 +0100
Subject: [PATCH] tc: u32: add json support in `print_raw`, `print_ipv4`,
`print_ipv6`
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1989591
Upstream Status: unknown commit 721435dc
commit 721435dcfd9274277af2fb6a4cec81d4a9bcc6b4
Author: Wen Liang <liangwen12year@gmail.com>
Date: Wed Jan 26 14:44:48 2022 -0500
tc: u32: add json support in `print_raw`, `print_ipv4`, `print_ipv6`
Currently the key struct of u32 filter does not support json. This
commit adds json support for showing key.
Signed-off-by: Wen Liang <liangwen12year@gmail.com>
Tested-by: Victor Nogueira <victor@mojatatu.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
---
tc/f_u32.c | 121 ++++++++++++++++++++++++++++++++++-------------------
1 file changed, 79 insertions(+), 42 deletions(-)
diff --git a/tc/f_u32.c b/tc/f_u32.c
index 11da202e..d787eb91 100644
--- a/tc/f_u32.c
+++ b/tc/f_u32.c
@@ -824,23 +824,27 @@ static void print_ipv4(FILE *f, const struct tc_u32_key *key)
{
char abuf[256];
+ open_json_object("match");
switch (key->off) {
case 0:
switch (ntohl(key->mask)) {
case 0x0f000000:
- fprintf(f, "\n match IP ihl %u",
- ntohl(key->val) >> 24);
+ print_nl();
+ print_uint(PRINT_ANY, "ip_ihl", " match IP ihl %u",
+ ntohl(key->val) >> 24);
return;
case 0x00ff0000:
- fprintf(f, "\n match IP dsfield %#x",
- ntohl(key->val) >> 16);
+ print_nl();
+ print_0xhex(PRINT_ANY, "ip_dsfield", " match IP dsfield %#x",
+ ntohl(key->val) >> 16);
return;
}
break;
case 8:
if (ntohl(key->mask) == 0x00ff0000) {
- fprintf(f, "\n match IP protocol %d",
- ntohl(key->val) >> 16);
+ print_nl();
+ print_int(PRINT_ANY, "ip_protocol", " match IP protocol %d",
+ ntohl(key->val) >> 16);
return;
}
break;
@@ -849,11 +853,21 @@ static void print_ipv4(FILE *f, const struct tc_u32_key *key)
int bits = mask2bits(key->mask);
if (bits >= 0) {
- fprintf(f, "\n %s %s/%d",
- key->off == 12 ? "match IP src" : "match IP dst",
- inet_ntop(AF_INET, &key->val,
- abuf, sizeof(abuf)),
- bits);
+ const char *addr;
+
+ if (key->off == 12) {
+ print_nl();
+ print_null(PRINT_FP, NULL, " match IP src ", NULL);
+ open_json_object("src");
+ } else {
+ print_nl();
+ print_null(PRINT_FP, NULL, " match IP dst ", NULL);
+ open_json_object("dst");
+ }
+ addr = inet_ntop(AF_INET, &key->val, abuf, sizeof(abuf));
+ print_string(PRINT_ANY, "address", "%s", addr);
+ print_int(PRINT_ANY, "prefixlen", "/%d", bits);
+ close_json_object();
return;
}
}
@@ -862,45 +876,52 @@ static void print_ipv4(FILE *f, const struct tc_u32_key *key)
case 20:
switch (ntohl(key->mask)) {
case 0x0000ffff:
- fprintf(f, "\n match dport %u",
- ntohl(key->val) & 0xffff);
+ print_uint(PRINT_ANY, "dport", "match dport %u",
+ ntohl(key->val) & 0xffff);
return;
case 0xffff0000:
- fprintf(f, "\n match sport %u",
- ntohl(key->val) >> 16);
+ print_nl();
+ print_uint(PRINT_ANY, "sport", " match sport %u",
+ ntohl(key->val) >> 16);
return;
case 0xffffffff:
- fprintf(f, "\n match dport %u, match sport %u",
- ntohl(key->val) & 0xffff,
- ntohl(key->val) >> 16);
-
+ print_nl();
+ print_uint(PRINT_ANY, "dport", " match dport %u, ",
+ ntohl(key->val) & 0xffff);
+ print_uint(PRINT_ANY, "sport", "match sport %u",
+ ntohl(key->val) >> 16);
return;
}
/* XXX: Default print_raw */
}
+ close_json_object();
}
static void print_ipv6(FILE *f, const struct tc_u32_key *key)
{
char abuf[256];
+ open_json_object("match");
switch (key->off) {
case 0:
switch (ntohl(key->mask)) {
case 0x0f000000:
- fprintf(f, "\n match IP ihl %u",
- ntohl(key->val) >> 24);
+ print_nl();
+ print_uint(PRINT_ANY, "ip_ihl", " match IP ihl %u",
+ ntohl(key->val) >> 24);
return;
case 0x00ff0000:
- fprintf(f, "\n match IP dsfield %#x",
- ntohl(key->val) >> 16);
+ print_nl();
+ print_0xhex(PRINT_ANY, "ip_dsfield", " match IP dsfield %#x",
+ ntohl(key->val) >> 16);
return;
}
break;
case 8:
if (ntohl(key->mask) == 0x00ff0000) {
- fprintf(f, "\n match IP protocol %d",
- ntohl(key->val) >> 16);
+ print_nl();
+ print_int(PRINT_ANY, "ip_protocol", " match IP protocol %d",
+ ntohl(key->val) >> 16);
return;
}
break;
@@ -909,11 +930,21 @@ static void print_ipv6(FILE *f, const struct tc_u32_key *key)
int bits = mask2bits(key->mask);
if (bits >= 0) {
- fprintf(f, "\n %s %s/%d",
- key->off == 12 ? "match IP src" : "match IP dst",
- inet_ntop(AF_INET, &key->val,
- abuf, sizeof(abuf)),
- bits);
+ const char *addr;
+
+ if (key->off == 12) {
+ print_nl();
+ print_null(PRINT_FP, NULL, " match IP src ", NULL);
+ open_json_object("src");
+ } else {
+ print_nl();
+ print_null(PRINT_FP, NULL, " match IP dst ", NULL);
+ open_json_object("dst");
+ }
+ addr = inet_ntop(AF_INET, &key->val, abuf, sizeof(abuf));
+ print_string(PRINT_ANY, "address", "%s", addr);
+ print_int(PRINT_ANY, "prefixlen", "/%d", bits);
+ close_json_object();
return;
}
}
@@ -922,31 +953,37 @@ static void print_ipv6(FILE *f, const struct tc_u32_key *key)
case 20:
switch (ntohl(key->mask)) {
case 0x0000ffff:
- fprintf(f, "\n match sport %u",
- ntohl(key->val) & 0xffff);
+ print_nl();
+ print_uint(PRINT_ANY, "sport", " match sport %u",
+ ntohl(key->val) & 0xffff);
return;
case 0xffff0000:
- fprintf(f, "\n match dport %u",
- ntohl(key->val) >> 16);
+ print_uint(PRINT_ANY, "dport", "match dport %u",
+ ntohl(key->val) >> 16);
return;
case 0xffffffff:
- fprintf(f, "\n match sport %u, match dport %u",
- ntohl(key->val) & 0xffff,
- ntohl(key->val) >> 16);
+ print_nl();
+ print_uint(PRINT_ANY, "sport", " match sport %u, ",
+ ntohl(key->val) & 0xffff);
+ print_uint(PRINT_ANY, "dport", "match dport %u",
+ ntohl(key->val) >> 16);
return;
}
/* XXX: Default print_raw */
}
+ close_json_object();
}
static void print_raw(FILE *f, const struct tc_u32_key *key)
{
- fprintf(f, "\n match %08x/%08x at %s%d",
- (unsigned int)ntohl(key->val),
- (unsigned int)ntohl(key->mask),
- key->offmask ? "nexthdr+" : "",
- key->off);
+ open_json_object("match");
+ print_nl();
+ print_hex(PRINT_ANY, "value", " match %08x", (unsigned int)ntohl(key->val));
+ print_hex(PRINT_ANY, "mask", "/%08x ", (unsigned int)ntohl(key->mask));
+ print_string(PRINT_ANY, "offmask", "at %s", key->offmask ? "nexthdr+" : "");
+ print_int(PRINT_ANY, "off", "%d", key->off);
+ close_json_object();
}
static const struct {
--
2.34.1

View File

@ -0,0 +1,397 @@
From e62789f726b0d7fb1f3102e7cb26f2a59fd74231 Mon Sep 17 00:00:00 2001
Message-Id: <e62789f726b0d7fb1f3102e7cb26f2a59fd74231.1647872200.git.aclaudi@redhat.com>
In-Reply-To: <b30268eda844bdebbb8e5e4f5735e3b1bb666368.1647872200.git.aclaudi@redhat.com>
References: <b30268eda844bdebbb8e5e4f5735e3b1bb666368.1647872200.git.aclaudi@redhat.com>
From: Andrea Claudi <aclaudi@redhat.com>
Date: Mon, 21 Mar 2022 15:01:57 +0100
Subject: [PATCH] Update kernel headers and import virtio_net
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2056827
Upstream Status: iproute2.git commit 5cb7ec0c
Conflicts: cherry-picked virtio_net header update, exclude unrelated bpf
and mptcp parts.
commit 5cb7ec0c8d554a7ea32c2f924d7a2fc66af4544a
Author: David Ahern <dsahern@kernel.org>
Date: Sat Dec 18 14:00:29 2021 -0700
Update kernel headers and import virtio_net
Update kernel headers to commit:
f85b244ee395 ("xdp: move the if dev statements to the first")
and import virtio_net.h for vdpa.
Signed-off-by: David Ahern <dsahern@kernel.org>
---
include/uapi/linux/virtio_net.h | 358 ++++++++++++++++++++++++++++++++
1 file changed, 358 insertions(+)
create mode 100644 include/uapi/linux/virtio_net.h
diff --git a/include/uapi/linux/virtio_net.h b/include/uapi/linux/virtio_net.h
new file mode 100644
index 00000000..ab08237f
--- /dev/null
+++ b/include/uapi/linux/virtio_net.h
@@ -0,0 +1,358 @@
+#ifndef _LINUX_VIRTIO_NET_H
+#define _LINUX_VIRTIO_NET_H
+/* This header is BSD licensed so anyone can use the definitions to implement
+ * compatible drivers/servers.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ * 1. Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in the
+ * documentation and/or other materials provided with the distribution.
+ * 3. Neither the name of IBM nor the names of its contributors
+ * may be used to endorse or promote products derived from this software
+ * without specific prior written permission.
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS ``AS IS'' AND
+ * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+ * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ * ARE DISCLAIMED. IN NO EVENT SHALL IBM OR CONTRIBUTORS BE LIABLE
+ * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+ * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
+ * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
+ * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
+ * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
+ * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
+ * SUCH DAMAGE. */
+#include <linux/types.h>
+#include <linux/virtio_ids.h>
+#include <linux/virtio_config.h>
+#include <linux/virtio_types.h>
+#include <linux/if_ether.h>
+
+/* The feature bitmap for virtio net */
+#define VIRTIO_NET_F_CSUM 0 /* Host handles pkts w/ partial csum */
+#define VIRTIO_NET_F_GUEST_CSUM 1 /* Guest handles pkts w/ partial csum */
+#define VIRTIO_NET_F_CTRL_GUEST_OFFLOADS 2 /* Dynamic offload configuration. */
+#define VIRTIO_NET_F_MTU 3 /* Initial MTU advice */
+#define VIRTIO_NET_F_MAC 5 /* Host has given MAC address. */
+#define VIRTIO_NET_F_GUEST_TSO4 7 /* Guest can handle TSOv4 in. */
+#define VIRTIO_NET_F_GUEST_TSO6 8 /* Guest can handle TSOv6 in. */
+#define VIRTIO_NET_F_GUEST_ECN 9 /* Guest can handle TSO[6] w/ ECN in. */
+#define VIRTIO_NET_F_GUEST_UFO 10 /* Guest can handle UFO in. */
+#define VIRTIO_NET_F_HOST_TSO4 11 /* Host can handle TSOv4 in. */
+#define VIRTIO_NET_F_HOST_TSO6 12 /* Host can handle TSOv6 in. */
+#define VIRTIO_NET_F_HOST_ECN 13 /* Host can handle TSO[6] w/ ECN in. */
+#define VIRTIO_NET_F_HOST_UFO 14 /* Host can handle UFO in. */
+#define VIRTIO_NET_F_MRG_RXBUF 15 /* Host can merge receive buffers. */
+#define VIRTIO_NET_F_STATUS 16 /* virtio_net_config.status available */
+#define VIRTIO_NET_F_CTRL_VQ 17 /* Control channel available */
+#define VIRTIO_NET_F_CTRL_RX 18 /* Control channel RX mode support */
+#define VIRTIO_NET_F_CTRL_VLAN 19 /* Control channel VLAN filtering */
+#define VIRTIO_NET_F_CTRL_RX_EXTRA 20 /* Extra RX mode control support */
+#define VIRTIO_NET_F_GUEST_ANNOUNCE 21 /* Guest can announce device on the
+ * network */
+#define VIRTIO_NET_F_MQ 22 /* Device supports Receive Flow
+ * Steering */
+#define VIRTIO_NET_F_CTRL_MAC_ADDR 23 /* Set MAC address */
+
+#define VIRTIO_NET_F_HASH_REPORT 57 /* Supports hash report */
+#define VIRTIO_NET_F_RSS 60 /* Supports RSS RX steering */
+#define VIRTIO_NET_F_RSC_EXT 61 /* extended coalescing info */
+#define VIRTIO_NET_F_STANDBY 62 /* Act as standby for another device
+ * with the same MAC.
+ */
+#define VIRTIO_NET_F_SPEED_DUPLEX 63 /* Device set linkspeed and duplex */
+
+#ifndef VIRTIO_NET_NO_LEGACY
+#define VIRTIO_NET_F_GSO 6 /* Host handles pkts w/ any GSO type */
+#endif /* VIRTIO_NET_NO_LEGACY */
+
+#define VIRTIO_NET_S_LINK_UP 1 /* Link is up */
+#define VIRTIO_NET_S_ANNOUNCE 2 /* Announcement is needed */
+
+/* supported/enabled hash types */
+#define VIRTIO_NET_RSS_HASH_TYPE_IPv4 (1 << 0)
+#define VIRTIO_NET_RSS_HASH_TYPE_TCPv4 (1 << 1)
+#define VIRTIO_NET_RSS_HASH_TYPE_UDPv4 (1 << 2)
+#define VIRTIO_NET_RSS_HASH_TYPE_IPv6 (1 << 3)
+#define VIRTIO_NET_RSS_HASH_TYPE_TCPv6 (1 << 4)
+#define VIRTIO_NET_RSS_HASH_TYPE_UDPv6 (1 << 5)
+#define VIRTIO_NET_RSS_HASH_TYPE_IP_EX (1 << 6)
+#define VIRTIO_NET_RSS_HASH_TYPE_TCP_EX (1 << 7)
+#define VIRTIO_NET_RSS_HASH_TYPE_UDP_EX (1 << 8)
+
+struct virtio_net_config {
+ /* The config defining mac address (if VIRTIO_NET_F_MAC) */
+ __u8 mac[ETH_ALEN];
+ /* See VIRTIO_NET_F_STATUS and VIRTIO_NET_S_* above */
+ __virtio16 status;
+ /* Maximum number of each of transmit and receive queues;
+ * see VIRTIO_NET_F_MQ and VIRTIO_NET_CTRL_MQ.
+ * Legal values are between 1 and 0x8000
+ */
+ __virtio16 max_virtqueue_pairs;
+ /* Default maximum transmit unit advice */
+ __virtio16 mtu;
+ /*
+ * speed, in units of 1Mb. All values 0 to INT_MAX are legal.
+ * Any other value stands for unknown.
+ */
+ __le32 speed;
+ /*
+ * 0x00 - half duplex
+ * 0x01 - full duplex
+ * Any other value stands for unknown.
+ */
+ __u8 duplex;
+ /* maximum size of RSS key */
+ __u8 rss_max_key_size;
+ /* maximum number of indirection table entries */
+ __le16 rss_max_indirection_table_length;
+ /* bitmask of supported VIRTIO_NET_RSS_HASH_ types */
+ __le32 supported_hash_types;
+} __attribute__((packed));
+
+/*
+ * This header comes first in the scatter-gather list. If you don't
+ * specify GSO or CSUM features, you can simply ignore the header.
+ *
+ * This is bitwise-equivalent to the legacy struct virtio_net_hdr_mrg_rxbuf,
+ * only flattened.
+ */
+struct virtio_net_hdr_v1 {
+#define VIRTIO_NET_HDR_F_NEEDS_CSUM 1 /* Use csum_start, csum_offset */
+#define VIRTIO_NET_HDR_F_DATA_VALID 2 /* Csum is valid */
+#define VIRTIO_NET_HDR_F_RSC_INFO 4 /* rsc info in csum_ fields */
+ __u8 flags;
+#define VIRTIO_NET_HDR_GSO_NONE 0 /* Not a GSO frame */
+#define VIRTIO_NET_HDR_GSO_TCPV4 1 /* GSO frame, IPv4 TCP (TSO) */
+#define VIRTIO_NET_HDR_GSO_UDP 3 /* GSO frame, IPv4 UDP (UFO) */
+#define VIRTIO_NET_HDR_GSO_TCPV6 4 /* GSO frame, IPv6 TCP */
+#define VIRTIO_NET_HDR_GSO_ECN 0x80 /* TCP has ECN set */
+ __u8 gso_type;
+ __virtio16 hdr_len; /* Ethernet + IP + tcp/udp hdrs */
+ __virtio16 gso_size; /* Bytes to append to hdr_len per frame */
+ union {
+ struct {
+ __virtio16 csum_start;
+ __virtio16 csum_offset;
+ };
+ /* Checksum calculation */
+ struct {
+ /* Position to start checksumming from */
+ __virtio16 start;
+ /* Offset after that to place checksum */
+ __virtio16 offset;
+ } csum;
+ /* Receive Segment Coalescing */
+ struct {
+ /* Number of coalesced segments */
+ __le16 segments;
+ /* Number of duplicated acks */
+ __le16 dup_acks;
+ } rsc;
+ };
+ __virtio16 num_buffers; /* Number of merged rx buffers */
+};
+
+struct virtio_net_hdr_v1_hash {
+ struct virtio_net_hdr_v1 hdr;
+ __le32 hash_value;
+#define VIRTIO_NET_HASH_REPORT_NONE 0
+#define VIRTIO_NET_HASH_REPORT_IPv4 1
+#define VIRTIO_NET_HASH_REPORT_TCPv4 2
+#define VIRTIO_NET_HASH_REPORT_UDPv4 3
+#define VIRTIO_NET_HASH_REPORT_IPv6 4
+#define VIRTIO_NET_HASH_REPORT_TCPv6 5
+#define VIRTIO_NET_HASH_REPORT_UDPv6 6
+#define VIRTIO_NET_HASH_REPORT_IPv6_EX 7
+#define VIRTIO_NET_HASH_REPORT_TCPv6_EX 8
+#define VIRTIO_NET_HASH_REPORT_UDPv6_EX 9
+ __le16 hash_report;
+ __le16 padding;
+};
+
+#ifndef VIRTIO_NET_NO_LEGACY
+/* This header comes first in the scatter-gather list.
+ * For legacy virtio, if VIRTIO_F_ANY_LAYOUT is not negotiated, it must
+ * be the first element of the scatter-gather list. If you don't
+ * specify GSO or CSUM features, you can simply ignore the header. */
+struct virtio_net_hdr {
+ /* See VIRTIO_NET_HDR_F_* */
+ __u8 flags;
+ /* See VIRTIO_NET_HDR_GSO_* */
+ __u8 gso_type;
+ __virtio16 hdr_len; /* Ethernet + IP + tcp/udp hdrs */
+ __virtio16 gso_size; /* Bytes to append to hdr_len per frame */
+ __virtio16 csum_start; /* Position to start checksumming from */
+ __virtio16 csum_offset; /* Offset after that to place checksum */
+};
+
+/* This is the version of the header to use when the MRG_RXBUF
+ * feature has been negotiated. */
+struct virtio_net_hdr_mrg_rxbuf {
+ struct virtio_net_hdr hdr;
+ __virtio16 num_buffers; /* Number of merged rx buffers */
+};
+#endif /* ...VIRTIO_NET_NO_LEGACY */
+
+/*
+ * Control virtqueue data structures
+ *
+ * The control virtqueue expects a header in the first sg entry
+ * and an ack/status response in the last entry. Data for the
+ * command goes in between.
+ */
+struct virtio_net_ctrl_hdr {
+ __u8 class;
+ __u8 cmd;
+} __attribute__((packed));
+
+typedef __u8 virtio_net_ctrl_ack;
+
+#define VIRTIO_NET_OK 0
+#define VIRTIO_NET_ERR 1
+
+/*
+ * Control the RX mode, ie. promisucous, allmulti, etc...
+ * All commands require an "out" sg entry containing a 1 byte
+ * state value, zero = disable, non-zero = enable. Commands
+ * 0 and 1 are supported with the VIRTIO_NET_F_CTRL_RX feature.
+ * Commands 2-5 are added with VIRTIO_NET_F_CTRL_RX_EXTRA.
+ */
+#define VIRTIO_NET_CTRL_RX 0
+ #define VIRTIO_NET_CTRL_RX_PROMISC 0
+ #define VIRTIO_NET_CTRL_RX_ALLMULTI 1
+ #define VIRTIO_NET_CTRL_RX_ALLUNI 2
+ #define VIRTIO_NET_CTRL_RX_NOMULTI 3
+ #define VIRTIO_NET_CTRL_RX_NOUNI 4
+ #define VIRTIO_NET_CTRL_RX_NOBCAST 5
+
+/*
+ * Control the MAC
+ *
+ * The MAC filter table is managed by the hypervisor, the guest should
+ * assume the size is infinite. Filtering should be considered
+ * non-perfect, ie. based on hypervisor resources, the guest may
+ * received packets from sources not specified in the filter list.
+ *
+ * In addition to the class/cmd header, the TABLE_SET command requires
+ * two out scatterlists. Each contains a 4 byte count of entries followed
+ * by a concatenated byte stream of the ETH_ALEN MAC addresses. The
+ * first sg list contains unicast addresses, the second is for multicast.
+ * This functionality is present if the VIRTIO_NET_F_CTRL_RX feature
+ * is available.
+ *
+ * The ADDR_SET command requests one out scatterlist, it contains a
+ * 6 bytes MAC address. This functionality is present if the
+ * VIRTIO_NET_F_CTRL_MAC_ADDR feature is available.
+ */
+struct virtio_net_ctrl_mac {
+ __virtio32 entries;
+ __u8 macs[][ETH_ALEN];
+} __attribute__((packed));
+
+#define VIRTIO_NET_CTRL_MAC 1
+ #define VIRTIO_NET_CTRL_MAC_TABLE_SET 0
+ #define VIRTIO_NET_CTRL_MAC_ADDR_SET 1
+
+/*
+ * Control VLAN filtering
+ *
+ * The VLAN filter table is controlled via a simple ADD/DEL interface.
+ * VLAN IDs not added may be filterd by the hypervisor. Del is the
+ * opposite of add. Both commands expect an out entry containing a 2
+ * byte VLAN ID. VLAN filterting is available with the
+ * VIRTIO_NET_F_CTRL_VLAN feature bit.
+ */
+#define VIRTIO_NET_CTRL_VLAN 2
+ #define VIRTIO_NET_CTRL_VLAN_ADD 0
+ #define VIRTIO_NET_CTRL_VLAN_DEL 1
+
+/*
+ * Control link announce acknowledgement
+ *
+ * The command VIRTIO_NET_CTRL_ANNOUNCE_ACK is used to indicate that
+ * driver has recevied the notification; device would clear the
+ * VIRTIO_NET_S_ANNOUNCE bit in the status field after it receives
+ * this command.
+ */
+#define VIRTIO_NET_CTRL_ANNOUNCE 3
+ #define VIRTIO_NET_CTRL_ANNOUNCE_ACK 0
+
+/*
+ * Control Receive Flow Steering
+ */
+#define VIRTIO_NET_CTRL_MQ 4
+/*
+ * The command VIRTIO_NET_CTRL_MQ_VQ_PAIRS_SET
+ * enables Receive Flow Steering, specifying the number of the transmit and
+ * receive queues that will be used. After the command is consumed and acked by
+ * the device, the device will not steer new packets on receive virtqueues
+ * other than specified nor read from transmit virtqueues other than specified.
+ * Accordingly, driver should not transmit new packets on virtqueues other than
+ * specified.
+ */
+struct virtio_net_ctrl_mq {
+ __virtio16 virtqueue_pairs;
+};
+
+ #define VIRTIO_NET_CTRL_MQ_VQ_PAIRS_SET 0
+ #define VIRTIO_NET_CTRL_MQ_VQ_PAIRS_MIN 1
+ #define VIRTIO_NET_CTRL_MQ_VQ_PAIRS_MAX 0x8000
+
+/*
+ * The command VIRTIO_NET_CTRL_MQ_RSS_CONFIG has the same effect as
+ * VIRTIO_NET_CTRL_MQ_VQ_PAIRS_SET does and additionally configures
+ * the receive steering to use a hash calculated for incoming packet
+ * to decide on receive virtqueue to place the packet. The command
+ * also provides parameters to calculate a hash and receive virtqueue.
+ */
+struct virtio_net_rss_config {
+ __le32 hash_types;
+ __le16 indirection_table_mask;
+ __le16 unclassified_queue;
+ __le16 indirection_table[1/* + indirection_table_mask */];
+ __le16 max_tx_vq;
+ __u8 hash_key_length;
+ __u8 hash_key_data[/* hash_key_length */];
+};
+
+ #define VIRTIO_NET_CTRL_MQ_RSS_CONFIG 1
+
+/*
+ * The command VIRTIO_NET_CTRL_MQ_HASH_CONFIG requests the device
+ * to include in the virtio header of the packet the value of the
+ * calculated hash and the report type of hash. It also provides
+ * parameters for hash calculation. The command requires feature
+ * VIRTIO_NET_F_HASH_REPORT to be negotiated to extend the
+ * layout of virtio header as defined in virtio_net_hdr_v1_hash.
+ */
+struct virtio_net_hash_config {
+ __le32 hash_types;
+ /* for compatibility with virtio_net_rss_config */
+ __le16 reserved[4];
+ __u8 hash_key_length;
+ __u8 hash_key_data[/* hash_key_length */];
+};
+
+ #define VIRTIO_NET_CTRL_MQ_HASH_CONFIG 2
+
+/*
+ * Control network offloads
+ *
+ * Reconfigures the network offloads that Guest can handle.
+ *
+ * Available with the VIRTIO_NET_F_CTRL_GUEST_OFFLOADS feature bit.
+ *
+ * Command data format matches the feature bit mask exactly.
+ *
+ * See VIRTIO_NET_F_GUEST_* for the list of offloads
+ * that can be enabled/disabled.
+ */
+#define VIRTIO_NET_CTRL_GUEST_OFFLOADS 5
+#define VIRTIO_NET_CTRL_GUEST_OFFLOADS_SET 0
+
+#endif /* _LINUX_VIRTIO_NET_H */
--
2.35.1

View File

@ -0,0 +1,51 @@
From 28786edf9fd1e0d188190cb7029ddde2bdcd8ad8 Mon Sep 17 00:00:00 2001
Message-Id: <28786edf9fd1e0d188190cb7029ddde2bdcd8ad8.1647872200.git.aclaudi@redhat.com>
In-Reply-To: <b30268eda844bdebbb8e5e4f5735e3b1bb666368.1647872200.git.aclaudi@redhat.com>
References: <b30268eda844bdebbb8e5e4f5735e3b1bb666368.1647872200.git.aclaudi@redhat.com>
From: Andrea Claudi <aclaudi@redhat.com>
Date: Mon, 21 Mar 2022 15:03:16 +0100
Subject: [PATCH] uapi: update vdpa.h
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2056827
Upstream Status: iproute2.git commit 885e281e
commit 885e281eadc238e30f7c3a42ad366ea123c03a83
Author: Stephen Hemminger <stephen@networkplumber.org>
Date: Fri Mar 11 19:16:25 2022 -0800
uapi: update vdpa.h
Update header from upstream.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
vdpa/include/uapi/linux/vdpa.h | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/vdpa/include/uapi/linux/vdpa.h b/vdpa/include/uapi/linux/vdpa.h
index b7eab069..cc575a82 100644
--- a/vdpa/include/uapi/linux/vdpa.h
+++ b/vdpa/include/uapi/linux/vdpa.h
@@ -23,6 +23,9 @@ enum vdpa_command {
enum vdpa_attr {
VDPA_ATTR_UNSPEC,
+ /* Pad attribute for 64b alignment */
+ VDPA_ATTR_PAD = VDPA_ATTR_UNSPEC,
+
/* bus name (optional) + dev name together make the parent device handle */
VDPA_ATTR_MGMTDEV_BUS_NAME, /* string */
VDPA_ATTR_MGMTDEV_DEV_NAME, /* string */
@@ -40,6 +43,9 @@ enum vdpa_attr {
VDPA_ATTR_DEV_NET_CFG_MAX_VQP, /* u16 */
VDPA_ATTR_DEV_NET_CFG_MTU, /* u16 */
+ VDPA_ATTR_DEV_NEGOTIATED_FEATURES, /* u64 */
+ VDPA_ATTR_DEV_MGMTDEV_MAX_VQS, /* u32 */
+ VDPA_ATTR_DEV_SUPPORTED_FEATURES, /* u64 */
/* new attributes must be added above here */
VDPA_ATTR_MAX,
};
--
2.35.1

View File

@ -0,0 +1,46 @@
From a2ffc58207b80608f57299a297704d1e409829a5 Mon Sep 17 00:00:00 2001
Message-Id: <a2ffc58207b80608f57299a297704d1e409829a5.1647872200.git.aclaudi@redhat.com>
In-Reply-To: <b30268eda844bdebbb8e5e4f5735e3b1bb666368.1647872200.git.aclaudi@redhat.com>
References: <b30268eda844bdebbb8e5e4f5735e3b1bb666368.1647872200.git.aclaudi@redhat.com>
From: Andrea Claudi <aclaudi@redhat.com>
Date: Mon, 21 Mar 2022 15:03:16 +0100
Subject: [PATCH] vdpa: Remove unsupported command line option
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2056827
Upstream Status: iproute2-next.git commit 2d1954c8
commit 2d1954c8a54b61ec271ab5b36976c4efdcf30066
Author: Eli Cohen <elic@nvidia.com>
Date: Sun Mar 13 19:12:16 2022 +0200
vdpa: Remove unsupported command line option
"-v[erbose]" option is not supported.
Remove it.
Reviewed-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Jianbo Liu <jianbol@mellanox.com>
Reviewed-by: Si-Wei Liu <si-wei.liu@oracle.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Eli Cohen <elic@nvidia.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
---
vdpa/vdpa.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/vdpa/vdpa.c b/vdpa/vdpa.c
index f048e470..4ccb5648 100644
--- a/vdpa/vdpa.c
+++ b/vdpa/vdpa.c
@@ -711,7 +711,7 @@ static void help(void)
fprintf(stderr,
"Usage: vdpa [ OPTIONS ] OBJECT { COMMAND | help }\n"
"where OBJECT := { mgmtdev | dev }\n"
- " OPTIONS := { -V[ersion] | -n[o-nice-names] | -j[son] | -p[retty] | -v[erbose] }\n");
+ " OPTIONS := { -V[ersion] | -n[o-nice-names] | -j[son] | -p[retty] }\n");
}
static int vdpa_cmd(struct vdpa *vdpa, int argc, char **argv)
--
2.35.1

View File

@ -0,0 +1,224 @@
From fab19f1e5fe9ccf1d180874d5b0d86c99c7e16cb Mon Sep 17 00:00:00 2001
Message-Id: <fab19f1e5fe9ccf1d180874d5b0d86c99c7e16cb.1647872200.git.aclaudi@redhat.com>
In-Reply-To: <b30268eda844bdebbb8e5e4f5735e3b1bb666368.1647872200.git.aclaudi@redhat.com>
References: <b30268eda844bdebbb8e5e4f5735e3b1bb666368.1647872200.git.aclaudi@redhat.com>
From: Andrea Claudi <aclaudi@redhat.com>
Date: Mon, 21 Mar 2022 15:03:16 +0100
Subject: [PATCH] vdpa: Allow for printing negotiated features of a device
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2056827
Upstream Status: iproute2-next.git commit bd91c764
commit bd91c764718997adaa1d86eee5c585e67ca85356
Author: Eli Cohen <elic@nvidia.com>
Date: Sun Mar 13 19:12:17 2022 +0200
vdpa: Allow for printing negotiated features of a device
When reading the configuration of a vdpa device, check if the
VDPA_ATTR_DEV_NEGOTIATED_FEATURES is available. If it is, parse the
feature bits and print a string representation of each of the feature
bits.
We keep the strings in two different arrays. One for net device related
devices and one for generic feature bits.
In this patch we parse only net device specific features. Support for
other devices can be added later. If the device queried is not a net
device, we print its bit number only.
Examples:
1. Standard presentation
$ vdpa dev config show vdpa-a
vdpa-a: mac 00:00:00:00:88:88 link up link_announce false max_vq_pairs 2 mtu 9000
negotiated_features CSUM GUEST_CSUM MTU MAC HOST_TSO4 HOST_TSO6 STATUS \
CTRL_VQ MQ CTRL_MAC_ADDR VERSION_1 ACCESS_PLATFORM
2. json output
$ vdpa -j dev config show vdpa-a
{"config":{"vdpa-a":{"mac":"00:00:00:00:88:88","link":"up","link_announce":false,\
"max_vq_pairs":2,"mtu":9000,"negotiated_features":["CSUM","GUEST_CSUM",\
"MTU","MAC","HOST_TSO4","HOST_TSO6","STATUS","CTRL_VQ","MQ","CTRL_MAC_ADDR",\
"VERSION_1","ACCESS_PLATFORM"]}}}
3. Pretty json
$ vdpa -jp dev config show vdpa-a
{
"config": {
"vdpa-a": {
"mac": "00:00:00:00:88:88",
"link ": "up",
"link_announce ": false,
"max_vq_pairs": 2,
"mtu": 9000,
"negotiated_features": [
"CSUM","GUEST_CSUM","MTU","MAC","HOST_TSO4","HOST_TSO6","STATUS","CTRL_VQ",\
"MQ","CTRL_MAC_ADDR","VERSION_1","ACCESS_PLATFORM" ]
}
}
}
Reviewed-by: Si-Wei Liu<si-wei.liu@oracle.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Eli Cohen <elic@nvidia.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
---
vdpa/vdpa.c | 105 +++++++++++++++++++++++++++++++++++++++++++++++++++-
1 file changed, 103 insertions(+), 2 deletions(-)
diff --git a/vdpa/vdpa.c b/vdpa/vdpa.c
index 4ccb5648..40078b1c 100644
--- a/vdpa/vdpa.c
+++ b/vdpa/vdpa.c
@@ -10,6 +10,8 @@
#include <linux/virtio_net.h>
#include <linux/netlink.h>
#include <libmnl/libmnl.h>
+#include <linux/virtio_ring.h>
+#include <linux/virtio_config.h>
#include "mnl_utils.h"
#include <rt_names.h>
@@ -78,6 +80,7 @@ static const enum mnl_attr_data_type vdpa_policy[VDPA_ATTR_MAX + 1] = {
[VDPA_ATTR_DEV_VENDOR_ID] = MNL_TYPE_U32,
[VDPA_ATTR_DEV_MAX_VQS] = MNL_TYPE_U32,
[VDPA_ATTR_DEV_MAX_VQ_SIZE] = MNL_TYPE_U16,
+ [VDPA_ATTR_DEV_NEGOTIATED_FEATURES] = MNL_TYPE_U64,
};
static int attr_cb(const struct nlattr *attr, void *data)
@@ -385,6 +388,94 @@ static const char *parse_class(int num)
return class ? class : "< unknown class >";
}
+static const char * const net_feature_strs[64] = {
+ [VIRTIO_NET_F_CSUM] = "CSUM",
+ [VIRTIO_NET_F_GUEST_CSUM] = "GUEST_CSUM",
+ [VIRTIO_NET_F_CTRL_GUEST_OFFLOADS] = "CTRL_GUEST_OFFLOADS",
+ [VIRTIO_NET_F_MTU] = "MTU",
+ [VIRTIO_NET_F_MAC] = "MAC",
+ [VIRTIO_NET_F_GUEST_TSO4] = "GUEST_TSO4",
+ [VIRTIO_NET_F_GUEST_TSO6] = "GUEST_TSO6",
+ [VIRTIO_NET_F_GUEST_ECN] = "GUEST_ECN",
+ [VIRTIO_NET_F_GUEST_UFO] = "GUEST_UFO",
+ [VIRTIO_NET_F_HOST_TSO4] = "HOST_TSO4",
+ [VIRTIO_NET_F_HOST_TSO6] = "HOST_TSO6",
+ [VIRTIO_NET_F_HOST_ECN] = "HOST_ECN",
+ [VIRTIO_NET_F_HOST_UFO] = "HOST_UFO",
+ [VIRTIO_NET_F_MRG_RXBUF] = "MRG_RXBUF",
+ [VIRTIO_NET_F_STATUS] = "STATUS",
+ [VIRTIO_NET_F_CTRL_VQ] = "CTRL_VQ",
+ [VIRTIO_NET_F_CTRL_RX] = "CTRL_RX",
+ [VIRTIO_NET_F_CTRL_VLAN] = "CTRL_VLAN",
+ [VIRTIO_NET_F_CTRL_RX_EXTRA] = "CTRL_RX_EXTRA",
+ [VIRTIO_NET_F_GUEST_ANNOUNCE] = "GUEST_ANNOUNCE",
+ [VIRTIO_NET_F_MQ] = "MQ",
+ [VIRTIO_F_NOTIFY_ON_EMPTY] = "NOTIFY_ON_EMPTY",
+ [VIRTIO_NET_F_CTRL_MAC_ADDR] = "CTRL_MAC_ADDR",
+ [VIRTIO_F_ANY_LAYOUT] = "ANY_LAYOUT",
+ [VIRTIO_NET_F_RSC_EXT] = "RSC_EXT",
+ [VIRTIO_NET_F_HASH_REPORT] = "HASH_REPORT",
+ [VIRTIO_NET_F_RSS] = "RSS",
+ [VIRTIO_NET_F_STANDBY] = "STANDBY",
+ [VIRTIO_NET_F_SPEED_DUPLEX] = "SPEED_DUPLEX",
+};
+
+#define VIRTIO_F_IN_ORDER 35
+#define VIRTIO_F_NOTIFICATION_DATA 38
+#define VDPA_EXT_FEATURES_SZ (VIRTIO_TRANSPORT_F_END - \
+ VIRTIO_TRANSPORT_F_START + 1)
+
+static const char * const ext_feature_strs[VDPA_EXT_FEATURES_SZ] = {
+ [VIRTIO_RING_F_INDIRECT_DESC - VIRTIO_TRANSPORT_F_START] = "RING_INDIRECT_DESC",
+ [VIRTIO_RING_F_EVENT_IDX - VIRTIO_TRANSPORT_F_START] = "RING_EVENT_IDX",
+ [VIRTIO_F_VERSION_1 - VIRTIO_TRANSPORT_F_START] = "VERSION_1",
+ [VIRTIO_F_ACCESS_PLATFORM - VIRTIO_TRANSPORT_F_START] = "ACCESS_PLATFORM",
+ [VIRTIO_F_RING_PACKED - VIRTIO_TRANSPORT_F_START] = "RING_PACKED",
+ [VIRTIO_F_IN_ORDER - VIRTIO_TRANSPORT_F_START] = "IN_ORDER",
+ [VIRTIO_F_ORDER_PLATFORM - VIRTIO_TRANSPORT_F_START] = "ORDER_PLATFORM",
+ [VIRTIO_F_SR_IOV - VIRTIO_TRANSPORT_F_START] = "SR_IOV",
+ [VIRTIO_F_NOTIFICATION_DATA - VIRTIO_TRANSPORT_F_START] = "NOTIFICATION_DATA",
+};
+
+static const char * const *dev_to_feature_str[] = {
+ [VIRTIO_ID_NET] = net_feature_strs,
+};
+
+#define NUM_FEATURE_BITS 64
+
+static void print_features(struct vdpa *vdpa, uint64_t features, bool mgmtdevf,
+ uint16_t dev_id)
+{
+ const char * const *feature_strs = NULL;
+ const char *s;
+ int i;
+
+ if (dev_id < ARRAY_SIZE(dev_to_feature_str))
+ feature_strs = dev_to_feature_str[dev_id];
+
+ if (mgmtdevf)
+ pr_out_array_start(vdpa, "dev_features");
+ else
+ pr_out_array_start(vdpa, "negotiated_features");
+
+ for (i = 0; i < NUM_FEATURE_BITS; i++) {
+ if (!(features & (1ULL << i)))
+ continue;
+
+ if (i < VIRTIO_TRANSPORT_F_START || i > VIRTIO_TRANSPORT_F_END)
+ s = feature_strs ? feature_strs[i] : NULL;
+ else
+ s = ext_feature_strs[i - VIRTIO_TRANSPORT_F_START];
+
+ if (!s)
+ print_uint(PRINT_ANY, NULL, " bit_%d", i);
+ else
+ print_string(PRINT_ANY, NULL, " %s", s);
+ }
+
+ pr_out_array_end(vdpa);
+}
+
static void pr_out_mgmtdev_show(struct vdpa *vdpa, const struct nlmsghdr *nlh,
struct nlattr **tb)
{
@@ -579,9 +670,10 @@ static int cmd_dev_del(struct vdpa *vdpa, int argc, char **argv)
return mnlu_gen_socket_sndrcv(&vdpa->nlg, nlh, NULL, NULL);
}
-static void pr_out_dev_net_config(struct nlattr **tb)
+static void pr_out_dev_net_config(struct vdpa *vdpa, struct nlattr **tb)
{
SPRINT_BUF(macaddr);
+ uint64_t val_u64;
uint16_t val_u16;
if (tb[VDPA_ATTR_DEV_NET_CFG_MACADDR]) {
@@ -610,6 +702,15 @@ static void pr_out_dev_net_config(struct nlattr **tb)
val_u16 = mnl_attr_get_u16(tb[VDPA_ATTR_DEV_NET_CFG_MTU]);
print_uint(PRINT_ANY, "mtu", "mtu %d ", val_u16);
}
+ if (tb[VDPA_ATTR_DEV_NEGOTIATED_FEATURES]) {
+ uint16_t dev_id = 0;
+
+ if (tb[VDPA_ATTR_DEV_ID])
+ dev_id = mnl_attr_get_u32(tb[VDPA_ATTR_DEV_ID]);
+
+ val_u64 = mnl_attr_get_u64(tb[VDPA_ATTR_DEV_NEGOTIATED_FEATURES]);
+ print_features(vdpa, val_u64, false, dev_id);
+ }
}
static void pr_out_dev_config(struct vdpa *vdpa, struct nlattr **tb)
@@ -619,7 +720,7 @@ static void pr_out_dev_config(struct vdpa *vdpa, struct nlattr **tb)
pr_out_vdev_handle_start(vdpa, tb);
switch (device_id) {
case VIRTIO_ID_NET:
- pr_out_dev_net_config(tb);
+ pr_out_dev_net_config(vdpa, tb);
break;
default:
break;
--
2.35.1

View File

@ -0,0 +1,127 @@
From b49cd0103978e0e05ca5be4d7369ab62622ff42f Mon Sep 17 00:00:00 2001
Message-Id: <b49cd0103978e0e05ca5be4d7369ab62622ff42f.1647872200.git.aclaudi@redhat.com>
In-Reply-To: <b30268eda844bdebbb8e5e4f5735e3b1bb666368.1647872200.git.aclaudi@redhat.com>
References: <b30268eda844bdebbb8e5e4f5735e3b1bb666368.1647872200.git.aclaudi@redhat.com>
From: Andrea Claudi <aclaudi@redhat.com>
Date: Mon, 21 Mar 2022 15:03:16 +0100
Subject: [PATCH] vdpa: Support for configuring max VQ pairs for a device
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2056827
Upstream Status: iproute2-next.git commit 16482fd4
commit 16482fd4df1132749575c49797c8d167c316d3f7
Author: Eli Cohen <elic@nvidia.com>
Date: Sun Mar 13 19:12:18 2022 +0200
vdpa: Support for configuring max VQ pairs for a device
Use VDPA_ATTR_DEV_MGMTDEV_MAX_VQS to specify max number of virtqueue
pairs to configure for a vdpa device when adding a device.
Examples:
1. Create a device with 3 virtqueue pairs:
$ vdpa dev add name vdpa-a mgmtdev auxiliary/mlx5_core.sf.1 max_vqp 3
2. Read the configuration of a vdpa device
$ vdpa dev config show vdpa-a
vdpa-a: mac 00:00:00:00:88:88 link up link_announce false max_vq_pairs 3 \
mtu 1500
negotiated_features CSUM GUEST_CSUM MTU MAC HOST_TSO4 HOST_TSO6 STATUS \
CTRL_VQ MQ CTRL_MAC_ADDR VERSION_1 ACCESS_PLATFORM
Reviewed-by: Si-Wei Liu <si-wei.liu@oracle.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Eli Cohen <elic@nvidia.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
---
vdpa/vdpa.c | 25 ++++++++++++++++++++++++-
1 file changed, 24 insertions(+), 1 deletion(-)
diff --git a/vdpa/vdpa.c b/vdpa/vdpa.c
index 40078b1c..9985b6ca 100644
--- a/vdpa/vdpa.c
+++ b/vdpa/vdpa.c
@@ -25,6 +25,7 @@
#define VDPA_OPT_VDEV_HANDLE BIT(3)
#define VDPA_OPT_VDEV_MAC BIT(4)
#define VDPA_OPT_VDEV_MTU BIT(5)
+#define VDPA_OPT_MAX_VQP BIT(6)
struct vdpa_opts {
uint64_t present; /* flags of present items */
@@ -34,6 +35,7 @@ struct vdpa_opts {
unsigned int device_id;
char mac[ETH_ALEN];
uint16_t mtu;
+ uint16_t max_vqp;
};
struct vdpa {
@@ -81,6 +83,7 @@ static const enum mnl_attr_data_type vdpa_policy[VDPA_ATTR_MAX + 1] = {
[VDPA_ATTR_DEV_MAX_VQS] = MNL_TYPE_U32,
[VDPA_ATTR_DEV_MAX_VQ_SIZE] = MNL_TYPE_U16,
[VDPA_ATTR_DEV_NEGOTIATED_FEATURES] = MNL_TYPE_U64,
+ [VDPA_ATTR_DEV_MGMTDEV_MAX_VQS] = MNL_TYPE_U32,
};
static int attr_cb(const struct nlattr *attr, void *data)
@@ -222,6 +225,8 @@ static void vdpa_opts_put(struct nlmsghdr *nlh, struct vdpa *vdpa)
sizeof(opts->mac), opts->mac);
if (opts->present & VDPA_OPT_VDEV_MTU)
mnl_attr_put_u16(nlh, VDPA_ATTR_DEV_NET_CFG_MTU, opts->mtu);
+ if (opts->present & VDPA_OPT_MAX_VQP)
+ mnl_attr_put_u16(nlh, VDPA_ATTR_DEV_NET_CFG_MAX_VQP, opts->max_vqp);
}
static int vdpa_argv_parse(struct vdpa *vdpa, int argc, char **argv,
@@ -290,6 +295,14 @@ static int vdpa_argv_parse(struct vdpa *vdpa, int argc, char **argv,
NEXT_ARG_FWD();
o_found |= VDPA_OPT_VDEV_MTU;
+ } else if ((matches(*argv, "max_vqp") == 0) && (o_optional & VDPA_OPT_MAX_VQP)) {
+ NEXT_ARG_FWD();
+ err = vdpa_argv_u16(vdpa, argc, argv, &opts->max_vqp);
+ if (err)
+ return err;
+
+ NEXT_ARG_FWD();
+ o_found |= VDPA_OPT_MAX_VQP;
} else {
fprintf(stderr, "Unknown option \"%s\"\n", *argv);
return -EINVAL;
@@ -499,6 +512,14 @@ static void pr_out_mgmtdev_show(struct vdpa *vdpa, const struct nlmsghdr *nlh,
pr_out_array_end(vdpa);
}
+ if (tb[VDPA_ATTR_DEV_MGMTDEV_MAX_VQS]) {
+ uint32_t num_vqs;
+
+ print_nl();
+ num_vqs = mnl_attr_get_u32(tb[VDPA_ATTR_DEV_MGMTDEV_MAX_VQS]);
+ print_uint(PRINT_ANY, "max_supported_vqs", " max_supported_vqs %d", num_vqs);
+ }
+
pr_out_handle_end(vdpa);
}
@@ -559,6 +580,7 @@ static void cmd_dev_help(void)
{
fprintf(stderr, "Usage: vdpa dev show [ DEV ]\n");
fprintf(stderr, " vdpa dev add name NAME mgmtdev MANAGEMENTDEV [ mac MACADDR ] [ mtu MTU ]\n");
+ fprintf(stderr, " [ max_vqp MAX_VQ_PAIRS ]\n");
fprintf(stderr, " vdpa dev del DEV\n");
fprintf(stderr, "Usage: vdpa dev config COMMAND [ OPTIONS ]\n");
}
@@ -648,7 +670,8 @@ static int cmd_dev_add(struct vdpa *vdpa, int argc, char **argv)
NLM_F_REQUEST | NLM_F_ACK);
err = vdpa_argv_parse_put(nlh, vdpa, argc, argv,
VDPA_OPT_VDEV_MGMTDEV_HANDLE | VDPA_OPT_VDEV_NAME,
- VDPA_OPT_VDEV_MAC | VDPA_OPT_VDEV_MTU);
+ VDPA_OPT_VDEV_MAC | VDPA_OPT_VDEV_MTU |
+ VDPA_OPT_MAX_VQP);
if (err)
return err;
--
2.35.1

View File

@ -0,0 +1,99 @@
From c98dd268d17c4faa19dac141f69597096bf0dfa4 Mon Sep 17 00:00:00 2001
Message-Id: <c98dd268d17c4faa19dac141f69597096bf0dfa4.1647872200.git.aclaudi@redhat.com>
In-Reply-To: <b30268eda844bdebbb8e5e4f5735e3b1bb666368.1647872200.git.aclaudi@redhat.com>
References: <b30268eda844bdebbb8e5e4f5735e3b1bb666368.1647872200.git.aclaudi@redhat.com>
From: Andrea Claudi <aclaudi@redhat.com>
Date: Mon, 21 Mar 2022 15:03:16 +0100
Subject: [PATCH] vdpa: Support reading device features
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2056827
Upstream Status: iproute2-next.git commit 56eb8bf4
commit 56eb8bf45aa3d509eb119201341d0323ea81ef84
Author: Eli Cohen <elic@nvidia.com>
Date: Sun Mar 13 19:12:19 2022 +0200
vdpa: Support reading device features
When showing the available management devices, check if
VDPA_ATTR_DEV_SUPPORTED_FEATURES feature is available and print the
supported features for a management device.
Examples:
$ vdpa mgmtdev show
auxiliary/mlx5_core.sf.1:
supported_classes net
max_supported_vqs 257
dev_features CSUM GUEST_CSUM MTU HOST_TSO4 HOST_TSO6 STATUS CTRL_VQ MQ \
CTRL_MAC_ADDR VERSION_1 ACCESS_PLATFORM
$ vdpa -jp mgmtdev show
{
"mgmtdev": {
"auxiliary/mlx5_core.sf.1": {
"supported_classes": [ "net" ],
"max_supported_vqs": 257,
"dev_features": [
"CSUM","GUEST_CSUM","MTU","HOST_TSO4","HOST_TSO6","STATUS","CTRL_VQ","MQ",\
"CTRL_MAC_ADDR","VERSION_1","ACCESS_PLATFORM" ]
}
}
}
Reviewed-by: Si-Wei Liu <si-wei.liu@oracle.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Eli Cohen <elic@nvidia.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
---
vdpa/vdpa.c | 15 +++++++++++++--
1 file changed, 13 insertions(+), 2 deletions(-)
diff --git a/vdpa/vdpa.c b/vdpa/vdpa.c
index 9985b6ca..3ae1b78f 100644
--- a/vdpa/vdpa.c
+++ b/vdpa/vdpa.c
@@ -84,6 +84,7 @@ static const enum mnl_attr_data_type vdpa_policy[VDPA_ATTR_MAX + 1] = {
[VDPA_ATTR_DEV_MAX_VQ_SIZE] = MNL_TYPE_U16,
[VDPA_ATTR_DEV_NEGOTIATED_FEATURES] = MNL_TYPE_U64,
[VDPA_ATTR_DEV_MGMTDEV_MAX_VQS] = MNL_TYPE_U32,
+ [VDPA_ATTR_DEV_SUPPORTED_FEATURES] = MNL_TYPE_U64,
};
static int attr_cb(const struct nlattr *attr, void *data)
@@ -492,14 +493,14 @@ static void print_features(struct vdpa *vdpa, uint64_t features, bool mgmtdevf,
static void pr_out_mgmtdev_show(struct vdpa *vdpa, const struct nlmsghdr *nlh,
struct nlattr **tb)
{
+ uint64_t classes = 0;
const char *class;
unsigned int i;
pr_out_handle_start(vdpa, tb);
if (tb[VDPA_ATTR_MGMTDEV_SUPPORTED_CLASSES]) {
- uint64_t classes = mnl_attr_get_u64(tb[VDPA_ATTR_MGMTDEV_SUPPORTED_CLASSES]);
-
+ classes = mnl_attr_get_u64(tb[VDPA_ATTR_MGMTDEV_SUPPORTED_CLASSES]);
pr_out_array_start(vdpa, "supported_classes");
for (i = 1; i < 64; i++) {
@@ -520,6 +521,16 @@ static void pr_out_mgmtdev_show(struct vdpa *vdpa, const struct nlmsghdr *nlh,
print_uint(PRINT_ANY, "max_supported_vqs", " max_supported_vqs %d", num_vqs);
}
+ if (tb[VDPA_ATTR_DEV_SUPPORTED_FEATURES]) {
+ uint64_t features;
+
+ features = mnl_attr_get_u64(tb[VDPA_ATTR_DEV_SUPPORTED_FEATURES]);
+ if (classes & BIT(VIRTIO_ID_NET))
+ print_features(vdpa, features, true, VIRTIO_ID_NET);
+ else
+ print_features(vdpa, features, true, 0);
+ }
+
pr_out_handle_end(vdpa);
}
--
2.35.1

View File

@ -0,0 +1,54 @@
From b72f22efc837d0c3e917186a4179158c35e9e690 Mon Sep 17 00:00:00 2001
Message-Id: <b72f22efc837d0c3e917186a4179158c35e9e690.1647872200.git.aclaudi@redhat.com>
In-Reply-To: <b30268eda844bdebbb8e5e4f5735e3b1bb666368.1647872200.git.aclaudi@redhat.com>
References: <b30268eda844bdebbb8e5e4f5735e3b1bb666368.1647872200.git.aclaudi@redhat.com>
From: Andrea Claudi <aclaudi@redhat.com>
Date: Mon, 21 Mar 2022 15:03:16 +0100
Subject: [PATCH] vdpa: Update man page with added support to configure max vq
pair
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2056827
Upstream Status: iproute2-next.git commit 8130653d
commit 8130653dabe6726b46b7b19c31d85e33a67175e3
Author: Eli Cohen <elic@nvidia.com>
Date: Tue Mar 15 15:13:58 2022 +0200
vdpa: Update man page with added support to configure max vq pair
Update man page to include information how to configure the max
virtqueue pairs for a vdpa device when creating one.
Signed-off-by: Eli Cohen <elic@nvidia.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
---
man/man8/vdpa-dev.8 | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/man/man8/vdpa-dev.8 b/man/man8/vdpa-dev.8
index aa21ae3a..432867c6 100644
--- a/man/man8/vdpa-dev.8
+++ b/man/man8/vdpa-dev.8
@@ -33,6 +33,7 @@ vdpa-dev \- vdpa device configuration
.I MGMTDEV
.RI "[ mac " MACADDR " ]"
.RI "[ mtu " MTU " ]"
+.RI "[ max_vqp " MAX_VQ_PAIRS " ]"
.ti -8
.B vdpa dev del
@@ -119,6 +120,11 @@ vdpa dev add name foo mgmtdev vdpa_sim_net mac 00:11:22:33:44:55 mtu 9000
Add the vdpa device named foo on the management device vdpa_sim_net with mac address of 00:11:22:33:44:55 and mtu of 9000 bytes.
.RE
.PP
+vdpa dev add name foo mgmtdev auxiliary/mlx5_core.sf.1 mac 00:11:22:33:44:55 max_vqp 8
+.RS 4
+Add the vdpa device named foo on the management device auxiliary/mlx5_core.sf.1 with mac address of 00:11:22:33:44:55 and max 8 virtqueue pairs
+.RE
+.PP
vdpa dev del foo
.RS 4
Delete the vdpa device named foo which was previously created.
--
2.35.1

View File

@ -1,5 +0,0 @@
# tc initialization script (sh)
if [ -z "$TC_LIB_DIR" ]; then
export TC_LIB_DIR=/usr/lib64/tc
fi

View File

@ -1,42 +1,50 @@
Summary: Advanced IP routing and network device configuration tools
Name: iproute
Version: 5.12.0
Version: 5.15.0
Release: 4%{?dist}%{?buildid}
%if 0%{?rhel}
Group: Applications/System
URL: http://kernel.org/pub/linux/utils/net/%{name}2/
Source0: http://kernel.org/pub/linux/utils/net/%{name}2/%{name}2-%{version}.tar.xz
%endif
URL: https://kernel.org/pub/linux/utils/net/%{name}2/
Source0: https://kernel.org/pub/linux/utils/net/%{name}2/%{name}2-%{version}.tar.xz
Source1: rt_dsfield.deprecated
Source2: iproute2.sh
Patch0: 0001-tc-f_flower-Add-option-to-match-on-related-ct-state.patch
Patch1: 0002-tc-f_flower-Add-missing-ct_state-flags-to-usage-desc.patch
Patch2: 0003-mptcp-add-support-for-port-based-endpoint.patch
Patch3: 0004-Update-kernel-headers.patch
Patch4: 0005-police-add-support-for-packet-per-second-rate-limiti.patch
Patch5: 0006-police-Add-support-for-json-output.patch
Patch6: 0007-police-Fix-normal-output-back-to-what-it-was.patch
Patch7: 0008-tc-u32-Fix-key-folding-in-sample-option.patch
Patch8: 0009-tc-htb-improve-burst-error-messages.patch
Patch9: 0010-lib-bpf_legacy-fix-bpffs-mount-when-sys-fs-bpf-exist.patch
Patch0: 0001-configure-fix-parsing-issue-on-include_dir-option.patch
Patch1: 0002-configure-fix-parsing-issue-on-libbpf_dir-option.patch
Patch2: 0003-configure-fix-parsing-issue-with-more-than-one-value.patch
Patch3: 0004-configure-simplify-options-parsing.patch
Patch4: 0005-configure-support-param-value-style.patch
Patch5: 0006-configure-add-the-prefix-option.patch
Patch6: 0007-configure-add-the-libdir-option.patch
Patch7: 0008-vdpa-align-uapi-headers.patch
Patch8: 0009-vdpa-Enable-user-to-query-vdpa-device-config-layout.patch
Patch9: 0010-vdpa-Enable-user-to-set-mac-address-of-vdpa-device.patch
Patch10: 0011-vdpa-Enable-user-to-set-mtu-of-the-vdpa-device.patch
Patch11: 0012-tc-u32-add-support-for-json-output.patch
Patch12: 0013-tc-u32-add-json-support-in-print_raw-print_ipv4-prin.patch
Patch13: 0014-Update-kernel-headers-and-import-virtio_net.patch
Patch14: 0015-uapi-update-vdpa.h.patch
Patch15: 0016-vdpa-Remove-unsupported-command-line-option.patch
Patch16: 0017-vdpa-Allow-for-printing-negotiated-features-of-a-dev.patch
Patch17: 0018-vdpa-Support-for-configuring-max-VQ-pairs-for-a-devi.patch
Patch18: 0019-vdpa-Support-reading-device-features.patch
Patch19: 0020-vdpa-Update-man-page-with-added-support-to-configure.patch
License: GPLv2+ and Public Domain
BuildRequires: bison
BuildRequires: elfutils-libelf-devel
BuildRequires: flex
BuildRequires: gcc
BuildRequires: iptables-devel >= 1.4.5
BuildRequires: libbpf-devel
BuildRequires: libcap-devel
BuildRequires: libdb-devel
BuildRequires: libmnl-devel
BuildRequires: libselinux-devel
BuildRequires: make
BuildRequires: pkgconfig
%if ! 0%{?_module_build}
%if 0%{?fedora}
BuildRequires: linux-atm-libs-devel
%endif
%endif
# For the UsrMove transition period
Conflicts: filesystem < 3
Requires: libbpf
Requires: psmisc
Provides: /sbin/ip
Obsoletes: %{name} < 4.5.0-3
%description
The iproute package contains networking utilities (ip and rtmon, for example)
@ -47,7 +55,6 @@ kernel.
Summary: Linux Traffic Control utility
Group: Applications/System
License: GPLv2+
Obsoletes: %{name} < 4.5.0-3
Requires: %{name}%{?_isa} = %{version}-%{release}
Provides: /sbin/tc
@ -98,15 +105,11 @@ install -D -m644 lib/libnetlink.a %{buildroot}%{_libdir}/libnetlink.a
# drop these files, iproute-doc package extracts files directly from _builddir
rm -rf '%{buildroot}%{_docdir}'
# Append deprecated values to rt_dsfield for compatibility reasons
# append deprecated values to rt_dsfield for compatibility reasons
cat %{SOURCE1} >>%{buildroot}%{_sysconfdir}/iproute2/rt_dsfield
# use TC_LIB_DIR environment variable
install -D -m644 %{SOURCE2} %{buildroot}%{_sysconfdir}/profile.d/iproute2.sh
%files
%dir %{_sysconfdir}/iproute2
%{!?_licensedir:%global license %%doc}
%license COPYING
%doc README README.devel
%{_mandir}/man7/*
@ -120,9 +123,7 @@ install -D -m644 %{SOURCE2} %{buildroot}%{_sysconfdir}/profile.d/iproute2.sh
%{_datadir}/bash-completion/completions/devlink
%files tc
%{!?_licensedir:%global license %%doc}
%license COPYING
%{_sysconfdir}/profile.d/iproute2.sh
%{_mandir}/man7/tc-*
%{_mandir}/man8/tc*
%{_mandir}/man8/cbq*
@ -133,13 +134,11 @@ install -D -m644 %{SOURCE2} %{buildroot}%{_sysconfdir}/profile.d/iproute2.sh
%if ! 0%{?_module_build}
%files doc
%{!?_licensedir:%global license %%doc}
%license COPYING
%doc examples
%endif
%files devel
%{!?_licensedir:%global license %%doc}
%license COPYING
%{_mandir}/man3/*
%{_libdir}/libnetlink.a
@ -147,6 +146,28 @@ install -D -m644 %{SOURCE2} %{buildroot}%{_sysconfdir}/profile.d/iproute2.sh
%{_includedir}/iproute2/bpf_elf.h
%changelog
* Mon Mar 21 2022 Andrea Claudi <aclaudi@redhat.com> - 5.15.0-4.el8
- vdpa: Update man page with added support to configure max vq pair (Andrea Claudi) [2056827]
- vdpa: Support reading device features (Andrea Claudi) [2056827]
- vdpa: Support for configuring max VQ pairs for a device (Andrea Claudi) [2056827]
- vdpa: Allow for printing negotiated features of a device (Andrea Claudi) [2056827]
- vdpa: Remove unsupported command line option (Andrea Claudi) [2056827]
- uapi: update vdpa.h (Andrea Claudi) [2056827]
- Update kernel headers and import virtio_net (Andrea Claudi) [2056827]
* Mon Feb 07 2022 Andrea Claudi <aclaudi@redhat.com> - 5.15.0-3.el8
- tc: u32: add json support in `print_raw`, `print_ipv4`, `print_ipv6` (Andrea Claudi) [1989591]
- tc: u32: add support for json output (Andrea Claudi) [1989591]
* Wed Jan 26 2022 Andrea Claudi <aclaudi@redhat.com> - 5.15.0-2.el8
- vdpa: Enable user to set mtu of the vdpa device (Andrea Claudi) [2036880]
- vdpa: Enable user to set mac address of vdpa device (Andrea Claudi) [2036880]
- vdpa: Enable user to query vdpa device config layout (Andrea Claudi) [2036880]
- vdpa: align uapi headers (Andrea Claudi) [2036880]
* Tue Nov 23 2021 Andrea Claudi <aclaudi@redhat.com> - 5.15.0-1.el8
- New version 5.15.0 (Andrea Claudi) [2016061]
* Thu Oct 07 2021 Andrea Claudi <aclaudi@redhat.com> [5.12.0-4.el8]
- lib: bpf_legacy: fix bpffs mount when /sys/fs/bpf exists (Andrea Claudi) [1995082]
@ -159,7 +180,7 @@ install -D -m644 %{SOURCE2} %{buildroot}%{_sysconfdir}/profile.d/iproute2.sh
- Update kernel headers (Andrea Claudi) [1981393]
- mptcp: add support for port based endpoint (Andrea Claudi) [1984733]
* Fri Aug 08 2021 Andrea Claudi <aclaudi@redhat.com> [5.12.0-2.el8]
* Fri Aug 06 2021 Andrea Claudi <aclaudi@redhat.com> [5.12.0-2.el8]
- add build and run-time dependencies on libbpf (Andrea Claudi) [1990402]
* Mon Jun 28 2021 Andrea Claudi <aclaudi@redhat.com> [5.12.0-1.el8]