diff --git a/COPYING-5.14.0-611.5.1.el9 b/COPYING-5.14.0-611.9.1.el9 similarity index 100% rename from COPYING-5.14.0-611.5.1.el9 rename to COPYING-5.14.0-611.9.1.el9 diff --git a/Documentation/ABI/testing/sysfs-devices-system-cpu b/Documentation/ABI/testing/sysfs-devices-system-cpu index d3cc32677a..ee3d8f6943 100644 --- a/Documentation/ABI/testing/sysfs-devices-system-cpu +++ b/Documentation/ABI/testing/sysfs-devices-system-cpu @@ -533,6 +533,7 @@ What: /sys/devices/system/cpu/vulnerabilities /sys/devices/system/cpu/vulnerabilities/srbds /sys/devices/system/cpu/vulnerabilities/tsa /sys/devices/system/cpu/vulnerabilities/tsx_async_abort + /sys/devices/system/cpu/vulnerabilities/vmscape Date: January 2018 Contact: Linux kernel mailing list Description: Information about CPU vulnerabilities diff --git a/Documentation/admin-guide/hw-vuln/index.rst b/Documentation/admin-guide/hw-vuln/index.rst index ce296b8430..ecbe926294 100644 --- a/Documentation/admin-guide/hw-vuln/index.rst +++ b/Documentation/admin-guide/hw-vuln/index.rst @@ -24,3 +24,4 @@ are configurable at compile, boot or run time. reg-file-data-sampling rsb indirect-target-selection + vmscape diff --git a/Documentation/admin-guide/hw-vuln/vmscape.rst b/Documentation/admin-guide/hw-vuln/vmscape.rst new file mode 100644 index 0000000000..d9b9a2b6c1 --- /dev/null +++ b/Documentation/admin-guide/hw-vuln/vmscape.rst @@ -0,0 +1,110 @@ +.. SPDX-License-Identifier: GPL-2.0 + +VMSCAPE +======= + +VMSCAPE is a vulnerability that may allow a guest to influence the branch +prediction in host userspace. It particularly affects hypervisors like QEMU. + +Even if a hypervisor may not have any sensitive data like disk encryption keys, +guest-userspace may be able to attack the guest-kernel using the hypervisor as +a confused deputy. + +Affected processors +------------------- + +The following CPU families are affected by VMSCAPE: + +**Intel processors:** + - Skylake generation (Parts without Enhanced-IBRS) + - Cascade Lake generation - (Parts affected by ITS guest/host separation) + - Alder Lake and newer (Parts affected by BHI) + +Note that, BHI affected parts that use BHB clearing software mitigation e.g. +Icelake are not vulnerable to VMSCAPE. + +**AMD processors:** + - Zen series (families 0x17, 0x19, 0x1a) + +** Hygon processors:** + - Family 0x18 + +Mitigation +---------- + +Conditional IBPB +---------------- + +Kernel tracks when a CPU has run a potentially malicious guest and issues an +IBPB before the first exit to userspace after VM-exit. If userspace did not run +between VM-exit and the next VM-entry, no IBPB is issued. + +Note that the existing userspace mitigation against Spectre-v2 is effective in +protecting the userspace. They are insufficient to protect the userspace VMMs +from a malicious guest. This is because Spectre-v2 mitigations are applied at +context switch time, while the userspace VMM can run after a VM-exit without a +context switch. + +Vulnerability enumeration and mitigation is not applied inside a guest. This is +because nested hypervisors should already be deploying IBPB to isolate +themselves from nested guests. + +SMT considerations +------------------ + +When Simultaneous Multi-Threading (SMT) is enabled, hypervisors can be +vulnerable to cross-thread attacks. For complete protection against VMSCAPE +attacks in SMT environments, STIBP should be enabled. + +The kernel will issue a warning if SMT is enabled without adequate STIBP +protection. Warning is not issued when: + +- SMT is disabled +- STIBP is enabled system-wide +- Intel eIBRS is enabled (which implies STIBP protection) + +System information and options +------------------------------ + +The sysfs file showing VMSCAPE mitigation status is: + + /sys/devices/system/cpu/vulnerabilities/vmscape + +The possible values in this file are: + + * 'Not affected': + + The processor is not vulnerable to VMSCAPE attacks. + + * 'Vulnerable': + + The processor is vulnerable and no mitigation has been applied. + + * 'Mitigation: IBPB before exit to userspace': + + Conditional IBPB mitigation is enabled. The kernel tracks when a CPU has + run a potentially malicious guest and issues an IBPB before the first + exit to userspace after VM-exit. + + * 'Mitigation: IBPB on VMEXIT': + + IBPB is issued on every VM-exit. This occurs when other mitigations like + RETBLEED or SRSO are already issuing IBPB on VM-exit. + +Mitigation control on the kernel command line +---------------------------------------------- + +The mitigation can be controlled via the ``vmscape=`` command line parameter: + + * ``vmscape=off``: + + Disable the VMSCAPE mitigation. + + * ``vmscape=ibpb``: + + Enable conditional IBPB mitigation (default when CONFIG_MITIGATION_VMSCAPE=y). + + * ``vmscape=force``: + + Force vulnerability detection and mitigation even on processors that are + not known to be affected. diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt index 89bf79741a..d9580065ab 100644 --- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -3435,6 +3435,7 @@ srbds=off [X86,INTEL] ssbd=force-off [ARM64] tsx_async_abort=off [X86] + vmscape=off [X86] Exceptions: This does not have any effect on @@ -7152,6 +7153,16 @@ vmpoff= [KNL,S390] Perform z/VM CP command after power off. Format: + vmscape= [X86] Controls mitigation for VMscape attacks. + VMscape attacks can leak information from a userspace + hypervisor to a guest via speculative side-channels. + + off - disable the mitigation + ibpb - use Indirect Branch Prediction Barrier + (IBPB) mitigation (default) + force - force vulnerability detection even on + unaffected processors + vsyscall= [X86-64] Controls the behavior of vsyscalls (i.e. calls to fixed addresses of 0xffffffffff600x00 from legacy diff --git a/Documentation/arch/x86/tdx.rst b/Documentation/arch/x86/tdx.rst index 719043cd8b..61670e7df2 100644 --- a/Documentation/arch/x86/tdx.rst +++ b/Documentation/arch/x86/tdx.rst @@ -142,13 +142,6 @@ but depends on the BIOS to behave correctly. Note TDX works with CPU logical online/offline, thus the kernel still allows to offline logical CPU and online it again. -Kexec() -~~~~~~~ - -TDX host support currently lacks the ability to handle kexec. For -simplicity only one of them can be enabled in the Kconfig. This will be -fixed in the future. - Erratum ~~~~~~~ @@ -171,6 +164,13 @@ If the platform has such erratum, the kernel prints additional message in machine check handler to tell user the machine check may be caused by kernel bug on TDX private memory. +Kexec +~~~~~~~ + +Currently kexec doesn't work on the TDX platforms with the aforementioned +erratum. It fails when loading the kexec kernel image. Otherwise it +works normally. + Interaction vs S3 and deeper states ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ diff --git a/Makefile.rhelver b/Makefile.rhelver index 9ec6f976df..539e233e18 100644 --- a/Makefile.rhelver +++ b/Makefile.rhelver @@ -12,7 +12,7 @@ RHEL_MINOR = 7 # # Use this spot to avoid future merge conflicts. # Do not trim this comment. -RHEL_RELEASE = 611.5.1 +RHEL_RELEASE = 611.9.1 # # ZSTREAM diff --git a/arch/arm64/kernel/syscall.c b/arch/arm64/kernel/syscall.c index f48c4cf084..83be2b4085 100644 --- a/arch/arm64/kernel/syscall.c +++ b/arch/arm64/kernel/syscall.c @@ -53,17 +53,15 @@ static void invoke_syscall(struct pt_regs *regs, unsigned int scno, syscall_set_return_value(current, regs, 0, ret); /* - * Ultimately, this value will get limited by KSTACK_OFFSET_MAX(), - * but not enough for arm64 stack utilization comfort. To keep - * reasonable stack head room, reduce the maximum offset to 9 bits. + * This value will get limited by KSTACK_OFFSET_MAX(), which is 10 + * bits. The actual entropy will be further reduced by the compiler + * when applying stack alignment constraints: the AAPCS mandates a + * 16-byte aligned SP at function boundaries, which will remove the + * 4 low bits from any entropy chosen here. * - * The actual entropy will be further reduced by the compiler when - * applying stack alignment constraints: the AAPCS mandates a - * 16-byte (i.e. 4-bit) aligned SP at function boundaries. - * - * The resulting 5 bits of entropy is seen in SP[8:4]. + * The resulting 6 bits of entropy is seen in SP[9:4]. */ - choose_random_kstack_offset(get_random_int() & 0x1FF); + choose_random_kstack_offset(get_random_u16()); } static inline bool has_syscall_work(unsigned long flags) diff --git a/arch/powerpc/include/asm/hvcall.h b/arch/powerpc/include/asm/hvcall.h index 56e0c8767a..4693a00f71 100644 --- a/arch/powerpc/include/asm/hvcall.h +++ b/arch/powerpc/include/asm/hvcall.h @@ -258,6 +258,7 @@ #define H_QUERY_INT_STATE 0x1E4 #define H_POLL_PENDING 0x1D8 #define H_ILLAN_ATTRIBUTES 0x244 +#define H_ADD_LOGICAL_LAN_BUFFERS 0x248 #define H_MODIFY_HEA_QP 0x250 #define H_QUERY_HEA_QP 0x254 #define H_QUERY_HEA 0x258 diff --git a/arch/s390/hypfs/hypfs_dbfs.c b/arch/s390/hypfs/hypfs_dbfs.c index 4024599eb4..b55bffcd72 100644 --- a/arch/s390/hypfs/hypfs_dbfs.c +++ b/arch/s390/hypfs/hypfs_dbfs.c @@ -6,6 +6,7 @@ * Author(s): Michael Holzheu */ +#include #include #include "hypfs.h" @@ -64,24 +65,29 @@ static long dbfs_ioctl(struct file *file, unsigned int cmd, unsigned long arg) long rc; mutex_lock(&df->lock); - if (df->unlocked_ioctl) - rc = df->unlocked_ioctl(file, cmd, arg); - else - rc = -ENOTTY; + rc = df->unlocked_ioctl(file, cmd, arg); mutex_unlock(&df->lock); return rc; } -static const struct file_operations dbfs_ops = { +static const struct file_operations dbfs_ops_ioctl = { .read = dbfs_read, .llseek = no_llseek, .unlocked_ioctl = dbfs_ioctl, }; +static const struct file_operations dbfs_ops = { + .read = dbfs_read, + .llseek = no_llseek, +}; + void hypfs_dbfs_create_file(struct hypfs_dbfs_file *df) { - df->dentry = debugfs_create_file(df->name, 0400, dbfs_dir, df, - &dbfs_ops); + const struct file_operations *fops = &dbfs_ops; + + if (df->unlocked_ioctl && !security_locked_down(LOCKDOWN_DEBUGFS)) + fops = &dbfs_ops_ioctl; + df->dentry = debugfs_create_file(df->name, 0400, dbfs_dir, df, fops); mutex_init(&df->lock); } diff --git a/arch/s390/include/asm/entry-common.h b/arch/s390/include/asm/entry-common.h index b6474e779f..2bd5030746 100644 --- a/arch/s390/include/asm/entry-common.h +++ b/arch/s390/include/asm/entry-common.h @@ -55,7 +55,7 @@ static __always_inline void arch_exit_to_user_mode(void) static inline void arch_exit_to_user_mode_prepare(struct pt_regs *regs, unsigned long ti_work) { - choose_random_kstack_offset(get_tod_clock_fast() & 0xff); + choose_random_kstack_offset(get_tod_clock_fast()); } #define arch_exit_to_user_mode_prepare arch_exit_to_user_mode_prepare diff --git a/arch/s390/pci/pci_event.c b/arch/s390/pci/pci_event.c index 2fbee3887d..6c8922ad70 100644 --- a/arch/s390/pci/pci_event.c +++ b/arch/s390/pci/pci_event.c @@ -106,6 +106,10 @@ static pci_ers_result_t zpci_event_do_error_state_clear(struct pci_dev *pdev, struct zpci_dev *zdev = to_zpci(pdev); int rc; + /* The underlying device may have been disabled by the event */ + if (!zdev_enabled(zdev)) + return PCI_ERS_RESULT_NEED_RESET; + pr_info("%s: Unblocking device access for examination\n", pci_name(pdev)); rc = zpci_reset_load_store_blocked(zdev); if (rc) { @@ -273,6 +277,8 @@ static void __zpci_event_error(struct zpci_ccdf_err *ccdf) struct zpci_dev *zdev = get_zdev_by_fid(ccdf->fid); struct pci_dev *pdev = NULL; pci_ers_result_t ers_res; + u32 fh = 0; + int rc; zpci_dbg(3, "err fid:%x, fh:%x, pec:%x\n", ccdf->fid, ccdf->fh, ccdf->pec); @@ -281,6 +287,15 @@ static void __zpci_event_error(struct zpci_ccdf_err *ccdf) if (zdev) { mutex_lock(&zdev->state_lock); + rc = clp_refresh_fh(zdev->fid, &fh); + if (rc) + goto no_pdev; + if (!fh || ccdf->fh != fh) { + /* Ignore events with stale handles */ + zpci_dbg(3, "err fid:%x, fh:%x (stale %x)\n", + ccdf->fid, fh, ccdf->fh); + goto no_pdev; + } zpci_update_fh(zdev, ccdf->fh); if (zdev->zbus->bus) pdev = pci_get_slot(zdev->zbus->bus, zdev->devfn); diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 08486d8377..e05608e5b3 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -1949,7 +1949,6 @@ config INTEL_TDX_HOST depends on X86_X2APIC select ARCH_KEEP_MEMBLOCK depends on CONTIG_ALLOC - depends on !KEXEC_CORE depends on X86_MCE help Intel Trust Domain Extensions (TDX) protects guest VMs from malicious @@ -2754,6 +2753,15 @@ config MITIGATION_TSA security vulnerability on AMD CPUs which can lead to forwarding of invalid info to subsequent instructions and thus can affect their timing and thereby cause a leakage. + +config MITIGATION_VMSCAPE + bool "Mitigate VMSCAPE" + depends on KVM + default y + help + Enable mitigation for VMSCAPE attacks. VMSCAPE is a hardware security + vulnerability on Intel and AMD CPUs that may allow a guest to do + Spectre v2 style attacks on userspace hypervisor. endif config ARCH_HAS_ADD_PAGES diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h index 200fc42504..2248a13e80 100644 --- a/arch/x86/include/asm/cpufeatures.h +++ b/arch/x86/include/asm/cpufeatures.h @@ -491,6 +491,7 @@ #define X86_FEATURE_TSA_SQ_NO (21*32+11) /* AMD CPU not vulnerable to TSA-SQ */ #define X86_FEATURE_TSA_L1_NO (21*32+12) /* AMD CPU not vulnerable to TSA-L1 */ #define X86_FEATURE_CLEAR_CPU_BUF_VM (21*32+13) /* Clear CPU buffers using VERW before VMRUN */ +#define X86_FEATURE_IBPB_EXIT_TO_USER (21*32+14) /* Use IBPB on exit-to-userspace, see VMSCAPE bug */ /* * BUG word(s) @@ -546,4 +547,5 @@ #define X86_BUG_ITS X86_BUG( 1*32+ 7) /* "its" CPU is affected by Indirect Target Selection */ #define X86_BUG_ITS_NATIVE_ONLY X86_BUG( 1*32+ 8) /* "its_native_only" CPU is affected by ITS, VMX is not affected */ #define X86_BUG_TSA X86_BUG( 1*32+ 9) /* "tsa" CPU is affected by Transient Scheduler Attacks */ +#define X86_BUG_VMSCAPE X86_BUG( 1*32+10) /* "vmscape" CPU is affected by VMSCAPE attacks from guests */ #endif /* _ASM_X86_CPUFEATURES_H */ diff --git a/arch/x86/include/asm/entry-common.h b/arch/x86/include/asm/entry-common.h index 7e523bb3d2..bb0a5ecc80 100644 --- a/arch/x86/include/asm/entry-common.h +++ b/arch/x86/include/asm/entry-common.h @@ -73,19 +73,23 @@ static inline void arch_exit_to_user_mode_prepare(struct pt_regs *regs, #endif /* - * Ultimately, this value will get limited by KSTACK_OFFSET_MAX(), - * but not enough for x86 stack utilization comfort. To keep - * reasonable stack head room, reduce the maximum offset to 8 bits. - * - * The actual entropy will be further reduced by the compiler when - * applying stack alignment constraints (see cc_stack_align4/8 in + * This value will get limited by KSTACK_OFFSET_MAX(), which is 10 + * bits. The actual entropy will be further reduced by the compiler + * when applying stack alignment constraints (see cc_stack_align4/8 in * arch/x86/Makefile), which will remove the 3 (x86_64) or 2 (ia32) * low bits from any entropy chosen here. * - * Therefore, final stack offset entropy will be 5 (x86_64) or - * 6 (ia32) bits. + * Therefore, final stack offset entropy will be 7 (x86_64) or + * 8 (ia32) bits. */ - choose_random_kstack_offset(rdtsc() & 0xFF); + choose_random_kstack_offset(rdtsc()); + + /* Avoid unnecessary reads of 'x86_ibpb_exit_to_user' */ + if (cpu_feature_enabled(X86_FEATURE_IBPB_EXIT_TO_USER) && + this_cpu_read(x86_ibpb_exit_to_user)) { + indirect_branch_prediction_barrier(); + this_cpu_write(x86_ibpb_exit_to_user, false); + } } #define arch_exit_to_user_mode_prepare arch_exit_to_user_mode_prepare diff --git a/arch/x86/include/asm/kexec.h b/arch/x86/include/asm/kexec.h index 8be622e82b..a03770fb24 100644 --- a/arch/x86/include/asm/kexec.h +++ b/arch/x86/include/asm/kexec.h @@ -129,7 +129,7 @@ relocate_kernel(unsigned long indirection_page, unsigned long page_list, unsigned long start_address, unsigned int preserve_context, - unsigned int host_mem_enc_active); + unsigned int cache_incoherent); #endif #define ARCH_HAS_KIMAGE_ARCH diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h index 5412599de8..17258f6063 100644 --- a/arch/x86/include/asm/nospec-branch.h +++ b/arch/x86/include/asm/nospec-branch.h @@ -545,6 +545,8 @@ void alternative_msr_write(unsigned int msr, u64 val, unsigned int feature) : "memory"); } +DECLARE_PER_CPU(bool, x86_ibpb_exit_to_user); + static inline void indirect_branch_prediction_barrier(void) { asm_inline volatile(ALTERNATIVE("", "call write_ibpb", X86_FEATURE_IBPB) diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h index 8d5036457f..033847d710 100644 --- a/arch/x86/include/asm/processor.h +++ b/arch/x86/include/asm/processor.h @@ -773,6 +773,8 @@ void __noreturn stop_this_cpu(void *dummy); void microcode_check(struct cpuinfo_x86 *prev_info); void store_cpu_caps(struct cpuinfo_x86 *info); +DECLARE_PER_CPU(bool, cache_state_incoherent); + enum l1tf_mitigations { L1TF_MITIGATION_OFF, L1TF_MITIGATION_FLUSH_NOWARN, diff --git a/arch/x86/include/asm/tdx.h b/arch/x86/include/asm/tdx.h index 3359673ddf..256682f06c 100644 --- a/arch/x86/include/asm/tdx.h +++ b/arch/x86/include/asm/tdx.h @@ -98,18 +98,41 @@ u64 __seamcall_ret(u64 fn, struct tdx_module_args *args); u64 __seamcall_saved_ret(u64 fn, struct tdx_module_args *args); void tdx_init(void); +#include #include +#include typedef u64 (*sc_func_t)(u64 fn, struct tdx_module_args *args); -static inline u64 sc_retry(sc_func_t func, u64 fn, +static __always_inline u64 __seamcall_dirty_cache(sc_func_t func, u64 fn, + struct tdx_module_args *args) +{ + lockdep_assert_preemption_disabled(); + + /* + * SEAMCALLs are made to the TDX module and can generate dirty + * cachelines of TDX private memory. Mark cache state incoherent + * so that the cache can be flushed during kexec. + * + * This needs to be done before actually making the SEAMCALL, + * because kexec-ing CPU could send NMI to stop remote CPUs, + * in which case even disabling IRQ won't help here. + */ + this_cpu_write(cache_state_incoherent, true); + + return func(fn, args); +} + +static __always_inline u64 sc_retry(sc_func_t func, u64 fn, struct tdx_module_args *args) { int retry = RDRAND_RETRY_LOOPS; u64 ret; do { - ret = func(fn, args); + preempt_disable(); + ret = __seamcall_dirty_cache(func, fn, args); + preempt_enable(); } while (ret == TDX_RND_NO_ENTROPY && --retry); return ret; @@ -199,5 +222,11 @@ static inline const char *tdx_dump_mce_info(struct mce *m) { return NULL; } static inline const struct tdx_sys_info *tdx_get_sysinfo(void) { return NULL; } #endif /* CONFIG_INTEL_TDX_HOST */ +#ifdef CONFIG_KEXEC_CORE +void tdx_cpu_flush_cache_for_kexec(void); +#else +static inline void tdx_cpu_flush_cache_for_kexec(void) { } +#endif + #endif /* !__ASSEMBLY__ */ #endif /* _ASM_X86_TDX_H */ diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c index 381249de1c..fcae36a2f7 100644 --- a/arch/x86/kernel/cpu/amd.c +++ b/arch/x86/kernel/cpu/amd.c @@ -530,6 +530,23 @@ static void early_detect_mem_encrypt(struct cpuinfo_x86 *c) { u64 msr; + /* + * Mark using WBINVD is needed during kexec on processors that + * support SME. This provides support for performing a successful + * kexec when going from SME inactive to SME active (or vice-versa). + * + * The cache must be cleared so that if there are entries with the + * same physical address, both with and without the encryption bit, + * they don't race each other when flushed and potentially end up + * with the wrong entry being committed to memory. + * + * Test the CPUID bit directly because with mem_encrypt=off the + * BSP will clear the X86_FEATURE_SME bit and the APs will not + * see it set after that. + */ + if (c->extended_cpuid_level >= 0x8000001f && (cpuid_eax(0x8000001f) & BIT(0))) + __this_cpu_write(cache_state_incoherent, true); + /* * BIOS support is required for SME and SEV. * For SME: If BIOS has enabled SME then adjust x86_phys_bits by diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c index 0017b9e4db..e23ae9b47d 100644 --- a/arch/x86/kernel/cpu/bugs.c +++ b/arch/x86/kernel/cpu/bugs.c @@ -53,6 +53,9 @@ static void __init gds_select_mitigation(void); static void __init its_select_mitigation(void); static void __init tsa_select_mitigation(void); static void __init tsa_apply_mitigation(void); +static void __init vmscape_select_mitigation(void); +static void __init vmscape_update_mitigation(void); +static void __init vmscape_apply_mitigation(void); /* The base value of the SPEC_CTRL MSR without task-specific bits set */ u64 x86_spec_ctrl_base; @@ -62,6 +65,14 @@ EXPORT_SYMBOL_GPL(x86_spec_ctrl_base); DEFINE_PER_CPU(u64, x86_spec_ctrl_current); EXPORT_PER_CPU_SYMBOL_GPL(x86_spec_ctrl_current); +/* + * Set when the CPU has run a potentially malicious guest. An IBPB will + * be needed to before running userspace. That IBPB will flush the branch + * predictor content. + */ +DEFINE_PER_CPU(bool, x86_ibpb_exit_to_user); +EXPORT_PER_CPU_SYMBOL_GPL(x86_ibpb_exit_to_user); + u64 x86_pred_cmd __ro_after_init = PRED_CMD_IBPB; static u64 __ro_after_init x86_arch_cap_msr; @@ -197,6 +208,9 @@ void __init cpu_select_mitigations(void) its_select_mitigation(); tsa_select_mitigation(); tsa_apply_mitigation(); + vmscape_select_mitigation(); + vmscape_update_mitigation(); + vmscape_apply_mitigation(); } /* @@ -2173,88 +2187,6 @@ static void update_mds_branch_idle(void) } } -#define MDS_MSG_SMT "MDS CPU bug present and SMT on, data leak possible. See https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/mds.html for more details.\n" -#define TAA_MSG_SMT "TAA CPU bug present and SMT on, data leak possible. See https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/tsx_async_abort.html for more details.\n" -#define MMIO_MSG_SMT "MMIO Stale Data CPU bug present and SMT on, data leak possible. See https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/processor_mmio_stale_data.html for more details.\n" - -void cpu_bugs_smt_update(void) -{ - mutex_lock(&spec_ctrl_mutex); - - if (sched_smt_active() && unprivileged_ebpf_enabled() && - spectre_v2_enabled == SPECTRE_V2_EIBRS_LFENCE) - pr_warn_once(SPECTRE_V2_EIBRS_LFENCE_EBPF_SMT_MSG); - - switch (spectre_v2_user_stibp) { - case SPECTRE_V2_USER_NONE: - break; - case SPECTRE_V2_USER_STRICT: - case SPECTRE_V2_USER_STRICT_PREFERRED: - update_stibp_strict(); - break; - case SPECTRE_V2_USER_PRCTL: - case SPECTRE_V2_USER_SECCOMP: - update_indir_branch_cond(); - break; - } - - switch (mds_mitigation) { - case MDS_MITIGATION_FULL: - case MDS_MITIGATION_AUTO: - case MDS_MITIGATION_VMWERV: - if (sched_smt_active() && !boot_cpu_has(X86_BUG_MSBDS_ONLY)) - pr_warn_once(MDS_MSG_SMT); - update_mds_branch_idle(); - break; - case MDS_MITIGATION_OFF: - break; - } - - switch (taa_mitigation) { - case TAA_MITIGATION_VERW: - case TAA_MITIGATION_AUTO: - case TAA_MITIGATION_UCODE_NEEDED: - if (sched_smt_active()) - pr_warn_once(TAA_MSG_SMT); - break; - case TAA_MITIGATION_TSX_DISABLED: - case TAA_MITIGATION_OFF: - break; - } - - switch (mmio_mitigation) { - case MMIO_MITIGATION_VERW: - case MMIO_MITIGATION_AUTO: - case MMIO_MITIGATION_UCODE_NEEDED: - if (sched_smt_active()) - pr_warn_once(MMIO_MSG_SMT); - break; - case MMIO_MITIGATION_OFF: - break; - } - - switch (tsa_mitigation) { - case TSA_MITIGATION_USER_KERNEL: - case TSA_MITIGATION_VM: - case TSA_MITIGATION_AUTO: - case TSA_MITIGATION_FULL: - /* - * TSA-SQ can potentially lead to info leakage between - * SMT threads. - */ - if (sched_smt_active()) - static_branch_enable(&cpu_buf_idle_clear); - else - static_branch_disable(&cpu_buf_idle_clear); - break; - case TSA_MITIGATION_NONE: - case TSA_MITIGATION_UCODE_NEEDED: - break; - } - - mutex_unlock(&spec_ctrl_mutex); -} - #ifdef CONFIG_DEBUG_FS /* * Provide a debugfs file to dump SPEC_CTRL MSRs of all the CPUs @@ -3037,9 +2969,185 @@ out: pr_info("%s\n", srso_strings[srso_mitigation]); } +#undef pr_fmt +#define pr_fmt(fmt) "VMSCAPE: " fmt + +enum vmscape_mitigations { + VMSCAPE_MITIGATION_NONE, + VMSCAPE_MITIGATION_AUTO, + VMSCAPE_MITIGATION_IBPB_EXIT_TO_USER, + VMSCAPE_MITIGATION_IBPB_ON_VMEXIT, +}; + +static const char * const vmscape_strings[] = { + [VMSCAPE_MITIGATION_NONE] = "Vulnerable", + /* [VMSCAPE_MITIGATION_AUTO] */ + [VMSCAPE_MITIGATION_IBPB_EXIT_TO_USER] = "Mitigation: IBPB before exit to userspace", + [VMSCAPE_MITIGATION_IBPB_ON_VMEXIT] = "Mitigation: IBPB on VMEXIT", +}; + +static enum vmscape_mitigations vmscape_mitigation __ro_after_init = + IS_ENABLED(CONFIG_MITIGATION_VMSCAPE) ? VMSCAPE_MITIGATION_AUTO : VMSCAPE_MITIGATION_NONE; + +static int __init vmscape_parse_cmdline(char *str) +{ + if (!str) + return -EINVAL; + + if (!strcmp(str, "off")) { + vmscape_mitigation = VMSCAPE_MITIGATION_NONE; + } else if (!strcmp(str, "ibpb")) { + vmscape_mitigation = VMSCAPE_MITIGATION_IBPB_EXIT_TO_USER; + } else if (!strcmp(str, "force")) { + setup_force_cpu_bug(X86_BUG_VMSCAPE); + vmscape_mitigation = VMSCAPE_MITIGATION_AUTO; + } else { + pr_err("Ignoring unknown vmscape=%s option.\n", str); + } + + return 0; +} +early_param("vmscape", vmscape_parse_cmdline); + +static void __init vmscape_select_mitigation(void) +{ + if (cpu_mitigations_off() || + !boot_cpu_has_bug(X86_BUG_VMSCAPE) || + !boot_cpu_has(X86_FEATURE_IBPB)) { + vmscape_mitigation = VMSCAPE_MITIGATION_NONE; + return; + } + + if (vmscape_mitigation == VMSCAPE_MITIGATION_AUTO) + vmscape_mitigation = VMSCAPE_MITIGATION_IBPB_EXIT_TO_USER; +} + +static void __init vmscape_update_mitigation(void) +{ + if (!boot_cpu_has_bug(X86_BUG_VMSCAPE)) + return; + + if (retbleed_mitigation == RETBLEED_MITIGATION_IBPB || + srso_mitigation == SRSO_MITIGATION_IBPB_ON_VMEXIT) + vmscape_mitigation = VMSCAPE_MITIGATION_IBPB_ON_VMEXIT; + + pr_info("%s\n", vmscape_strings[vmscape_mitigation]); +} + +static void __init vmscape_apply_mitigation(void) +{ + if (vmscape_mitigation == VMSCAPE_MITIGATION_IBPB_EXIT_TO_USER) + setup_force_cpu_cap(X86_FEATURE_IBPB_EXIT_TO_USER); +} + #undef pr_fmt #define pr_fmt(fmt) fmt +#define MDS_MSG_SMT "MDS CPU bug present and SMT on, data leak possible. See https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/mds.html for more details.\n" +#define TAA_MSG_SMT "TAA CPU bug present and SMT on, data leak possible. See https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/tsx_async_abort.html for more details.\n" +#define MMIO_MSG_SMT "MMIO Stale Data CPU bug present and SMT on, data leak possible. See https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/processor_mmio_stale_data.html for more details.\n" +#define VMSCAPE_MSG_SMT "VMSCAPE: SMT on, STIBP is required for full protection. See https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/vmscape.html for more details.\n" + +void cpu_bugs_smt_update(void) +{ + mutex_lock(&spec_ctrl_mutex); + + if (sched_smt_active() && unprivileged_ebpf_enabled() && + spectre_v2_enabled == SPECTRE_V2_EIBRS_LFENCE) + pr_warn_once(SPECTRE_V2_EIBRS_LFENCE_EBPF_SMT_MSG); + + switch (spectre_v2_user_stibp) { + case SPECTRE_V2_USER_NONE: + break; + case SPECTRE_V2_USER_STRICT: + case SPECTRE_V2_USER_STRICT_PREFERRED: + update_stibp_strict(); + break; + case SPECTRE_V2_USER_PRCTL: + case SPECTRE_V2_USER_SECCOMP: + update_indir_branch_cond(); + break; + } + + switch (mds_mitigation) { + case MDS_MITIGATION_FULL: + case MDS_MITIGATION_AUTO: + case MDS_MITIGATION_VMWERV: + if (sched_smt_active() && !boot_cpu_has(X86_BUG_MSBDS_ONLY)) + pr_warn_once(MDS_MSG_SMT); + update_mds_branch_idle(); + break; + case MDS_MITIGATION_OFF: + break; + } + + switch (taa_mitigation) { + case TAA_MITIGATION_VERW: + case TAA_MITIGATION_AUTO: + case TAA_MITIGATION_UCODE_NEEDED: + if (sched_smt_active()) + pr_warn_once(TAA_MSG_SMT); + break; + case TAA_MITIGATION_TSX_DISABLED: + case TAA_MITIGATION_OFF: + break; + } + + switch (mmio_mitigation) { + case MMIO_MITIGATION_VERW: + case MMIO_MITIGATION_AUTO: + case MMIO_MITIGATION_UCODE_NEEDED: + if (sched_smt_active()) + pr_warn_once(MMIO_MSG_SMT); + break; + case MMIO_MITIGATION_OFF: + break; + } + + switch (tsa_mitigation) { + case TSA_MITIGATION_USER_KERNEL: + case TSA_MITIGATION_VM: + case TSA_MITIGATION_AUTO: + case TSA_MITIGATION_FULL: + /* + * TSA-SQ can potentially lead to info leakage between + * SMT threads. + */ + if (sched_smt_active()) + static_branch_enable(&cpu_buf_idle_clear); + else + static_branch_disable(&cpu_buf_idle_clear); + break; + case TSA_MITIGATION_NONE: + case TSA_MITIGATION_UCODE_NEEDED: + break; + } + + switch (vmscape_mitigation) { + case VMSCAPE_MITIGATION_NONE: + case VMSCAPE_MITIGATION_AUTO: + break; + case VMSCAPE_MITIGATION_IBPB_ON_VMEXIT: + case VMSCAPE_MITIGATION_IBPB_EXIT_TO_USER: + /* + * Hypervisors can be attacked across-threads, warn for SMT when + * STIBP is not already enabled system-wide. + * + * Intel eIBRS (!AUTOIBRS) implies STIBP on. + */ + if (!sched_smt_active() || + spectre_v2_user_stibp == SPECTRE_V2_USER_STRICT || + spectre_v2_user_stibp == SPECTRE_V2_USER_STRICT_PREFERRED || + (spectre_v2_in_eibrs_mode(spectre_v2_enabled) && + !boot_cpu_has(X86_FEATURE_AUTOIBRS))) + break; + pr_warn_once(VMSCAPE_MSG_SMT); + break; + } + + mutex_unlock(&spec_ctrl_mutex); +} + #ifdef CONFIG_SYSFS #define L1TF_DEFAULT_MSG "Mitigation: PTE Inversion" @@ -3283,6 +3391,11 @@ static ssize_t tsa_show_state(char *buf) return sysfs_emit(buf, "%s\n", tsa_strings[tsa_mitigation]); } +static ssize_t vmscape_show_state(char *buf) +{ + return sysfs_emit(buf, "%s\n", vmscape_strings[vmscape_mitigation]); +} + static ssize_t cpu_show_common(struct device *dev, struct device_attribute *attr, char *buf, unsigned int bug) { @@ -3347,6 +3460,9 @@ static ssize_t cpu_show_common(struct device *dev, struct device_attribute *attr case X86_BUG_TSA: return tsa_show_state(buf); + case X86_BUG_VMSCAPE: + return vmscape_show_state(buf); + default: break; } @@ -3436,6 +3552,11 @@ ssize_t cpu_show_tsa(struct device *dev, struct device_attribute *attr, char *bu { return cpu_show_common(dev, attr, buf, X86_BUG_TSA); } + +ssize_t cpu_show_vmscape(struct device *dev, struct device_attribute *attr, char *buf) +{ + return cpu_show_common(dev, attr, buf, X86_BUG_VMSCAPE); +} #endif void __warn_thunk(void) diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c index 5e20196dfc..ba64048fcd 100644 --- a/arch/x86/kernel/cpu/common.c +++ b/arch/x86/kernel/cpu/common.c @@ -1280,55 +1280,71 @@ static const __initconst struct x86_cpu_id cpu_vuln_whitelist[] = { #define ITS_NATIVE_ONLY BIT(9) /* CPU is affected by Transient Scheduler Attacks */ #define TSA BIT(10) +/* CPU is affected by VMSCAPE */ +#define VMSCAPE BIT(11) static const struct x86_cpu_id cpu_vuln_blacklist[] __initconst = { - VULNBL_INTEL_STEPS(INTEL_IVYBRIDGE, X86_STEP_MAX, SRBDS), - VULNBL_INTEL_STEPS(INTEL_HASWELL, X86_STEP_MAX, SRBDS), - VULNBL_INTEL_STEPS(INTEL_HASWELL_L, X86_STEP_MAX, SRBDS), - VULNBL_INTEL_STEPS(INTEL_HASWELL_G, X86_STEP_MAX, SRBDS), - VULNBL_INTEL_STEPS(INTEL_HASWELL_X, X86_STEP_MAX, MMIO), - VULNBL_INTEL_STEPS(INTEL_BROADWELL_D, X86_STEP_MAX, MMIO), - VULNBL_INTEL_STEPS(INTEL_BROADWELL_G, X86_STEP_MAX, SRBDS), - VULNBL_INTEL_STEPS(INTEL_BROADWELL_X, X86_STEP_MAX, MMIO), - VULNBL_INTEL_STEPS(INTEL_BROADWELL, X86_STEP_MAX, SRBDS), - VULNBL_INTEL_STEPS(INTEL_SKYLAKE_X, 0x5, MMIO | RETBLEED | GDS), - VULNBL_INTEL_STEPS(INTEL_SKYLAKE_X, X86_STEP_MAX, MMIO | RETBLEED | GDS | ITS), - VULNBL_INTEL_STEPS(INTEL_SKYLAKE_L, X86_STEP_MAX, MMIO | RETBLEED | GDS | SRBDS), - VULNBL_INTEL_STEPS(INTEL_SKYLAKE, X86_STEP_MAX, MMIO | RETBLEED | GDS | SRBDS), - VULNBL_INTEL_STEPS(INTEL_KABYLAKE_L, 0xb, MMIO | RETBLEED | GDS | SRBDS), - VULNBL_INTEL_STEPS(INTEL_KABYLAKE_L, X86_STEP_MAX, MMIO | RETBLEED | GDS | SRBDS | ITS), - VULNBL_INTEL_STEPS(INTEL_KABYLAKE, 0xc, MMIO | RETBLEED | GDS | SRBDS), - VULNBL_INTEL_STEPS(INTEL_KABYLAKE, X86_STEP_MAX, MMIO | RETBLEED | GDS | SRBDS | ITS), - VULNBL_INTEL_STEPS(INTEL_CANNONLAKE_L, X86_STEP_MAX, RETBLEED), + VULNBL_INTEL_STEPS(INTEL_SANDYBRIDGE_X, X86_STEP_MAX, VMSCAPE), + VULNBL_INTEL_STEPS(INTEL_SANDYBRIDGE, X86_STEP_MAX, VMSCAPE), + VULNBL_INTEL_STEPS(INTEL_IVYBRIDGE_X, X86_STEP_MAX, VMSCAPE), + VULNBL_INTEL_STEPS(INTEL_IVYBRIDGE, X86_STEP_MAX, SRBDS | VMSCAPE), + VULNBL_INTEL_STEPS(INTEL_HASWELL, X86_STEP_MAX, SRBDS | VMSCAPE), + VULNBL_INTEL_STEPS(INTEL_HASWELL_L, X86_STEP_MAX, SRBDS | VMSCAPE), + VULNBL_INTEL_STEPS(INTEL_HASWELL_G, X86_STEP_MAX, SRBDS | VMSCAPE), + VULNBL_INTEL_STEPS(INTEL_HASWELL_X, X86_STEP_MAX, MMIO | VMSCAPE), + VULNBL_INTEL_STEPS(INTEL_BROADWELL_D, X86_STEP_MAX, MMIO | VMSCAPE), + VULNBL_INTEL_STEPS(INTEL_BROADWELL_X, X86_STEP_MAX, MMIO | VMSCAPE), + VULNBL_INTEL_STEPS(INTEL_BROADWELL_G, X86_STEP_MAX, SRBDS | VMSCAPE), + VULNBL_INTEL_STEPS(INTEL_BROADWELL, X86_STEP_MAX, SRBDS | VMSCAPE), + VULNBL_INTEL_STEPS(INTEL_SKYLAKE_X, 0x5, MMIO | RETBLEED | GDS | VMSCAPE), + VULNBL_INTEL_STEPS(INTEL_SKYLAKE_X, X86_STEP_MAX, MMIO | RETBLEED | GDS | ITS | VMSCAPE), + VULNBL_INTEL_STEPS(INTEL_SKYLAKE_L, X86_STEP_MAX, MMIO | RETBLEED | GDS | SRBDS | VMSCAPE), + VULNBL_INTEL_STEPS(INTEL_SKYLAKE, X86_STEP_MAX, MMIO | RETBLEED | GDS | SRBDS | VMSCAPE), + VULNBL_INTEL_STEPS(INTEL_KABYLAKE_L, 0xb, MMIO | RETBLEED | GDS | SRBDS | VMSCAPE), + VULNBL_INTEL_STEPS(INTEL_KABYLAKE_L, X86_STEP_MAX, MMIO | RETBLEED | GDS | SRBDS | ITS | VMSCAPE), + VULNBL_INTEL_STEPS(INTEL_KABYLAKE, 0xc, MMIO | RETBLEED | GDS | SRBDS | VMSCAPE), + VULNBL_INTEL_STEPS(INTEL_KABYLAKE, X86_STEP_MAX, MMIO | RETBLEED | GDS | SRBDS | ITS | VMSCAPE), + VULNBL_INTEL_STEPS(INTEL_CANNONLAKE_L, X86_STEP_MAX, RETBLEED | VMSCAPE), VULNBL_INTEL_STEPS(INTEL_ICELAKE_L, X86_STEP_MAX, MMIO | MMIO_SBDS | RETBLEED | GDS | ITS | ITS_NATIVE_ONLY), VULNBL_INTEL_STEPS(INTEL_ICELAKE_D, X86_STEP_MAX, MMIO | GDS | ITS | ITS_NATIVE_ONLY), VULNBL_INTEL_STEPS(INTEL_ICELAKE_X, X86_STEP_MAX, MMIO | GDS | ITS | ITS_NATIVE_ONLY), - VULNBL_INTEL_STEPS(INTEL_COMETLAKE, X86_STEP_MAX, MMIO | MMIO_SBDS | RETBLEED | GDS | ITS), - VULNBL_INTEL_STEPS(INTEL_COMETLAKE_L, 0x0, MMIO | RETBLEED | ITS), - VULNBL_INTEL_STEPS(INTEL_COMETLAKE_L, X86_STEP_MAX, MMIO | MMIO_SBDS | RETBLEED | GDS | ITS), + VULNBL_INTEL_STEPS(INTEL_COMETLAKE, X86_STEP_MAX, MMIO | MMIO_SBDS | RETBLEED | GDS | ITS | VMSCAPE), + VULNBL_INTEL_STEPS(INTEL_COMETLAKE_L, 0x0, MMIO | RETBLEED | ITS | VMSCAPE), + VULNBL_INTEL_STEPS(INTEL_COMETLAKE_L, X86_STEP_MAX, MMIO | MMIO_SBDS | RETBLEED | GDS | ITS | VMSCAPE), VULNBL_INTEL_STEPS(INTEL_TIGERLAKE_L, X86_STEP_MAX, GDS | ITS | ITS_NATIVE_ONLY), VULNBL_INTEL_STEPS(INTEL_TIGERLAKE, X86_STEP_MAX, GDS | ITS | ITS_NATIVE_ONLY), VULNBL_INTEL_STEPS(INTEL_LAKEFIELD, X86_STEP_MAX, MMIO | MMIO_SBDS | RETBLEED), VULNBL_INTEL_STEPS(INTEL_ROCKETLAKE, X86_STEP_MAX, MMIO | RETBLEED | GDS | ITS | ITS_NATIVE_ONLY), - VULNBL_INTEL_TYPE(INTEL_ALDERLAKE, ATOM, RFDS), - VULNBL_INTEL_STEPS(INTEL_ALDERLAKE_L, X86_STEP_MAX, RFDS), - VULNBL_INTEL_TYPE(INTEL_RAPTORLAKE, ATOM, RFDS), - VULNBL_INTEL_STEPS(INTEL_RAPTORLAKE_P, X86_STEP_MAX, RFDS), - VULNBL_INTEL_STEPS(INTEL_RAPTORLAKE_S, X86_STEP_MAX, RFDS), - VULNBL_INTEL_STEPS(INTEL_ATOM_GRACEMONT, X86_STEP_MAX, RFDS), + VULNBL_INTEL_TYPE(INTEL_ALDERLAKE, ATOM, RFDS | VMSCAPE), + VULNBL_INTEL_STEPS(INTEL_ALDERLAKE, X86_STEP_MAX, VMSCAPE), + VULNBL_INTEL_STEPS(INTEL_ALDERLAKE_L, X86_STEP_MAX, RFDS | VMSCAPE), + VULNBL_INTEL_TYPE(INTEL_RAPTORLAKE, ATOM, RFDS | VMSCAPE), + VULNBL_INTEL_STEPS(INTEL_RAPTORLAKE, X86_STEP_MAX, VMSCAPE), + VULNBL_INTEL_STEPS(INTEL_RAPTORLAKE_P, X86_STEP_MAX, RFDS | VMSCAPE), + VULNBL_INTEL_STEPS(INTEL_RAPTORLAKE_S, X86_STEP_MAX, RFDS | VMSCAPE), + VULNBL_INTEL_STEPS(INTEL_METEORLAKE_L, X86_STEP_MAX, VMSCAPE), + VULNBL_INTEL_STEPS(INTEL_ARROWLAKE_H, X86_STEP_MAX, VMSCAPE), + VULNBL_INTEL_STEPS(INTEL_ARROWLAKE, X86_STEP_MAX, VMSCAPE), + VULNBL_INTEL_STEPS(INTEL_ARROWLAKE_U, X86_STEP_MAX, VMSCAPE), + VULNBL_INTEL_STEPS(INTEL_LUNARLAKE_M, X86_STEP_MAX, VMSCAPE), + VULNBL_INTEL_STEPS(INTEL_SAPPHIRERAPIDS_X, X86_STEP_MAX, VMSCAPE), + VULNBL_INTEL_STEPS(INTEL_GRANITERAPIDS_X, X86_STEP_MAX, VMSCAPE), + VULNBL_INTEL_STEPS(INTEL_EMERALDRAPIDS_X, X86_STEP_MAX, VMSCAPE), + VULNBL_INTEL_STEPS(INTEL_ATOM_GRACEMONT, X86_STEP_MAX, RFDS | VMSCAPE), VULNBL_INTEL_STEPS(INTEL_ATOM_TREMONT, X86_STEP_MAX, MMIO | MMIO_SBDS | RFDS), VULNBL_INTEL_STEPS(INTEL_ATOM_TREMONT_D, X86_STEP_MAX, MMIO | RFDS), VULNBL_INTEL_STEPS(INTEL_ATOM_TREMONT_L, X86_STEP_MAX, MMIO | MMIO_SBDS | RFDS), VULNBL_INTEL_STEPS(INTEL_ATOM_GOLDMONT, X86_STEP_MAX, RFDS), VULNBL_INTEL_STEPS(INTEL_ATOM_GOLDMONT_D, X86_STEP_MAX, RFDS), VULNBL_INTEL_STEPS(INTEL_ATOM_GOLDMONT_PLUS, X86_STEP_MAX, RFDS), + VULNBL_INTEL_STEPS(INTEL_ATOM_CRESTMONT_X, X86_STEP_MAX, VMSCAPE), VULNBL_AMD(0x15, RETBLEED), VULNBL_AMD(0x16, RETBLEED), - VULNBL_AMD(0x17, RETBLEED | SMT_RSB | SRSO), - VULNBL_HYGON(0x18, RETBLEED | SMT_RSB | SRSO), - VULNBL_AMD(0x19, SRSO | TSA), - VULNBL_AMD(0x1a, SRSO), + VULNBL_AMD(0x17, RETBLEED | SMT_RSB | SRSO | VMSCAPE), + VULNBL_HYGON(0x18, RETBLEED | SMT_RSB | SRSO | VMSCAPE), + VULNBL_AMD(0x19, SRSO | TSA | VMSCAPE), + VULNBL_AMD(0x1a, SRSO | VMSCAPE), {} }; @@ -1551,6 +1567,14 @@ static void __init cpu_set_bug_bits(struct cpuinfo_x86 *c) } } + /* + * Set the bug only on bare-metal. A nested hypervisor should already be + * deploying IBPB to isolate itself from nested guests. + */ + if (cpu_matches(cpu_vuln_blacklist, VMSCAPE) && + !boot_cpu_has(X86_FEATURE_HYPERVISOR)) + setup_force_cpu_bug(X86_BUG_VMSCAPE); + if (cpu_matches(cpu_vuln_whitelist, NO_MELTDOWN)) return; diff --git a/arch/x86/kernel/machine_kexec_64.c b/arch/x86/kernel/machine_kexec_64.c index 31af9f60cc..7d896e9bcb 100644 --- a/arch/x86/kernel/machine_kexec_64.c +++ b/arch/x86/kernel/machine_kexec_64.c @@ -29,6 +29,7 @@ #include #include #include +#include #ifdef CONFIG_ACPI /* @@ -299,6 +300,22 @@ int machine_kexec_prepare(struct kimage *image) unsigned long start_pgtable; int result; + /* + * Some early TDX-capable platforms have an erratum. A kernel + * partial write (a write transaction of less than cacheline + * lands at memory controller) to TDX private memory poisons that + * memory, and a subsequent read triggers a machine check. + * + * On those platforms the old kernel must reset TDX private + * memory before jumping to the new kernel otherwise the new + * kernel may see unexpected machine check. For simplicity + * just fail kexec/kdump on those platforms. + */ + if (boot_cpu_has_bug(X86_BUG_TDX_PW_MCE)) { + pr_info_once("Not allowed on platform with tdx_pw_mce bug\n"); + return -EOPNOTSUPP; + } + /* Calculate the offsets */ start_pgtable = page_to_pfn(image->control_code_page) << PAGE_SHIFT; @@ -323,6 +340,7 @@ void machine_kexec(struct kimage *image) { unsigned long page_list[PAGES_NR]; void *control_page; + unsigned int cache_incoherent; int save_ftrace_enabled; #ifdef CONFIG_KEXEC_JUMP @@ -362,6 +380,12 @@ void machine_kexec(struct kimage *image) page_list[PA_SWAP_PAGE] = (page_to_pfn(image->swap_page) << PAGE_SHIFT); + /* + * This must be done before load_segments(), since it resets + * GS to 0 and percpu data needs the correct GS to work. + */ + cache_incoherent = this_cpu_read(cache_state_incoherent); + /* * The segment registers are funny things, they have both a * visible and an invisible part. Whenever the visible part is @@ -371,6 +395,11 @@ void machine_kexec(struct kimage *image) * * I take advantage of this here by force loading the * segments, before I zap the gdt with an invalid value. + * + * load_segments() resets GS to 0. Don't make any function call + * after here since call depth tracking uses percpu variables to + * operate (relocate_kernel() is explicitly ignored by call depth + * tracking). */ load_segments(); /* diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c index 6a3214df4e..53488481dd 100644 --- a/arch/x86/kernel/process.c +++ b/arch/x86/kernel/process.c @@ -84,6 +84,16 @@ EXPORT_PER_CPU_SYMBOL(cpu_tss_rw); DEFINE_PER_CPU(bool, __tss_limit_invalid); EXPORT_PER_CPU_SYMBOL_GPL(__tss_limit_invalid); +/* + * The cache may be in an incoherent state and needs flushing during kexec. + * E.g., on SME/TDX platforms, dirty cacheline aliases with and without + * encryption bit(s) can coexist and the cache needs to be flushed before + * booting to the new kernel to avoid the silent memory corruption due to + * dirty cachelines with different encryption property being written back + * to the memory. + */ +DEFINE_PER_CPU(bool, cache_state_incoherent); + /* * this gets called so that we can store lazy state into memory and copy the * current task into the new thread. @@ -785,19 +795,7 @@ void __noreturn stop_this_cpu(void *dummy) disable_local_APIC(); mcheck_cpu_clear(c); - /* - * Use wbinvd on processors that support SME. This provides support - * for performing a successful kexec when going from SME inactive - * to SME active (or vice-versa). The cache must be cleared so that - * if there are entries with the same physical address, both with and - * without the encryption bit, they don't race each other when flushed - * and potentially end up with the wrong entry being committed to - * memory. - * - * Test the CPUID bit directly because the machine might've cleared - * X86_FEATURE_SME due to cmdline options. - */ - if (c->extended_cpuid_level >= 0x8000001f && (cpuid_eax(0x8000001f) & BIT(0))) + if (this_cpu_read(cache_state_incoherent)) native_wbinvd(); /* diff --git a/arch/x86/kernel/relocate_kernel_64.S b/arch/x86/kernel/relocate_kernel_64.S index 042c9a0334..6f25b357d3 100644 --- a/arch/x86/kernel/relocate_kernel_64.S +++ b/arch/x86/kernel/relocate_kernel_64.S @@ -52,7 +52,7 @@ SYM_CODE_START_NOALIGN(relocate_kernel) * %rsi page_list * %rdx start address * %rcx preserve_context - * %r8 host_mem_enc_active + * %r8 cache_incoherent */ /* Save the CPU context, used for jumping back */ @@ -161,14 +161,21 @@ SYM_CODE_START_LOCAL_NOALIGN(identity_mapped) movq %r9, %cr3 /* + * If the memory cache is in incoherent state, e.g., due to + * memory encryption, do WBINVD to flush cache. + * * If SME is active, there could be old encrypted cache line * entries that will conflict with the now unencrypted memory * used by kexec. Flush the caches before copying the kernel. + * + * Note SME sets this flag to true when the platform supports + * SME, so the WBINVD is performed even SME is not activated + * by the kernel. But this has no harm. */ testq %r12, %r12 - jz .Lsme_off + jz .Lnowbinvd wbinvd -.Lsme_off: +.Lnowbinvd: movq %rcx, %r11 call swap_pages diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c index 4f0a94346d..9c301200e9 100644 --- a/arch/x86/kvm/hyperv.c +++ b/arch/x86/kvm/hyperv.c @@ -1980,6 +1980,9 @@ int kvm_hv_vcpu_flush_tlb(struct kvm_vcpu *vcpu) if (entries[i] == KVM_HV_TLB_FLUSHALL_ENTRY) goto out_flush_all; + if (is_noncanonical_invlpg_address(entries[i], vcpu)) + continue; + /* * Lower 12 bits of 'address' encode the number of additional * pages to flush. diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c index 0c7bfe17e3..f8dadff19c 100644 --- a/arch/x86/kvm/vmx/tdx.c +++ b/arch/x86/kvm/vmx/tdx.c @@ -442,6 +442,16 @@ void tdx_disable_virtualization_cpu(void) tdx_flush_vp(&arg); } local_irq_restore(flags); + + /* + * Flush cache now if kexec is possible: this is necessary to avoid + * having dirty private memory cachelines when the new kernel boots, + * but WBINVD is a relatively expensive operation and doing it during + * kexec can exacerbate races in native_stop_other_cpus(). Do it + * now, since this is a safe moment and there is going to be no more + * TDX activity on this CPU from this point on. + */ + tdx_cpu_flush_cache_for_kexec(); } #define TDX_SEAMCALL_RETRIES 10000 diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 48369747aa..5be10ea1f4 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -11160,6 +11160,15 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu) if (vcpu->arch.guest_fpu.xfd_err) wrmsrl(MSR_IA32_XFD_ERR, 0); + /* + * Mark this CPU as needing a branch predictor flush before running + * userspace. Must be done before enabling preemption to ensure it gets + * set for the CPU that actually ran the guest, and not the CPU that it + * may migrate to. + */ + if (cpu_feature_enabled(X86_FEATURE_IBPB_EXIT_TO_USER)) + this_cpu_write(x86_ibpb_exit_to_user, true); + /* * Consume any pending interrupts, including the possible source of * VM-Exit on SVM and any ticks that occur between VM-Exit and now. diff --git a/arch/x86/virt/vmx/tdx/tdx.c b/arch/x86/virt/vmx/tdx/tdx.c index 0df91577e7..15ac88c620 100644 --- a/arch/x86/virt/vmx/tdx/tdx.c +++ b/arch/x86/virt/vmx/tdx/tdx.c @@ -75,8 +75,9 @@ static inline void seamcall_err_ret(u64 fn, u64 err, args->r9, args->r10, args->r11); } -static inline int sc_retry_prerr(sc_func_t func, sc_err_func_t err_func, - u64 fn, struct tdx_module_args *args) +static __always_inline int sc_retry_prerr(sc_func_t func, + sc_err_func_t err_func, + u64 fn, struct tdx_module_args *args) { u64 sret = sc_retry(func, fn, args); @@ -1288,7 +1289,7 @@ static bool paddr_is_tdx_private(unsigned long phys) return false; /* Get page type from the TDX module */ - sret = __seamcall_ret(TDH_PHYMEM_PAGE_RDMD, &args); + sret = __seamcall_dirty_cache(__seamcall_ret, TDH_PHYMEM_PAGE_RDMD, &args); /* * The SEAMCALL will not return success unless there is a @@ -1544,7 +1545,7 @@ noinstr __flatten u64 tdh_vp_enter(struct tdx_vp *td, struct tdx_module_args *ar { args->rcx = tdx_tdvpr_pa(td); - return __seamcall_saved_ret(TDH_VP_ENTER, args); + return __seamcall_dirty_cache(__seamcall_saved_ret, TDH_VP_ENTER, args); } EXPORT_SYMBOL_GPL(tdh_vp_enter); @@ -1892,3 +1893,22 @@ u64 tdh_phymem_page_wbinvd_hkid(u64 hkid, struct page *page) return seamcall(TDH_PHYMEM_PAGE_WBINVD, &args); } EXPORT_SYMBOL_GPL(tdh_phymem_page_wbinvd_hkid); + +#ifdef CONFIG_KEXEC_CORE +void tdx_cpu_flush_cache_for_kexec(void) +{ + lockdep_assert_preemption_disabled(); + + if (!this_cpu_read(cache_state_incoherent)) + return; + + /* + * Private memory cachelines need to be clean at the time of + * kexec. Write them back now, as the caller promises that + * there should be no more SEAMCALLs on this CPU. + */ + wbinvd(); + this_cpu_write(cache_state_incoherent, false); +} +EXPORT_SYMBOL_GPL(tdx_cpu_flush_cache_for_kexec); +#endif diff --git a/configs/kernel-5.14.0-x86_64-debug.config b/configs/kernel-5.14.0-x86_64-debug.config index bcd9b49d7f..24451e2383 100644 --- a/configs/kernel-5.14.0-x86_64-debug.config +++ b/configs/kernel-5.14.0-x86_64-debug.config @@ -489,6 +489,7 @@ CONFIG_X86_INTEL_TSX_MODE_OFF=y # CONFIG_X86_INTEL_TSX_MODE_ON is not set # CONFIG_X86_INTEL_TSX_MODE_AUTO is not set CONFIG_X86_SGX=y +CONFIG_INTEL_TDX_HOST=y CONFIG_X86_USER_SHADOW_STACK=y CONFIG_EFI=y CONFIG_EFI_STUB=y @@ -568,6 +569,7 @@ CONFIG_MITIGATION_SRBDS=y CONFIG_MITIGATION_SSB=y CONFIG_MITIGATION_ITS=y CONFIG_MITIGATION_TSA=y +CONFIG_MITIGATION_VMSCAPE=y CONFIG_ARCH_HAS_ADD_PAGES=y # @@ -776,6 +778,7 @@ CONFIG_KVM=m CONFIG_KVM_SW_PROTECTED_VM=y CONFIG_KVM_INTEL=m CONFIG_X86_SGX_KVM=y +CONFIG_KVM_INTEL_TDX=y CONFIG_KVM_AMD=m CONFIG_KVM_AMD_SEV=y CONFIG_KVM_SMM=y @@ -1115,6 +1118,7 @@ CONFIG_SPARSEMEM_VMEMMAP=y CONFIG_ARCH_WANT_OPTIMIZE_DAX_VMEMMAP=y CONFIG_ARCH_WANT_OPTIMIZE_HUGETLB_VMEMMAP=y CONFIG_HAVE_FAST_GUP=y +CONFIG_ARCH_KEEP_MEMBLOCK=y CONFIG_NUMA_KEEP_MEMINFO=y CONFIG_MEMORY_ISOLATION=y CONFIG_EXCLUSIVE_SYSTEM_RAM=y diff --git a/configs/kernel-5.14.0-x86_64-rt-debug.config b/configs/kernel-5.14.0-x86_64-rt-debug.config index df017aa857..c64f9ab66e 100644 --- a/configs/kernel-5.14.0-x86_64-rt-debug.config +++ b/configs/kernel-5.14.0-x86_64-rt-debug.config @@ -490,6 +490,7 @@ CONFIG_X86_INTEL_TSX_MODE_OFF=y # CONFIG_X86_INTEL_TSX_MODE_ON is not set # CONFIG_X86_INTEL_TSX_MODE_AUTO is not set CONFIG_X86_SGX=y +CONFIG_INTEL_TDX_HOST=y CONFIG_X86_USER_SHADOW_STACK=y CONFIG_EFI=y CONFIG_EFI_STUB=y @@ -570,6 +571,7 @@ CONFIG_MITIGATION_SRBDS=y CONFIG_MITIGATION_SSB=y CONFIG_MITIGATION_ITS=y CONFIG_MITIGATION_TSA=y +CONFIG_MITIGATION_VMSCAPE=y CONFIG_ARCH_HAS_ADD_PAGES=y # @@ -785,6 +787,7 @@ CONFIG_KVM_SW_PROTECTED_VM=y CONFIG_KVM_INTEL=m # CONFIG_KVM_INTEL_PROVE_VE is not set CONFIG_X86_SGX_KVM=y +CONFIG_KVM_INTEL_TDX=y CONFIG_KVM_AMD=m CONFIG_KVM_AMD_SEV=y CONFIG_KVM_SMM=y @@ -1123,6 +1126,7 @@ CONFIG_SPARSEMEM_VMEMMAP=y CONFIG_ARCH_WANT_OPTIMIZE_DAX_VMEMMAP=y CONFIG_ARCH_WANT_OPTIMIZE_HUGETLB_VMEMMAP=y CONFIG_HAVE_FAST_GUP=y +CONFIG_ARCH_KEEP_MEMBLOCK=y CONFIG_NUMA_KEEP_MEMINFO=y CONFIG_MEMORY_ISOLATION=y CONFIG_EXCLUSIVE_SYSTEM_RAM=y diff --git a/configs/kernel-5.14.0-x86_64-rt.config b/configs/kernel-5.14.0-x86_64-rt.config index 906428dd61..99170c3c9d 100644 --- a/configs/kernel-5.14.0-x86_64-rt.config +++ b/configs/kernel-5.14.0-x86_64-rt.config @@ -490,6 +490,7 @@ CONFIG_X86_INTEL_TSX_MODE_OFF=y # CONFIG_X86_INTEL_TSX_MODE_ON is not set # CONFIG_X86_INTEL_TSX_MODE_AUTO is not set CONFIG_X86_SGX=y +CONFIG_INTEL_TDX_HOST=y CONFIG_X86_USER_SHADOW_STACK=y CONFIG_EFI=y CONFIG_EFI_STUB=y @@ -570,6 +571,7 @@ CONFIG_MITIGATION_SRBDS=y CONFIG_MITIGATION_SSB=y CONFIG_MITIGATION_ITS=y CONFIG_MITIGATION_TSA=y +CONFIG_MITIGATION_VMSCAPE=y CONFIG_ARCH_HAS_ADD_PAGES=y # @@ -783,6 +785,7 @@ CONFIG_KVM_SW_PROTECTED_VM=y CONFIG_KVM_INTEL=m # CONFIG_KVM_INTEL_PROVE_VE is not set CONFIG_X86_SGX_KVM=y +CONFIG_KVM_INTEL_TDX=y CONFIG_KVM_AMD=m CONFIG_KVM_AMD_SEV=y CONFIG_KVM_SMM=y @@ -1120,6 +1123,7 @@ CONFIG_SPARSEMEM_VMEMMAP=y CONFIG_ARCH_WANT_OPTIMIZE_DAX_VMEMMAP=y CONFIG_ARCH_WANT_OPTIMIZE_HUGETLB_VMEMMAP=y CONFIG_HAVE_FAST_GUP=y +CONFIG_ARCH_KEEP_MEMBLOCK=y CONFIG_NUMA_KEEP_MEMINFO=y CONFIG_MEMORY_ISOLATION=y CONFIG_EXCLUSIVE_SYSTEM_RAM=y diff --git a/configs/kernel-5.14.0-x86_64.config b/configs/kernel-5.14.0-x86_64.config index d9988b53ce..14540bf494 100644 --- a/configs/kernel-5.14.0-x86_64.config +++ b/configs/kernel-5.14.0-x86_64.config @@ -486,6 +486,7 @@ CONFIG_X86_INTEL_TSX_MODE_OFF=y # CONFIG_X86_INTEL_TSX_MODE_ON is not set # CONFIG_X86_INTEL_TSX_MODE_AUTO is not set CONFIG_X86_SGX=y +CONFIG_INTEL_TDX_HOST=y CONFIG_X86_USER_SHADOW_STACK=y CONFIG_EFI=y CONFIG_EFI_STUB=y @@ -566,6 +567,7 @@ CONFIG_MITIGATION_SRBDS=y CONFIG_MITIGATION_SSB=y CONFIG_MITIGATION_ITS=y CONFIG_MITIGATION_TSA=y +CONFIG_MITIGATION_VMSCAPE=y CONFIG_ARCH_HAS_ADD_PAGES=y # @@ -772,6 +774,7 @@ CONFIG_KVM=m CONFIG_KVM_SW_PROTECTED_VM=y CONFIG_KVM_INTEL=m CONFIG_X86_SGX_KVM=y +CONFIG_KVM_INTEL_TDX=y CONFIG_KVM_AMD=m CONFIG_KVM_AMD_SEV=y CONFIG_KVM_SMM=y @@ -1111,6 +1114,7 @@ CONFIG_SPARSEMEM_VMEMMAP=y CONFIG_ARCH_WANT_OPTIMIZE_DAX_VMEMMAP=y CONFIG_ARCH_WANT_OPTIMIZE_HUGETLB_VMEMMAP=y CONFIG_HAVE_FAST_GUP=y +CONFIG_ARCH_KEEP_MEMBLOCK=y CONFIG_NUMA_KEEP_MEMINFO=y CONFIG_MEMORY_ISOLATION=y CONFIG_EXCLUSIVE_SYSTEM_RAM=y diff --git a/crypto/seqiv.c b/crypto/seqiv.c index 86bb33644d..ae7256cfc7 100644 --- a/crypto/seqiv.c +++ b/crypto/seqiv.c @@ -23,7 +23,7 @@ static void seqiv_aead_encrypt_complete2(struct aead_request *req, int err) struct aead_request *subreq = aead_request_ctx(req); struct crypto_aead *geniv; - if (err == -EINPROGRESS) + if (err == -EINPROGRESS || err == -EBUSY) return; if (err) diff --git a/crypto/xts.c b/crypto/xts.c index 6c12f30dbd..9f90121b69 100644 --- a/crypto/xts.c +++ b/crypto/xts.c @@ -203,12 +203,12 @@ static void xts_encrypt_done(struct crypto_async_request *areq, int err) if (!err) { struct xts_request_ctx *rctx = skcipher_request_ctx(req); - rctx->subreq.base.flags &= ~CRYPTO_TFM_REQ_MAY_SLEEP; + rctx->subreq.base.flags &= CRYPTO_TFM_REQ_MAY_BACKLOG; err = xts_xor_tweak_post(req, true); if (!err && unlikely(req->cryptlen % XTS_BLOCK_SIZE)) { err = xts_cts_final(req, crypto_skcipher_encrypt); - if (err == -EINPROGRESS) + if (err == -EINPROGRESS || err == -EBUSY) return; } } @@ -223,12 +223,12 @@ static void xts_decrypt_done(struct crypto_async_request *areq, int err) if (!err) { struct xts_request_ctx *rctx = skcipher_request_ctx(req); - rctx->subreq.base.flags &= ~CRYPTO_TFM_REQ_MAY_SLEEP; + rctx->subreq.base.flags &= CRYPTO_TFM_REQ_MAY_BACKLOG; err = xts_xor_tweak_post(req, false); if (!err && unlikely(req->cryptlen % XTS_BLOCK_SIZE)) { err = xts_cts_final(req, crypto_skcipher_decrypt); - if (err == -EINPROGRESS) + if (err == -EINPROGRESS || err == -EBUSY) return; } } diff --git a/drivers/base/cpu.c b/drivers/base/cpu.c index 8328b35722..88739ceb17 100644 --- a/drivers/base/cpu.c +++ b/drivers/base/cpu.c @@ -581,6 +581,7 @@ CPU_SHOW_VULN_FALLBACK(gds); CPU_SHOW_VULN_FALLBACK(reg_file_data_sampling); CPU_SHOW_VULN_FALLBACK(indirect_target_selection); CPU_SHOW_VULN_FALLBACK(tsa); +CPU_SHOW_VULN_FALLBACK(vmscape); static DEVICE_ATTR(meltdown, 0444, cpu_show_meltdown, NULL); static DEVICE_ATTR(spectre_v1, 0444, cpu_show_spectre_v1, NULL); @@ -598,6 +599,7 @@ static DEVICE_ATTR(gather_data_sampling, 0444, cpu_show_gds, NULL); static DEVICE_ATTR(reg_file_data_sampling, 0444, cpu_show_reg_file_data_sampling, NULL); static DEVICE_ATTR(indirect_target_selection, 0444, cpu_show_indirect_target_selection, NULL); static DEVICE_ATTR(tsa, 0444, cpu_show_tsa, NULL); +static DEVICE_ATTR(vmscape, 0444, cpu_show_vmscape, NULL); static struct attribute *cpu_root_vulnerabilities_attrs[] = { &dev_attr_meltdown.attr, @@ -616,6 +618,7 @@ static struct attribute *cpu_root_vulnerabilities_attrs[] = { &dev_attr_reg_file_data_sampling.attr, &dev_attr_indirect_target_selection.attr, &dev_attr_tsa.attr, + &dev_attr_vmscape.attr, NULL }; diff --git a/drivers/firmware/arm_scpi.c b/drivers/firmware/arm_scpi.c index 7d16f6fb38..db00a40579 100644 --- a/drivers/firmware/arm_scpi.c +++ b/drivers/firmware/arm_scpi.c @@ -815,7 +815,7 @@ static int scpi_init_versions(struct scpi_drvinfo *info) info->firmware_version = le32_to_cpu(caps.platform_version); } /* Ignore error if not implemented */ - if (scpi_info->is_legacy && ret == -EOPNOTSUPP) + if (info->is_legacy && ret == -EOPNOTSUPP) return 0; return ret; @@ -911,13 +911,14 @@ static int scpi_probe(struct platform_device *pdev) struct resource res; struct device *dev = &pdev->dev; struct device_node *np = dev->of_node; + struct scpi_drvinfo *scpi_drvinfo; - scpi_info = devm_kzalloc(dev, sizeof(*scpi_info), GFP_KERNEL); - if (!scpi_info) + scpi_drvinfo = devm_kzalloc(dev, sizeof(*scpi_drvinfo), GFP_KERNEL); + if (!scpi_drvinfo) return -ENOMEM; if (of_match_device(legacy_scpi_of_match, &pdev->dev)) - scpi_info->is_legacy = true; + scpi_drvinfo->is_legacy = true; count = of_count_phandle_with_args(np, "mboxes", "#mbox-cells"); if (count < 0) { @@ -925,19 +926,19 @@ static int scpi_probe(struct platform_device *pdev) return -ENODEV; } - scpi_info->channels = devm_kcalloc(dev, count, sizeof(struct scpi_chan), - GFP_KERNEL); - if (!scpi_info->channels) + scpi_drvinfo->channels = + devm_kcalloc(dev, count, sizeof(struct scpi_chan), GFP_KERNEL); + if (!scpi_drvinfo->channels) return -ENOMEM; - ret = devm_add_action(dev, scpi_free_channels, scpi_info); + ret = devm_add_action(dev, scpi_free_channels, scpi_drvinfo); if (ret) return ret; - for (; scpi_info->num_chans < count; scpi_info->num_chans++) { + for (; scpi_drvinfo->num_chans < count; scpi_drvinfo->num_chans++) { resource_size_t size; - int idx = scpi_info->num_chans; - struct scpi_chan *pchan = scpi_info->channels + idx; + int idx = scpi_drvinfo->num_chans; + struct scpi_chan *pchan = scpi_drvinfo->channels + idx; struct mbox_client *cl = &pchan->cl; struct device_node *shmem = of_parse_phandle(np, "shmem", idx); @@ -984,45 +985,53 @@ static int scpi_probe(struct platform_device *pdev) return ret; } - scpi_info->commands = scpi_std_commands; + scpi_drvinfo->commands = scpi_std_commands; - platform_set_drvdata(pdev, scpi_info); + platform_set_drvdata(pdev, scpi_drvinfo); - if (scpi_info->is_legacy) { + if (scpi_drvinfo->is_legacy) { /* Replace with legacy variants */ scpi_ops.clk_set_val = legacy_scpi_clk_set_val; - scpi_info->commands = scpi_legacy_commands; + scpi_drvinfo->commands = scpi_legacy_commands; /* Fill priority bitmap */ for (idx = 0; idx < ARRAY_SIZE(legacy_hpriority_cmds); idx++) set_bit(legacy_hpriority_cmds[idx], - scpi_info->cmd_priority); + scpi_drvinfo->cmd_priority); } - ret = scpi_init_versions(scpi_info); + scpi_info = scpi_drvinfo; + + ret = scpi_init_versions(scpi_drvinfo); if (ret) { dev_err(dev, "incorrect or no SCP firmware found\n"); + scpi_info = NULL; return ret; } - if (scpi_info->is_legacy && !scpi_info->protocol_version && - !scpi_info->firmware_version) + if (scpi_drvinfo->is_legacy && !scpi_drvinfo->protocol_version && + !scpi_drvinfo->firmware_version) dev_info(dev, "SCP Protocol legacy pre-1.0 firmware\n"); else dev_info(dev, "SCP Protocol %lu.%lu Firmware %lu.%lu.%lu version\n", FIELD_GET(PROTO_REV_MAJOR_MASK, - scpi_info->protocol_version), + scpi_drvinfo->protocol_version), FIELD_GET(PROTO_REV_MINOR_MASK, - scpi_info->protocol_version), + scpi_drvinfo->protocol_version), FIELD_GET(FW_REV_MAJOR_MASK, - scpi_info->firmware_version), + scpi_drvinfo->firmware_version), FIELD_GET(FW_REV_MINOR_MASK, - scpi_info->firmware_version), + scpi_drvinfo->firmware_version), FIELD_GET(FW_REV_PATCH_MASK, - scpi_info->firmware_version)); - scpi_info->scpi_ops = &scpi_ops; + scpi_drvinfo->firmware_version)); - return devm_of_platform_populate(dev); + scpi_drvinfo->scpi_ops = &scpi_ops; + + ret = devm_of_platform_populate(dev); + if (ret) + scpi_info = NULL; + + return ret; } static const struct of_device_id scpi_of_match[] = { diff --git a/drivers/infiniband/hw/mana/counters.c b/drivers/infiniband/hw/mana/counters.c index e533ce2101..6a81365d3b 100644 --- a/drivers/infiniband/hw/mana/counters.c +++ b/drivers/infiniband/hw/mana/counters.c @@ -34,6 +34,22 @@ static const struct rdma_stat_desc mana_ib_port_stats_desc[] = { [MANA_IB_CURRENT_RATE].name = "current_rate", }; +static const struct rdma_stat_desc mana_ib_device_stats_desc[] = { + [MANA_IB_SENT_CNPS].name = "sent_cnps", + [MANA_IB_RECEIVED_ECNS].name = "received_ecns", + [MANA_IB_RECEIVED_CNP_COUNT].name = "received_cnp_count", + [MANA_IB_QP_CONGESTED_EVENTS].name = "qp_congested_events", + [MANA_IB_QP_RECOVERED_EVENTS].name = "qp_recovered_events", + [MANA_IB_DEV_RATE_INC_EVENTS].name = "rate_inc_events", +}; + +struct rdma_hw_stats *mana_ib_alloc_hw_device_stats(struct ib_device *ibdev) +{ + return rdma_alloc_hw_stats_struct(mana_ib_device_stats_desc, + ARRAY_SIZE(mana_ib_device_stats_desc), + RDMA_HW_STATS_DEFAULT_LIFESPAN); +} + struct rdma_hw_stats *mana_ib_alloc_hw_port_stats(struct ib_device *ibdev, u32 port_num) { @@ -42,8 +58,39 @@ struct rdma_hw_stats *mana_ib_alloc_hw_port_stats(struct ib_device *ibdev, RDMA_HW_STATS_DEFAULT_LIFESPAN); } -int mana_ib_get_hw_stats(struct ib_device *ibdev, struct rdma_hw_stats *stats, - u32 port_num, int index) +static int mana_ib_get_hw_device_stats(struct ib_device *ibdev, struct rdma_hw_stats *stats) +{ + struct mana_ib_dev *mdev = container_of(ibdev, struct mana_ib_dev, + ib_dev); + struct mana_rnic_query_device_cntrs_resp resp = {}; + struct mana_rnic_query_device_cntrs_req req = {}; + int err; + + mana_gd_init_req_hdr(&req.hdr, MANA_IB_QUERY_DEVICE_COUNTERS, + sizeof(req), sizeof(resp)); + req.hdr.dev_id = mdev->gdma_dev->dev_id; + req.adapter = mdev->adapter_handle; + + err = mana_gd_send_request(mdev_to_gc(mdev), sizeof(req), &req, + sizeof(resp), &resp); + if (err) { + ibdev_err(&mdev->ib_dev, "Failed to query device counters err %d", + err); + return err; + } + + stats->value[MANA_IB_SENT_CNPS] = resp.sent_cnps; + stats->value[MANA_IB_RECEIVED_ECNS] = resp.received_ecns; + stats->value[MANA_IB_RECEIVED_CNP_COUNT] = resp.received_cnp_count; + stats->value[MANA_IB_QP_CONGESTED_EVENTS] = resp.qp_congested_events; + stats->value[MANA_IB_QP_RECOVERED_EVENTS] = resp.qp_recovered_events; + stats->value[MANA_IB_DEV_RATE_INC_EVENTS] = resp.rate_inc_events; + + return ARRAY_SIZE(mana_ib_device_stats_desc); +} + +static int mana_ib_get_hw_port_stats(struct ib_device *ibdev, struct rdma_hw_stats *stats, + u32 port_num) { struct mana_ib_dev *mdev = container_of(ibdev, struct mana_ib_dev, ib_dev); @@ -103,3 +150,12 @@ int mana_ib_get_hw_stats(struct ib_device *ibdev, struct rdma_hw_stats *stats, return ARRAY_SIZE(mana_ib_port_stats_desc); } + +int mana_ib_get_hw_stats(struct ib_device *ibdev, struct rdma_hw_stats *stats, + u32 port_num, int index) +{ + if (!port_num) + return mana_ib_get_hw_device_stats(ibdev, stats); + else + return mana_ib_get_hw_port_stats(ibdev, stats, port_num); +} diff --git a/drivers/infiniband/hw/mana/counters.h b/drivers/infiniband/hw/mana/counters.h index 7ff92d27f6..987a6fee83 100644 --- a/drivers/infiniband/hw/mana/counters.h +++ b/drivers/infiniband/hw/mana/counters.h @@ -37,8 +37,18 @@ enum mana_ib_port_counters { MANA_IB_CURRENT_RATE, }; +enum mana_ib_device_counters { + MANA_IB_SENT_CNPS, + MANA_IB_RECEIVED_ECNS, + MANA_IB_RECEIVED_CNP_COUNT, + MANA_IB_QP_CONGESTED_EVENTS, + MANA_IB_QP_RECOVERED_EVENTS, + MANA_IB_DEV_RATE_INC_EVENTS, +}; + struct rdma_hw_stats *mana_ib_alloc_hw_port_stats(struct ib_device *ibdev, u32 port_num); +struct rdma_hw_stats *mana_ib_alloc_hw_device_stats(struct ib_device *ibdev); int mana_ib_get_hw_stats(struct ib_device *ibdev, struct rdma_hw_stats *stats, u32 port_num, int index); #endif /* _COUNTERS_H_ */ diff --git a/drivers/infiniband/hw/mana/device.c b/drivers/infiniband/hw/mana/device.c index 0eb42466b4..4f83d0f7da 100644 --- a/drivers/infiniband/hw/mana/device.c +++ b/drivers/infiniband/hw/mana/device.c @@ -66,6 +66,10 @@ static const struct ib_device_ops mana_ib_stats_ops = { .get_hw_stats = mana_ib_get_hw_stats, }; +static const struct ib_device_ops mana_ib_device_stats_ops = { + .alloc_hw_device_stats = mana_ib_alloc_hw_device_stats, +}; + static int mana_ib_netdev_event(struct notifier_block *this, unsigned long event, void *ptr) { @@ -154,6 +158,8 @@ static int mana_ib_probe(struct auxiliary_device *adev, } ib_set_device_ops(&dev->ib_dev, &mana_ib_stats_ops); + if (dev->adapter_caps.feature_flags & MANA_IB_FEATURE_DEV_COUNTERS_SUPPORT) + ib_set_device_ops(&dev->ib_dev, &mana_ib_device_stats_ops); ret = mana_ib_create_eqs(dev); if (ret) { diff --git a/drivers/infiniband/hw/mana/mana_ib.h b/drivers/infiniband/hw/mana/mana_ib.h index 42bebd6cd4..eddd0a83b9 100644 --- a/drivers/infiniband/hw/mana/mana_ib.h +++ b/drivers/infiniband/hw/mana/mana_ib.h @@ -210,6 +210,7 @@ enum mana_ib_command_code { MANA_IB_DESTROY_RC_QP = 0x3000b, MANA_IB_SET_QP_STATE = 0x3000d, MANA_IB_QUERY_VF_COUNTERS = 0x30022, + MANA_IB_QUERY_DEVICE_COUNTERS = 0x30023, }; struct mana_ib_query_adapter_caps_req { @@ -218,6 +219,7 @@ struct mana_ib_query_adapter_caps_req { enum mana_ib_adapter_features { MANA_IB_FEATURE_CLIENT_ERROR_CQE_SUPPORT = BIT(4), + MANA_IB_FEATURE_DEV_COUNTERS_SUPPORT = BIT(5), }; struct mana_ib_query_adapter_caps_resp { @@ -516,6 +518,23 @@ struct mana_rnic_query_vf_cntrs_resp { u64 current_rate; }; /* HW Data */ +struct mana_rnic_query_device_cntrs_req { + struct gdma_req_hdr hdr; + mana_handle_t adapter; +}; /* HW Data */ + +struct mana_rnic_query_device_cntrs_resp { + struct gdma_resp_hdr hdr; + u32 sent_cnps; + u32 received_ecns; + u32 reserved1; + u32 received_cnp_count; + u32 qp_congested_events; + u32 qp_recovered_events; + u32 rate_inc_events; + u32 reserved2; +}; /* HW Data */ + static inline struct gdma_context *mdev_to_gc(struct mana_ib_dev *mdev) { return mdev->gdma_dev->gdma_context; diff --git a/drivers/infiniband/hw/mana/qp.c b/drivers/infiniband/hw/mana/qp.c index 14fd7d6c54..a6bf4d539e 100644 --- a/drivers/infiniband/hw/mana/qp.c +++ b/drivers/infiniband/hw/mana/qp.c @@ -772,7 +772,7 @@ static int mana_ib_gd_modify_qp(struct ib_qp *ibqp, struct ib_qp_attr *attr, req.ah_attr.dest_port = ROCE_V2_UDP_DPORT; req.ah_attr.src_port = rdma_get_udp_sport(attr->ah_attr.grh.flow_label, ibqp->qp_num, attr->dest_qp_num); - req.ah_attr.traffic_class = attr->ah_attr.grh.traffic_class; + req.ah_attr.traffic_class = attr->ah_attr.grh.traffic_class >> 2; req.ah_attr.hop_limit = attr->ah_attr.grh.hop_limit; } diff --git a/drivers/net/ethernet/ibm/ibmveth.c b/drivers/net/ethernet/ibm/ibmveth.c index 04192190be..7f94e84d09 100644 --- a/drivers/net/ethernet/ibm/ibmveth.c +++ b/drivers/net/ethernet/ibm/ibmveth.c @@ -39,8 +39,6 @@ #include "ibmveth.h" static irqreturn_t ibmveth_interrupt(int irq, void *dev_instance); -static void ibmveth_rxq_harvest_buffer(struct ibmveth_adapter *adapter, - bool reuse); static unsigned long ibmveth_get_desired_dma(struct vio_dev *vdev); static struct kobj_type ktype_veth_pool; @@ -213,95 +211,170 @@ static inline void ibmveth_flush_buffer(void *addr, unsigned long length) static void ibmveth_replenish_buffer_pool(struct ibmveth_adapter *adapter, struct ibmveth_buff_pool *pool) { - u32 i; - u32 count = pool->size - atomic_read(&pool->available); - u32 buffers_added = 0; - struct sk_buff *skb; - unsigned int free_index, index; - u64 correlator; + union ibmveth_buf_desc descs[IBMVETH_MAX_RX_PER_HCALL] = {0}; + u32 remaining = pool->size - atomic_read(&pool->available); + u64 correlators[IBMVETH_MAX_RX_PER_HCALL] = {0}; unsigned long lpar_rc; + u32 buffers_added = 0; + u32 i, filled, batch; + struct vio_dev *vdev; dma_addr_t dma_addr; + struct device *dev; + u32 index; + + vdev = adapter->vdev; + dev = &vdev->dev; mb(); - for (i = 0; i < count; ++i) { - union ibmveth_buf_desc desc; + batch = adapter->rx_buffers_per_hcall; - free_index = pool->consumer_index; - index = pool->free_map[free_index]; - skb = NULL; + while (remaining > 0) { + unsigned int free_index = pool->consumer_index; - BUG_ON(index == IBM_VETH_INVALID_MAP); + /* Fill a batch of descriptors */ + for (filled = 0; filled < min(remaining, batch); filled++) { + index = pool->free_map[free_index]; + if (WARN_ON(index == IBM_VETH_INVALID_MAP)) { + adapter->replenish_add_buff_failure++; + netdev_info(adapter->netdev, + "Invalid map index %u, reset\n", + index); + schedule_work(&adapter->work); + break; + } - /* are we allocating a new buffer or recycling an old one */ - if (pool->skbuff[index]) - goto reuse; + if (!pool->skbuff[index]) { + struct sk_buff *skb = NULL; - skb = netdev_alloc_skb(adapter->netdev, pool->buff_size); + skb = netdev_alloc_skb(adapter->netdev, + pool->buff_size); + if (!skb) { + adapter->replenish_no_mem++; + adapter->replenish_add_buff_failure++; + break; + } - if (!skb) { - netdev_dbg(adapter->netdev, - "replenish: unable to allocate skb\n"); - adapter->replenish_no_mem++; + dma_addr = dma_map_single(dev, skb->data, + pool->buff_size, + DMA_FROM_DEVICE); + if (dma_mapping_error(dev, dma_addr)) { + dev_kfree_skb_any(skb); + adapter->replenish_add_buff_failure++; + break; + } + + pool->dma_addr[index] = dma_addr; + pool->skbuff[index] = skb; + } else { + /* re-use case */ + dma_addr = pool->dma_addr[index]; + } + + if (rx_flush) { + unsigned int len; + + len = adapter->netdev->mtu + IBMVETH_BUFF_OH; + len = min(pool->buff_size, len); + ibmveth_flush_buffer(pool->skbuff[index]->data, + len); + } + + descs[filled].fields.flags_len = IBMVETH_BUF_VALID | + pool->buff_size; + descs[filled].fields.address = dma_addr; + + correlators[filled] = ((u64)pool->index << 32) | index; + *(u64 *)pool->skbuff[index]->data = correlators[filled]; + + free_index++; + if (free_index >= pool->size) + free_index = 0; + } + + if (!filled) break; - } - - dma_addr = dma_map_single(&adapter->vdev->dev, skb->data, - pool->buff_size, DMA_FROM_DEVICE); - - if (dma_mapping_error(&adapter->vdev->dev, dma_addr)) - goto failure; - - pool->dma_addr[index] = dma_addr; - pool->skbuff[index] = skb; - - if (rx_flush) { - unsigned int len = min(pool->buff_size, - adapter->netdev->mtu + - IBMVETH_BUFF_OH); - ibmveth_flush_buffer(skb->data, len); - } -reuse: - dma_addr = pool->dma_addr[index]; - desc.fields.flags_len = IBMVETH_BUF_VALID | pool->buff_size; - desc.fields.address = dma_addr; - - correlator = ((u64)pool->index << 32) | index; - *(u64 *)pool->skbuff[index]->data = correlator; - - lpar_rc = h_add_logical_lan_buffer(adapter->vdev->unit_address, - desc.desc); + /* single buffer case*/ + if (filled == 1) + lpar_rc = h_add_logical_lan_buffer(vdev->unit_address, + descs[0].desc); + else + /* Multi-buffer hcall */ + lpar_rc = h_add_logical_lan_buffers(vdev->unit_address, + descs[0].desc, + descs[1].desc, + descs[2].desc, + descs[3].desc, + descs[4].desc, + descs[5].desc, + descs[6].desc, + descs[7].desc); if (lpar_rc != H_SUCCESS) { - netdev_warn(adapter->netdev, - "%sadd_logical_lan failed %lu\n", - skb ? "" : "When recycling: ", lpar_rc); - goto failure; + dev_warn_ratelimited(dev, + "RX h_add_logical_lan failed: filled=%u, rc=%lu, batch=%u\n", + filled, lpar_rc, batch); + goto hcall_failure; } - pool->free_map[free_index] = IBM_VETH_INVALID_MAP; - pool->consumer_index++; - if (pool->consumer_index >= pool->size) - pool->consumer_index = 0; + /* Only update pool state after hcall succeeds */ + for (i = 0; i < filled; i++) { + free_index = pool->consumer_index; + pool->free_map[free_index] = IBM_VETH_INVALID_MAP; - buffers_added++; - adapter->replenish_add_buff_success++; + pool->consumer_index++; + if (pool->consumer_index >= pool->size) + pool->consumer_index = 0; + } + + buffers_added += filled; + adapter->replenish_add_buff_success += filled; + remaining -= filled; + + memset(&descs, 0, sizeof(descs)); + memset(&correlators, 0, sizeof(correlators)); + continue; + +hcall_failure: + for (i = 0; i < filled; i++) { + index = correlators[i] & 0xffffffffUL; + dma_addr = pool->dma_addr[index]; + + if (pool->skbuff[index]) { + if (dma_addr && + !dma_mapping_error(dev, dma_addr)) + dma_unmap_single(dev, dma_addr, + pool->buff_size, + DMA_FROM_DEVICE); + + dev_kfree_skb_any(pool->skbuff[index]); + pool->skbuff[index] = NULL; + } + } + adapter->replenish_add_buff_failure += filled; + + /* + * If multi rx buffers hcall is no longer supported by FW + * e.g. in the case of Live Parttion Migration + */ + if (batch > 1 && lpar_rc == H_FUNCTION) { + /* + * Instead of retry submit single buffer individually + * here just set the max rx buffer per hcall to 1 + * buffers will be respleshed next time + * when ibmveth_replenish_buffer_pool() is called again + * with single-buffer case + */ + netdev_info(adapter->netdev, + "RX Multi buffers not supported by FW, rc=%lu\n", + lpar_rc); + adapter->rx_buffers_per_hcall = 1; + netdev_info(adapter->netdev, + "Next rx replesh will fall back to single-buffer hcall\n"); + } + break; } - mb(); - atomic_add(buffers_added, &(pool->available)); - return; - -failure: - - if (dma_addr && !dma_mapping_error(&adapter->vdev->dev, dma_addr)) - dma_unmap_single(&adapter->vdev->dev, - pool->dma_addr[index], pool->buff_size, - DMA_FROM_DEVICE); - dev_kfree_skb_any(pool->skbuff[index]); - pool->skbuff[index] = NULL; - adapter->replenish_add_buff_failure++; - mb(); atomic_add(buffers_added, &(pool->available)); } @@ -370,20 +443,36 @@ static void ibmveth_free_buffer_pool(struct ibmveth_adapter *adapter, } } -/* remove a buffer from a pool */ -static void ibmveth_remove_buffer_from_pool(struct ibmveth_adapter *adapter, - u64 correlator, bool reuse) +/** + * ibmveth_remove_buffer_from_pool - remove a buffer from a pool + * @adapter: adapter instance + * @correlator: identifies pool and index + * @reuse: whether to reuse buffer + * + * Return: + * * %0 - success + * * %-EINVAL - correlator maps to pool or index out of range + * * %-EFAULT - pool and index map to null skb + */ +static int ibmveth_remove_buffer_from_pool(struct ibmveth_adapter *adapter, + u64 correlator, bool reuse) { unsigned int pool = correlator >> 32; unsigned int index = correlator & 0xffffffffUL; unsigned int free_index; struct sk_buff *skb; - BUG_ON(pool >= IBMVETH_NUM_BUFF_POOLS); - BUG_ON(index >= adapter->rx_buff_pool[pool].size); + if (WARN_ON(pool >= IBMVETH_NUM_BUFF_POOLS) || + WARN_ON(index >= adapter->rx_buff_pool[pool].size)) { + schedule_work(&adapter->work); + return -EINVAL; + } skb = adapter->rx_buff_pool[pool].skbuff[index]; - BUG_ON(skb == NULL); + if (WARN_ON(!skb)) { + schedule_work(&adapter->work); + return -EFAULT; + } /* if we are going to reuse the buffer then keep the pointers around * but mark index as available. replenish will see the skb pointer and @@ -411,6 +500,8 @@ static void ibmveth_remove_buffer_from_pool(struct ibmveth_adapter *adapter, mb(); atomic_dec(&(adapter->rx_buff_pool[pool].available)); + + return 0; } /* get the current buffer on the rx queue */ @@ -420,24 +511,44 @@ static inline struct sk_buff *ibmveth_rxq_get_buffer(struct ibmveth_adapter *ada unsigned int pool = correlator >> 32; unsigned int index = correlator & 0xffffffffUL; - BUG_ON(pool >= IBMVETH_NUM_BUFF_POOLS); - BUG_ON(index >= adapter->rx_buff_pool[pool].size); + if (WARN_ON(pool >= IBMVETH_NUM_BUFF_POOLS) || + WARN_ON(index >= adapter->rx_buff_pool[pool].size)) { + schedule_work(&adapter->work); + return NULL; + } return adapter->rx_buff_pool[pool].skbuff[index]; } -static void ibmveth_rxq_harvest_buffer(struct ibmveth_adapter *adapter, - bool reuse) +/** + * ibmveth_rxq_harvest_buffer - Harvest buffer from pool + * + * @adapter: pointer to adapter + * @reuse: whether to reuse buffer + * + * Context: called from ibmveth_poll + * + * Return: + * * %0 - success + * * other - non-zero return from ibmveth_remove_buffer_from_pool + */ +static int ibmveth_rxq_harvest_buffer(struct ibmveth_adapter *adapter, + bool reuse) { u64 cor; + int rc; cor = adapter->rx_queue.queue_addr[adapter->rx_queue.index].correlator; - ibmveth_remove_buffer_from_pool(adapter, cor, reuse); + rc = ibmveth_remove_buffer_from_pool(adapter, cor, reuse); + if (unlikely(rc)) + return rc; if (++adapter->rx_queue.index == adapter->rx_queue.num_slots) { adapter->rx_queue.index = 0; adapter->rx_queue.toggle = !adapter->rx_queue.toggle; } + + return 0; } static void ibmveth_free_tx_ltb(struct ibmveth_adapter *adapter, int idx) @@ -709,6 +820,35 @@ static int ibmveth_close(struct net_device *netdev) return 0; } +/** + * ibmveth_reset - Handle scheduled reset work + * + * @w: pointer to work_struct embedded in adapter structure + * + * Context: This routine acquires rtnl_mutex and disables its NAPI through + * ibmveth_close. It can't be called directly in a context that has + * already acquired rtnl_mutex or disabled its NAPI, or directly from + * a poll routine. + * + * Return: void + */ +static void ibmveth_reset(struct work_struct *w) +{ + struct ibmveth_adapter *adapter = container_of(w, struct ibmveth_adapter, work); + struct net_device *netdev = adapter->netdev; + + netdev_dbg(netdev, "reset starting\n"); + + rtnl_lock(); + + dev_close(adapter->netdev); + dev_open(adapter->netdev, NULL); + + rtnl_unlock(); + + netdev_dbg(netdev, "reset complete\n"); +} + static int ibmveth_set_link_ksettings(struct net_device *dev, const struct ethtool_link_ksettings *cmd) { @@ -1324,7 +1464,8 @@ restart_poll: wmb(); /* suggested by larson1 */ adapter->rx_invalid_buffer++; netdev_dbg(netdev, "recycling invalid buffer\n"); - ibmveth_rxq_harvest_buffer(adapter, true); + if (unlikely(ibmveth_rxq_harvest_buffer(adapter, true))) + break; } else { struct sk_buff *skb, *new_skb; int length = ibmveth_rxq_frame_length(adapter); @@ -1334,6 +1475,8 @@ restart_poll: __sum16 iph_check = 0; skb = ibmveth_rxq_get_buffer(adapter); + if (unlikely(!skb)) + break; /* if the large packet bit is set in the rx queue * descriptor, the mss will be written by PHYP eight @@ -1357,10 +1500,12 @@ restart_poll: if (rx_flush) ibmveth_flush_buffer(skb->data, length + offset); - ibmveth_rxq_harvest_buffer(adapter, true); + if (unlikely(ibmveth_rxq_harvest_buffer(adapter, true))) + break; skb = new_skb; } else { - ibmveth_rxq_harvest_buffer(adapter, false); + if (unlikely(ibmveth_rxq_harvest_buffer(adapter, false))) + break; skb_reserve(skb, offset); } @@ -1407,7 +1552,10 @@ restart_poll: * then check once more to make sure we are done. */ lpar_rc = h_vio_signal(adapter->vdev->unit_address, VIO_IRQ_ENABLE); - BUG_ON(lpar_rc != H_SUCCESS); + if (WARN_ON(lpar_rc != H_SUCCESS)) { + schedule_work(&adapter->work); + goto out; + } if (ibmveth_rxq_pending_buffer(adapter) && napi_schedule(napi)) { lpar_rc = h_vio_signal(adapter->vdev->unit_address, @@ -1428,7 +1576,7 @@ static irqreturn_t ibmveth_interrupt(int irq, void *dev_instance) if (napi_schedule_prep(&adapter->napi)) { lpar_rc = h_vio_signal(adapter->vdev->unit_address, VIO_IRQ_DISABLE); - BUG_ON(lpar_rc != H_SUCCESS); + WARN_ON(lpar_rc != H_SUCCESS); __napi_schedule(&adapter->napi); } return IRQ_HANDLED; @@ -1670,6 +1818,7 @@ static int ibmveth_probe(struct vio_dev *dev, const struct vio_device_id *id) adapter->vdev = dev; adapter->netdev = netdev; + INIT_WORK(&adapter->work, ibmveth_reset); adapter->mcastFilterSize = be32_to_cpu(*mcastFilterSize_p); ibmveth_init_link_settings(netdev); @@ -1705,6 +1854,19 @@ static int ibmveth_probe(struct vio_dev *dev, const struct vio_device_id *id) netdev->features |= NETIF_F_FRAGLIST; } + if (ret == H_SUCCESS && + (ret_attr & IBMVETH_ILLAN_RX_MULTI_BUFF_SUPPORT)) { + adapter->rx_buffers_per_hcall = IBMVETH_MAX_RX_PER_HCALL; + netdev_dbg(netdev, + "RX Multi-buffer hcall supported by FW, batch set to %u\n", + adapter->rx_buffers_per_hcall); + } else { + adapter->rx_buffers_per_hcall = 1; + netdev_dbg(netdev, + "RX Single-buffer hcall mode, batch set to %u\n", + adapter->rx_buffers_per_hcall); + } + netdev->min_mtu = IBMVETH_MIN_MTU; netdev->max_mtu = ETH_MAX_MTU - IBMVETH_BUFF_OH; @@ -1762,6 +1924,8 @@ static void ibmveth_remove(struct vio_dev *dev) struct ibmveth_adapter *adapter = netdev_priv(netdev); int i; + cancel_work_sync(&adapter->work); + for (i = 0; i < IBMVETH_NUM_BUFF_POOLS; i++) kobject_put(&adapter->rx_buff_pool[i].kobj); diff --git a/drivers/net/ethernet/ibm/ibmveth.h b/drivers/net/ethernet/ibm/ibmveth.h index 8468e2c59d..47b9051840 100644 --- a/drivers/net/ethernet/ibm/ibmveth.h +++ b/drivers/net/ethernet/ibm/ibmveth.h @@ -28,6 +28,7 @@ #define IbmVethMcastRemoveFilter 0x2UL #define IbmVethMcastClearFilterTable 0x3UL +#define IBMVETH_ILLAN_RX_MULTI_BUFF_SUPPORT 0x0000000000040000UL #define IBMVETH_ILLAN_LRG_SR_ENABLED 0x0000000000010000UL #define IBMVETH_ILLAN_LRG_SND_SUPPORT 0x0000000000008000UL #define IBMVETH_ILLAN_PADDED_PKT_CSUM 0x0000000000002000UL @@ -46,6 +47,24 @@ #define h_add_logical_lan_buffer(ua, buf) \ plpar_hcall_norets(H_ADD_LOGICAL_LAN_BUFFER, ua, buf) +static inline long h_add_logical_lan_buffers(unsigned long unit_address, + unsigned long desc1, + unsigned long desc2, + unsigned long desc3, + unsigned long desc4, + unsigned long desc5, + unsigned long desc6, + unsigned long desc7, + unsigned long desc8) +{ + unsigned long retbuf[PLPAR_HCALL9_BUFSIZE]; + + return plpar_hcall9(H_ADD_LOGICAL_LAN_BUFFERS, + retbuf, unit_address, + desc1, desc2, desc3, desc4, + desc5, desc6, desc7, desc8); +} + /* FW allows us to send 6 descriptors but we only use one so mark * the other 5 as unused (0) */ @@ -101,6 +120,7 @@ static inline long h_illan_attributes(unsigned long unit_address, #define IBMVETH_MAX_TX_BUF_SIZE (1024 * 64) #define IBMVETH_MAX_QUEUES 16U #define IBMVETH_DEFAULT_QUEUES 8U +#define IBMVETH_MAX_RX_PER_HCALL 8U static int pool_size[] = { 512, 1024 * 2, 1024 * 16, 1024 * 32, 1024 * 64 }; static int pool_count[] = { 256, 512, 256, 256, 256 }; @@ -137,6 +157,7 @@ struct ibmveth_adapter { struct vio_dev *vdev; struct net_device *netdev; struct napi_struct napi; + struct work_struct work; unsigned int mcastFilterSize; void * buffer_list_addr; void * filter_list_addr; @@ -150,6 +171,7 @@ struct ibmveth_adapter { int rx_csum; int large_send; bool is_active_trunk; + unsigned int rx_buffers_per_hcall; u64 fw_ipv6_csum_support; u64 fw_ipv4_csum_support; diff --git a/drivers/net/ethernet/ibm/ibmvnic.c b/drivers/net/ethernet/ibm/ibmvnic.c index 3ffc3a7c0e..725db9e2de 100644 --- a/drivers/net/ethernet/ibm/ibmvnic.c +++ b/drivers/net/ethernet/ibm/ibmvnic.c @@ -547,6 +547,17 @@ static void deactivate_rx_pools(struct ibmvnic_adapter *adapter) adapter->rx_pool[i].active = 0; } +static void ibmvnic_set_safe_max_ind_descs(struct ibmvnic_adapter *adapter) +{ + if (adapter->cur_max_ind_descs > IBMVNIC_SAFE_IND_DESC) { + netdev_info(adapter->netdev, + "set max ind descs from %u to safe limit %u\n", + adapter->cur_max_ind_descs, + IBMVNIC_SAFE_IND_DESC); + adapter->cur_max_ind_descs = IBMVNIC_SAFE_IND_DESC; + } +} + static void replenish_rx_pool(struct ibmvnic_adapter *adapter, struct ibmvnic_rx_pool *pool) { @@ -633,7 +644,7 @@ static void replenish_rx_pool(struct ibmvnic_adapter *adapter, sub_crq->rx_add.len = cpu_to_be32(pool->buff_size << shift); /* if send_subcrq_indirect queue is full, flush to VIOS */ - if (ind_bufp->index == IBMVNIC_MAX_IND_DESCS || + if (ind_bufp->index == adapter->cur_max_ind_descs || i == count - 1) { lpar_rc = send_subcrq_indirect(adapter, handle, @@ -652,6 +663,14 @@ static void replenish_rx_pool(struct ibmvnic_adapter *adapter, failure: if (lpar_rc != H_PARAMETER && lpar_rc != H_CLOSED) dev_err_ratelimited(dev, "rx: replenish packet buffer failed\n"); + + /* Detect platform limit H_PARAMETER */ + if (lpar_rc == H_PARAMETER) + ibmvnic_set_safe_max_ind_descs(adapter); + + /* For all error case, temporarily drop only this batch + * Rely on TCP/IP retransmissions to retry and recover + */ for (i = ind_bufp->index - 1; i >= 0; --i) { struct ibmvnic_rx_buff *rx_buff; @@ -2103,9 +2122,7 @@ static void ibmvnic_tx_scrq_clean_buffer(struct ibmvnic_adapter *adapter, tx_pool->num_buffers - 1 : tx_pool->consumer_index - 1; tx_buff = &tx_pool->tx_buff[index]; - adapter->netdev->stats.tx_packets--; - adapter->netdev->stats.tx_bytes -= tx_buff->skb->len; - adapter->tx_stats_buffers[queue_num].packets--; + adapter->tx_stats_buffers[queue_num].batched_packets--; adapter->tx_stats_buffers[queue_num].bytes -= tx_buff->skb->len; dev_kfree_skb_any(tx_buff->skb); @@ -2174,16 +2191,28 @@ static int ibmvnic_tx_scrq_flush(struct ibmvnic_adapter *adapter, rc = send_subcrq_direct(adapter, handle, (u64 *)ind_bufp->indir_arr); - if (rc) + if (rc) { + dev_err_ratelimited(&adapter->vdev->dev, + "tx_flush failed, rc=%u (%llu entries dma=%pad handle=%llx)\n", + rc, entries, &dma_addr, handle); + /* Detect platform limit H_PARAMETER */ + if (rc == H_PARAMETER) + ibmvnic_set_safe_max_ind_descs(adapter); + + /* For all error case, temporarily drop only this batch + * Rely on TCP/IP retransmissions to retry and recover + */ ibmvnic_tx_scrq_clean_buffer(adapter, tx_scrq); - else + } else { ind_bufp->index = 0; + } return rc; } static netdev_tx_t ibmvnic_xmit(struct sk_buff *skb, struct net_device *netdev) { struct ibmvnic_adapter *adapter = netdev_priv(netdev); + u32 cur_max_ind_descs = adapter->cur_max_ind_descs; int queue_num = skb_get_queue_mapping(skb); u8 *hdrs = (u8 *)&adapter->tx_rx_desc_req; struct device *dev = &adapter->vdev->dev; @@ -2196,7 +2225,8 @@ static netdev_tx_t ibmvnic_xmit(struct sk_buff *skb, struct net_device *netdev) unsigned int tx_map_failed = 0; union sub_crq indir_arr[16]; unsigned int tx_dropped = 0; - unsigned int tx_packets = 0; + unsigned int tx_dpackets = 0; + unsigned int tx_bpackets = 0; unsigned int tx_bytes = 0; dma_addr_t data_dma_addr; struct netdev_queue *txq; @@ -2370,6 +2400,7 @@ static netdev_tx_t ibmvnic_xmit(struct sk_buff *skb, struct net_device *netdev) if (lpar_rc != H_SUCCESS) goto tx_err; + tx_dpackets++; goto early_exit; } @@ -2379,7 +2410,7 @@ static netdev_tx_t ibmvnic_xmit(struct sk_buff *skb, struct net_device *netdev) tx_crq.v1.n_crq_elem = num_entries; tx_buff->num_entries = num_entries; /* flush buffer if current entry can not fit */ - if (num_entries + ind_bufp->index > IBMVNIC_MAX_IND_DESCS) { + if (num_entries + ind_bufp->index > cur_max_ind_descs) { lpar_rc = ibmvnic_tx_scrq_flush(adapter, tx_scrq, true); if (lpar_rc != H_SUCCESS) goto tx_flush_err; @@ -2392,11 +2423,12 @@ static netdev_tx_t ibmvnic_xmit(struct sk_buff *skb, struct net_device *netdev) ind_bufp->index += num_entries; if (__netdev_tx_sent_queue(txq, skb->len, netdev_xmit_more() && - ind_bufp->index < IBMVNIC_MAX_IND_DESCS)) { + ind_bufp->index < cur_max_ind_descs)) { lpar_rc = ibmvnic_tx_scrq_flush(adapter, tx_scrq, true); if (lpar_rc != H_SUCCESS) goto tx_err; } + tx_bpackets++; early_exit: if (atomic_add_return(num_entries, &tx_scrq->used) @@ -2405,7 +2437,6 @@ early_exit: netif_stop_subqueue(netdev, queue_num); } - tx_packets++; tx_bytes += skblen; txq_trans_cond_update(txq); ret = NETDEV_TX_OK; @@ -2433,12 +2464,10 @@ tx_err: } out: rcu_read_unlock(); - netdev->stats.tx_dropped += tx_dropped; - netdev->stats.tx_bytes += tx_bytes; - netdev->stats.tx_packets += tx_packets; adapter->tx_send_failed += tx_send_failed; adapter->tx_map_failed += tx_map_failed; - adapter->tx_stats_buffers[queue_num].packets += tx_packets; + adapter->tx_stats_buffers[queue_num].batched_packets += tx_bpackets; + adapter->tx_stats_buffers[queue_num].direct_packets += tx_dpackets; adapter->tx_stats_buffers[queue_num].bytes += tx_bytes; adapter->tx_stats_buffers[queue_num].dropped_packets += tx_dropped; @@ -3237,6 +3266,25 @@ err: return -ret; } +static void ibmvnic_get_stats64(struct net_device *netdev, + struct rtnl_link_stats64 *stats) +{ + struct ibmvnic_adapter *adapter = netdev_priv(netdev); + int i; + + for (i = 0; i < adapter->req_rx_queues; i++) { + stats->rx_packets += adapter->rx_stats_buffers[i].packets; + stats->rx_bytes += adapter->rx_stats_buffers[i].bytes; + } + + for (i = 0; i < adapter->req_tx_queues; i++) { + stats->tx_packets += adapter->tx_stats_buffers[i].batched_packets; + stats->tx_packets += adapter->tx_stats_buffers[i].direct_packets; + stats->tx_bytes += adapter->tx_stats_buffers[i].bytes; + stats->tx_dropped += adapter->tx_stats_buffers[i].dropped_packets; + } +} + static void ibmvnic_tx_timeout(struct net_device *dev, unsigned int txqueue) { struct ibmvnic_adapter *adapter = netdev_priv(dev); @@ -3352,8 +3400,6 @@ restart_poll: length = skb->len; napi_gro_receive(napi, skb); /* send it up */ - netdev->stats.rx_packets++; - netdev->stats.rx_bytes += length; adapter->rx_stats_buffers[scrq_num].packets++; adapter->rx_stats_buffers[scrq_num].bytes += length; frames_processed++; @@ -3463,6 +3509,7 @@ static const struct net_device_ops ibmvnic_netdev_ops = { .ndo_set_rx_mode = ibmvnic_set_multi, .ndo_set_mac_address = ibmvnic_set_mac, .ndo_validate_addr = eth_validate_addr, + .ndo_get_stats64 = ibmvnic_get_stats64, .ndo_tx_timeout = ibmvnic_tx_timeout, .ndo_change_mtu = ibmvnic_change_mtu, .ndo_features_check = ibmvnic_features_check, @@ -3627,7 +3674,10 @@ static void ibmvnic_get_strings(struct net_device *dev, u32 stringset, u8 *data) memcpy(data, ibmvnic_stats[i].name, ETH_GSTRING_LEN); for (i = 0; i < adapter->req_tx_queues; i++) { - snprintf(data, ETH_GSTRING_LEN, "tx%d_packets", i); + snprintf(data, ETH_GSTRING_LEN, "tx%d_batched_packets", i); + data += ETH_GSTRING_LEN; + + snprintf(data, ETH_GSTRING_LEN, "tx%d_direct_packets", i); data += ETH_GSTRING_LEN; snprintf(data, ETH_GSTRING_LEN, "tx%d_bytes", i); @@ -3705,7 +3755,9 @@ static void ibmvnic_get_ethtool_stats(struct net_device *dev, (adapter, ibmvnic_stats[i].offset)); for (j = 0; j < adapter->req_tx_queues; j++) { - data[i] = adapter->tx_stats_buffers[j].packets; + data[i] = adapter->tx_stats_buffers[j].batched_packets; + i++; + data[i] = adapter->tx_stats_buffers[j].direct_packets; i++; data[i] = adapter->tx_stats_buffers[j].bytes; i++; @@ -3844,7 +3896,7 @@ static void release_sub_crq_queue(struct ibmvnic_adapter *adapter, } dma_free_coherent(dev, - IBMVNIC_IND_ARR_SZ, + IBMVNIC_IND_MAX_ARR_SZ, scrq->ind_buf.indir_arr, scrq->ind_buf.indir_dma); @@ -3901,7 +3953,7 @@ static struct ibmvnic_sub_crq_queue *init_sub_crq_queue(struct ibmvnic_adapter scrq->ind_buf.indir_arr = dma_alloc_coherent(dev, - IBMVNIC_IND_ARR_SZ, + IBMVNIC_IND_MAX_ARR_SZ, &scrq->ind_buf.indir_dma, GFP_KERNEL); @@ -6206,6 +6258,19 @@ static int ibmvnic_reset_init(struct ibmvnic_adapter *adapter, bool reset) rc = reset_sub_crq_queues(adapter); } } else { + if (adapter->reset_reason == VNIC_RESET_MOBILITY) { + /* After an LPM, reset the max number of indirect + * subcrq descriptors per H_SEND_SUB_CRQ_INDIRECT + * hcall to the default max (e.g POWER8 -> POWER10) + * + * If the new destination platform does not support + * the higher limit max (e.g. POWER10-> POWER8 LPM) + * H_PARAMETER will trigger automatic fallback to the + * safe minimum limit. + */ + adapter->cur_max_ind_descs = IBMVNIC_MAX_IND_DESCS; + } + rc = init_sub_crqs(adapter); } @@ -6357,6 +6422,7 @@ static int ibmvnic_probe(struct vio_dev *dev, const struct vio_device_id *id) adapter->wait_for_reset = false; adapter->last_reset_time = jiffies; + adapter->cur_max_ind_descs = IBMVNIC_MAX_IND_DESCS; rc = register_netdev(netdev); if (rc) { diff --git a/drivers/net/ethernet/ibm/ibmvnic.h b/drivers/net/ethernet/ibm/ibmvnic.h index a145c0d11f..08845e2a7c 100644 --- a/drivers/net/ethernet/ibm/ibmvnic.h +++ b/drivers/net/ethernet/ibm/ibmvnic.h @@ -29,8 +29,9 @@ #define IBMVNIC_BUFFS_PER_POOL 100 #define IBMVNIC_MAX_QUEUES 16 #define IBMVNIC_MAX_QUEUE_SZ 4096 -#define IBMVNIC_MAX_IND_DESCS 16 -#define IBMVNIC_IND_ARR_SZ (IBMVNIC_MAX_IND_DESCS * 32) +#define IBMVNIC_MAX_IND_DESCS 128 +#define IBMVNIC_SAFE_IND_DESC 16 +#define IBMVNIC_IND_MAX_ARR_SZ (IBMVNIC_MAX_IND_DESCS * 32) #define IBMVNIC_TSO_BUF_SZ 65536 #define IBMVNIC_TSO_BUFS 64 @@ -175,20 +176,25 @@ struct ibmvnic_statistics { u8 reserved[72]; } __packed __aligned(8); -#define NUM_TX_STATS 3 struct ibmvnic_tx_queue_stats { - u64 packets; + u64 batched_packets; + u64 direct_packets; u64 bytes; u64 dropped_packets; }; -#define NUM_RX_STATS 3 +#define NUM_TX_STATS \ + (sizeof(struct ibmvnic_tx_queue_stats) / sizeof(u64)) + struct ibmvnic_rx_queue_stats { u64 packets; u64 bytes; u64 interrupts; }; +#define NUM_RX_STATS \ + (sizeof(struct ibmvnic_rx_queue_stats) / sizeof(u64)) + struct ibmvnic_acl_buffer { __be32 len; __be32 version; @@ -885,6 +891,7 @@ struct ibmvnic_adapter { dma_addr_t ip_offload_ctrl_tok; u32 msg_enable; u32 priv_flags; + u32 cur_max_ind_descs; /* Vital Product Data (VPD) */ struct ibmvnic_vpd *vpd; diff --git a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c index bbeb93d84b..85f93eb0ae 100644 --- a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c +++ b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c @@ -448,7 +448,7 @@ static void i40e_config_irq_link_list(struct i40e_vf *vf, u16 vsi_id, (qtype << I40E_QINT_RQCTL_NEXTQ_TYPE_SHIFT) | (pf_queue_id << I40E_QINT_RQCTL_NEXTQ_INDX_SHIFT) | BIT(I40E_QINT_RQCTL_CAUSE_ENA_SHIFT) | - (itr_idx << I40E_QINT_RQCTL_ITR_INDX_SHIFT); + FIELD_PREP(I40E_QINT_RQCTL_ITR_INDX_MASK, itr_idx); wr32(hw, reg_idx, reg); } @@ -653,6 +653,13 @@ static int i40e_config_vsi_tx_queue(struct i40e_vf *vf, u16 vsi_id, /* only set the required fields */ tx_ctx.base = info->dma_ring_addr / 128; + + /* ring_len has to be multiple of 8 */ + if (!IS_ALIGNED(info->ring_len, 8) || + info->ring_len > I40E_MAX_NUM_DESCRIPTORS_XL710) { + ret = -EINVAL; + goto error_context; + } tx_ctx.qlen = info->ring_len; tx_ctx.rdylist = le16_to_cpu(vsi->info.qs_handle[0]); tx_ctx.rdylist_act = 0; @@ -716,6 +723,13 @@ static int i40e_config_vsi_rx_queue(struct i40e_vf *vf, u16 vsi_id, /* only set the required fields */ rx_ctx.base = info->dma_ring_addr / 128; + + /* ring_len has to be multiple of 32 */ + if (!IS_ALIGNED(info->ring_len, 32) || + info->ring_len > I40E_MAX_NUM_DESCRIPTORS_XL710) { + ret = -EINVAL; + goto error_param; + } rx_ctx.qlen = info->ring_len; if (info->splithdr_enabled) { @@ -1453,6 +1467,7 @@ static void i40e_trigger_vf_reset(struct i40e_vf *vf, bool flr) * functions that may still be running at this point. */ clear_bit(I40E_VF_STATE_INIT, &vf->vf_states); + clear_bit(I40E_VF_STATE_RESOURCES_LOADED, &vf->vf_states); /* In the case of a VFLR, the HW has already reset the VF and we * just need to clean up, so don't hit the VFRTRIG register. @@ -2119,7 +2134,10 @@ static int i40e_vc_get_vf_resources_msg(struct i40e_vf *vf, u8 *msg) size_t len = 0; int ret; - if (!i40e_sync_vf_state(vf, I40E_VF_STATE_INIT)) { + i40e_sync_vf_state(vf, I40E_VF_STATE_INIT); + + if (!test_bit(I40E_VF_STATE_INIT, &vf->vf_states) || + test_bit(I40E_VF_STATE_RESOURCES_LOADED, &vf->vf_states)) { aq_ret = -EINVAL; goto err; } @@ -2222,6 +2240,7 @@ static int i40e_vc_get_vf_resources_msg(struct i40e_vf *vf, u8 *msg) vf->default_lan_addr.addr); } set_bit(I40E_VF_STATE_ACTIVE, &vf->vf_states); + set_bit(I40E_VF_STATE_RESOURCES_LOADED, &vf->vf_states); err: /* send the response back to the VF */ @@ -2384,7 +2403,7 @@ static int i40e_vc_config_queues_msg(struct i40e_vf *vf, u8 *msg) } if (vf->adq_enabled) { - if (idx >= ARRAY_SIZE(vf->ch)) { + if (idx >= vf->num_tc) { aq_ret = -ENODEV; goto error_param; } @@ -2405,7 +2424,7 @@ static int i40e_vc_config_queues_msg(struct i40e_vf *vf, u8 *msg) * to its appropriate VSIs based on TC mapping */ if (vf->adq_enabled) { - if (idx >= ARRAY_SIZE(vf->ch)) { + if (idx >= vf->num_tc) { aq_ret = -ENODEV; goto error_param; } @@ -2455,8 +2474,10 @@ static int i40e_validate_queue_map(struct i40e_vf *vf, u16 vsi_id, u16 vsi_queue_id, queue_id; for_each_set_bit(vsi_queue_id, &queuemap, I40E_MAX_VSI_QP) { - if (vf->adq_enabled) { - vsi_id = vf->ch[vsi_queue_id / I40E_MAX_VF_VSI].vsi_id; + u16 idx = vsi_queue_id / I40E_MAX_VF_VSI; + + if (vf->adq_enabled && idx < vf->num_tc) { + vsi_id = vf->ch[idx].vsi_id; queue_id = (vsi_queue_id % I40E_DEFAULT_QUEUES_PER_VF); } else { queue_id = vsi_queue_id; @@ -3589,7 +3610,7 @@ static int i40e_validate_cloud_filter(struct i40e_vf *vf, /* action_meta is TC number here to which the filter is applied */ if (!tc_filter->action_meta || - tc_filter->action_meta > vf->num_tc) { + tc_filter->action_meta >= vf->num_tc) { dev_info(&pf->pdev->dev, "VF %d: Invalid TC number %u\n", vf->vf_id, tc_filter->action_meta); goto err; @@ -3887,6 +3908,8 @@ err: aq_ret); } +#define I40E_MAX_VF_CLOUD_FILTER 0xFF00 + /** * i40e_vc_add_cloud_filter * @vf: pointer to the VF info @@ -3926,6 +3949,14 @@ static int i40e_vc_add_cloud_filter(struct i40e_vf *vf, u8 *msg) goto err_out; } + if (vf->num_cloud_filters >= I40E_MAX_VF_CLOUD_FILTER) { + dev_warn(&pf->pdev->dev, + "VF %d: Max number of filters reached, can't apply cloud filter\n", + vf->vf_id); + aq_ret = -ENOSPC; + goto err_out; + } + cfilter = kzalloc(sizeof(*cfilter), GFP_KERNEL); if (!cfilter) { aq_ret = -ENOMEM; diff --git a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.h b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.h index 5cf74f16f4..f558b45725 100644 --- a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.h +++ b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.h @@ -41,7 +41,8 @@ enum i40e_vf_states { I40E_VF_STATE_MC_PROMISC, I40E_VF_STATE_UC_PROMISC, I40E_VF_STATE_PRE_ENABLE, - I40E_VF_STATE_RESETTING + I40E_VF_STATE_RESETTING, + I40E_VF_STATE_RESOURCES_LOADED, }; /* VF capabilities */ diff --git a/drivers/net/ethernet/intel/ice/ice_main.c b/drivers/net/ethernet/intel/ice/ice_main.c index aa3888e5cf..f81c5dcbe9 100644 --- a/drivers/net/ethernet/intel/ice/ice_main.c +++ b/drivers/net/ethernet/intel/ice/ice_main.c @@ -3230,12 +3230,14 @@ static irqreturn_t ice_ll_ts_intr(int __always_unused irq, void *data) hw = &pf->hw; tx = &pf->ptp.port.tx; spin_lock_irqsave(&tx->lock, flags); - ice_ptp_complete_tx_single_tstamp(tx); + if (tx->init) { + ice_ptp_complete_tx_single_tstamp(tx); - idx = find_next_bit_wrap(tx->in_use, tx->len, - tx->last_ll_ts_idx_read + 1); - if (idx != tx->len) - ice_ptp_req_tx_single_tstamp(tx, idx); + idx = find_next_bit_wrap(tx->in_use, tx->len, + tx->last_ll_ts_idx_read + 1); + if (idx != tx->len) + ice_ptp_req_tx_single_tstamp(tx, idx); + } spin_unlock_irqrestore(&tx->lock, flags); val = GLINT_DYN_CTL_INTENA_M | GLINT_DYN_CTL_CLEARPBA_M | diff --git a/drivers/net/ethernet/intel/ice/ice_ptp.c b/drivers/net/ethernet/intel/ice/ice_ptp.c index 190f4006ae..0f94dd4dec 100644 --- a/drivers/net/ethernet/intel/ice/ice_ptp.c +++ b/drivers/net/ethernet/intel/ice/ice_ptp.c @@ -2882,16 +2882,19 @@ irqreturn_t ice_ptp_ts_irq(struct ice_pf *pf) */ if (hw->dev_caps.ts_dev_info.ts_ll_int_read) { struct ice_ptp_tx *tx = &pf->ptp.port.tx; - u8 idx; + u8 idx, last; if (!ice_pf_state_is_nominal(pf)) return IRQ_HANDLED; spin_lock(&tx->lock); - idx = find_next_bit_wrap(tx->in_use, tx->len, - tx->last_ll_ts_idx_read + 1); - if (idx != tx->len) - ice_ptp_req_tx_single_tstamp(tx, idx); + if (tx->init) { + last = tx->last_ll_ts_idx_read + 1; + idx = find_next_bit_wrap(tx->in_use, tx->len, + last); + if (idx != tx->len) + ice_ptp_req_tx_single_tstamp(tx, idx); + } spin_unlock(&tx->lock); return IRQ_HANDLED; diff --git a/drivers/net/ethernet/intel/ice/ice_txrx.c b/drivers/net/ethernet/intel/ice/ice_txrx.c index 1e4f6f6ee4..72de666acc 100644 --- a/drivers/net/ethernet/intel/ice/ice_txrx.c +++ b/drivers/net/ethernet/intel/ice/ice_txrx.c @@ -865,10 +865,6 @@ ice_add_xdp_frag(struct ice_rx_ring *rx_ring, struct xdp_buff *xdp, __skb_fill_page_desc_noacc(sinfo, sinfo->nr_frags++, rx_buf->page, rx_buf->page_offset, size); sinfo->xdp_frags_size += size; - /* remember frag count before XDP prog execution; bpf_xdp_adjust_tail() - * can pop off frags but driver has to handle it on its own - */ - rx_ring->nr_frags = sinfo->nr_frags; if (page_is_pfmemalloc(rx_buf->page)) xdp_buff_set_frag_pfmemalloc(xdp); @@ -939,20 +935,20 @@ ice_get_rx_buf(struct ice_rx_ring *rx_ring, const unsigned int size, /** * ice_get_pgcnts - grab page_count() for gathered fragments * @rx_ring: Rx descriptor ring to store the page counts on + * @ntc: the next to clean element (not included in this frame!) * * This function is intended to be called right before running XDP * program so that the page recycling mechanism will be able to take * a correct decision regarding underlying pages; this is done in such * way as XDP program can change the refcount of page */ -static void ice_get_pgcnts(struct ice_rx_ring *rx_ring) +static void ice_get_pgcnts(struct ice_rx_ring *rx_ring, unsigned int ntc) { - u32 nr_frags = rx_ring->nr_frags + 1; u32 idx = rx_ring->first_desc; struct ice_rx_buf *rx_buf; u32 cnt = rx_ring->count; - for (int i = 0; i < nr_frags; i++) { + while (idx != ntc) { rx_buf = &rx_ring->rx_buf[idx]; rx_buf->pgcnt = page_count(rx_buf->page); @@ -1125,62 +1121,51 @@ ice_put_rx_buf(struct ice_rx_ring *rx_ring, struct ice_rx_buf *rx_buf) } /** - * ice_put_rx_mbuf - ice_put_rx_buf() caller, for all frame frags + * ice_put_rx_mbuf - ice_put_rx_buf() caller, for all buffers in frame * @rx_ring: Rx ring with all the auxiliary data * @xdp: XDP buffer carrying linear + frags part - * @xdp_xmit: XDP_TX/XDP_REDIRECT verdict storage - * @ntc: a current next_to_clean value to be stored at rx_ring + * @ntc: the next to clean element (not included in this frame!) * @verdict: return code from XDP program execution * - * Walk through gathered fragments and satisfy internal page - * recycle mechanism; we take here an action related to verdict - * returned by XDP program; + * Called after XDP program is completed, or on error with verdict set to + * ICE_XDP_CONSUMED. + * + * Walk through buffers from first_desc to the end of the frame, releasing + * buffers and satisfying internal page recycle mechanism. The action depends + * on verdict from XDP program. */ static void ice_put_rx_mbuf(struct ice_rx_ring *rx_ring, struct xdp_buff *xdp, - u32 *xdp_xmit, u32 ntc, u32 verdict) + u32 ntc, u32 verdict) { - u32 nr_frags = rx_ring->nr_frags + 1; u32 idx = rx_ring->first_desc; u32 cnt = rx_ring->count; - u32 post_xdp_frags = 1; struct ice_rx_buf *buf; - int i; + u32 xdp_frags = 0; + int i = 0; if (unlikely(xdp_buff_has_frags(xdp))) - post_xdp_frags += xdp_get_shared_info_from_buff(xdp)->nr_frags; + xdp_frags = xdp_get_shared_info_from_buff(xdp)->nr_frags; - for (i = 0; i < post_xdp_frags; i++) { + while (idx != ntc) { buf = &rx_ring->rx_buf[idx]; + if (++idx == cnt) + idx = 0; - if (verdict & (ICE_XDP_TX | ICE_XDP_REDIR)) { + /* An XDP program could release fragments from the end of the + * buffer. For these, we need to keep the pagecnt_bias as-is. + * To do this, only adjust pagecnt_bias for fragments up to + * the total remaining after the XDP program has run. + */ + if (verdict != ICE_XDP_CONSUMED) ice_rx_buf_adjust_pg_offset(buf, xdp->frame_sz); - *xdp_xmit |= verdict; - } else if (verdict & ICE_XDP_CONSUMED) { + else if (i++ <= xdp_frags) buf->pagecnt_bias++; - } else if (verdict == ICE_XDP_PASS) { - ice_rx_buf_adjust_pg_offset(buf, xdp->frame_sz); - } ice_put_rx_buf(rx_ring, buf); - - if (++idx == cnt) - idx = 0; - } - /* handle buffers that represented frags released by XDP prog; - * for these we keep pagecnt_bias as-is; refcount from struct page - * has been decremented within XDP prog and we do not have to increase - * the biased refcnt - */ - for (; i < nr_frags; i++) { - buf = &rx_ring->rx_buf[idx]; - ice_put_rx_buf(rx_ring, buf); - if (++idx == cnt) - idx = 0; } xdp->data = NULL; rx_ring->first_desc = ntc; - rx_ring->nr_frags = 0; } /** @@ -1260,6 +1245,10 @@ int ice_clean_rx_irq(struct ice_rx_ring *rx_ring, int budget) /* retrieve a buffer from the ring */ rx_buf = ice_get_rx_buf(rx_ring, size, ntc); + /* Increment ntc before calls to ice_put_rx_mbuf() */ + if (++ntc == cnt) + ntc = 0; + if (!xdp->data) { void *hard_start; @@ -1268,24 +1257,23 @@ int ice_clean_rx_irq(struct ice_rx_ring *rx_ring, int budget) xdp_prepare_buff(xdp, hard_start, offset, size, !!offset); xdp_buff_clear_frags_flag(xdp); } else if (ice_add_xdp_frag(rx_ring, xdp, rx_buf, size)) { - ice_put_rx_mbuf(rx_ring, xdp, NULL, ntc, ICE_XDP_CONSUMED); + ice_put_rx_mbuf(rx_ring, xdp, ntc, ICE_XDP_CONSUMED); break; } - if (++ntc == cnt) - ntc = 0; /* skip if it is NOP desc */ if (ice_is_non_eop(rx_ring, rx_desc)) continue; - ice_get_pgcnts(rx_ring); + ice_get_pgcnts(rx_ring, ntc); xdp_verdict = ice_run_xdp(rx_ring, xdp, xdp_prog, xdp_ring, rx_desc); if (xdp_verdict == ICE_XDP_PASS) goto construct_skb; total_rx_bytes += xdp_get_buff_len(xdp); total_rx_pkts++; - ice_put_rx_mbuf(rx_ring, xdp, &xdp_xmit, ntc, xdp_verdict); + ice_put_rx_mbuf(rx_ring, xdp, ntc, xdp_verdict); + xdp_xmit |= xdp_verdict & (ICE_XDP_TX | ICE_XDP_REDIR); continue; construct_skb: @@ -1298,7 +1286,7 @@ construct_skb: rx_ring->ring_stats->rx_stats.alloc_page_failed++; xdp_verdict = ICE_XDP_CONSUMED; } - ice_put_rx_mbuf(rx_ring, xdp, &xdp_xmit, ntc, xdp_verdict); + ice_put_rx_mbuf(rx_ring, xdp, ntc, xdp_verdict); if (!skb) break; diff --git a/drivers/net/ethernet/intel/ice/ice_txrx.h b/drivers/net/ethernet/intel/ice/ice_txrx.h index a4b1e95146..07155e615f 100644 --- a/drivers/net/ethernet/intel/ice/ice_txrx.h +++ b/drivers/net/ethernet/intel/ice/ice_txrx.h @@ -358,7 +358,6 @@ struct ice_rx_ring { struct ice_tx_ring *xdp_ring; struct ice_rx_ring *next; /* pointer to next ring in q_vector */ struct xsk_buff_pool *xsk_pool; - u32 nr_frags; u16 max_frame; u16 rx_buf_len; dma_addr_t dma; /* physical address of ring */ diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_e610.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_e610.c index 9ada35f7d8..4ce1ad792b 100644 --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_e610.c +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_e610.c @@ -3094,7 +3094,7 @@ static int ixgbe_get_orom_ver_info(struct ixgbe_hw *hw, if (err) return err; - combo_ver = le32_to_cpu(civd.combo_ver); + combo_ver = get_unaligned_le32(&civd.combo_ver); orom->major = (u8)FIELD_GET(IXGBE_OROM_VER_MASK, combo_ver); orom->patch = (u8)FIELD_GET(IXGBE_OROM_VER_PATCH_MASK, combo_ver); diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_type_e610.h b/drivers/net/ethernet/intel/ixgbe/ixgbe_type_e610.h index bea94e5ccb..3fdb2b8b40 100644 --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_type_e610.h +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_type_e610.h @@ -1136,7 +1136,7 @@ struct ixgbe_orom_civd_info { __le32 combo_ver; /* Combo Image Version number */ u8 combo_name_len; /* Length of the unicode combo image version string, max of 32 */ __le16 combo_name[32]; /* Unicode string representing the Combo Image version */ -}; +} __packed; /* Function specific capabilities */ struct ixgbe_hw_func_caps { diff --git a/drivers/net/ethernet/microsoft/mana/gdma_main.c b/drivers/net/ethernet/microsoft/mana/gdma_main.c index f8c3d96d8d..a20f5eef03 100644 --- a/drivers/net/ethernet/microsoft/mana/gdma_main.c +++ b/drivers/net/ethernet/microsoft/mana/gdma_main.c @@ -6,8 +6,10 @@ #include #include #include +#include #include +#include #include struct dentry *mana_debugfs_root; @@ -32,6 +34,9 @@ static void mana_gd_init_pf_regs(struct pci_dev *pdev) gc->db_page_base = gc->bar0_va + mana_gd_r64(gc, GDMA_PF_REG_DB_PAGE_OFF); + gc->phys_db_page_base = gc->bar0_pa + + mana_gd_r64(gc, GDMA_PF_REG_DB_PAGE_OFF); + sriov_base_off = mana_gd_r64(gc, GDMA_SRIOV_REG_CFG_BASE_OFF); sriov_base_va = gc->bar0_va + sriov_base_off; @@ -64,6 +69,24 @@ static void mana_gd_init_registers(struct pci_dev *pdev) mana_gd_init_vf_regs(pdev); } +/* Suppress logging when we set timeout to zero */ +bool mana_need_log(struct gdma_context *gc, int err) +{ + struct hw_channel_context *hwc; + + if (err != -ETIMEDOUT) + return true; + + if (!gc) + return true; + + hwc = gc->hwc.driver_data; + if (hwc && hwc->hwc_timeout == 0) + return false; + + return true; +} + static int mana_gd_query_max_resources(struct pci_dev *pdev) { struct gdma_context *gc = pci_get_drvdata(pdev); @@ -267,8 +290,9 @@ static int mana_gd_disable_queue(struct gdma_queue *queue) err = mana_gd_send_request(gc, sizeof(req), &req, sizeof(resp), &resp); if (err || resp.hdr.status) { - dev_err(gc->dev, "Failed to disable queue: %d, 0x%x\n", err, - resp.hdr.status); + if (mana_need_log(gc, err)) + dev_err(gc->dev, "Failed to disable queue: %d, 0x%x\n", err, + resp.hdr.status); return err ? err : -EPROTO; } @@ -353,11 +377,113 @@ void mana_gd_ring_cq(struct gdma_queue *cq, u8 arm_bit) } EXPORT_SYMBOL_NS(mana_gd_ring_cq, NET_MANA); +#define MANA_SERVICE_PERIOD 10 + +static void mana_serv_fpga(struct pci_dev *pdev) +{ + struct pci_bus *bus, *parent; + + pci_lock_rescan_remove(); + + bus = pdev->bus; + if (!bus) { + dev_err(&pdev->dev, "MANA service: no bus\n"); + goto out; + } + + parent = bus->parent; + if (!parent) { + dev_err(&pdev->dev, "MANA service: no parent bus\n"); + goto out; + } + + pci_stop_and_remove_bus_device(bus->self); + + msleep(MANA_SERVICE_PERIOD * 1000); + + pci_rescan_bus(parent); + +out: + pci_unlock_rescan_remove(); +} + +static void mana_serv_reset(struct pci_dev *pdev) +{ + struct gdma_context *gc = pci_get_drvdata(pdev); + struct hw_channel_context *hwc; + + if (!gc) { + dev_err(&pdev->dev, "MANA service: no GC\n"); + return; + } + + hwc = gc->hwc.driver_data; + if (!hwc) { + dev_err(&pdev->dev, "MANA service: no HWC\n"); + goto out; + } + + /* HWC is not responding in this case, so don't wait */ + hwc->hwc_timeout = 0; + + dev_info(&pdev->dev, "MANA reset cycle start\n"); + + mana_gd_suspend(pdev, PMSG_SUSPEND); + + msleep(MANA_SERVICE_PERIOD * 1000); + + mana_gd_resume(pdev); + + dev_info(&pdev->dev, "MANA reset cycle completed\n"); + +out: + gc->in_service = false; +} + +struct mana_serv_work { + struct work_struct serv_work; + struct pci_dev *pdev; + enum gdma_eqe_type type; +}; + +static void mana_serv_func(struct work_struct *w) +{ + struct mana_serv_work *mns_wk; + struct pci_dev *pdev; + + mns_wk = container_of(w, struct mana_serv_work, serv_work); + pdev = mns_wk->pdev; + + if (!pdev) + goto out; + + switch (mns_wk->type) { + case GDMA_EQE_HWC_FPGA_RECONFIG: + mana_serv_fpga(pdev); + break; + + case GDMA_EQE_HWC_RESET_REQUEST: + mana_serv_reset(pdev); + break; + + default: + dev_err(&pdev->dev, "MANA service: unknown type %d\n", + mns_wk->type); + break; + } + +out: + pci_dev_put(pdev); + kfree(mns_wk); + module_put(THIS_MODULE); +} + static void mana_gd_process_eqe(struct gdma_queue *eq) { u32 head = eq->head % (eq->queue_size / GDMA_EQE_SIZE); struct gdma_context *gc = eq->gdma_dev->gdma_context; struct gdma_eqe *eq_eqe_ptr = eq->queue_mem_ptr; + struct mana_serv_work *mns_wk; union gdma_eqe_info eqe_info; enum gdma_eqe_type type; struct gdma_event event; @@ -402,6 +528,35 @@ static void mana_gd_process_eqe(struct gdma_queue *eq) eq->eq.callback(eq->eq.context, eq, &event); break; + case GDMA_EQE_HWC_FPGA_RECONFIG: + case GDMA_EQE_HWC_RESET_REQUEST: + dev_info(gc->dev, "Recv MANA service type:%d\n", type); + + if (gc->in_service) { + dev_info(gc->dev, "Already in service\n"); + break; + } + + if (!try_module_get(THIS_MODULE)) { + dev_info(gc->dev, "Module is unloading\n"); + break; + } + + mns_wk = kzalloc(sizeof(*mns_wk), GFP_ATOMIC); + if (!mns_wk) { + module_put(THIS_MODULE); + break; + } + + dev_info(gc->dev, "Start MANA service type:%d\n", type); + gc->in_service = true; + mns_wk->pdev = to_pci_dev(gc->dev); + mns_wk->type = type; + pci_dev_get(mns_wk->pdev); + INIT_WORK(&mns_wk->serv_work, mana_serv_func); + schedule_work(&mns_wk->serv_work); + break; + default: break; } @@ -543,7 +698,8 @@ int mana_gd_test_eq(struct gdma_context *gc, struct gdma_queue *eq) err = mana_gd_send_request(gc, sizeof(req), &req, sizeof(resp), &resp); if (err) { - dev_err(dev, "test_eq failed: %d\n", err); + if (mana_need_log(gc, err)) + dev_err(dev, "test_eq failed: %d\n", err); goto out; } @@ -578,7 +734,7 @@ static void mana_gd_destroy_eq(struct gdma_context *gc, bool flush_evenets, if (flush_evenets) { err = mana_gd_test_eq(gc, queue); - if (err) + if (err && mana_need_log(gc, err)) dev_warn(gc->dev, "Failed to flush EQ: %d\n", err); } @@ -724,8 +880,9 @@ int mana_gd_destroy_dma_region(struct gdma_context *gc, u64 dma_region_handle) err = mana_gd_send_request(gc, sizeof(req), &req, sizeof(resp), &resp); if (err || resp.hdr.status) { - dev_err(gc->dev, "Failed to destroy DMA region: %d, 0x%x\n", - err, resp.hdr.status); + if (mana_need_log(gc, err)) + dev_err(gc->dev, "Failed to destroy DMA region: %d, 0x%x\n", + err, resp.hdr.status); return -EPROTO; } @@ -1025,8 +1182,9 @@ int mana_gd_deregister_device(struct gdma_dev *gd) err = mana_gd_send_request(gc, sizeof(req), &req, sizeof(resp), &resp); if (err || resp.hdr.status) { - dev_err(gc->dev, "Failed to deregister device: %d, 0x%x\n", - err, resp.hdr.status); + if (mana_need_log(gc, err)) + dev_err(gc->dev, "Failed to deregister device: %d, 0x%x\n", + err, resp.hdr.status); if (!err) err = -EPROTO; } @@ -1642,7 +1800,7 @@ static void mana_gd_remove(struct pci_dev *pdev) } /* The 'state' parameter is not used. */ -static int mana_gd_suspend(struct pci_dev *pdev, pm_message_t state) +int mana_gd_suspend(struct pci_dev *pdev, pm_message_t state) { struct gdma_context *gc = pci_get_drvdata(pdev); @@ -1658,7 +1816,7 @@ static int mana_gd_suspend(struct pci_dev *pdev, pm_message_t state) * fail -- if this happens, it's safer to just report an error than try to undo * what has been done. */ -static int mana_gd_resume(struct pci_dev *pdev) +int mana_gd_resume(struct pci_dev *pdev) { struct gdma_context *gc = pci_get_drvdata(pdev); int err; diff --git a/drivers/net/ethernet/microsoft/mana/hw_channel.c b/drivers/net/ethernet/microsoft/mana/hw_channel.c index 8ac42d06ed..2a3036976c 100644 --- a/drivers/net/ethernet/microsoft/mana/hw_channel.c +++ b/drivers/net/ethernet/microsoft/mana/hw_channel.c @@ -2,6 +2,7 @@ /* Copyright (c) 2021, Microsoft Corporation. */ #include +#include #include static int mana_hwc_get_msg_index(struct hw_channel_context *hwc, u16 *msg_id) @@ -878,7 +879,9 @@ int mana_hwc_send_request(struct hw_channel_context *hwc, u32 req_len, if (!wait_for_completion_timeout(&ctx->comp_event, (msecs_to_jiffies(hwc->hwc_timeout)))) { - dev_err(hwc->dev, "HWC: Request timed out!\n"); + if (hwc->hwc_timeout != 0) + dev_err(hwc->dev, "HWC: Request timed out!\n"); + err = -ETIMEDOUT; goto out; } @@ -889,8 +892,13 @@ int mana_hwc_send_request(struct hw_channel_context *hwc, u32 req_len, } if (ctx->status_code && ctx->status_code != GDMA_STATUS_MORE_ENTRIES) { - dev_err(hwc->dev, "HWC: Failed hw_channel req: 0x%x\n", - ctx->status_code); + if (ctx->status_code == GDMA_STATUS_CMD_UNSUPPORTED) { + err = -EOPNOTSUPP; + goto out; + } + if (req_msg->req.msg_type != MANA_QUERY_PHY_STAT) + dev_err(hwc->dev, "HWC: Failed hw_channel req: 0x%x\n", + ctx->status_code); err = -EPROTO; goto out; } diff --git a/drivers/net/ethernet/microsoft/mana/mana_en.c b/drivers/net/ethernet/microsoft/mana/mana_en.c index c356cfe6ab..35cbb3a15f 100644 --- a/drivers/net/ethernet/microsoft/mana/mana_en.c +++ b/drivers/net/ethernet/microsoft/mana/mana_en.c @@ -10,6 +10,7 @@ #include #include #include +#include #include #include @@ -45,6 +46,15 @@ static const struct file_operations mana_dbg_q_fops = { .read = mana_dbg_q_read, }; +static bool mana_en_need_log(struct mana_port_context *apc, int err) +{ + if (apc && apc->ac && apc->ac->gdma_dev && + apc->ac->gdma_dev->gdma_context) + return mana_need_log(apc->ac->gdma_dev->gdma_context, err); + else + return true; +} + /* Microsoft Azure Network Adapter (MANA) functions */ static int mana_open(struct net_device *ndev) @@ -249,10 +259,10 @@ netdev_tx_t mana_start_xmit(struct sk_buff *skb, struct net_device *ndev) struct netdev_queue *net_txq; struct mana_stats_tx *tx_stats; struct gdma_queue *gdma_sq; + int err, len, num_gso_seg; unsigned int csum_type; struct mana_txq *txq; struct mana_cq *cq; - int err, len; if (unlikely(!apc->port_is_up)) goto tx_drop; @@ -405,6 +415,7 @@ netdev_tx_t mana_start_xmit(struct sk_buff *skb, struct net_device *ndev) skb_queue_tail(&txq->pending_skbs, skb); len = skb->len; + num_gso_seg = skb_is_gso(skb) ? skb_shinfo(skb)->gso_segs : 1; net_txq = netdev_get_tx_queue(ndev, txq_idx); err = mana_gd_post_work_request(gdma_sq, &pkg.wqe_req, @@ -429,10 +440,13 @@ netdev_tx_t mana_start_xmit(struct sk_buff *skb, struct net_device *ndev) /* skb may be freed after mana_gd_post_work_request. Do not use it. */ skb = NULL; + /* Populated the packet and bytes counters based on post GSO packet + * calculations + */ tx_stats = &txq->stats; u64_stats_update_begin(&tx_stats->syncp); - tx_stats->packets++; - tx_stats->bytes += len; + tx_stats->packets += num_gso_seg; + tx_stats->bytes += len + ((num_gso_seg - 1) * gso_hs); u64_stats_update_end(&tx_stats->syncp); tx_busy: @@ -772,8 +786,13 @@ static int mana_send_request(struct mana_context *ac, void *in_buf, err = mana_gd_send_request(gc, in_len, in_buf, out_len, out_buf); if (err || resp->status) { - dev_err(dev, "Failed to send mana message: %d, 0x%x\n", - err, resp->status); + if (err == -EOPNOTSUPP) + return err; + + if (req->req.msg_type != MANA_QUERY_PHY_STAT && + mana_need_log(gc, err)) + dev_err(dev, "Failed to send mana message: %d, 0x%x\n", + err, resp->status); return err ? err : -EPROTO; } @@ -848,8 +867,10 @@ static void mana_pf_deregister_hw_vport(struct mana_port_context *apc) err = mana_send_request(apc->ac, &req, sizeof(req), &resp, sizeof(resp)); if (err) { - netdev_err(apc->ndev, "Failed to unregister hw vPort: %d\n", - err); + if (mana_en_need_log(apc, err)) + netdev_err(apc->ndev, "Failed to unregister hw vPort: %d\n", + err); + return; } @@ -904,8 +925,10 @@ static void mana_pf_deregister_filter(struct mana_port_context *apc) err = mana_send_request(apc->ac, &req, sizeof(req), &resp, sizeof(resp)); if (err) { - netdev_err(apc->ndev, "Failed to unregister filter: %d\n", - err); + if (mana_en_need_log(apc, err)) + netdev_err(apc->ndev, "Failed to unregister filter: %d\n", + err); + return; } @@ -1135,7 +1158,9 @@ static int mana_cfg_vport_steering(struct mana_port_context *apc, err = mana_send_request(apc->ac, req, req_buf_size, &resp, sizeof(resp)); if (err) { - netdev_err(ndev, "Failed to configure vPort RX: %d\n", err); + if (mana_en_need_log(apc, err)) + netdev_err(ndev, "Failed to configure vPort RX: %d\n", err); + goto out; } @@ -1230,7 +1255,9 @@ void mana_destroy_wq_obj(struct mana_port_context *apc, u32 wq_type, err = mana_send_request(apc->ac, &req, sizeof(req), &resp, sizeof(resp)); if (err) { - netdev_err(ndev, "Failed to destroy WQ object: %d\n", err); + if (mana_en_need_log(apc, err)) + netdev_err(ndev, "Failed to destroy WQ object: %d\n", err); + return; } @@ -2609,6 +2636,88 @@ void mana_query_gf_stats(struct mana_port_context *apc) apc->eth_stats.hc_tx_err_gdma = resp.tx_err_gdma; } +void mana_query_phy_stats(struct mana_port_context *apc) +{ + struct mana_query_phy_stat_resp resp = {}; + struct mana_query_phy_stat_req req = {}; + struct net_device *ndev = apc->ndev; + int err; + + mana_gd_init_req_hdr(&req.hdr, MANA_QUERY_PHY_STAT, + sizeof(req), sizeof(resp)); + err = mana_send_request(apc->ac, &req, sizeof(req), &resp, + sizeof(resp)); + if (err) + return; + + err = mana_verify_resp_hdr(&resp.hdr, MANA_QUERY_PHY_STAT, + sizeof(resp)); + if (err || resp.hdr.status) { + netdev_err(ndev, + "Failed to query PHY stats: %d, resp:0x%x\n", + err, resp.hdr.status); + return; + } + + /* Aggregate drop counters */ + apc->phy_stats.rx_pkt_drop_phy = resp.rx_pkt_drop_phy; + apc->phy_stats.tx_pkt_drop_phy = resp.tx_pkt_drop_phy; + + /* Per TC traffic Counters */ + apc->phy_stats.rx_pkt_tc0_phy = resp.rx_pkt_tc0_phy; + apc->phy_stats.tx_pkt_tc0_phy = resp.tx_pkt_tc0_phy; + apc->phy_stats.rx_pkt_tc1_phy = resp.rx_pkt_tc1_phy; + apc->phy_stats.tx_pkt_tc1_phy = resp.tx_pkt_tc1_phy; + apc->phy_stats.rx_pkt_tc2_phy = resp.rx_pkt_tc2_phy; + apc->phy_stats.tx_pkt_tc2_phy = resp.tx_pkt_tc2_phy; + apc->phy_stats.rx_pkt_tc3_phy = resp.rx_pkt_tc3_phy; + apc->phy_stats.tx_pkt_tc3_phy = resp.tx_pkt_tc3_phy; + apc->phy_stats.rx_pkt_tc4_phy = resp.rx_pkt_tc4_phy; + apc->phy_stats.tx_pkt_tc4_phy = resp.tx_pkt_tc4_phy; + apc->phy_stats.rx_pkt_tc5_phy = resp.rx_pkt_tc5_phy; + apc->phy_stats.tx_pkt_tc5_phy = resp.tx_pkt_tc5_phy; + apc->phy_stats.rx_pkt_tc6_phy = resp.rx_pkt_tc6_phy; + apc->phy_stats.tx_pkt_tc6_phy = resp.tx_pkt_tc6_phy; + apc->phy_stats.rx_pkt_tc7_phy = resp.rx_pkt_tc7_phy; + apc->phy_stats.tx_pkt_tc7_phy = resp.tx_pkt_tc7_phy; + + /* Per TC byte Counters */ + apc->phy_stats.rx_byte_tc0_phy = resp.rx_byte_tc0_phy; + apc->phy_stats.tx_byte_tc0_phy = resp.tx_byte_tc0_phy; + apc->phy_stats.rx_byte_tc1_phy = resp.rx_byte_tc1_phy; + apc->phy_stats.tx_byte_tc1_phy = resp.tx_byte_tc1_phy; + apc->phy_stats.rx_byte_tc2_phy = resp.rx_byte_tc2_phy; + apc->phy_stats.tx_byte_tc2_phy = resp.tx_byte_tc2_phy; + apc->phy_stats.rx_byte_tc3_phy = resp.rx_byte_tc3_phy; + apc->phy_stats.tx_byte_tc3_phy = resp.tx_byte_tc3_phy; + apc->phy_stats.rx_byte_tc4_phy = resp.rx_byte_tc4_phy; + apc->phy_stats.tx_byte_tc4_phy = resp.tx_byte_tc4_phy; + apc->phy_stats.rx_byte_tc5_phy = resp.rx_byte_tc5_phy; + apc->phy_stats.tx_byte_tc5_phy = resp.tx_byte_tc5_phy; + apc->phy_stats.rx_byte_tc6_phy = resp.rx_byte_tc6_phy; + apc->phy_stats.tx_byte_tc6_phy = resp.tx_byte_tc6_phy; + apc->phy_stats.rx_byte_tc7_phy = resp.rx_byte_tc7_phy; + apc->phy_stats.tx_byte_tc7_phy = resp.tx_byte_tc7_phy; + + /* Per TC pause Counters */ + apc->phy_stats.rx_pause_tc0_phy = resp.rx_pause_tc0_phy; + apc->phy_stats.tx_pause_tc0_phy = resp.tx_pause_tc0_phy; + apc->phy_stats.rx_pause_tc1_phy = resp.rx_pause_tc1_phy; + apc->phy_stats.tx_pause_tc1_phy = resp.tx_pause_tc1_phy; + apc->phy_stats.rx_pause_tc2_phy = resp.rx_pause_tc2_phy; + apc->phy_stats.tx_pause_tc2_phy = resp.tx_pause_tc2_phy; + apc->phy_stats.rx_pause_tc3_phy = resp.rx_pause_tc3_phy; + apc->phy_stats.tx_pause_tc3_phy = resp.tx_pause_tc3_phy; + apc->phy_stats.rx_pause_tc4_phy = resp.rx_pause_tc4_phy; + apc->phy_stats.tx_pause_tc4_phy = resp.tx_pause_tc4_phy; + apc->phy_stats.rx_pause_tc5_phy = resp.rx_pause_tc5_phy; + apc->phy_stats.tx_pause_tc5_phy = resp.tx_pause_tc5_phy; + apc->phy_stats.rx_pause_tc6_phy = resp.rx_pause_tc6_phy; + apc->phy_stats.tx_pause_tc6_phy = resp.tx_pause_tc6_phy; + apc->phy_stats.rx_pause_tc7_phy = resp.rx_pause_tc7_phy; + apc->phy_stats.tx_pause_tc7_phy = resp.tx_pause_tc7_phy; +} + static int mana_init_port(struct net_device *ndev) { struct mana_port_context *apc = netdev_priv(ndev); @@ -2803,11 +2912,10 @@ static int mana_dealloc_queues(struct net_device *ndev) apc->rss_state = TRI_STATE_FALSE; err = mana_config_rss(apc, TRI_STATE_FALSE, false, false); - if (err) { + if (err && mana_en_need_log(apc, err)) netdev_err(ndev, "Failed to disable vPort: %d\n", err); - return err; - } + /* Even in err case, still need to cleanup the vPort */ mana_destroy_vport(apc); return 0; diff --git a/drivers/net/ethernet/microsoft/mana/mana_ethtool.c b/drivers/net/ethernet/microsoft/mana/mana_ethtool.c index c419626073..4fb3a04994 100644 --- a/drivers/net/ethernet/microsoft/mana/mana_ethtool.c +++ b/drivers/net/ethernet/microsoft/mana/mana_ethtool.c @@ -7,10 +7,12 @@ #include -static const struct { +struct mana_stats_desc { char name[ETH_GSTRING_LEN]; u16 offset; -} mana_eth_stats[] = { +}; + +static const struct mana_stats_desc mana_eth_stats[] = { {"stop_queue", offsetof(struct mana_ethtool_stats, stop_queue)}, {"wake_queue", offsetof(struct mana_ethtool_stats, wake_queue)}, {"hc_rx_discards_no_wqe", offsetof(struct mana_ethtool_stats, @@ -75,6 +77,59 @@ static const struct { rx_cqe_unknown_type)}, }; +static const struct mana_stats_desc mana_phy_stats[] = { + { "hc_rx_pkt_drop_phy", offsetof(struct mana_ethtool_phy_stats, rx_pkt_drop_phy) }, + { "hc_tx_pkt_drop_phy", offsetof(struct mana_ethtool_phy_stats, tx_pkt_drop_phy) }, + { "hc_tc0_rx_pkt_phy", offsetof(struct mana_ethtool_phy_stats, rx_pkt_tc0_phy) }, + { "hc_tc0_rx_byte_phy", offsetof(struct mana_ethtool_phy_stats, rx_byte_tc0_phy) }, + { "hc_tc0_tx_pkt_phy", offsetof(struct mana_ethtool_phy_stats, tx_pkt_tc0_phy) }, + { "hc_tc0_tx_byte_phy", offsetof(struct mana_ethtool_phy_stats, tx_byte_tc0_phy) }, + { "hc_tc1_rx_pkt_phy", offsetof(struct mana_ethtool_phy_stats, rx_pkt_tc1_phy) }, + { "hc_tc1_rx_byte_phy", offsetof(struct mana_ethtool_phy_stats, rx_byte_tc1_phy) }, + { "hc_tc1_tx_pkt_phy", offsetof(struct mana_ethtool_phy_stats, tx_pkt_tc1_phy) }, + { "hc_tc1_tx_byte_phy", offsetof(struct mana_ethtool_phy_stats, tx_byte_tc1_phy) }, + { "hc_tc2_rx_pkt_phy", offsetof(struct mana_ethtool_phy_stats, rx_pkt_tc2_phy) }, + { "hc_tc2_rx_byte_phy", offsetof(struct mana_ethtool_phy_stats, rx_byte_tc2_phy) }, + { "hc_tc2_tx_pkt_phy", offsetof(struct mana_ethtool_phy_stats, tx_pkt_tc2_phy) }, + { "hc_tc2_tx_byte_phy", offsetof(struct mana_ethtool_phy_stats, tx_byte_tc2_phy) }, + { "hc_tc3_rx_pkt_phy", offsetof(struct mana_ethtool_phy_stats, rx_pkt_tc3_phy) }, + { "hc_tc3_rx_byte_phy", offsetof(struct mana_ethtool_phy_stats, rx_byte_tc3_phy) }, + { "hc_tc3_tx_pkt_phy", offsetof(struct mana_ethtool_phy_stats, tx_pkt_tc3_phy) }, + { "hc_tc3_tx_byte_phy", offsetof(struct mana_ethtool_phy_stats, tx_byte_tc3_phy) }, + { "hc_tc4_rx_pkt_phy", offsetof(struct mana_ethtool_phy_stats, rx_pkt_tc4_phy) }, + { "hc_tc4_rx_byte_phy", offsetof(struct mana_ethtool_phy_stats, rx_byte_tc4_phy) }, + { "hc_tc4_tx_pkt_phy", offsetof(struct mana_ethtool_phy_stats, tx_pkt_tc4_phy) }, + { "hc_tc4_tx_byte_phy", offsetof(struct mana_ethtool_phy_stats, tx_byte_tc4_phy) }, + { "hc_tc5_rx_pkt_phy", offsetof(struct mana_ethtool_phy_stats, rx_pkt_tc5_phy) }, + { "hc_tc5_rx_byte_phy", offsetof(struct mana_ethtool_phy_stats, rx_byte_tc5_phy) }, + { "hc_tc5_tx_pkt_phy", offsetof(struct mana_ethtool_phy_stats, tx_pkt_tc5_phy) }, + { "hc_tc5_tx_byte_phy", offsetof(struct mana_ethtool_phy_stats, tx_byte_tc5_phy) }, + { "hc_tc6_rx_pkt_phy", offsetof(struct mana_ethtool_phy_stats, rx_pkt_tc6_phy) }, + { "hc_tc6_rx_byte_phy", offsetof(struct mana_ethtool_phy_stats, rx_byte_tc6_phy) }, + { "hc_tc6_tx_pkt_phy", offsetof(struct mana_ethtool_phy_stats, tx_pkt_tc6_phy) }, + { "hc_tc6_tx_byte_phy", offsetof(struct mana_ethtool_phy_stats, tx_byte_tc6_phy) }, + { "hc_tc7_rx_pkt_phy", offsetof(struct mana_ethtool_phy_stats, rx_pkt_tc7_phy) }, + { "hc_tc7_rx_byte_phy", offsetof(struct mana_ethtool_phy_stats, rx_byte_tc7_phy) }, + { "hc_tc7_tx_pkt_phy", offsetof(struct mana_ethtool_phy_stats, tx_pkt_tc7_phy) }, + { "hc_tc7_tx_byte_phy", offsetof(struct mana_ethtool_phy_stats, tx_byte_tc7_phy) }, + { "hc_tc0_rx_pause_phy", offsetof(struct mana_ethtool_phy_stats, rx_pause_tc0_phy) }, + { "hc_tc0_tx_pause_phy", offsetof(struct mana_ethtool_phy_stats, tx_pause_tc0_phy) }, + { "hc_tc1_rx_pause_phy", offsetof(struct mana_ethtool_phy_stats, rx_pause_tc1_phy) }, + { "hc_tc1_tx_pause_phy", offsetof(struct mana_ethtool_phy_stats, tx_pause_tc1_phy) }, + { "hc_tc2_rx_pause_phy", offsetof(struct mana_ethtool_phy_stats, rx_pause_tc2_phy) }, + { "hc_tc2_tx_pause_phy", offsetof(struct mana_ethtool_phy_stats, tx_pause_tc2_phy) }, + { "hc_tc3_rx_pause_phy", offsetof(struct mana_ethtool_phy_stats, rx_pause_tc3_phy) }, + { "hc_tc3_tx_pause_phy", offsetof(struct mana_ethtool_phy_stats, tx_pause_tc3_phy) }, + { "hc_tc4_rx_pause_phy", offsetof(struct mana_ethtool_phy_stats, rx_pause_tc4_phy) }, + { "hc_tc4_tx_pause_phy", offsetof(struct mana_ethtool_phy_stats, tx_pause_tc4_phy) }, + { "hc_tc5_rx_pause_phy", offsetof(struct mana_ethtool_phy_stats, rx_pause_tc5_phy) }, + { "hc_tc5_tx_pause_phy", offsetof(struct mana_ethtool_phy_stats, tx_pause_tc5_phy) }, + { "hc_tc6_rx_pause_phy", offsetof(struct mana_ethtool_phy_stats, rx_pause_tc6_phy) }, + { "hc_tc6_tx_pause_phy", offsetof(struct mana_ethtool_phy_stats, tx_pause_tc6_phy) }, + { "hc_tc7_rx_pause_phy", offsetof(struct mana_ethtool_phy_stats, rx_pause_tc7_phy) }, + { "hc_tc7_tx_pause_phy", offsetof(struct mana_ethtool_phy_stats, tx_pause_tc7_phy) }, +}; + static int mana_get_sset_count(struct net_device *ndev, int stringset) { struct mana_port_context *apc = netdev_priv(ndev); @@ -83,8 +138,8 @@ static int mana_get_sset_count(struct net_device *ndev, int stringset) if (stringset != ETH_SS_STATS) return -EINVAL; - return ARRAY_SIZE(mana_eth_stats) + num_queues * - (MANA_STATS_RX_COUNT + MANA_STATS_TX_COUNT); + return ARRAY_SIZE(mana_eth_stats) + ARRAY_SIZE(mana_phy_stats) + + num_queues * (MANA_STATS_RX_COUNT + MANA_STATS_TX_COUNT); } static void mana_get_strings(struct net_device *ndev, u32 stringset, u8 *data) @@ -99,6 +154,9 @@ static void mana_get_strings(struct net_device *ndev, u32 stringset, u8 *data) for (i = 0; i < ARRAY_SIZE(mana_eth_stats); i++) ethtool_puts(&data, mana_eth_stats[i].name); + for (i = 0; i < ARRAY_SIZE(mana_phy_stats); i++) + ethtool_puts(&data, mana_phy_stats[i].name); + for (i = 0; i < num_queues; i++) { ethtool_sprintf(&data, "rx_%d_packets", i); ethtool_sprintf(&data, "rx_%d_bytes", i); @@ -128,6 +186,7 @@ static void mana_get_ethtool_stats(struct net_device *ndev, struct mana_port_context *apc = netdev_priv(ndev); unsigned int num_queues = apc->num_queues; void *eth_stats = &apc->eth_stats; + void *phy_stats = &apc->phy_stats; struct mana_stats_rx *rx_stats; struct mana_stats_tx *tx_stats; unsigned int start; @@ -151,9 +210,18 @@ static void mana_get_ethtool_stats(struct net_device *ndev, /* we call mana function to update stats from GDMA */ mana_query_gf_stats(apc); + /* We call this mana function to get the phy stats from GDMA and includes + * aggregate tx/rx drop counters, Per-TC(Traffic Channel) tx/rx and pause + * counters. + */ + mana_query_phy_stats(apc); + for (q = 0; q < ARRAY_SIZE(mana_eth_stats); q++) data[i++] = *(u64 *)(eth_stats + mana_eth_stats[q].offset); + for (q = 0; q < ARRAY_SIZE(mana_phy_stats); q++) + data[i++] = *(u64 *)(phy_stats + mana_phy_stats[q].offset); + for (q = 0; q < num_queues; q++) { rx_stats = &apc->rxqs[q]->stats; diff --git a/drivers/net/hyperv/hyperv_net.h b/drivers/net/hyperv/hyperv_net.h index fc77897d29..ef05e8d09f 100644 --- a/drivers/net/hyperv/hyperv_net.h +++ b/drivers/net/hyperv/hyperv_net.h @@ -1060,6 +1060,7 @@ struct net_device_context { struct net_device __rcu *vf_netdev; struct netvsc_vf_pcpu_stats __percpu *vf_stats; struct delayed_work vf_takeover; + struct delayed_work vfns_work; /* 1: allocated, serial number is valid. 0: not allocated */ u32 vf_alloc; @@ -1074,6 +1075,8 @@ struct net_device_context { struct netvsc_device_info *saved_netvsc_dev_info; }; +void netvsc_vfns_work(struct work_struct *w); + /* Azure hosts don't support non-TCP port numbers in hashing for fragmented * packets. We can use ethtool to change UDP hash level when necessary. */ diff --git a/drivers/net/hyperv/netvsc_drv.c b/drivers/net/hyperv/netvsc_drv.c index 9d7ef032a6..b47ec77cfd 100644 --- a/drivers/net/hyperv/netvsc_drv.c +++ b/drivers/net/hyperv/netvsc_drv.c @@ -2531,6 +2531,7 @@ static int netvsc_probe(struct hv_device *dev, spin_lock_init(&net_device_ctx->lock); INIT_LIST_HEAD(&net_device_ctx->reconfig_events); INIT_DELAYED_WORK(&net_device_ctx->vf_takeover, netvsc_vf_setup); + INIT_DELAYED_WORK(&net_device_ctx->vfns_work, netvsc_vfns_work); net_device_ctx->vf_stats = netdev_alloc_pcpu_stats(struct netvsc_vf_pcpu_stats); @@ -2673,6 +2674,8 @@ static void netvsc_remove(struct hv_device *dev) cancel_delayed_work_sync(&ndev_ctx->dwork); rtnl_lock(); + cancel_delayed_work_sync(&ndev_ctx->vfns_work); + nvdev = rtnl_dereference(ndev_ctx->nvdev); if (nvdev) { cancel_work_sync(&nvdev->subchan_work); @@ -2714,6 +2717,7 @@ static int netvsc_suspend(struct hv_device *dev) cancel_delayed_work_sync(&ndev_ctx->dwork); rtnl_lock(); + cancel_delayed_work_sync(&ndev_ctx->vfns_work); nvdev = rtnl_dereference(ndev_ctx->nvdev); if (nvdev == NULL) { @@ -2807,6 +2811,27 @@ static void netvsc_event_set_vf_ns(struct net_device *ndev) } } +void netvsc_vfns_work(struct work_struct *w) +{ + struct net_device_context *ndev_ctx = + container_of(w, struct net_device_context, vfns_work.work); + struct net_device *ndev; + + if (!rtnl_trylock()) { + schedule_delayed_work(&ndev_ctx->vfns_work, 1); + return; + } + + ndev = hv_get_drvdata(ndev_ctx->device_ctx); + if (!ndev) + goto out; + + netvsc_event_set_vf_ns(ndev); + +out: + rtnl_unlock(); +} + /* * On Hyper-V, every VF interface is matched with a corresponding * synthetic interface. The synthetic interface is presented first @@ -2817,10 +2842,12 @@ static int netvsc_netdev_event(struct notifier_block *this, unsigned long event, void *ptr) { struct net_device *event_dev = netdev_notifier_info_to_dev(ptr); + struct net_device_context *ndev_ctx; int ret = 0; if (event_dev->netdev_ops == &device_ops && event == NETDEV_REGISTER) { - netvsc_event_set_vf_ns(event_dev); + ndev_ctx = netdev_priv(event_dev); + schedule_delayed_work(&ndev_ctx->vfns_work, 0); return NOTIFY_DONE; } diff --git a/drivers/nvme/host/multipath.c b/drivers/nvme/host/multipath.c index 173b1c7c5c..a20d0c31bc 100644 --- a/drivers/nvme/host/multipath.c +++ b/drivers/nvme/host/multipath.c @@ -137,12 +137,14 @@ void nvme_mpath_start_request(struct request *rq) struct nvme_ns *ns = rq->q->queuedata; struct gendisk *disk = ns->head->disk; - if (READ_ONCE(ns->head->subsys->iopolicy) == NVME_IOPOLICY_QD) { + if ((READ_ONCE(ns->head->subsys->iopolicy) == NVME_IOPOLICY_QD) && + !(nvme_req(rq)->flags & NVME_MPATH_CNT_ACTIVE)) { atomic_inc(&ns->ctrl->nr_active); nvme_req(rq)->flags |= NVME_MPATH_CNT_ACTIVE; } - if (!blk_queue_io_stat(disk->queue) || blk_rq_is_passthrough(rq)) + if (!blk_queue_io_stat(disk->queue) || blk_rq_is_passthrough(rq) || + (nvme_req(rq)->flags & NVME_MPATH_IO_STATS)) return; nvme_req(rq)->flags |= NVME_MPATH_IO_STATS; diff --git a/drivers/scsi/lpfc/lpfc_nvmet.c b/drivers/scsi/lpfc/lpfc_nvmet.c index 79deddcbdb..84e5397c59 100644 --- a/drivers/scsi/lpfc/lpfc_nvmet.c +++ b/drivers/scsi/lpfc/lpfc_nvmet.c @@ -1245,7 +1245,7 @@ lpfc_nvmet_defer_rcv(struct nvmet_fc_target_port *tgtport, struct lpfc_nvmet_tgtport *tgtp; struct lpfc_async_xchg_ctx *ctxp = container_of(rsp, struct lpfc_async_xchg_ctx, hdlrctx.fcp_req); - struct rqb_dmabuf *nvmebuf = ctxp->rqb_buffer; + struct rqb_dmabuf *nvmebuf; struct lpfc_hba *phba = ctxp->phba; unsigned long iflag; @@ -1253,13 +1253,18 @@ lpfc_nvmet_defer_rcv(struct nvmet_fc_target_port *tgtport, lpfc_nvmeio_data(phba, "NVMET DEFERRCV: xri x%x sz %d CPU %02x\n", ctxp->oxid, ctxp->size, raw_smp_processor_id()); + spin_lock_irqsave(&ctxp->ctxlock, iflag); + nvmebuf = ctxp->rqb_buffer; if (!nvmebuf) { + spin_unlock_irqrestore(&ctxp->ctxlock, iflag); lpfc_printf_log(phba, KERN_INFO, LOG_NVME_IOERR, "6425 Defer rcv: no buffer oxid x%x: " "flg %x ste %x\n", ctxp->oxid, ctxp->flag, ctxp->state); return; } + ctxp->rqb_buffer = NULL; + spin_unlock_irqrestore(&ctxp->ctxlock, iflag); tgtp = phba->targetport->private; if (tgtp) @@ -1267,9 +1272,6 @@ lpfc_nvmet_defer_rcv(struct nvmet_fc_target_port *tgtport, /* Free the nvmebuf since a new buffer already replaced it */ nvmebuf->hrq->rqbp->rqb_free_buffer(phba, nvmebuf); - spin_lock_irqsave(&ctxp->ctxlock, iflag); - ctxp->rqb_buffer = NULL; - spin_unlock_irqrestore(&ctxp->ctxlock, iflag); } /** diff --git a/fs/efivarfs/super.c b/fs/efivarfs/super.c index 64928ff8e9..d3789c5471 100644 --- a/fs/efivarfs/super.c +++ b/fs/efivarfs/super.c @@ -90,6 +90,10 @@ static int efivarfs_d_compare(const struct dentry *dentry, { int guid = len - EFI_VARIABLE_GUID_LEN; + /* Parallel lookups may produce a temporary invalid filename */ + if (guid <= 0) + return 1; + if (name->len != len) return 1; diff --git a/fs/inode.c b/fs/inode.c index faae983f0a..e7e809fb10 100644 --- a/fs/inode.c +++ b/fs/inode.c @@ -193,8 +193,6 @@ int inode_init_always(struct super_block *sb, struct inode *inode) inode->i_wb_frn_history = 0; #endif - if (security_inode_alloc(inode)) - goto out; spin_lock_init(&inode->i_lock); lockdep_set_class(&inode->i_lock, &sb->s_type->i_lock_key); @@ -231,11 +229,12 @@ int inode_init_always(struct super_block *sb, struct inode *inode) inode->i_fsnotify_mask = 0; #endif inode->i_flctx = NULL; + + if (unlikely(security_inode_alloc(inode))) + return -ENOMEM; this_cpu_inc(nr_inodes); return 0; -out: - return -ENOMEM; } EXPORT_SYMBOL(inode_init_always); diff --git a/fs/kernfs/file.c b/fs/kernfs/file.c index ffe8e26d18..c6e9923d93 100644 --- a/fs/kernfs/file.c +++ b/fs/kernfs/file.c @@ -70,6 +70,24 @@ static struct kernfs_open_node *of_on(struct kernfs_open_file *of) !list_empty(&of->list)); } +/* Get active reference to kernfs node for an open file */ +static struct kernfs_open_file *kernfs_get_active_of(struct kernfs_open_file *of) +{ + /* Skip if file was already released */ + if (unlikely(of->released)) + return NULL; + + if (!kernfs_get_active(of->kn)) + return NULL; + + return of; +} + +static void kernfs_put_active_of(struct kernfs_open_file *of) +{ + return kernfs_put_active(of->kn); +} + /** * kernfs_deref_open_node_locked - Get kernfs_open_node corresponding to @kn * @@ -139,7 +157,7 @@ static void kernfs_seq_stop_active(struct seq_file *sf, void *v) if (ops->seq_stop) ops->seq_stop(sf, v); - kernfs_put_active(of->kn); + kernfs_put_active_of(of); } static void *kernfs_seq_start(struct seq_file *sf, loff_t *ppos) @@ -152,7 +170,7 @@ static void *kernfs_seq_start(struct seq_file *sf, loff_t *ppos) * the ops aren't called concurrently for the same open file. */ mutex_lock(&of->mutex); - if (!kernfs_get_active(of->kn)) + if (!kernfs_get_active_of(of)) return ERR_PTR(-ENODEV); ops = kernfs_ops(of->kn); @@ -243,7 +261,7 @@ static ssize_t kernfs_file_read_iter(struct kiocb *iocb, struct iov_iter *iter) * the ops aren't called concurrently for the same open file. */ mutex_lock(&of->mutex); - if (!kernfs_get_active(of->kn)) { + if (!kernfs_get_active_of(of)) { len = -ENODEV; mutex_unlock(&of->mutex); goto out_free; @@ -257,7 +275,7 @@ static ssize_t kernfs_file_read_iter(struct kiocb *iocb, struct iov_iter *iter) else len = -EINVAL; - kernfs_put_active(of->kn); + kernfs_put_active_of(of); mutex_unlock(&of->mutex); if (len < 0) @@ -328,7 +346,7 @@ static ssize_t kernfs_fop_write_iter(struct kiocb *iocb, struct iov_iter *iter) * the ops aren't called concurrently for the same open file. */ mutex_lock(&of->mutex); - if (!kernfs_get_active(of->kn)) { + if (!kernfs_get_active_of(of)) { mutex_unlock(&of->mutex); len = -ENODEV; goto out_free; @@ -340,7 +358,7 @@ static ssize_t kernfs_fop_write_iter(struct kiocb *iocb, struct iov_iter *iter) else len = -EINVAL; - kernfs_put_active(of->kn); + kernfs_put_active_of(of); mutex_unlock(&of->mutex); if (len > 0) @@ -362,13 +380,13 @@ static void kernfs_vma_open(struct vm_area_struct *vma) if (!of->vm_ops) return; - if (!kernfs_get_active(of->kn)) + if (!kernfs_get_active_of(of)) return; if (of->vm_ops->open) of->vm_ops->open(vma); - kernfs_put_active(of->kn); + kernfs_put_active_of(of); } static vm_fault_t kernfs_vma_fault(struct vm_fault *vmf) @@ -380,14 +398,14 @@ static vm_fault_t kernfs_vma_fault(struct vm_fault *vmf) if (!of->vm_ops) return VM_FAULT_SIGBUS; - if (!kernfs_get_active(of->kn)) + if (!kernfs_get_active_of(of)) return VM_FAULT_SIGBUS; ret = VM_FAULT_SIGBUS; if (of->vm_ops->fault) ret = of->vm_ops->fault(vmf); - kernfs_put_active(of->kn); + kernfs_put_active_of(of); return ret; } @@ -400,7 +418,7 @@ static vm_fault_t kernfs_vma_page_mkwrite(struct vm_fault *vmf) if (!of->vm_ops) return VM_FAULT_SIGBUS; - if (!kernfs_get_active(of->kn)) + if (!kernfs_get_active_of(of)) return VM_FAULT_SIGBUS; ret = 0; @@ -409,7 +427,7 @@ static vm_fault_t kernfs_vma_page_mkwrite(struct vm_fault *vmf) else file_update_time(file); - kernfs_put_active(of->kn); + kernfs_put_active_of(of); return ret; } @@ -423,14 +441,14 @@ static int kernfs_vma_access(struct vm_area_struct *vma, unsigned long addr, if (!of->vm_ops) return -EINVAL; - if (!kernfs_get_active(of->kn)) + if (!kernfs_get_active_of(of)) return -EINVAL; ret = -EINVAL; if (of->vm_ops->access) ret = of->vm_ops->access(vma, addr, buf, len, write); - kernfs_put_active(of->kn); + kernfs_put_active_of(of); return ret; } @@ -460,7 +478,7 @@ static int kernfs_fop_mmap(struct file *file, struct vm_area_struct *vma) mutex_lock(&of->mutex); rc = -ENODEV; - if (!kernfs_get_active(of->kn)) + if (!kernfs_get_active_of(of)) goto out_unlock; ops = kernfs_ops(of->kn); @@ -493,7 +511,7 @@ static int kernfs_fop_mmap(struct file *file, struct vm_area_struct *vma) of->vm_ops = vma->vm_ops; vma->vm_ops = &kernfs_vm_ops; out_put: - kernfs_put_active(of->kn); + kernfs_put_active_of(of); out_unlock: mutex_unlock(&of->mutex); @@ -847,7 +865,7 @@ static __poll_t kernfs_fop_poll(struct file *filp, poll_table *wait) struct kernfs_node *kn = kernfs_dentry_node(filp->f_path.dentry); __poll_t ret; - if (!kernfs_get_active(kn)) + if (!kernfs_get_active_of(of)) return DEFAULT_POLLMASK|EPOLLERR|EPOLLPRI; if (kn->attr.ops->poll) @@ -855,7 +873,7 @@ static __poll_t kernfs_fop_poll(struct file *filp, poll_table *wait) else ret = kernfs_generic_poll(of, wait); - kernfs_put_active(kn); + kernfs_put_active_of(of); return ret; } diff --git a/fs/namespace.c b/fs/namespace.c index dbb0e40cd2..9801f4051b 100644 --- a/fs/namespace.c +++ b/fs/namespace.c @@ -2294,6 +2294,19 @@ static int graft_tree(struct mount *mnt, struct mount *p, struct mountpoint *mp) return attach_recursive_mnt(mnt, p, mp, false); } +static int may_change_propagation(const struct mount *m) +{ + struct mnt_namespace *ns = m->mnt_ns; + + // it must be mounted in some namespace + if (IS_ERR_OR_NULL(ns)) // is_mounted() + return -EINVAL; + // and the caller must be admin in userns of that namespace + if (!ns_capable(ns->user_ns, CAP_SYS_ADMIN)) + return -EPERM; + return 0; +} + /* * Sanity check the flags to change_mnt_propagation. */ @@ -2330,6 +2343,10 @@ static int do_change_type(struct path *path, int ms_flags) return -EINVAL; namespace_lock(); + err = may_change_propagation(mnt); + if (err) + goto out_unlock; + if (type == MS_SHARED) { err = invent_group_ids(mnt, recurse); if (err) diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c index 25182d0c6c..e8edf6a999 100644 --- a/fs/nfs/dir.c +++ b/fs/nfs/dir.c @@ -1840,9 +1840,7 @@ static void block_revalidate(struct dentry *dentry) static void unblock_revalidate(struct dentry *dentry) { - /* store_release ensures wait_var_event() sees the update */ - smp_store_release(&dentry->d_fsdata, NULL); - wake_up_var(&dentry->d_fsdata); + store_release_wake_up(&dentry->d_fsdata, NULL); } /* diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c index 77afff68f9..3d5b80e85e 100644 --- a/fs/nfs/nfs4proc.c +++ b/fs/nfs/nfs4proc.c @@ -7815,10 +7815,10 @@ int nfs4_lock_delegation_recall(struct file_lock *fl, struct nfs4_state *state, return err; do { err = _nfs4_do_setlk(state, F_SETLK, fl, NFS_LOCK_NEW); - if (err != -NFS4ERR_DELAY) + if (err != -NFS4ERR_DELAY && err != -NFS4ERR_GRACE) break; ssleep(1); - } while (err == -NFS4ERR_DELAY); + } while (err == -NFS4ERR_DELAY || err == -NFSERR_GRACE); return nfs4_handle_delegation_recall_error(server, state, stateid, fl, err); } diff --git a/fs/nfs/pagelist.c b/fs/nfs/pagelist.c index 04124f2266..bb01dbe6e6 100644 --- a/fs/nfs/pagelist.c +++ b/fs/nfs/pagelist.c @@ -253,13 +253,14 @@ nfs_page_group_unlock(struct nfs_page *req) nfs_page_clear_headlock(req); } -/* - * nfs_page_group_sync_on_bit_locked +/** + * nfs_page_group_sync_on_bit_locked - Test if all requests have @bit set + * @req: request in page group + * @bit: PG_* bit that is used to sync page group * * must be called with page group lock held */ -static bool -nfs_page_group_sync_on_bit_locked(struct nfs_page *req, unsigned int bit) +bool nfs_page_group_sync_on_bit_locked(struct nfs_page *req, unsigned int bit) { struct nfs_page *head = req->wb_head; struct nfs_page *tmp; diff --git a/fs/nfs/write.c b/fs/nfs/write.c index 4faa6505a5..394fbcb004 100644 --- a/fs/nfs/write.c +++ b/fs/nfs/write.c @@ -153,20 +153,10 @@ nfs_page_set_inode_ref(struct nfs_page *req, struct inode *inode) } } -static int -nfs_cancel_remove_inode(struct nfs_page *req, struct inode *inode) +static void nfs_cancel_remove_inode(struct nfs_page *req, struct inode *inode) { - int ret; - - if (!test_bit(PG_REMOVE, &req->wb_flags)) - return 0; - ret = nfs_page_group_lock(req); - if (ret) - return ret; if (test_and_clear_bit(PG_REMOVE, &req->wb_flags)) nfs_page_set_inode_ref(req, inode); - nfs_page_group_unlock(req); - return 0; } /** @@ -584,19 +574,18 @@ retry: return ERR_PTR(ret); } + ret = nfs_page_group_lock(head); + if (ret < 0) + goto out_unlock; + /* Ensure that nobody removed the request before we locked it */ if (head != folio->private) { + nfs_page_group_unlock(head); nfs_unlock_and_release_request(head); goto retry; } - ret = nfs_cancel_remove_inode(head, inode); - if (ret < 0) - goto out_unlock; - - ret = nfs_page_group_lock(head); - if (ret < 0) - goto out_unlock; + nfs_cancel_remove_inode(head, inode); /* lock each request in the page group */ for (subreq = head->wb_this_page; @@ -801,7 +790,8 @@ static void nfs_inode_remove_request(struct nfs_page *req) { struct nfs_inode *nfsi = NFS_I(nfs_page_to_inode(req)); - if (nfs_page_group_sync_on_bit(req, PG_REMOVE)) { + nfs_page_group_lock(req); + if (nfs_page_group_sync_on_bit_locked(req, PG_REMOVE)) { struct folio *folio = nfs_page_to_folio(req->wb_head); struct address_space *mapping = folio->mapping; @@ -813,6 +803,7 @@ static void nfs_inode_remove_request(struct nfs_page *req) } spin_unlock(&mapping->private_lock); } + nfs_page_group_unlock(req); if (test_and_clear_bit(PG_INODE_REF, &req->wb_flags)) { atomic_long_dec(&nfsi->nrequests); diff --git a/fs/nfsd/lockd.c b/fs/nfsd/lockd.c index 46a7f9b813..b02886f389 100644 --- a/fs/nfsd/lockd.c +++ b/fs/nfsd/lockd.c @@ -48,6 +48,21 @@ nlm_fopen(struct svc_rqst *rqstp, struct nfs_fh *f, struct file **filp, switch (nfserr) { case nfs_ok: return 0; + case nfserr_jukebox: + /* this error can indicate a presence of a conflicting + * delegation to an NLM lock request. Options are: + * (1) For now, drop this request and make the client + * retry. When delegation is returned, client's lock retry + * will complete. + * (2) NLM4_DENIED as per "spec" signals to the client + * that the lock is unavailable now but client can retry. + * Linux client implementation does not. It treats + * NLM4_DENIED same as NLM4_FAILED and errors the request. + * (3) For the future, treat this as blocked lock and try + * to callback when the delegation is returned but might + * not have a proper lock request to block on. + */ + fallthrough; case nfserr_dropit: return nlm_drop_reply; case nfserr_stale: diff --git a/fs/nfsd/nfs4callback.c b/fs/nfsd/nfs4callback.c index d8eed853d5..3c67f52bb1 100644 --- a/fs/nfsd/nfs4callback.c +++ b/fs/nfsd/nfs4callback.c @@ -95,10 +95,10 @@ static int decode_cb_fattr4(struct xdr_stream *xdr, uint32_t *bitmap, fattr->ncf_cb_fsize = 0; if (bitmap[0] & FATTR4_WORD0_CHANGE) if (xdr_stream_decode_u64(xdr, &fattr->ncf_cb_change) < 0) - return -NFSERR_BAD_XDR; + return -EIO; if (bitmap[0] & FATTR4_WORD0_SIZE) if (xdr_stream_decode_u64(xdr, &fattr->ncf_cb_fsize) < 0) - return -NFSERR_BAD_XDR; + return -EIO; return 0; } @@ -605,14 +605,14 @@ static int nfs4_xdr_dec_cb_getattr(struct rpc_rqst *rqstp, return status; status = decode_cb_op_status(xdr, OP_CB_GETATTR, &cb->cb_status); - if (status) + if (unlikely(status || cb->cb_status)) return status; if (xdr_stream_decode_uint32_array(xdr, bitmap, 3) < 0) - return -NFSERR_BAD_XDR; + return -EIO; if (xdr_stream_decode_u32(xdr, &attrlen) < 0) - return -NFSERR_BAD_XDR; + return -EIO; if (attrlen > (sizeof(ncf->ncf_cb_change) + sizeof(ncf->ncf_cb_fsize))) - return -NFSERR_BAD_XDR; + return -EIO; status = decode_cb_fattr4(xdr, bitmap, ncf); return status; } diff --git a/fs/pstore/ram_core.c b/fs/pstore/ram_core.c index fe5305028c..2928405785 100644 --- a/fs/pstore/ram_core.c +++ b/fs/pstore/ram_core.c @@ -514,7 +514,7 @@ static int persistent_ram_post_init(struct persistent_ram_zone *prz, u32 sig, sig ^= PERSISTENT_RAM_SIG; if (prz->buffer->sig == sig) { - if (buffer_size(prz) == 0) { + if (buffer_size(prz) == 0 && buffer_start(prz) == 0) { pr_debug("found existing empty buffer\n"); return 0; } diff --git a/fs/smb/client/cifsglob.h b/fs/smb/client/cifsglob.h index adce4573d0..ed69205386 100644 --- a/fs/smb/client/cifsglob.h +++ b/fs/smb/client/cifsglob.h @@ -87,7 +87,7 @@ #define SMB_INTERFACE_POLL_INTERVAL 600 /* maximum number of PDUs in one compound */ -#define MAX_COMPOUND 7 +#define MAX_COMPOUND 10 /* * Default number of credits to keep available for SMB3. @@ -1938,9 +1938,12 @@ static inline bool is_replayable_error(int error) /* cifs_get_writable_file() flags */ -#define FIND_WR_ANY 0 -#define FIND_WR_FSUID_ONLY 1 -#define FIND_WR_WITH_DELETE 2 +enum cifs_writable_file_flags { + FIND_WR_ANY = 0U, + FIND_WR_FSUID_ONLY = (1U << 0), + FIND_WR_WITH_DELETE = (1U << 1), + FIND_WR_NO_PENDING_DELETE = (1U << 2), +}; #define MID_FREE 0 #define MID_REQUEST_ALLOCATED 1 @@ -2374,6 +2377,8 @@ struct smb2_compound_vars { struct kvec qi_iov; struct kvec io_iov[SMB2_IOCTL_IOV_SIZE]; struct kvec si_iov[SMB2_SET_INFO_IOV_SIZE]; + struct kvec unlink_iov[SMB2_SET_INFO_IOV_SIZE]; + struct kvec rename_iov[SMB2_SET_INFO_IOV_SIZE]; struct kvec close_iov; struct smb2_file_rename_info_hdr rename_info; struct smb2_file_link_info_hdr link_info; diff --git a/fs/smb/client/cifsproto.h b/fs/smb/client/cifsproto.h index 776a1fcd5a..ce111e7893 100644 --- a/fs/smb/client/cifsproto.h +++ b/fs/smb/client/cifsproto.h @@ -297,8 +297,8 @@ extern void cifs_close_deferred_file(struct cifsInodeInfo *cifs_inode); extern void cifs_close_all_deferred_files(struct cifs_tcon *cifs_tcon); -extern void cifs_close_deferred_file_under_dentry(struct cifs_tcon *cifs_tcon, - const char *path); +void cifs_close_deferred_file_under_dentry(struct cifs_tcon *cifs_tcon, + struct dentry *dentry); extern void cifs_mark_open_handles_for_deleted_file(struct inode *inode, const char *path); diff --git a/fs/smb/client/file.c b/fs/smb/client/file.c index e951e6ac5a..1aa39f9bde 100644 --- a/fs/smb/client/file.c +++ b/fs/smb/client/file.c @@ -681,7 +681,10 @@ int cifs_open(struct inode *inode, struct file *file) /* Get the cached handle as SMB2 close is deferred */ if (OPEN_FMODE(file->f_flags) & FMODE_WRITE) { - rc = cifs_get_writable_path(tcon, full_path, FIND_WR_FSUID_ONLY, &cfile); + rc = cifs_get_writable_path(tcon, full_path, + FIND_WR_FSUID_ONLY | + FIND_WR_NO_PENDING_DELETE, + &cfile); } else { rc = cifs_get_readable_path(tcon, full_path, &cfile); } @@ -2286,6 +2289,9 @@ refind_writable: continue; if (with_delete && !(open_file->fid.access & DELETE)) continue; + if ((flags & FIND_WR_NO_PENDING_DELETE) && + open_file->status_file_deleted) + continue; if (OPEN_FMODE(open_file->f_flags) & FMODE_WRITE) { if (!open_file->invalidHandle) { /* found a good writable file */ @@ -2403,6 +2409,16 @@ cifs_get_readable_path(struct cifs_tcon *tcon, const char *name, spin_unlock(&tcon->open_file_lock); free_dentry_path(page); *ret_file = find_readable_file(cinode, 0); + if (*ret_file) { + spin_lock(&cinode->open_file_lock); + if ((*ret_file)->status_file_deleted) { + spin_unlock(&cinode->open_file_lock); + cifsFileInfo_put(*ret_file); + *ret_file = NULL; + } else { + spin_unlock(&cinode->open_file_lock); + } + } return *ret_file ? 0 : -ENOENT; } diff --git a/fs/smb/client/inode.c b/fs/smb/client/inode.c index 2320f1e1f2..8780d30ffd 100644 --- a/fs/smb/client/inode.c +++ b/fs/smb/client/inode.c @@ -1912,7 +1912,7 @@ cifs_drop_nlink(struct inode *inode) * but will return the EACCES to the caller. Note that the VFS does not call * unlink on negative dentries currently. */ -int cifs_unlink(struct inode *dir, struct dentry *dentry) +static int __cifs_unlink(struct inode *dir, struct dentry *dentry, bool sillyrename) { int rc = 0; unsigned int xid; @@ -1964,7 +1964,7 @@ int cifs_unlink(struct inode *dir, struct dentry *dentry) goto unlink_out; } - cifs_close_deferred_file_under_dentry(tcon, full_path); + cifs_close_deferred_file_under_dentry(tcon, dentry); #ifdef CONFIG_CIFS_ALLOW_INSECURE_LEGACY if (cap_unix(tcon->ses) && (CIFS_UNIX_POSIX_PATH_OPS_CAP & le64_to_cpu(tcon->fsUnixInfo.Capability))) { @@ -1983,7 +1983,24 @@ retry_std_delete: goto psx_del_no_retry; } - rc = server->ops->unlink(xid, tcon, full_path, cifs_sb, dentry); + /* For SMB2+, if the file is open, we always perform a silly rename. + * + * We check for d_count() right after calling + * cifs_close_deferred_file_under_dentry() to make sure that the + * dentry's refcount gets dropped in case the file had any deferred + * close. + */ + if (!sillyrename && server->vals->protocol_id > SMB10_PROT_ID) { + spin_lock(&dentry->d_lock); + if (d_count(dentry) > 1) + sillyrename = true; + spin_unlock(&dentry->d_lock); + } + + if (sillyrename) + rc = -EBUSY; + else + rc = server->ops->unlink(xid, tcon, full_path, cifs_sb, dentry); psx_del_no_retry: if (!rc) { @@ -2051,6 +2068,11 @@ unlink_out: return rc; } +int cifs_unlink(struct inode *dir, struct dentry *dentry) +{ + return __cifs_unlink(dir, dentry, false); +} + static int cifs_mkdir_qinfo(struct inode *parent, struct dentry *dentry, umode_t mode, const char *full_path, struct cifs_sb_info *cifs_sb, @@ -2338,14 +2360,16 @@ int cifs_rmdir(struct inode *inode, struct dentry *direntry) rc = server->ops->rmdir(xid, tcon, full_path, cifs_sb); cifs_put_tlink(tlink); + cifsInode = CIFS_I(d_inode(direntry)); + if (!rc) { + set_bit(CIFS_INO_DELETE_PENDING, &cifsInode->flags); spin_lock(&d_inode(direntry)->i_lock); i_size_write(d_inode(direntry), 0); clear_nlink(d_inode(direntry)); spin_unlock(&d_inode(direntry)->i_lock); } - cifsInode = CIFS_I(d_inode(direntry)); /* force revalidate to go get info when needed */ cifsInode->time = 0; @@ -2438,8 +2462,11 @@ cifs_do_rename(const unsigned int xid, struct dentry *from_dentry, } #endif /* CONFIG_CIFS_ALLOW_INSECURE_LEGACY */ do_rename_exit: - if (rc == 0) + if (rc == 0) { d_move(from_dentry, to_dentry); + /* Force a new lookup */ + d_drop(from_dentry); + } cifs_put_tlink(tlink); return rc; } @@ -2450,6 +2477,7 @@ cifs_rename2(struct mnt_idmap *idmap, struct inode *source_dir, struct dentry *target_dentry, unsigned int flags) { const char *from_name, *to_name; + struct TCP_Server_Info *server; void *page1, *page2; struct cifs_sb_info *cifs_sb; struct tcon_link *tlink; @@ -2485,6 +2513,7 @@ cifs_rename2(struct mnt_idmap *idmap, struct inode *source_dir, if (IS_ERR(tlink)) return PTR_ERR(tlink); tcon = tlink_tcon(tlink); + server = tcon->ses->server; page1 = alloc_dentry_path(); page2 = alloc_dentry_path(); @@ -2502,9 +2531,9 @@ cifs_rename2(struct mnt_idmap *idmap, struct inode *source_dir, goto cifs_rename_exit; } - cifs_close_deferred_file_under_dentry(tcon, from_name); + cifs_close_deferred_file_under_dentry(tcon, source_dentry); if (d_inode(target_dentry) != NULL) - cifs_close_deferred_file_under_dentry(tcon, to_name); + cifs_close_deferred_file_under_dentry(tcon, target_dentry); rc = cifs_do_rename(xid, source_dentry, from_name, target_dentry, to_name); @@ -2569,19 +2598,52 @@ cifs_rename2(struct mnt_idmap *idmap, struct inode *source_dir, unlink_target: #endif /* CONFIG_CIFS_ALLOW_INSECURE_LEGACY */ - - /* Try unlinking the target dentry if it's not negative */ - if (d_really_is_positive(target_dentry) && (rc == -EACCES || rc == -EEXIST)) { - if (d_is_dir(target_dentry)) - tmprc = cifs_rmdir(target_dir, target_dentry); - else - tmprc = cifs_unlink(target_dir, target_dentry); - if (tmprc) - goto cifs_rename_exit; - rc = cifs_do_rename(xid, source_dentry, from_name, - target_dentry, to_name); - if (!rc) - rehash = false; + if (d_really_is_positive(target_dentry)) { + if (!rc) { + struct inode *inode = d_inode(target_dentry); + /* + * Samba and ksmbd servers allow renaming a target + * directory that is open, so make sure to update + * ->i_nlink and then mark it as delete pending. + */ + if (S_ISDIR(inode->i_mode)) { + drop_cached_dir_by_name(xid, tcon, to_name, cifs_sb); + spin_lock(&inode->i_lock); + i_size_write(inode, 0); + clear_nlink(inode); + spin_unlock(&inode->i_lock); + set_bit(CIFS_INO_DELETE_PENDING, &CIFS_I(inode)->flags); + CIFS_I(inode)->time = 0; /* force reval */ + inode->i_mtime = inode_set_ctime_current(inode); + } + } else if (rc == -EACCES || rc == -EEXIST) { + /* + * Rename failed, possibly due to a busy target. + * Retry it by unliking the target first. + */ + if (d_is_dir(target_dentry)) { + tmprc = cifs_rmdir(target_dir, target_dentry); + } else { + tmprc = __cifs_unlink(target_dir, target_dentry, + server->vals->protocol_id > SMB10_PROT_ID); + } + if (tmprc) { + /* + * Some servers will return STATUS_ACCESS_DENIED + * or STATUS_DIRECTORY_NOT_EMPTY when failing to + * rename a non-empty directory. Make sure to + * propagate the appropriate error back to + * userspace. + */ + if (tmprc == -EEXIST || tmprc == -ENOTEMPTY) + rc = tmprc; + goto cifs_rename_exit; + } + rc = cifs_do_rename(xid, source_dentry, from_name, + target_dentry, to_name); + if (!rc) + rehash = false; + } } /* force revalidate to go get info when needed */ @@ -2610,6 +2672,8 @@ cifs_dentry_needs_reval(struct dentry *dentry) struct cifs_tcon *tcon = cifs_sb_master_tcon(cifs_sb); struct cached_fid *cfid = NULL; + if (test_bit(CIFS_INO_DELETE_PENDING, &cifs_i->flags)) + return false; if (cifs_i->time == 0) return true; diff --git a/fs/smb/client/misc.c b/fs/smb/client/misc.c index 0d4b82c448..2b121f6c0b 100644 --- a/fs/smb/client/misc.c +++ b/fs/smb/client/misc.c @@ -832,33 +832,28 @@ cifs_close_all_deferred_files(struct cifs_tcon *tcon) kfree(tmp_list); } } -void -cifs_close_deferred_file_under_dentry(struct cifs_tcon *tcon, const char *path) + +void cifs_close_deferred_file_under_dentry(struct cifs_tcon *tcon, + struct dentry *dentry) { - struct cifsFileInfo *cfile; struct file_list *tmp_list, *tmp_next_list; - void *page; - const char *full_path; + struct cifsFileInfo *cfile; LIST_HEAD(file_head); - page = alloc_dentry_path(); spin_lock(&tcon->open_file_lock); list_for_each_entry(cfile, &tcon->openFileList, tlist) { - full_path = build_path_from_dentry(cfile->dentry, page); - if (strstr(full_path, path)) { - if (delayed_work_pending(&cfile->deferred)) { - if (cancel_delayed_work(&cfile->deferred)) { - spin_lock(&CIFS_I(d_inode(cfile->dentry))->deferred_lock); - cifs_del_deferred_close(cfile); - spin_unlock(&CIFS_I(d_inode(cfile->dentry))->deferred_lock); + if ((cfile->dentry == dentry) && + delayed_work_pending(&cfile->deferred) && + cancel_delayed_work(&cfile->deferred)) { + spin_lock(&CIFS_I(d_inode(cfile->dentry))->deferred_lock); + cifs_del_deferred_close(cfile); + spin_unlock(&CIFS_I(d_inode(cfile->dentry))->deferred_lock); - tmp_list = kmalloc(sizeof(struct file_list), GFP_ATOMIC); - if (tmp_list == NULL) - break; - tmp_list->cfile = cfile; - list_add_tail(&tmp_list->list, &file_head); - } - } + tmp_list = kmalloc(sizeof(struct file_list), GFP_ATOMIC); + if (tmp_list == NULL) + break; + tmp_list->cfile = cfile; + list_add_tail(&tmp_list->list, &file_head); } } spin_unlock(&tcon->open_file_lock); @@ -868,7 +863,6 @@ cifs_close_deferred_file_under_dentry(struct cifs_tcon *tcon, const char *path) list_del(&tmp_list->list); kfree(tmp_list); } - free_dentry_path(page); } /* diff --git a/fs/smb/client/smb2glob.h b/fs/smb/client/smb2glob.h index 224495322a..e56e4d402f 100644 --- a/fs/smb/client/smb2glob.h +++ b/fs/smb/client/smb2glob.h @@ -30,10 +30,9 @@ enum smb2_compound_ops { SMB2_OP_QUERY_DIR, SMB2_OP_MKDIR, SMB2_OP_RENAME, - SMB2_OP_DELETE, SMB2_OP_HARDLINK, SMB2_OP_SET_EOF, - SMB2_OP_RMDIR, + SMB2_OP_UNLINK, SMB2_OP_POSIX_QUERY_INFO, SMB2_OP_SET_REPARSE, SMB2_OP_GET_REPARSE, diff --git a/fs/smb/client/smb2inode.c b/fs/smb/client/smb2inode.c index 0a76620d4b..c534595ba9 100644 --- a/fs/smb/client/smb2inode.c +++ b/fs/smb/client/smb2inode.c @@ -207,8 +207,10 @@ replay_again: server = cifs_pick_channel(ses); vars = kzalloc(sizeof(*vars), GFP_ATOMIC); - if (vars == NULL) - return -ENOMEM; + if (vars == NULL) { + rc = -ENOMEM; + goto out; + } rqst = &vars->rqst[0]; rsp_iov = &vars->rsp_iov[0]; @@ -344,9 +346,6 @@ replay_again: trace_smb3_posix_query_info_compound_enter(xid, tcon->tid, ses->Suid, full_path); break; - case SMB2_OP_DELETE: - trace_smb3_delete_enter(xid, tcon->tid, ses->Suid, full_path); - break; case SMB2_OP_MKDIR: /* * Directories are created through parameters in the @@ -354,23 +353,40 @@ replay_again: */ trace_smb3_mkdir_enter(xid, tcon->tid, ses->Suid, full_path); break; - case SMB2_OP_RMDIR: - rqst[num_rqst].rq_iov = &vars->si_iov[0]; + case SMB2_OP_UNLINK: + rqst[num_rqst].rq_iov = vars->unlink_iov; rqst[num_rqst].rq_nvec = 1; size[0] = 1; /* sizeof __u8 See MS-FSCC section 2.4.11 */ data[0] = &delete_pending[0]; - rc = SMB2_set_info_init(tcon, server, - &rqst[num_rqst], COMPOUND_FID, - COMPOUND_FID, current->tgid, - FILE_DISPOSITION_INFORMATION, - SMB2_O_INFO_FILE, 0, data, size); - if (rc) + if (cfile) { + rc = SMB2_set_info_init(tcon, server, + &rqst[num_rqst], + cfile->fid.persistent_fid, + cfile->fid.volatile_fid, + current->tgid, + FILE_DISPOSITION_INFORMATION, + SMB2_O_INFO_FILE, 0, + data, size); + } else { + rc = SMB2_set_info_init(tcon, server, + &rqst[num_rqst], + COMPOUND_FID, + COMPOUND_FID, + current->tgid, + FILE_DISPOSITION_INFORMATION, + SMB2_O_INFO_FILE, 0, + data, size); + } + if (!rc && (!cfile || num_rqst > 1)) { + smb2_set_next_command(tcon, &rqst[num_rqst]); + smb2_set_related(&rqst[num_rqst]); + } else if (rc) { goto finished; - smb2_set_next_command(tcon, &rqst[num_rqst]); - smb2_set_related(&rqst[num_rqst++]); - trace_smb3_rmdir_enter(xid, tcon->tid, ses->Suid, full_path); + } + num_rqst++; + trace_smb3_unlink_enter(xid, tcon->tid, ses->Suid, full_path); break; case SMB2_OP_SET_EOF: rqst[num_rqst].rq_iov = &vars->si_iov[0]; @@ -440,7 +456,7 @@ replay_again: ses->Suid, full_path); break; case SMB2_OP_RENAME: - rqst[num_rqst].rq_iov = &vars->si_iov[0]; + rqst[num_rqst].rq_iov = vars->rename_iov; rqst[num_rqst].rq_nvec = 2; len = in_iov[i].iov_len; @@ -671,7 +687,7 @@ finished: } for (i = 0; i < num_cmds; i++) { - char *buf = rsp_iov[i + i].iov_base; + char *buf = rsp_iov[i + 1].iov_base; if (buf && resp_buftype[i + 1] != CIFS_NO_BUFFER) rc = server->ops->map_error(buf, false); @@ -730,19 +746,6 @@ finished: trace_smb3_posix_query_info_compound_done(xid, tcon->tid, ses->Suid); break; - case SMB2_OP_DELETE: - if (rc) - trace_smb3_delete_err(xid, tcon->tid, ses->Suid, rc); - else { - /* - * If dentry (hence, inode) is NULL, lease break is going to - * take care of degrading leases on handles for deleted files. - */ - if (inode) - cifs_mark_open_handles_for_deleted_file(inode, full_path); - trace_smb3_delete_done(xid, tcon->tid, ses->Suid); - } - break; case SMB2_OP_MKDIR: if (rc) trace_smb3_mkdir_err(xid, tcon->tid, ses->Suid, rc); @@ -763,11 +766,11 @@ finished: trace_smb3_rename_done(xid, tcon->tid, ses->Suid); SMB2_set_info_free(&rqst[num_rqst++]); break; - case SMB2_OP_RMDIR: - if (rc) - trace_smb3_rmdir_err(xid, tcon->tid, ses->Suid, rc); + case SMB2_OP_UNLINK: + if (!rc) + trace_smb3_unlink_done(xid, tcon->tid, ses->Suid); else - trace_smb3_rmdir_done(xid, tcon->tid, ses->Suid); + trace_smb3_unlink_err(xid, tcon->tid, ses->Suid, rc); SMB2_set_info_free(&rqst[num_rqst++]); break; case SMB2_OP_SET_EOF: @@ -864,6 +867,7 @@ finished: smb2_should_replay(tcon, &retries, &cur_sleep)) goto replay_again; +out: if (cfile) cifsFileInfo_put(cfile); @@ -1163,7 +1167,7 @@ smb2_rmdir(const unsigned int xid, struct cifs_tcon *tcon, const char *name, FILE_OPEN, CREATE_NOT_FILE, ACL_NO_MODE); return smb2_compound_op(xid, tcon, cifs_sb, name, &oparms, NULL, - &(int){SMB2_OP_RMDIR}, 1, + &(int){SMB2_OP_UNLINK}, 1, NULL, NULL, NULL, NULL); } @@ -1171,21 +1175,107 @@ int smb2_unlink(const unsigned int xid, struct cifs_tcon *tcon, const char *name, struct cifs_sb_info *cifs_sb, struct dentry *dentry) { + struct kvec open_iov[SMB2_CREATE_IOV_SIZE]; + __le16 *utf16_path __free(kfree) = NULL; + int retries = 0, cur_sleep = 1; + struct TCP_Server_Info *server; struct cifs_open_parms oparms; + struct smb2_create_req *creq; + struct inode *inode = NULL; + struct smb_rqst rqst[2]; + struct kvec rsp_iov[2]; + struct kvec close_iov; + int resp_buftype[2]; + struct cifs_fid fid; + int flags = 0; + __u8 oplock; + int rc; - oparms = CIFS_OPARMS(cifs_sb, tcon, name, - DELETE, FILE_OPEN, - CREATE_DELETE_ON_CLOSE | OPEN_REPARSE_POINT, - ACL_NO_MODE); - int rc = smb2_compound_op(xid, tcon, cifs_sb, name, &oparms, - NULL, &(int){SMB2_OP_DELETE}, 1, - NULL, NULL, NULL, dentry); - if (rc == -EINVAL) { - cifs_dbg(FYI, "invalid lease key, resending request without lease"); - rc = smb2_compound_op(xid, tcon, cifs_sb, name, &oparms, - NULL, &(int){SMB2_OP_DELETE}, 1, - NULL, NULL, NULL, NULL); + utf16_path = cifs_convert_path_to_utf16(name, cifs_sb); + if (!utf16_path) + return -ENOMEM; + + if (smb3_encryption_required(tcon)) + flags |= CIFS_TRANSFORM_REQ; +again: + oplock = SMB2_OPLOCK_LEVEL_NONE; + server = cifs_pick_channel(tcon->ses); + + memset(rqst, 0, sizeof(rqst)); + memset(resp_buftype, 0, sizeof(resp_buftype)); + memset(rsp_iov, 0, sizeof(rsp_iov)); + + rqst[0].rq_iov = open_iov; + rqst[0].rq_nvec = ARRAY_SIZE(open_iov); + + oparms = CIFS_OPARMS(cifs_sb, tcon, name, DELETE | FILE_READ_ATTRIBUTES, + FILE_OPEN, CREATE_DELETE_ON_CLOSE | + OPEN_REPARSE_POINT, ACL_NO_MODE); + oparms.fid = &fid; + + if (dentry) { + inode = d_inode(dentry); + if (CIFS_I(inode)->lease_granted && server->ops->get_lease_key) { + oplock = SMB2_OPLOCK_LEVEL_LEASE; + server->ops->get_lease_key(inode, &fid); + } } + + rc = SMB2_open_init(tcon, server, + &rqst[0], &oplock, &oparms, utf16_path); + if (rc) + goto err_free; + smb2_set_next_command(tcon, &rqst[0]); + creq = rqst[0].rq_iov[0].iov_base; + creq->ShareAccess = FILE_SHARE_DELETE_LE; + + rqst[1].rq_iov = &close_iov; + rqst[1].rq_nvec = 1; + + rc = SMB2_close_init(tcon, server, &rqst[1], + COMPOUND_FID, COMPOUND_FID, false); + smb2_set_related(&rqst[1]); + if (rc) + goto err_free; + + if (retries) { + for (int i = 0; i < ARRAY_SIZE(rqst); i++) + smb2_set_replay(server, &rqst[i]); + } + + rc = compound_send_recv(xid, tcon->ses, server, flags, + ARRAY_SIZE(rqst), rqst, + resp_buftype, rsp_iov); + SMB2_open_free(&rqst[0]); + SMB2_close_free(&rqst[1]); + free_rsp_buf(resp_buftype[0], rsp_iov[0].iov_base); + free_rsp_buf(resp_buftype[1], rsp_iov[1].iov_base); + + if (is_replayable_error(rc) && + smb2_should_replay(tcon, &retries, &cur_sleep)) + goto again; + + /* Retry compound request without lease */ + if (rc == -EINVAL && dentry) { + dentry = NULL; + retries = 0; + cur_sleep = 1; + goto again; + } + /* + * If dentry (hence, inode) is NULL, lease break is going to + * take care of degrading leases on handles for deleted files. + */ + if (!rc && inode) + cifs_mark_open_handles_for_deleted_file(inode, name); + + return rc; + +err_free: + SMB2_open_free(&rqst[0]); + SMB2_close_free(&rqst[1]); + free_rsp_buf(resp_buftype[0], rsp_iov[0].iov_base); + free_rsp_buf(resp_buftype[1], rsp_iov[1].iov_base); return rc; } @@ -1438,3 +1528,113 @@ out: cifs_free_open_info(&data); return rc; } + +static inline __le16 *utf16_smb2_path(struct cifs_sb_info *cifs_sb, + const char *name, size_t namelen) +{ + int len; + + if (*name == '\\' || + (cifs_sb_master_tlink(cifs_sb) && + cifs_sb_master_tcon(cifs_sb)->posix_extensions && *name == '/')) + name++; + return cifs_strndup_to_utf16(name, namelen, &len, + cifs_sb->local_nls, + cifs_remap(cifs_sb)); +} + +int smb2_rename_pending_delete(const char *full_path, + struct dentry *dentry, + const unsigned int xid) +{ + struct cifs_sb_info *cifs_sb = CIFS_SB(d_inode(dentry)->i_sb); + struct cifsInodeInfo *cinode = CIFS_I(d_inode(dentry)); + __le16 *utf16_path __free(kfree) = NULL; + __u32 co = file_create_options(dentry); + int cmds[] = { + SMB2_OP_SET_INFO, + SMB2_OP_RENAME, + SMB2_OP_UNLINK, + }; + const int num_cmds = ARRAY_SIZE(cmds); + char *to_name __free(kfree) = NULL; + __u32 attrs = cinode->cifsAttrs; + struct cifs_open_parms oparms; + static atomic_t sillycounter; + struct cifsFileInfo *cfile; + struct tcon_link *tlink; + struct cifs_tcon *tcon; + struct kvec iov[2]; + const char *ppath; + void *page; + size_t len; + int rc; + + tlink = cifs_sb_tlink(cifs_sb); + if (IS_ERR(tlink)) + return PTR_ERR(tlink); + tcon = tlink_tcon(tlink); + + page = alloc_dentry_path(); + + ppath = build_path_from_dentry(dentry->d_parent, page); + if (IS_ERR(ppath)) { + rc = PTR_ERR(ppath); + goto out; + } + + len = strlen(ppath) + strlen("/.__smb1234") + 1; + to_name = kmalloc(len, GFP_KERNEL); + if (!to_name) { + rc = -ENOMEM; + goto out; + } + + scnprintf(to_name, len, "%s%c.__smb%04X", ppath, CIFS_DIR_SEP(cifs_sb), + atomic_inc_return(&sillycounter) & 0xffff); + + utf16_path = utf16_smb2_path(cifs_sb, to_name, len); + if (!utf16_path) { + rc = -ENOMEM; + goto out; + } + + drop_cached_dir_by_name(xid, tcon, full_path, cifs_sb); + oparms = CIFS_OPARMS(cifs_sb, tcon, full_path, + DELETE | FILE_WRITE_ATTRIBUTES, + FILE_OPEN, co, ACL_NO_MODE); + + attrs &= ~ATTR_READONLY; + if (!attrs) + attrs = ATTR_NORMAL; + if (d_inode(dentry)->i_nlink <= 1) + attrs |= ATTR_HIDDEN; + iov[0].iov_base = &(FILE_BASIC_INFO) { + .Attributes = cpu_to_le32(attrs), + }; + iov[0].iov_len = sizeof(FILE_BASIC_INFO); + iov[1].iov_base = utf16_path; + iov[1].iov_len = sizeof(*utf16_path) * UniStrlen((wchar_t *)utf16_path); + + cifs_get_writable_path(tcon, full_path, FIND_WR_WITH_DELETE, &cfile); + rc = smb2_compound_op(xid, tcon, cifs_sb, full_path, &oparms, iov, + cmds, num_cmds, cfile, NULL, NULL, dentry); + if (rc == -EINVAL) { + cifs_dbg(FYI, "invalid lease key, resending request without lease\n"); + cifs_get_writable_path(tcon, full_path, + FIND_WR_WITH_DELETE, &cfile); + rc = smb2_compound_op(xid, tcon, cifs_sb, full_path, &oparms, iov, + cmds, num_cmds, cfile, NULL, NULL, NULL); + } + if (!rc) { + set_bit(CIFS_INO_DELETE_PENDING, &cinode->flags); + } else { + cifs_tcon_dbg(FYI, "%s: failed to rename '%s' to '%s': %d\n", + __func__, full_path, to_name, rc); + rc = -EIO; + } +out: + cifs_put_tlink(tlink); + free_dentry_path(page); + return rc; +} diff --git a/fs/smb/client/smb2ops.c b/fs/smb/client/smb2ops.c index 65dc20a16a..364f53a805 100644 --- a/fs/smb/client/smb2ops.c +++ b/fs/smb/client/smb2ops.c @@ -2596,13 +2596,35 @@ smb2_set_next_command(struct cifs_tcon *tcon, struct smb_rqst *rqst) } /* SMB headers in a compound are 8 byte aligned. */ - if (!IS_ALIGNED(len, 8)) { - num_padding = 8 - (len & 7); + if (IS_ALIGNED(len, 8)) + goto out; + + num_padding = 8 - (len & 7); + if (smb3_encryption_required(tcon)) { + int i; + + /* + * Flatten request into a single buffer with required padding as + * the encryption layer can't handle the padding iovs. + */ + for (i = 1; i < rqst->rq_nvec; i++) { + memcpy(rqst->rq_iov[0].iov_base + + rqst->rq_iov[0].iov_len, + rqst->rq_iov[i].iov_base, + rqst->rq_iov[i].iov_len); + rqst->rq_iov[0].iov_len += rqst->rq_iov[i].iov_len; + } + memset(rqst->rq_iov[0].iov_base + rqst->rq_iov[0].iov_len, + 0, num_padding); + rqst->rq_iov[0].iov_len += num_padding; + rqst->rq_nvec = 1; + } else { rqst->rq_iov[rqst->rq_nvec].iov_base = smb2_padding; rqst->rq_iov[rqst->rq_nvec].iov_len = num_padding; rqst->rq_nvec++; - len += num_padding; } + len += num_padding; +out: shdr->NextCommand = cpu_to_le32(len); } @@ -5320,6 +5342,7 @@ struct smb_version_operations smb20_operations = { .llseek = smb3_llseek, .is_status_io_timeout = smb2_is_status_io_timeout, .is_network_name_deleted = smb2_is_network_name_deleted, + .rename_pending_delete = smb2_rename_pending_delete, }; #endif /* CIFS_ALLOW_INSECURE_LEGACY */ @@ -5425,6 +5448,7 @@ struct smb_version_operations smb21_operations = { .llseek = smb3_llseek, .is_status_io_timeout = smb2_is_status_io_timeout, .is_network_name_deleted = smb2_is_network_name_deleted, + .rename_pending_delete = smb2_rename_pending_delete, }; struct smb_version_operations smb30_operations = { @@ -5541,6 +5565,7 @@ struct smb_version_operations smb30_operations = { .llseek = smb3_llseek, .is_status_io_timeout = smb2_is_status_io_timeout, .is_network_name_deleted = smb2_is_network_name_deleted, + .rename_pending_delete = smb2_rename_pending_delete, }; struct smb_version_operations smb311_operations = { @@ -5657,6 +5682,7 @@ struct smb_version_operations smb311_operations = { .llseek = smb3_llseek, .is_status_io_timeout = smb2_is_status_io_timeout, .is_network_name_deleted = smb2_is_network_name_deleted, + .rename_pending_delete = smb2_rename_pending_delete, }; #ifdef CONFIG_CIFS_ALLOW_INSECURE_LEGACY diff --git a/fs/smb/client/smb2proto.h b/fs/smb/client/smb2proto.h index 6dcbf98e83..4c98aa3fa8 100644 --- a/fs/smb/client/smb2proto.h +++ b/fs/smb/client/smb2proto.h @@ -318,5 +318,8 @@ int posix_info_sid_size(const void *beg, const void *end); int smb2_make_nfs_node(unsigned int xid, struct inode *inode, struct dentry *dentry, struct cifs_tcon *tcon, const char *full_path, umode_t mode, dev_t dev); +int smb2_rename_pending_delete(const char *full_path, + struct dentry *dentry, + const unsigned int xid); #endif /* _SMB2PROTO_H */ diff --git a/fs/smb/client/trace.h b/fs/smb/client/trace.h index 7e215e13fe..8a2cf2887e 100644 --- a/fs/smb/client/trace.h +++ b/fs/smb/client/trace.h @@ -544,13 +544,12 @@ DEFINE_SMB3_INF_COMPOUND_ENTER_EVENT(query_info_compound_enter); DEFINE_SMB3_INF_COMPOUND_ENTER_EVENT(posix_query_info_compound_enter); DEFINE_SMB3_INF_COMPOUND_ENTER_EVENT(hardlink_enter); DEFINE_SMB3_INF_COMPOUND_ENTER_EVENT(rename_enter); -DEFINE_SMB3_INF_COMPOUND_ENTER_EVENT(rmdir_enter); +DEFINE_SMB3_INF_COMPOUND_ENTER_EVENT(unlink_enter); DEFINE_SMB3_INF_COMPOUND_ENTER_EVENT(set_eof_enter); DEFINE_SMB3_INF_COMPOUND_ENTER_EVENT(set_info_compound_enter); DEFINE_SMB3_INF_COMPOUND_ENTER_EVENT(set_reparse_compound_enter); DEFINE_SMB3_INF_COMPOUND_ENTER_EVENT(get_reparse_compound_enter); DEFINE_SMB3_INF_COMPOUND_ENTER_EVENT(query_wsl_ea_compound_enter); -DEFINE_SMB3_INF_COMPOUND_ENTER_EVENT(delete_enter); DEFINE_SMB3_INF_COMPOUND_ENTER_EVENT(mkdir_enter); DEFINE_SMB3_INF_COMPOUND_ENTER_EVENT(tdis_enter); DEFINE_SMB3_INF_COMPOUND_ENTER_EVENT(mknod_enter); @@ -585,13 +584,12 @@ DEFINE_SMB3_INF_COMPOUND_DONE_EVENT(query_info_compound_done); DEFINE_SMB3_INF_COMPOUND_DONE_EVENT(posix_query_info_compound_done); DEFINE_SMB3_INF_COMPOUND_DONE_EVENT(hardlink_done); DEFINE_SMB3_INF_COMPOUND_DONE_EVENT(rename_done); -DEFINE_SMB3_INF_COMPOUND_DONE_EVENT(rmdir_done); +DEFINE_SMB3_INF_COMPOUND_DONE_EVENT(unlink_done); DEFINE_SMB3_INF_COMPOUND_DONE_EVENT(set_eof_done); DEFINE_SMB3_INF_COMPOUND_DONE_EVENT(set_info_compound_done); DEFINE_SMB3_INF_COMPOUND_DONE_EVENT(set_reparse_compound_done); DEFINE_SMB3_INF_COMPOUND_DONE_EVENT(get_reparse_compound_done); DEFINE_SMB3_INF_COMPOUND_DONE_EVENT(query_wsl_ea_compound_done); -DEFINE_SMB3_INF_COMPOUND_DONE_EVENT(delete_done); DEFINE_SMB3_INF_COMPOUND_DONE_EVENT(mkdir_done); DEFINE_SMB3_INF_COMPOUND_DONE_EVENT(tdis_done); DEFINE_SMB3_INF_COMPOUND_DONE_EVENT(mknod_done); @@ -631,14 +629,13 @@ DEFINE_SMB3_INF_COMPOUND_ERR_EVENT(query_info_compound_err); DEFINE_SMB3_INF_COMPOUND_ERR_EVENT(posix_query_info_compound_err); DEFINE_SMB3_INF_COMPOUND_ERR_EVENT(hardlink_err); DEFINE_SMB3_INF_COMPOUND_ERR_EVENT(rename_err); -DEFINE_SMB3_INF_COMPOUND_ERR_EVENT(rmdir_err); +DEFINE_SMB3_INF_COMPOUND_ERR_EVENT(unlink_err); DEFINE_SMB3_INF_COMPOUND_ERR_EVENT(set_eof_err); DEFINE_SMB3_INF_COMPOUND_ERR_EVENT(set_info_compound_err); DEFINE_SMB3_INF_COMPOUND_ERR_EVENT(set_reparse_compound_err); DEFINE_SMB3_INF_COMPOUND_ERR_EVENT(get_reparse_compound_err); DEFINE_SMB3_INF_COMPOUND_ERR_EVENT(query_wsl_ea_compound_err); DEFINE_SMB3_INF_COMPOUND_ERR_EVENT(mkdir_err); -DEFINE_SMB3_INF_COMPOUND_ERR_EVENT(delete_err); DEFINE_SMB3_INF_COMPOUND_ERR_EVENT(tdis_err); DEFINE_SMB3_INF_COMPOUND_ERR_EVENT(mknod_err); diff --git a/fs/xfs/libxfs/xfs_attr_remote.c b/fs/xfs/libxfs/xfs_attr_remote.c index 4c44ce1c8a..bff3dc226f 100644 --- a/fs/xfs/libxfs/xfs_attr_remote.c +++ b/fs/xfs/libxfs/xfs_attr_remote.c @@ -435,6 +435,13 @@ xfs_attr_rmtval_get( 0, &bp, &xfs_attr3_rmt_buf_ops); if (xfs_metadata_is_sick(error)) xfs_dirattr_mark_sick(args->dp, XFS_ATTR_FORK); + /* + * ENODATA from disk implies a disk medium failure; + * ENODATA for xattrs means attribute not found, so + * disambiguate that here. + */ + if (error == -ENODATA) + error = -EIO; if (error) return error; diff --git a/fs/xfs/libxfs/xfs_da_btree.c b/fs/xfs/libxfs/xfs_da_btree.c index 17d9e6154f..723a0643b8 100644 --- a/fs/xfs/libxfs/xfs_da_btree.c +++ b/fs/xfs/libxfs/xfs_da_btree.c @@ -2833,6 +2833,12 @@ xfs_da_read_buf( &bp, ops); if (xfs_metadata_is_sick(error)) xfs_dirattr_mark_sick(dp, whichfork); + /* + * ENODATA from disk implies a disk medium failure; ENODATA for + * xattrs means attribute not found, so disambiguate that here. + */ + if (error == -ENODATA && whichfork == XFS_ATTR_FORK) + error = -EIO; if (error) goto out_free; diff --git a/include/linux/cpu.h b/include/linux/cpu.h index e73eb3a55a..19adf72edc 100644 --- a/include/linux/cpu.h +++ b/include/linux/cpu.h @@ -80,6 +80,7 @@ extern ssize_t cpu_show_reg_file_data_sampling(struct device *dev, extern ssize_t cpu_show_indirect_target_selection(struct device *dev, struct device_attribute *attr, char *buf); extern ssize_t cpu_show_tsa(struct device *dev, struct device_attribute *attr, char *buf); +extern ssize_t cpu_show_vmscape(struct device *dev, struct device_attribute *attr, char *buf); extern __printf(4, 5) struct device *cpu_device_create(struct device *parent, void *drvdata, diff --git a/include/linux/nfs_page.h b/include/linux/nfs_page.h index 169b4ae30f..9aed39abc9 100644 --- a/include/linux/nfs_page.h +++ b/include/linux/nfs_page.h @@ -160,6 +160,7 @@ extern void nfs_join_page_group(struct nfs_page *head, extern int nfs_page_group_lock(struct nfs_page *); extern void nfs_page_group_unlock(struct nfs_page *); extern bool nfs_page_group_sync_on_bit(struct nfs_page *, unsigned int); +extern bool nfs_page_group_sync_on_bit_locked(struct nfs_page *, unsigned int); extern int nfs_page_set_headlock(struct nfs_page *req); extern void nfs_page_clear_headlock(struct nfs_page *req); extern bool nfs_async_iocounter_wait(struct rpc_task *, struct nfs_lock_context *); diff --git a/include/linux/wait_bit.h b/include/linux/wait_bit.h index d9c23377dc..b6b726946b 100644 --- a/include/linux/wait_bit.h +++ b/include/linux/wait_bit.h @@ -6,9 +6,10 @@ * Linux wait-bit related types and methods: */ #include +#include struct wait_bit_key { - void *flags; + RH_KABI_REPLACE(void *flags, unsigned long *flags) int bit_nr; unsigned long timeout; }; @@ -23,14 +24,14 @@ struct wait_bit_queue_entry { typedef int wait_bit_action_f(struct wait_bit_key *key, int mode); -void __wake_up_bit(struct wait_queue_head *wq_head, void *word, int bit); +void __wake_up_bit(struct wait_queue_head *wq_head, unsigned long *word, int bit); int __wait_on_bit(struct wait_queue_head *wq_head, struct wait_bit_queue_entry *wbq_entry, wait_bit_action_f *action, unsigned int mode); int __wait_on_bit_lock(struct wait_queue_head *wq_head, struct wait_bit_queue_entry *wbq_entry, wait_bit_action_f *action, unsigned int mode); -void wake_up_bit(void *word, int bit); -int out_of_line_wait_on_bit(void *word, int, wait_bit_action_f *action, unsigned int mode); -int out_of_line_wait_on_bit_timeout(void *word, int, wait_bit_action_f *action, unsigned int mode, unsigned long timeout); -int out_of_line_wait_on_bit_lock(void *word, int, wait_bit_action_f *action, unsigned int mode); -struct wait_queue_head *bit_waitqueue(void *word, int bit); +void wake_up_bit(unsigned long *word, int bit); +int out_of_line_wait_on_bit(unsigned long *word, int, wait_bit_action_f *action, unsigned int mode); +int out_of_line_wait_on_bit_timeout(unsigned long *word, int, wait_bit_action_f *action, unsigned int mode, unsigned long timeout); +int out_of_line_wait_on_bit_lock(unsigned long *word, int, wait_bit_action_f *action, unsigned int mode); +struct wait_queue_head *bit_waitqueue(unsigned long *word, int bit); extern void __init wait_bit_init(void); int wake_bit_function(struct wait_queue_entry *wq_entry, unsigned mode, int sync, void *key); @@ -52,19 +53,21 @@ extern int bit_wait_timeout(struct wait_bit_key *key, int mode); /** * wait_on_bit - wait for a bit to be cleared - * @word: the word being waited on, a kernel virtual address - * @bit: the bit of the word being waited on + * @word: the address containing the bit being waited on + * @bit: the bit at that address being waited on * @mode: the task state to sleep in * - * There is a standard hashed waitqueue table for generic use. This - * is the part of the hashtable's accessor API that waits on a bit. - * For instance, if one were to have waiters on a bitflag, one would - * call wait_on_bit() in threads waiting for the bit to clear. - * One uses wait_on_bit() where one is waiting for the bit to clear, - * but has no intention of setting it. - * Returned value will be zero if the bit was cleared, or non-zero - * if the process received a signal and the mode permitted wakeup - * on that signal. + * Wait for the given bit in an unsigned long or bitmap (see DECLARE_BITMAP()) + * to be cleared. The clearing of the bit must be signalled with + * wake_up_bit(), often as clear_and_wake_up_bit(). + * + * The process will wait on a waitqueue selected by hash from a shared + * pool. It will only be woken on a wake_up for the target bit, even + * if other processes on the same queue are waiting for other bits. + * + * Returned value will be zero if the bit was cleared in which case the + * call has ACQUIRE semantics, or %-EINTR if the process received a + * signal and the mode permitted wake up on that signal. */ static inline int wait_on_bit(unsigned long *word, int bit, unsigned mode) @@ -79,17 +82,20 @@ wait_on_bit(unsigned long *word, int bit, unsigned mode) /** * wait_on_bit_io - wait for a bit to be cleared - * @word: the word being waited on, a kernel virtual address - * @bit: the bit of the word being waited on + * @word: the address containing the bit being waited on + * @bit: the bit at that address being waited on * @mode: the task state to sleep in * - * Use the standard hashed waitqueue table to wait for a bit - * to be cleared. This is similar to wait_on_bit(), but calls - * io_schedule() instead of schedule() for the actual waiting. + * Wait for the given bit in an unsigned long or bitmap (see DECLARE_BITMAP()) + * to be cleared. The clearing of the bit must be signalled with + * wake_up_bit(), often as clear_and_wake_up_bit(). * - * Returned value will be zero if the bit was cleared, or non-zero - * if the process received a signal and the mode permitted wakeup - * on that signal. + * This is similar to wait_on_bit(), but calls io_schedule() instead of + * schedule() for the actual waiting. + * + * Returned value will be zero if the bit was cleared in which case the + * call has ACQUIRE semantics, or %-EINTR if the process received a + * signal and the mode permitted wake up on that signal. */ static inline int wait_on_bit_io(unsigned long *word, int bit, unsigned mode) @@ -103,19 +109,24 @@ wait_on_bit_io(unsigned long *word, int bit, unsigned mode) } /** - * wait_on_bit_timeout - wait for a bit to be cleared or a timeout elapses - * @word: the word being waited on, a kernel virtual address - * @bit: the bit of the word being waited on + * wait_on_bit_timeout - wait for a bit to be cleared or a timeout to elapse + * @word: the address containing the bit being waited on + * @bit: the bit at that address being waited on * @mode: the task state to sleep in * @timeout: timeout, in jiffies * - * Use the standard hashed waitqueue table to wait for a bit - * to be cleared. This is similar to wait_on_bit(), except also takes a - * timeout parameter. + * Wait for the given bit in an unsigned long or bitmap (see + * DECLARE_BITMAP()) to be cleared, or for a timeout to expire. The + * clearing of the bit must be signalled with wake_up_bit(), often as + * clear_and_wake_up_bit(). * - * Returned value will be zero if the bit was cleared before the - * @timeout elapsed, or non-zero if the @timeout elapsed or process - * received a signal and the mode permitted wakeup on that signal. + * This is similar to wait_on_bit(), except it also takes a timeout + * parameter. + * + * Returned value will be zero if the bit was cleared in which case the + * call has ACQUIRE semantics, or %-EINTR if the process received a + * signal and the mode permitted wake up on that signal, or %-EAGAIN if the + * timeout elapsed. */ static inline int wait_on_bit_timeout(unsigned long *word, int bit, unsigned mode, @@ -131,19 +142,21 @@ wait_on_bit_timeout(unsigned long *word, int bit, unsigned mode, /** * wait_on_bit_action - wait for a bit to be cleared - * @word: the word being waited on, a kernel virtual address - * @bit: the bit of the word being waited on + * @word: the address containing the bit waited on + * @bit: the bit at that address being waited on * @action: the function used to sleep, which may take special actions * @mode: the task state to sleep in * - * Use the standard hashed waitqueue table to wait for a bit - * to be cleared, and allow the waiting action to be specified. - * This is like wait_on_bit() but allows fine control of how the waiting - * is done. + * Wait for the given bit in an unsigned long or bitmap (see DECLARE_BITMAP()) + * to be cleared. The clearing of the bit must be signalled with + * wake_up_bit(), often as clear_and_wake_up_bit(). * - * Returned value will be zero if the bit was cleared, or non-zero - * if the process received a signal and the mode permitted wakeup - * on that signal. + * This is similar to wait_on_bit(), but calls @action() instead of + * schedule() for the actual waiting. + * + * Returned value will be zero if the bit was cleared in which case the + * call has ACQUIRE semantics, or the error code returned by @action if + * that call returned non-zero. */ static inline int wait_on_bit_action(unsigned long *word, int bit, wait_bit_action_f *action, @@ -156,23 +169,22 @@ wait_on_bit_action(unsigned long *word, int bit, wait_bit_action_f *action, } /** - * wait_on_bit_lock - wait for a bit to be cleared, when wanting to set it - * @word: the word being waited on, a kernel virtual address - * @bit: the bit of the word being waited on + * wait_on_bit_lock - wait for a bit to be cleared, then set it + * @word: the address containing the bit being waited on + * @bit: the bit of the word being waited on and set * @mode: the task state to sleep in * - * There is a standard hashed waitqueue table for generic use. This - * is the part of the hashtable's accessor API that waits on a bit - * when one intends to set it, for instance, trying to lock bitflags. - * For instance, if one were to have waiters trying to set bitflag - * and waiting for it to clear before setting it, one would call - * wait_on_bit() in threads waiting to be able to set the bit. - * One uses wait_on_bit_lock() where one is waiting for the bit to - * clear with the intention of setting it, and when done, clearing it. + * Wait for the given bit in an unsigned long or bitmap (see + * DECLARE_BITMAP()) to be cleared. The clearing of the bit must be + * signalled with wake_up_bit(), often as clear_and_wake_up_bit(). As + * soon as it is clear, atomically set it and return. * - * Returns zero if the bit was (eventually) found to be clear and was - * set. Returns non-zero if a signal was delivered to the process and - * the @mode allows that signal to wake the process. + * This is similar to wait_on_bit(), but sets the bit before returning. + * + * Returned value will be zero if the bit was successfully set in which + * case the call has the same memory sequencing semantics as + * test_and_clear_bit(), or %-EINTR if the process received a signal and + * the mode permitted wake up on that signal. */ static inline int wait_on_bit_lock(unsigned long *word, int bit, unsigned mode) @@ -184,15 +196,18 @@ wait_on_bit_lock(unsigned long *word, int bit, unsigned mode) } /** - * wait_on_bit_lock_io - wait for a bit to be cleared, when wanting to set it - * @word: the word being waited on, a kernel virtual address - * @bit: the bit of the word being waited on + * wait_on_bit_lock_io - wait for a bit to be cleared, then set it + * @word: the address containing the bit being waited on + * @bit: the bit of the word being waited on and set * @mode: the task state to sleep in * - * Use the standard hashed waitqueue table to wait for a bit - * to be cleared and then to atomically set it. This is similar - * to wait_on_bit(), but calls io_schedule() instead of schedule() - * for the actual waiting. + * Wait for the given bit in an unsigned long or bitmap (see + * DECLARE_BITMAP()) to be cleared. The clearing of the bit must be + * signalled with wake_up_bit(), often as clear_and_wake_up_bit(). As + * soon as it is clear, atomically set it and return. + * + * This is similar to wait_on_bit_lock(), but calls io_schedule() instead + * of schedule(). * * Returns zero if the bit was (eventually) found to be clear and was * set. Returns non-zero if a signal was delivered to the process and @@ -208,21 +223,19 @@ wait_on_bit_lock_io(unsigned long *word, int bit, unsigned mode) } /** - * wait_on_bit_lock_action - wait for a bit to be cleared, when wanting to set it - * @word: the word being waited on, a kernel virtual address - * @bit: the bit of the word being waited on + * wait_on_bit_lock_action - wait for a bit to be cleared, then set it + * @word: the address containing the bit being waited on + * @bit: the bit of the word being waited on and set * @action: the function used to sleep, which may take special actions * @mode: the task state to sleep in * - * Use the standard hashed waitqueue table to wait for a bit - * to be cleared and then to set it, and allow the waiting action - * to be specified. - * This is like wait_on_bit() but allows fine control of how the waiting - * is done. + * This is similar to wait_on_bit_lock(), but calls @action() instead of + * schedule() for the actual waiting. * - * Returns zero if the bit was (eventually) found to be clear and was - * set. Returns non-zero if a signal was delivered to the process and - * the @mode allows that signal to wake the process. + * Returned value will be zero if the bit was successfully set in which + * case the call has the same memory sequencing semantics as + * test_and_clear_bit(), or the error code returned by @action if that + * call returned non-zero. */ static inline int wait_on_bit_lock_action(unsigned long *word, int bit, wait_bit_action_f *action, @@ -269,6 +282,22 @@ __out: __ret; \ ___wait_var_event(var, condition, TASK_UNINTERRUPTIBLE, 0, 0, \ schedule()) +/** + * wait_var_event - wait for a variable to be updated and notified + * @var: the address of variable being waited on + * @condition: the condition to wait for + * + * Wait for a @condition to be true, only re-checking when a wake up is + * received for the given @var (an arbitrary kernel address which need + * not be directly related to the given condition, but usually is). + * + * The process will wait on a waitqueue selected by hash from a shared + * pool. It will only be woken on a wake_up for the given address. + * + * The condition should normally use smp_load_acquire() or a similarly + * ordered access to ensure that any changes to memory made before the + * condition became true will be visible after the wait completes. + */ #define wait_var_event(var, condition) \ do { \ might_sleep(); \ @@ -281,6 +310,24 @@ do { \ ___wait_var_event(var, condition, TASK_KILLABLE, 0, 0, \ schedule()) +/** + * wait_var_event_killable - wait for a variable to be updated and notified + * @var: the address of variable being waited on + * @condition: the condition to wait for + * + * Wait for a @condition to be true or a fatal signal to be received, + * only re-checking the condition when a wake up is received for the given + * @var (an arbitrary kernel address which need not be directly related + * to the given condition, but usually is). + * + * This is similar to wait_var_event() but returns a value which is + * 0 if the condition became true, or %-ERESTARTSYS if a fatal signal + * was received. + * + * The condition should normally use smp_load_acquire() or a similarly + * ordered access to ensure that any changes to memory made before the + * condition became true will be visible after the wait completes. + */ #define wait_var_event_killable(var, condition) \ ({ \ int __ret = 0; \ @@ -295,6 +342,26 @@ do { \ TASK_UNINTERRUPTIBLE, 0, timeout, \ __ret = schedule_timeout(__ret)) +/** + * wait_var_event_timeout - wait for a variable to be updated or a timeout to expire + * @var: the address of variable being waited on + * @condition: the condition to wait for + * @timeout: maximum time to wait in jiffies + * + * Wait for a @condition to be true or a timeout to expire, only + * re-checking the condition when a wake up is received for the given + * @var (an arbitrary kernel address which need not be directly related + * to the given condition, but usually is). + * + * This is similar to wait_var_event() but returns a value which is 0 if + * the timeout expired and the condition was still false, or the + * remaining time left in the timeout (but at least 1) if the condition + * was found to be true. + * + * The condition should normally use smp_load_acquire() or a similarly + * ordered access to ensure that any changes to memory made before the + * condition became true will be visible after the wait completes. + */ #define wait_var_event_timeout(var, condition, timeout) \ ({ \ long __ret = timeout; \ @@ -308,6 +375,23 @@ do { \ ___wait_var_event(var, condition, TASK_INTERRUPTIBLE, 0, 0, \ schedule()) +/** + * wait_var_event_killable - wait for a variable to be updated and notified + * @var: the address of variable being waited on + * @condition: the condition to wait for + * + * Wait for a @condition to be true or a signal to be received, only + * re-checking the condition when a wake up is received for the given + * @var (an arbitrary kernel address which need not be directly related + * to the given condition, but usually is). + * + * This is similar to wait_var_event() but returns a value which is 0 if + * the condition became true, or %-ERESTARTSYS if a signal was received. + * + * The condition should normally use smp_load_acquire() or a similarly + * ordered access to ensure that any changes to memory made before the + * condition became true will be visible after the wait completes. + */ #define wait_var_event_interruptible(var, condition) \ ({ \ int __ret = 0; \ @@ -318,15 +402,122 @@ do { \ }) /** - * clear_and_wake_up_bit - clear a bit and wake up anyone waiting on that bit + * wait_var_event_any_lock - wait for a variable to be updated under a lock + * @var: the address of the variable being waited on + * @condition: condition to wait for + * @lock: the object that is locked to protect updates to the variable + * @type: prefix on lock and unlock operations + * @state: waiting state, %TASK_UNINTERRUPTIBLE etc. * - * @bit: the bit of the word being waited on - * @word: the word being waited on, a kernel virtual address + * Wait for a condition which can only be reliably tested while holding + * a lock. The variables assessed in the condition will normal be updated + * under the same lock, and the wake up should be signalled with + * wake_up_var_locked() under the same lock. * - * You can use this helper if bitflags are manipulated atomically rather than - * non-atomically under a lock. + * This is similar to wait_var_event(), but assumes a lock is held + * while calling this function and while updating the variable. + * + * This must be called while the given lock is held and the lock will be + * dropped when schedule() is called to wait for a wake up, and will be + * reclaimed before testing the condition again. The functions used to + * unlock and lock the object are constructed by appending _unlock and _lock + * to @type. + * + * Return %-ERESTARTSYS if a signal arrives which is allowed to interrupt + * the wait according to @state. */ -static inline void clear_and_wake_up_bit(int bit, void *word) +#define wait_var_event_any_lock(var, condition, lock, type, state) \ +({ \ + int __ret = 0; \ + if (!(condition)) \ + __ret = ___wait_var_event(var, condition, state, 0, 0, \ + type ## _unlock(lock); \ + schedule(); \ + type ## _lock(lock)); \ + __ret; \ +}) + +/** + * wait_var_event_spinlock - wait for a variable to be updated under a spinlock + * @var: the address of the variable being waited on + * @condition: condition to wait for + * @lock: the spinlock which protects updates to the variable + * + * Wait for a condition which can only be reliably tested while holding + * a spinlock. The variables assessed in the condition will normal be updated + * under the same spinlock, and the wake up should be signalled with + * wake_up_var_locked() under the same spinlock. + * + * This is similar to wait_var_event(), but assumes a spinlock is held + * while calling this function and while updating the variable. + * + * This must be called while the given lock is held and the lock will be + * dropped when schedule() is called to wait for a wake up, and will be + * reclaimed before testing the condition again. + */ +#define wait_var_event_spinlock(var, condition, lock) \ + wait_var_event_any_lock(var, condition, lock, spin, TASK_UNINTERRUPTIBLE) + +/** + * wait_var_event_mutex - wait for a variable to be updated under a mutex + * @var: the address of the variable being waited on + * @condition: condition to wait for + * @mutex: the mutex which protects updates to the variable + * + * Wait for a condition which can only be reliably tested while holding + * a mutex. The variables assessed in the condition will normal be + * updated under the same mutex, and the wake up should be signalled + * with wake_up_var_locked() under the same mutex. + * + * This is similar to wait_var_event(), but assumes a mutex is held + * while calling this function and while updating the variable. + * + * This must be called while the given mutex is held and the mutex will be + * dropped when schedule() is called to wait for a wake up, and will be + * reclaimed before testing the condition again. + */ +#define wait_var_event_mutex(var, condition, lock) \ + wait_var_event_any_lock(var, condition, lock, mutex, TASK_UNINTERRUPTIBLE) + +/** + * wake_up_var_protected - wake up waiters for a variable asserting that it is safe + * @var: the address of the variable being waited on + * @cond: the condition which afirms this is safe + * + * When waking waiters which use wait_var_event_any_lock() the waker must be + * holding the reelvant lock to avoid races. This version of wake_up_var() + * asserts that the relevant lock is held and so no barrier is needed. + * The @cond is only tested when CONFIG_LOCKDEP is enabled. + */ +#define wake_up_var_protected(var, cond) \ +do { \ + lockdep_assert(cond); \ + wake_up_var(var); \ +} while (0) + +/** + * wake_up_var_locked - wake up waiters for a variable while holding a spinlock or mutex + * @var: the address of the variable being waited on + * @lock: The spinlock or mutex what protects the variable + * + * Send a wake up for the given variable which should be waited for with + * wait_var_event_spinlock() or wait_var_event_mutex(). Unlike wake_up_var(), + * no extra barriers are needed as the locking provides sufficient sequencing. + */ +#define wake_up_var_locked(var, lock) \ + wake_up_var_protected(var, lockdep_is_held(lock)) + +/** + * clear_and_wake_up_bit - clear a bit and wake up anyone waiting on that bit + * @bit: the bit of the word being waited on + * @word: the address containing the bit being waited on + * + * The designated bit is cleared and any tasks waiting in wait_on_bit() + * or similar will be woken. This call has RELEASE semantics so that + * any changes to memory made before this call are guaranteed to be visible + * after the corresponding wait_on_bit() completes. + */ +static inline void clear_and_wake_up_bit(int bit, unsigned long *word) { clear_bit_unlock(bit, word); /* See wake_up_bit() for which memory barrier you need to use. */ @@ -334,4 +525,64 @@ static inline void clear_and_wake_up_bit(int bit, void *word) wake_up_bit(word, bit); } +/** + * test_and_clear_wake_up_bit - clear a bit if it was set: wake up anyone waiting on that bit + * @bit: the bit of the word being waited on + * @word: the address of memory containing that bit + * + * If the bit is set and can be atomically cleared, any tasks waiting in + * wait_on_bit() or similar will be woken. This call has the same + * complete ordering semantics as test_and_clear_bit(). Any changes to + * memory made before this call are guaranteed to be visible after the + * corresponding wait_on_bit() completes. + * + * Returns %true if the bit was successfully set and the wake up was sent. + */ +static inline bool test_and_clear_wake_up_bit(int bit, unsigned long *word) +{ + if (!test_and_clear_bit(bit, word)) + return false; + /* no extra barrier required */ + wake_up_bit(word, bit); + return true; +} + +/** + * atomic_dec_and_wake_up - decrement an atomic_t and if zero, wake up waiters + * @var: the variable to dec and test + * + * Decrements the atomic variable and if it reaches zero, send a wake_up to any + * processes waiting on the variable. + * + * This function has the same complete ordering semantics as atomic_dec_and_test. + * + * Returns %true is the variable reaches zero and the wake up was sent. + */ + +static inline bool atomic_dec_and_wake_up(atomic_t *var) +{ + if (!atomic_dec_and_test(var)) + return false; + /* No extra barrier required */ + wake_up_var(var); + return true; +} + +/** + * store_release_wake_up - update a variable and send a wake_up + * @var: the address of the variable to be updated and woken + * @val: the value to store in the variable. + * + * Store the given value in the variable send a wake up to any tasks + * waiting on the variable. All necessary barriers are included to ensure + * the task calling wait_var_event() sees the new value and all values + * written to memory before this call. + */ +#define store_release_wake_up(var, val) \ +do { \ + smp_store_release(var, val); \ + smp_mb(); \ + wake_up_var(var); \ +} while (0) + #endif /* _LINUX_WAIT_BIT_H */ diff --git a/include/net/bluetooth/hci_core.h b/include/net/bluetooth/hci_core.h index a0de30cb6d..17e20d521c 100644 --- a/include/net/bluetooth/hci_core.h +++ b/include/net/bluetooth/hci_core.h @@ -1199,6 +1199,27 @@ static inline struct hci_conn *hci_conn_hash_lookup_ba(struct hci_dev *hdev, return NULL; } +static inline struct hci_conn *hci_conn_hash_lookup_role(struct hci_dev *hdev, + __u8 type, __u8 role, + bdaddr_t *ba) +{ + struct hci_conn_hash *h = &hdev->conn_hash; + struct hci_conn *c; + + rcu_read_lock(); + + list_for_each_entry_rcu(c, &h->list, list) { + if (c->type == type && c->role == role && !bacmp(&c->dst, ba)) { + rcu_read_unlock(); + return c; + } + } + + rcu_read_unlock(); + + return NULL; +} + static inline struct hci_conn *hci_conn_hash_lookup_le(struct hci_dev *hdev, bdaddr_t *ba, __u8 ba_type) diff --git a/include/net/mana/gdma.h b/include/net/mana/gdma.h index 3ce56a8164..79516db61b 100644 --- a/include/net/mana/gdma.h +++ b/include/net/mana/gdma.h @@ -10,6 +10,7 @@ #include "shm_channel.h" #define GDMA_STATUS_MORE_ENTRIES 0x00000105 +#define GDMA_STATUS_CMD_UNSUPPORTED 0xffffffff /* Structures labeled with "HW DATA" are exchanged with the hardware. All of * them are naturally aligned and hence don't need __packed. @@ -58,9 +59,10 @@ enum gdma_eqe_type { GDMA_EQE_HWC_INIT_EQ_ID_DB = 129, GDMA_EQE_HWC_INIT_DATA = 130, GDMA_EQE_HWC_INIT_DONE = 131, - GDMA_EQE_HWC_SOC_RECONFIG = 132, + GDMA_EQE_HWC_FPGA_RECONFIG = 132, GDMA_EQE_HWC_SOC_RECONFIG_DATA = 133, GDMA_EQE_HWC_SOC_SERVICE = 134, + GDMA_EQE_HWC_RESET_REQUEST = 135, GDMA_EQE_RNIC_QP_FATAL = 176, }; @@ -403,6 +405,8 @@ struct gdma_context { u32 test_event_eq_id; bool is_pf; + bool in_service; + phys_addr_t bar0_pa; void __iomem *bar0_va; void __iomem *shm_base; @@ -578,12 +582,20 @@ enum { /* Driver can handle holes (zeros) in the device list */ #define GDMA_DRV_CAP_FLAG_1_DEV_LIST_HOLES_SUP BIT(11) +/* Driver can self reset on EQE notification */ +#define GDMA_DRV_CAP_FLAG_1_SELF_RESET_ON_EQE BIT(14) + +/* Driver can self reset on FPGA Reconfig EQE notification */ +#define GDMA_DRV_CAP_FLAG_1_HANDLE_RECONFIG_EQE BIT(17) + #define GDMA_DRV_CAP_FLAGS1 \ (GDMA_DRV_CAP_FLAG_1_EQ_SHARING_MULTI_VPORT | \ GDMA_DRV_CAP_FLAG_1_NAPI_WKDONE_FIX | \ GDMA_DRV_CAP_FLAG_1_HWC_TIMEOUT_RECONFIG | \ GDMA_DRV_CAP_FLAG_1_VARIABLE_INDIRECTION_TABLE_SUPPORT | \ - GDMA_DRV_CAP_FLAG_1_DEV_LIST_HOLES_SUP) + GDMA_DRV_CAP_FLAG_1_DEV_LIST_HOLES_SUP | \ + GDMA_DRV_CAP_FLAG_1_SELF_RESET_ON_EQE | \ + GDMA_DRV_CAP_FLAG_1_HANDLE_RECONFIG_EQE) #define GDMA_DRV_CAP_FLAGS2 0 @@ -910,4 +922,9 @@ void mana_unregister_debugfs(void); int mana_rdma_service_event(struct gdma_context *gc, enum gdma_service_type event); +int mana_gd_suspend(struct pci_dev *pdev, pm_message_t state); +int mana_gd_resume(struct pci_dev *pdev); + +bool mana_need_log(struct gdma_context *gc, int err); + #endif /* _GDMA_H */ diff --git a/include/net/mana/mana.h b/include/net/mana/mana.h index 123e80b332..3ce29a6c1a 100644 --- a/include/net/mana/mana.h +++ b/include/net/mana/mana.h @@ -402,6 +402,65 @@ struct mana_ethtool_stats { u64 rx_cqe_unknown_type; }; +struct mana_ethtool_phy_stats { + /* Drop Counters */ + u64 rx_pkt_drop_phy; + u64 tx_pkt_drop_phy; + + /* Per TC traffic Counters */ + u64 rx_pkt_tc0_phy; + u64 tx_pkt_tc0_phy; + u64 rx_pkt_tc1_phy; + u64 tx_pkt_tc1_phy; + u64 rx_pkt_tc2_phy; + u64 tx_pkt_tc2_phy; + u64 rx_pkt_tc3_phy; + u64 tx_pkt_tc3_phy; + u64 rx_pkt_tc4_phy; + u64 tx_pkt_tc4_phy; + u64 rx_pkt_tc5_phy; + u64 tx_pkt_tc5_phy; + u64 rx_pkt_tc6_phy; + u64 tx_pkt_tc6_phy; + u64 rx_pkt_tc7_phy; + u64 tx_pkt_tc7_phy; + + u64 rx_byte_tc0_phy; + u64 tx_byte_tc0_phy; + u64 rx_byte_tc1_phy; + u64 tx_byte_tc1_phy; + u64 rx_byte_tc2_phy; + u64 tx_byte_tc2_phy; + u64 rx_byte_tc3_phy; + u64 tx_byte_tc3_phy; + u64 rx_byte_tc4_phy; + u64 tx_byte_tc4_phy; + u64 rx_byte_tc5_phy; + u64 tx_byte_tc5_phy; + u64 rx_byte_tc6_phy; + u64 tx_byte_tc6_phy; + u64 rx_byte_tc7_phy; + u64 tx_byte_tc7_phy; + + /* Per TC pause Counters */ + u64 rx_pause_tc0_phy; + u64 tx_pause_tc0_phy; + u64 rx_pause_tc1_phy; + u64 tx_pause_tc1_phy; + u64 rx_pause_tc2_phy; + u64 tx_pause_tc2_phy; + u64 rx_pause_tc3_phy; + u64 tx_pause_tc3_phy; + u64 rx_pause_tc4_phy; + u64 tx_pause_tc4_phy; + u64 rx_pause_tc5_phy; + u64 tx_pause_tc5_phy; + u64 rx_pause_tc6_phy; + u64 tx_pause_tc6_phy; + u64 rx_pause_tc7_phy; + u64 tx_pause_tc7_phy; +}; + struct mana_context { struct gdma_dev *gdma_dev; @@ -472,6 +531,8 @@ struct mana_port_context { struct mana_ethtool_stats eth_stats; + struct mana_ethtool_phy_stats phy_stats; + /* Debugfs */ struct dentry *mana_port_debugfs; }; @@ -499,6 +560,7 @@ struct bpf_prog *mana_xdp_get(struct mana_port_context *apc); void mana_chn_setxdp(struct mana_port_context *apc, struct bpf_prog *prog); int mana_bpf(struct net_device *ndev, struct netdev_bpf *bpf); void mana_query_gf_stats(struct mana_port_context *apc); +void mana_query_phy_stats(struct mana_port_context *apc); int mana_pre_alloc_rxbufs(struct mana_port_context *apc, int mtu, int num_queues); void mana_pre_dealloc_rxbufs(struct mana_port_context *apc); @@ -525,6 +587,7 @@ enum mana_command_code { MANA_FENCE_RQ = 0x20006, MANA_CONFIG_VPORT_RX = 0x20007, MANA_QUERY_VPORT_CONFIG = 0x20008, + MANA_QUERY_PHY_STAT = 0x2000c, /* Privileged commands for the PF mode */ MANA_REGISTER_FILTER = 0x28000, @@ -687,6 +750,74 @@ struct mana_query_gf_stat_resp { u64 tx_err_gdma; }; /* HW DATA */ +/* Query phy stats */ +struct mana_query_phy_stat_req { + struct gdma_req_hdr hdr; + u64 req_stats; +}; /* HW DATA */ + +struct mana_query_phy_stat_resp { + struct gdma_resp_hdr hdr; + u64 reported_stats; + + /* Aggregate Drop Counters */ + u64 rx_pkt_drop_phy; + u64 tx_pkt_drop_phy; + + /* Per TC(Traffic class) traffic Counters */ + u64 rx_pkt_tc0_phy; + u64 tx_pkt_tc0_phy; + u64 rx_pkt_tc1_phy; + u64 tx_pkt_tc1_phy; + u64 rx_pkt_tc2_phy; + u64 tx_pkt_tc2_phy; + u64 rx_pkt_tc3_phy; + u64 tx_pkt_tc3_phy; + u64 rx_pkt_tc4_phy; + u64 tx_pkt_tc4_phy; + u64 rx_pkt_tc5_phy; + u64 tx_pkt_tc5_phy; + u64 rx_pkt_tc6_phy; + u64 tx_pkt_tc6_phy; + u64 rx_pkt_tc7_phy; + u64 tx_pkt_tc7_phy; + + u64 rx_byte_tc0_phy; + u64 tx_byte_tc0_phy; + u64 rx_byte_tc1_phy; + u64 tx_byte_tc1_phy; + u64 rx_byte_tc2_phy; + u64 tx_byte_tc2_phy; + u64 rx_byte_tc3_phy; + u64 tx_byte_tc3_phy; + u64 rx_byte_tc4_phy; + u64 tx_byte_tc4_phy; + u64 rx_byte_tc5_phy; + u64 tx_byte_tc5_phy; + u64 rx_byte_tc6_phy; + u64 tx_byte_tc6_phy; + u64 rx_byte_tc7_phy; + u64 tx_byte_tc7_phy; + + /* Per TC(Traffic Class) pause Counters */ + u64 rx_pause_tc0_phy; + u64 tx_pause_tc0_phy; + u64 rx_pause_tc1_phy; + u64 tx_pause_tc1_phy; + u64 rx_pause_tc2_phy; + u64 tx_pause_tc2_phy; + u64 rx_pause_tc3_phy; + u64 tx_pause_tc3_phy; + u64 rx_pause_tc4_phy; + u64 tx_pause_tc4_phy; + u64 rx_pause_tc5_phy; + u64 tx_pause_tc5_phy; + u64 rx_pause_tc6_phy; + u64 tx_pause_tc6_phy; + u64 rx_pause_tc7_phy; + u64 tx_pause_tc7_phy; +}; /* HW DATA */ + /* Configure vPort Rx Steering */ struct mana_cfg_rx_steer_req_v2 { struct gdma_req_hdr hdr; diff --git a/io_uring/waitid.c b/io_uring/waitid.c index 6362ec20ab..18217d5622 100644 --- a/io_uring/waitid.c +++ b/io_uring/waitid.c @@ -272,13 +272,14 @@ static int io_waitid_wait(struct wait_queue_entry *wait, unsigned mode, if (!pid_child_should_wake(wo, p)) return 0; + list_del_init(&wait->entry); + /* cancel is in progress */ if (atomic_fetch_inc(&iw->refs) & IO_WAITID_REF_MASK) return 1; req->io_task_work.func = io_waitid_cb; io_req_task_work_add(req); - list_del_init(&wait->entry); return 1; } diff --git a/kernel/sched/wait_bit.c b/kernel/sched/wait_bit.c index c6aab3db70..b410b61cec 100644 --- a/kernel/sched/wait_bit.c +++ b/kernel/sched/wait_bit.c @@ -9,7 +9,7 @@ static wait_queue_head_t bit_wait_table[WAIT_TABLE_SIZE] __cacheline_aligned; -wait_queue_head_t *bit_waitqueue(void *word, int bit) +wait_queue_head_t *bit_waitqueue(unsigned long *word, int bit) { const int shift = BITS_PER_LONG == 32 ? 5 : 6; unsigned long val = (unsigned long)word << shift | bit; @@ -55,7 +55,7 @@ __wait_on_bit(struct wait_queue_head *wq_head, struct wait_bit_queue_entry *wbq_ } EXPORT_SYMBOL(__wait_on_bit); -int __sched out_of_line_wait_on_bit(void *word, int bit, +int __sched out_of_line_wait_on_bit(unsigned long *word, int bit, wait_bit_action_f *action, unsigned mode) { struct wait_queue_head *wq_head = bit_waitqueue(word, bit); @@ -66,7 +66,7 @@ int __sched out_of_line_wait_on_bit(void *word, int bit, EXPORT_SYMBOL(out_of_line_wait_on_bit); int __sched out_of_line_wait_on_bit_timeout( - void *word, int bit, wait_bit_action_f *action, + unsigned long *word, int bit, wait_bit_action_f *action, unsigned mode, unsigned long timeout) { struct wait_queue_head *wq_head = bit_waitqueue(word, bit); @@ -108,7 +108,7 @@ __wait_on_bit_lock(struct wait_queue_head *wq_head, struct wait_bit_queue_entry } EXPORT_SYMBOL(__wait_on_bit_lock); -int __sched out_of_line_wait_on_bit_lock(void *word, int bit, +int __sched out_of_line_wait_on_bit_lock(unsigned long *word, int bit, wait_bit_action_f *action, unsigned mode) { struct wait_queue_head *wq_head = bit_waitqueue(word, bit); @@ -118,7 +118,7 @@ int __sched out_of_line_wait_on_bit_lock(void *word, int bit, } EXPORT_SYMBOL(out_of_line_wait_on_bit_lock); -void __wake_up_bit(struct wait_queue_head *wq_head, void *word, int bit) +void __wake_up_bit(struct wait_queue_head *wq_head, unsigned long *word, int bit) { struct wait_bit_key key = __WAIT_BIT_KEY_INITIALIZER(word, bit); @@ -128,23 +128,31 @@ void __wake_up_bit(struct wait_queue_head *wq_head, void *word, int bit) EXPORT_SYMBOL(__wake_up_bit); /** - * wake_up_bit - wake up a waiter on a bit - * @word: the word being waited on, a kernel virtual address - * @bit: the bit of the word being waited on + * wake_up_bit - wake up waiters on a bit + * @word: the address containing the bit being waited on + * @bit: the bit at that address being waited on * - * There is a standard hashed waitqueue table for generic use. This - * is the part of the hash-table's accessor API that wakes up waiters - * on a bit. For instance, if one were to have waiters on a bitflag, - * one would call wake_up_bit() after clearing the bit. + * Wake up any process waiting in wait_on_bit() or similar for the + * given bit to be cleared. * - * In order for this to function properly, as it uses waitqueue_active() - * internally, some kind of memory barrier must be done prior to calling - * this. Typically, this will be smp_mb__after_atomic(), but in some - * cases where bitflags are manipulated non-atomically under a lock, one - * may need to use a less regular barrier, such fs/inode.c's smp_mb(), - * because spin_unlock() does not guarantee a memory barrier. + * The wake-up is sent to tasks in a waitqueue selected by hash from a + * shared pool. Only those tasks on that queue which have requested + * wake_up on this specific address and bit will be woken, and only if the + * bit is clear. + * + * In order for this to function properly there must be a full memory + * barrier after the bit is cleared and before this function is called. + * If the bit was cleared atomically, such as a by clear_bit() then + * smb_mb__after_atomic() can be used, othwewise smb_mb() is needed. + * If the bit was cleared with a fully-ordered operation, no further + * barrier is required. + * + * Normally the bit should be cleared by an operation with RELEASE + * semantics so that any changes to memory made before the bit is + * cleared are guaranteed to be visible after the matching wait_on_bit() + * completes. */ -void wake_up_bit(void *word, int bit) +void wake_up_bit(unsigned long *word, int bit) { __wake_up_bit(bit_waitqueue(word, bit), word, bit); } @@ -188,6 +196,36 @@ void init_wait_var_entry(struct wait_bit_queue_entry *wbq_entry, void *var, int } EXPORT_SYMBOL(init_wait_var_entry); +/** + * wake_up_var - wake up waiters on a variable (kernel address) + * @var: the address of the variable being waited on + * + * Wake up any process waiting in wait_var_event() or similar for the + * given variable to change. wait_var_event() can be waiting for an + * arbitrary condition to be true and associates that condition with an + * address. Calling wake_up_var() suggests that the condition has been + * made true, but does not strictly require the condtion to use the + * address given. + * + * The wake-up is sent to tasks in a waitqueue selected by hash from a + * shared pool. Only those tasks on that queue which have requested + * wake_up on this specific address will be woken. + * + * In order for this to function properly there must be a full memory + * barrier after the variable is updated (or more accurately, after the + * condition waited on has been made to be true) and before this function + * is called. If the variable was updated atomically, such as a by + * atomic_dec() then smb_mb__after_atomic() can be used. If the + * variable was updated by a fully ordered operation such as + * atomic_dec_and_test() then no extra barrier is required. Otherwise + * smb_mb() is needed. + * + * Normally the variable should be updated (the condition should be made + * to be true) by an operation with RELEASE semantics such as + * smp_store_release() so that any changes to memory made before the + * variable was updated are guaranteed to be visible after the matching + * wait_var_event() completes. + */ void wake_up_var(void *var) { __wake_up_bit(__var_waitqueue(var), var, -1); diff --git a/mm/slub.c b/mm/slub.c index 6750fcb596..3465097596 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -897,19 +897,19 @@ static struct track *get_track(struct kmem_cache *s, void *object, } #ifdef CONFIG_STACKDEPOT -static noinline depot_stack_handle_t set_track_prepare(void) +static noinline depot_stack_handle_t set_track_prepare(gfp_t gfp_flags) { depot_stack_handle_t handle; unsigned long entries[TRACK_ADDRS_COUNT]; unsigned int nr_entries; nr_entries = stack_trace_save(entries, ARRAY_SIZE(entries), 3); - handle = stack_depot_save(entries, nr_entries, GFP_NOWAIT); + handle = stack_depot_save(entries, nr_entries, gfp_flags); return handle; } #else -static inline depot_stack_handle_t set_track_prepare(void) +static inline depot_stack_handle_t set_track_prepare(gfp_t gfp_flags) { return 0; } @@ -931,9 +931,9 @@ static void set_track_update(struct kmem_cache *s, void *object, } static __always_inline void set_track(struct kmem_cache *s, void *object, - enum track_item alloc, unsigned long addr) + enum track_item alloc, unsigned long addr, gfp_t gfp_flags) { - depot_stack_handle_t handle = set_track_prepare(); + depot_stack_handle_t handle = set_track_prepare(gfp_flags); set_track_update(s, object, alloc, addr, handle); } @@ -1826,9 +1826,9 @@ static inline bool free_debug_processing(struct kmem_cache *s, static inline void slab_pad_check(struct kmem_cache *s, struct slab *slab) {} static inline int check_object(struct kmem_cache *s, struct slab *slab, void *object, u8 val) { return 1; } -static inline depot_stack_handle_t set_track_prepare(void) { return 0; } +static inline depot_stack_handle_t set_track_prepare(gfp_t gfp_flags) { return 0; } static inline void set_track(struct kmem_cache *s, void *object, - enum track_item alloc, unsigned long addr) {} + enum track_item alloc, unsigned long addr, gfp_t gfp_flags) {} static inline void add_full(struct kmem_cache *s, struct kmem_cache_node *n, struct slab *slab) {} static inline void remove_full(struct kmem_cache *s, struct kmem_cache_node *n, @@ -3514,8 +3514,26 @@ new_objects: pc.slab = &slab; pc.orig_size = orig_size; freelist = get_partial(s, node, &pc); - if (freelist) - goto check_new_slab; + if (freelist) { + if (kmem_cache_debug(s)) { + /* + * For debug caches here we had to go through + * alloc_single_from_partial() so just store the + * tracking info and return the object. + * + * Due to disabled preemption we need to disallow + * blocking. The flags are further adjusted by + * gfp_nested_mask() in stack_depot itself. + */ + if (s->flags & SLAB_STORE_USER) + set_track(s, freelist, TRACK_ALLOC, addr, + gfpflags & ~(__GFP_DIRECT_RECLAIM)); + + return freelist; + } + + goto retry_load_slab; + } slub_put_cpu_ptr(s->cpu_slab); slab = new_slab(s, gfpflags, node); @@ -3535,7 +3553,8 @@ new_objects: goto new_objects; if (s->flags & SLAB_STORE_USER) - set_track(s, freelist, TRACK_ALLOC, addr); + set_track(s, freelist, TRACK_ALLOC, addr, + gfpflags & ~(__GFP_DIRECT_RECLAIM)); return freelist; } @@ -3551,20 +3570,6 @@ new_objects: inc_slabs_node(s, slab_nid(slab), slab->objects); -check_new_slab: - - if (kmem_cache_debug(s)) { - /* - * For debug caches here we had to go through - * alloc_single_from_partial() so just store the tracking info - * and return the object - */ - if (s->flags & SLAB_STORE_USER) - set_track(s, freelist, TRACK_ALLOC, addr); - - return freelist; - } - if (unlikely(!pfmemalloc_match(slab, gfpflags))) { /* * For !pfmemalloc_match() case we don't load freelist so that @@ -4027,8 +4032,12 @@ static noinline void free_to_partial_list( unsigned long flags; depot_stack_handle_t handle = 0; + /* + * We cannot use GFP_NOWAIT as there are callsites where waking up + * kswapd could deadlock + */ if (s->flags & SLAB_STORE_USER) - handle = set_track_prepare(); + handle = set_track_prepare(__GFP_NOWARN); spin_lock_irqsave(&n->list_lock, flags); diff --git a/net/bluetooth/hci_event.c b/net/bluetooth/hci_event.c index bf4373f622..e781b27dea 100644 --- a/net/bluetooth/hci_event.c +++ b/net/bluetooth/hci_event.c @@ -3115,8 +3115,18 @@ static void hci_conn_complete_evt(struct hci_dev *hdev, void *data, hci_dev_lock(hdev); + /* Check for existing connection: + * + * 1. If it doesn't exist then it must be receiver/slave role. + * 2. If it does exist confirm that it is connecting/BT_CONNECT in case + * of initiator/master role since there could be a collision where + * either side is attempting to connect or something like a fuzzing + * testing is trying to play tricks to destroy the hcon object before + * it even attempts to connect (e.g. hcon->state == BT_OPEN). + */ conn = hci_conn_hash_lookup_ba(hdev, ev->link_type, &ev->bdaddr); - if (!conn) { + if (!conn || + (conn->role == HCI_ROLE_MASTER && conn->state != BT_CONNECT)) { /* In case of error status and there is no connection pending * just unlock as there is nothing to cleanup. */ @@ -4422,6 +4432,8 @@ static void hci_num_comp_pkts_evt(struct hci_dev *hdev, void *data, bt_dev_dbg(hdev, "num %d", ev->num); + hci_dev_lock(hdev); + for (i = 0; i < ev->num; i++) { struct hci_comp_pkts_info *info = &ev->handles[i]; struct hci_conn *conn; @@ -4487,6 +4499,8 @@ static void hci_num_comp_pkts_evt(struct hci_dev *hdev, void *data, } queue_work(hdev->workqueue, &hdev->tx_work); + + hci_dev_unlock(hdev); } static void hci_mode_change_evt(struct hci_dev *hdev, void *data, @@ -5649,8 +5663,18 @@ static void le_conn_complete_evt(struct hci_dev *hdev, u8 status, */ hci_dev_clear_flag(hdev, HCI_LE_ADV); - conn = hci_conn_hash_lookup_ba(hdev, LE_LINK, bdaddr); - if (!conn) { + /* Check for existing connection: + * + * 1. If it doesn't exist then use the role to create a new object. + * 2. If it does exist confirm that it is connecting/BT_CONNECT in case + * of initiator/master role since there could be a collision where + * either side is attempting to connect or something like a fuzzing + * testing is trying to play tricks to destroy the hcon object before + * it even attempts to connect (e.g. hcon->state == BT_OPEN). + */ + conn = hci_conn_hash_lookup_role(hdev, LE_LINK, role, bdaddr); + if (!conn || + (conn->role == HCI_ROLE_MASTER && conn->state != BT_CONNECT)) { /* In case of error status and there is no connection pending * just unlock as there is nothing to cleanup. */ diff --git a/net/core/filter.c b/net/core/filter.c index 712c2e7943..0fe9ebb9c8 100644 --- a/net/core/filter.c +++ b/net/core/filter.c @@ -2256,6 +2256,7 @@ static int __bpf_redirect_neigh_v6(struct sk_buff *skb, struct net_device *dev, if (IS_ERR(dst)) goto out_drop; + skb_dst_drop(skb); skb_dst_set(skb, dst); } else if (nh->nh_family != AF_INET6) { goto out_drop; @@ -2363,6 +2364,7 @@ static int __bpf_redirect_neigh_v4(struct sk_buff *skb, struct net_device *dev, goto out_drop; } + skb_dst_drop(skb); skb_dst_set(skb, &rt->dst); } diff --git a/net/ipv4/ip_tunnel_core.c b/net/ipv4/ip_tunnel_core.c index f65d2f7273..8392d304a7 100644 --- a/net/ipv4/ip_tunnel_core.c +++ b/net/ipv4/ip_tunnel_core.c @@ -204,6 +204,9 @@ static int iptunnel_pmtud_build_icmp(struct sk_buff *skb, int mtu) if (!pskb_may_pull(skb, ETH_HLEN + sizeof(struct iphdr))) return -EINVAL; + if (skb_is_gso(skb)) + skb_gso_reset(skb); + skb_copy_bits(skb, skb_mac_offset(skb), &eh, ETH_HLEN); pskb_pull(skb, ETH_HLEN); skb_reset_network_header(skb); @@ -298,6 +301,9 @@ static int iptunnel_pmtud_build_icmpv6(struct sk_buff *skb, int mtu) if (!pskb_may_pull(skb, ETH_HLEN + sizeof(struct ipv6hdr))) return -EINVAL; + if (skb_is_gso(skb)) + skb_gso_reset(skb); + skb_copy_bits(skb, skb_mac_offset(skb), &eh, ETH_HLEN); pskb_pull(skb, ETH_HLEN); skb_reset_network_header(skb); diff --git a/net/ipv6/seg6_hmac.c b/net/ipv6/seg6_hmac.c index bbf5b84a70..6708ee3d43 100644 --- a/net/ipv6/seg6_hmac.c +++ b/net/ipv6/seg6_hmac.c @@ -35,6 +35,7 @@ #include #include +#include #include #include #include @@ -271,7 +272,7 @@ bool seg6_hmac_validate_skb(struct sk_buff *skb) if (seg6_hmac_compute(hinfo, srh, &ipv6_hdr(skb)->saddr, hmac_output)) return false; - if (memcmp(hmac_output, tlv->hmac, SEG6_HMAC_FIELD_LEN) != 0) + if (crypto_memneq(hmac_output, tlv->hmac, SEG6_HMAC_FIELD_LEN)) return false; return true; diff --git a/net/sunrpc/svcsock.c b/net/sunrpc/svcsock.c index 4619340cd6..8276fb69d1 100644 --- a/net/sunrpc/svcsock.c +++ b/net/sunrpc/svcsock.c @@ -255,20 +255,47 @@ svc_tcp_sock_process_cmsg(struct socket *sock, struct msghdr *msg, } static int -svc_tcp_sock_recv_cmsg(struct svc_sock *svsk, struct msghdr *msg) +svc_tcp_sock_recv_cmsg(struct socket *sock, unsigned int *msg_flags) { union { struct cmsghdr cmsg; u8 buf[CMSG_SPACE(sizeof(u8))]; } u; - struct socket *sock = svsk->sk_sock; + u8 alert[2]; + struct kvec alert_kvec = { + .iov_base = alert, + .iov_len = sizeof(alert), + }; + struct msghdr msg = { + .msg_flags = *msg_flags, + .msg_control = &u, + .msg_controllen = sizeof(u), + }; int ret; - msg->msg_control = &u; - msg->msg_controllen = sizeof(u); + iov_iter_kvec(&msg.msg_iter, ITER_DEST, &alert_kvec, 1, + alert_kvec.iov_len); + ret = sock_recvmsg(sock, &msg, MSG_DONTWAIT); + if (ret > 0 && + tls_get_record_type(sock->sk, &u.cmsg) == TLS_RECORD_TYPE_ALERT) { + iov_iter_revert(&msg.msg_iter, ret); + ret = svc_tcp_sock_process_cmsg(sock, &msg, &u.cmsg, -EAGAIN); + } + return ret; +} + +static int +svc_tcp_sock_recvmsg(struct svc_sock *svsk, struct msghdr *msg) +{ + int ret; + struct socket *sock = svsk->sk_sock; + ret = sock_recvmsg(sock, msg, MSG_DONTWAIT); - if (unlikely(msg->msg_controllen != sizeof(u))) - ret = svc_tcp_sock_process_cmsg(sock, msg, &u.cmsg, ret); + if (msg->msg_flags & MSG_CTRUNC) { + msg->msg_flags &= ~(MSG_CTRUNC | MSG_EOR); + if (ret == 0 || ret == -EIO) + ret = svc_tcp_sock_recv_cmsg(sock, &msg->msg_flags); + } return ret; } @@ -322,7 +349,7 @@ static ssize_t svc_tcp_read_msg(struct svc_rqst *rqstp, size_t buflen, iov_iter_advance(&msg.msg_iter, seek); buflen -= seek; } - len = svc_tcp_sock_recv_cmsg(svsk, &msg); + len = svc_tcp_sock_recvmsg(svsk, &msg); if (len > 0) svc_flush_bvec(bvec, len, seek); @@ -1018,7 +1045,7 @@ static ssize_t svc_tcp_read_marker(struct svc_sock *svsk, iov.iov_base = ((char *)&svsk->sk_marker) + svsk->sk_tcplen; iov.iov_len = want; iov_iter_kvec(&msg.msg_iter, READ, &iov, 1, want); - len = svc_tcp_sock_recv_cmsg(svsk, &msg); + len = svc_tcp_sock_recvmsg(svsk, &msg); if (len < 0) return len; svsk->sk_tcplen += len; diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c index 59748783df..3345108ceb 100644 --- a/net/sunrpc/xprtsock.c +++ b/net/sunrpc/xprtsock.c @@ -359,7 +359,7 @@ xs_alloc_sparse_pages(struct xdr_buf *buf, size_t want, gfp_t gfp) static int xs_sock_process_cmsg(struct socket *sock, struct msghdr *msg, - struct cmsghdr *cmsg, int ret) + unsigned int *msg_flags, struct cmsghdr *cmsg, int ret) { u8 content_type = tls_get_record_type(sock->sk, cmsg); u8 level, description; @@ -372,7 +372,7 @@ xs_sock_process_cmsg(struct socket *sock, struct msghdr *msg, * record, even though there might be more frames * waiting to be decrypted. */ - msg->msg_flags &= ~MSG_EOR; + *msg_flags &= ~MSG_EOR; break; case TLS_RECORD_TYPE_ALERT: tls_alert_recv(sock->sk, msg, &level, &description); @@ -387,19 +387,33 @@ xs_sock_process_cmsg(struct socket *sock, struct msghdr *msg, } static int -xs_sock_recv_cmsg(struct socket *sock, struct msghdr *msg, int flags) +xs_sock_recv_cmsg(struct socket *sock, unsigned int *msg_flags, int flags) { union { struct cmsghdr cmsg; u8 buf[CMSG_SPACE(sizeof(u8))]; } u; + u8 alert[2]; + struct kvec alert_kvec = { + .iov_base = alert, + .iov_len = sizeof(alert), + }; + struct msghdr msg = { + .msg_flags = *msg_flags, + .msg_control = &u, + .msg_controllen = sizeof(u), + }; int ret; - msg->msg_control = &u; - msg->msg_controllen = sizeof(u); - ret = sock_recvmsg(sock, msg, flags); - if (msg->msg_controllen != sizeof(u)) - ret = xs_sock_process_cmsg(sock, msg, &u.cmsg, ret); + iov_iter_kvec(&msg.msg_iter, ITER_DEST, &alert_kvec, 1, + alert_kvec.iov_len); + ret = sock_recvmsg(sock, &msg, flags); + if (ret > 0) { + if (tls_get_record_type(sock->sk, &u.cmsg) == TLS_RECORD_TYPE_ALERT) + iov_iter_revert(&msg.msg_iter, ret); + ret = xs_sock_process_cmsg(sock, &msg, msg_flags, &u.cmsg, + -EAGAIN); + } return ret; } @@ -409,7 +423,13 @@ xs_sock_recvmsg(struct socket *sock, struct msghdr *msg, int flags, size_t seek) ssize_t ret; if (seek != 0) iov_iter_advance(&msg->msg_iter, seek); - ret = xs_sock_recv_cmsg(sock, msg, flags); + ret = sock_recvmsg(sock, msg, flags); + /* Handle TLS inband control message lazily */ + if (msg->msg_flags & MSG_CTRUNC) { + msg->msg_flags &= ~(MSG_CTRUNC | MSG_EOR); + if (ret == 0 || ret == -EIO) + ret = xs_sock_recv_cmsg(sock, &msg->msg_flags, flags); + } return ret > 0 ? ret + seek : ret; } @@ -435,7 +455,7 @@ xs_read_discard(struct socket *sock, struct msghdr *msg, int flags, size_t count) { iov_iter_discard(&msg->msg_iter, READ, count); - return xs_sock_recv_cmsg(sock, msg, flags); + return xs_sock_recvmsg(sock, msg, flags, 0); } #if ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE diff --git a/net/vmw_vsock/virtio_transport.c b/net/vmw_vsock/virtio_transport.c index 4f8612bd15..0196706684 100644 --- a/net/vmw_vsock/virtio_transport.c +++ b/net/vmw_vsock/virtio_transport.c @@ -582,8 +582,9 @@ static void virtio_transport_rx_work(struct work_struct *work) do { virtqueue_disable_cb(vq); for (;;) { + unsigned int len, payload_len; + struct virtio_vsock_hdr *hdr; struct sk_buff *skb; - unsigned int len; if (!virtio_transport_more_replies(vsock)) { /* Stop rx until the device processes already @@ -600,12 +601,19 @@ static void virtio_transport_rx_work(struct work_struct *work) vsock->rx_buf_nr--; /* Drop short/long packets */ - if (unlikely(len < sizeof(struct virtio_vsock_hdr) || + if (unlikely(len < sizeof(*hdr) || len > virtio_vsock_skb_len(skb))) { kfree_skb(skb); continue; } + hdr = virtio_vsock_hdr(skb); + payload_len = le32_to_cpu(hdr->len); + if (unlikely(payload_len > len - sizeof(*hdr))) { + kfree_skb(skb); + continue; + } + virtio_vsock_skb_rx_put(skb); virtio_transport_deliver_tap_pkt(skb); virtio_transport_recv_pkt(&virtio_transport, skb); diff --git a/net/wireless/sme.c b/net/wireless/sme.c index cf998500a9..05d0651298 100644 --- a/net/wireless/sme.c +++ b/net/wireless/sme.c @@ -901,13 +901,16 @@ void __cfg80211_connect_result(struct net_device *dev, if (!wdev->u.client.ssid_len) { rcu_read_lock(); for_each_valid_link(cr, link) { + u32 ssid_len; + ssid = ieee80211_bss_get_elem(cr->links[link].bss, WLAN_EID_SSID); if (!ssid || !ssid->datalen) continue; - memcpy(wdev->u.client.ssid, ssid->data, ssid->datalen); + ssid_len = min(ssid->datalen, IEEE80211_MAX_SSID_LEN); + memcpy(wdev->u.client.ssid, ssid->data, ssid_len); wdev->u.client.ssid_len = ssid->datalen; break; } diff --git a/redhat/configs/common/generic/x86/CONFIG_MITIGATION_VMSCAPE b/redhat/configs/common/generic/x86/CONFIG_MITIGATION_VMSCAPE new file mode 100644 index 0000000000..8df03e3732 --- /dev/null +++ b/redhat/configs/common/generic/x86/CONFIG_MITIGATION_VMSCAPE @@ -0,0 +1 @@ +CONFIG_MITIGATION_VMSCAPE=y diff --git a/redhat/configs/rhel/generic/x86/x86_64/CONFIG_INTEL_TDX_HOST b/redhat/configs/rhel/generic/x86/x86_64/CONFIG_INTEL_TDX_HOST index 0e906439c7..880e5f40c4 100644 --- a/redhat/configs/rhel/generic/x86/x86_64/CONFIG_INTEL_TDX_HOST +++ b/redhat/configs/rhel/generic/x86/x86_64/CONFIG_INTEL_TDX_HOST @@ -1 +1 @@ -# CONFIG_INTEL_TDX_HOST is not set +CONFIG_INTEL_TDX_HOST=y diff --git a/redhat/configs/rhel/generic/x86/x86_64/CONFIG_KVM_INTEL_TDX b/redhat/configs/rhel/generic/x86/x86_64/CONFIG_KVM_INTEL_TDX new file mode 100644 index 0000000000..6c3eec922a --- /dev/null +++ b/redhat/configs/rhel/generic/x86/x86_64/CONFIG_KVM_INTEL_TDX @@ -0,0 +1 @@ +CONFIG_KVM_INTEL_TDX=y diff --git a/redhat/kernel.changelog-9.7 b/redhat/kernel.changelog-9.7 index eb1ecc8d18..4480b281c6 100644 --- a/redhat/kernel.changelog-9.7 +++ b/redhat/kernel.changelog-9.7 @@ -1,3 +1,111 @@ +* Sat Nov 15 2025 CKI KWF Bot [5.14.0-611.9.1.el9_7] +- NFSv4: handle ERR_GRACE on delegation recalls (Olga Kornievskaia) [RHEL-124651] +- nfsd: nfserr_jukebox in nlm_fopen should lead to a retry (Olga Kornievskaia) [RHEL-124651] +- mm: slub: avoid wake up kswapd in set_track_prepare (Audra Mitchell) [RHEL-125521] {CVE-2025-39843} +- slub: Reflow ___slab_alloc() (Audra Mitchell) [RHEL-125521] {CVE-2025-39843} +- nvme-multipath: Skip nr_active increments in RETRY disposition (Ewan D. Milne) [RHEL-123686] +Resolves: RHEL-123686, RHEL-124651, RHEL-125521 + +* Thu Nov 13 2025 CKI KWF Bot [5.14.0-611.8.1.el9_7] +- NFSD: Fix callback decoder status codes (Jay Shin) [RHEL-127193] +- NFSD: Fix CB_GETATTR status fix (Jay Shin) [RHEL-127193] +- NFSD: fix decoding in nfs4_xdr_dec_cb_getattr (Jay Shin) [RHEL-127193] +- kernfs: Fix UAF in polling when open file is released (Pavel Reichl) [RHEL-122087] {CVE-2025-39881} +- gitlab-ci: disable automotive pipelines (Scott Weaver) +- NFS: Fix wakeup of __nfs_lookup_revalidate() in unblock_revalidate() (Benjamin Coddington) [RHEL-122154] +- sched: Add wait/wake interface for variable updated under a lock. (Benjamin Coddington) [RHEL-122154] +- sched: Add test_and_clear_wake_up_bit() and atomic_dec_and_wake_up() (Benjamin Coddington) [RHEL-122154] +- sched: Document wait_var_event() family of functions and wake_up_var() (Benjamin Coddington) [RHEL-122154] +- sched: Improve documentation for wake_up_bit/wait_on_bit family of functions (Benjamin Coddington) [RHEL-122154] +- sched: change wake_up_bit() and related function to expect unsigned long * (Benjamin Coddington) [RHEL-122154] +- bpf: Fix metadata_dst leak __bpf_redirect_neigh_v{4,6} [rhel-9.7.z] (Xin Long) [RHEL-125513] +- redhat: use the same cert as UKI's to sign addons (Li Tian) [RHEL-125317] +- i40e: add mask to apply valid bits for itr_idx (Michal Schmidt) [RHEL-123808] +- i40e: add max boundary check for VF filters (Michal Schmidt) [RHEL-123808] {CVE-2025-39968} +- i40e: fix validation of VF state in get resources (Michal Schmidt) [RHEL-123808] {CVE-2025-39969} +- i40e: fix input validation logic for action_meta (Michal Schmidt) [RHEL-123808] {CVE-2025-39970} +- i40e: fix idx validation in config queues msg (Michal Schmidt) [RHEL-123808] {CVE-2025-39971} +- i40e: fix idx validation in i40e_validate_queue_map (Michal Schmidt) [RHEL-123808] {CVE-2025-39972} +- i40e: add validation for ring_len param (Michal Schmidt) [RHEL-123808] {CVE-2025-39973} +- io_uring/waitid: always prune wait queue entry in io_waitid_wait() (CKI Backport Bot) [RHEL-124971] {CVE-2025-40047} +- Bluetooth: hci_event: Fix UAF in hci_conn_tx_dequeue (CKI Backport Bot) [RHEL-124129] {CVE-2025-39983} +- Bluetooth: hci_event: Fix UAF in hci_acl_create_conn_sync (CKI Backport Bot) [RHEL-123821] {CVE-2025-39982} +- use uniform permission checks for all mount propagation changes (Ian Kent) [RHEL-121704] {CVE-2025-38498} +- do_change_type(): refuse to operate on unmounted/not ours mounts (Ian Kent) [RHEL-121704] {CVE-2025-38498} +- KVM: x86/hyper-v: Skip non-canonical addresses during PV TLB flush (Jon Maloy) [RHEL-117136] {CVE-2025-38351} +- ibmveth: Add multi buffers rx replenishment hcall support (Mamatha Inamdar) [RHEL-117438] +- net: ibmveth: Reset the adapter when unexpected states are detected (Mamatha Inamdar) [RHEL-117438] +- NFS: Fix a race when updating an existing write (CKI Backport Bot) [RHEL-113855] {CVE-2025-39697} +Resolves: RHEL-113855, RHEL-117136, RHEL-117438, RHEL-121704, RHEL-122087, RHEL-122154, RHEL-123808, RHEL-123821, RHEL-124129, RHEL-124971, RHEL-125317, RHEL-125513, RHEL-127193 + +* Thu Oct 30 2025 CKI KWF Bot [5.14.0-611.7.1.el9_7] +- The rpminspect.yaml emptyrpm list needs to be expanded (Alexandra Hájková) +- crypto: xts - Handle EBUSY correctly (Vladis Dronov) [RHEL-119236] {CVE-2023-53494} +- ice: fix NULL access of tx->in_use in ice_ll_ts_intr (Petr Oros) [RHEL-112874] +- ice: fix NULL access of tx->in_use in ice_ptp_ts_irq (Petr Oros) [RHEL-112874] +- ice: fix Rx page leak on multi-buffer frames (Petr Oros) [RHEL-116540] +- xfs: do not propagate ENODATA disk errors into xattr code (Carlos Maiolino) [RHEL-115730] +- ipv6: sr: Fix MAC comparison to be constant-time (CKI Backport Bot) [RHEL-116383] {CVE-2025-39702} +- s390/hypfs: Enable limited access during lockdown (CKI Backport Bot) [RHEL-114434] +- s390/hypfs: Avoid unnecessary ioctl registration in debugfs (CKI Backport Bot) [RHEL-114434] +- vsock/virtio: Validate length in packet header before skb_put() (Jon Maloy) [RHEL-114298] {CVE-2025-39718} +Resolves: RHEL-112874, RHEL-114298, RHEL-114434, RHEL-115730, RHEL-116383, RHEL-116540, RHEL-119236 + +* Thu Oct 23 2025 CKI KWF Bot [5.14.0-611.6.1.el9_7] +- pstore/ram: Check start of empty przs during init (CKI Backport Bot) [RHEL-122068] {CVE-2023-53331} +- ixgbe: fix ixgbe_orom_civd_info struct layout (Michal Schmidt) [RHEL-119074] +- scsi: lpfc: Fix buffer free/clear order in deferred receive path (CKI Backport Bot) [RHEL-119130] {CVE-2025-39841} +- efivarfs: Fix slab-out-of-bounds in efivarfs_d_compare (CKI Backport Bot) [RHEL-118257] {CVE-2025-39817} +- SUNRPC: call xs_sock_process_cmsg for all cmsg (Olga Kornievskaia) [RHEL-110810] +- sunrpc: fix client side handling of tls alerts (Olga Kornievskaia) [RHEL-110810] {CVE-2025-38571} +- smb: client: fix wrong index reference in smb2_compound_op() (Paulo Alcantara) [RHEL-117880] +- smb: client: handle unlink(2) of files open by different clients (Paulo Alcantara) [RHEL-117880] +- smb: client: fix file open check in __cifs_unlink() (Paulo Alcantara) [RHEL-117880] +- smb: client: fix filename matching of deferred files (Paulo Alcantara) [RHEL-117880] +- smb: client: fix data loss due to broken rename(2) (Paulo Alcantara) [RHEL-117880] +- smb: client: fix compound alignment with encryption (Paulo Alcantara) [RHEL-117880] +- fs/smb: Fix inconsistent refcnt update (Paulo Alcantara) [RHEL-117880] {CVE-2025-39819} +- sunrpc: fix handling of server side tls alerts (Steve Dickson) [RHEL-111069] {CVE-2025-38566} +- wifi: cfg80211: sme: cap SSID length in __cfg80211_connect_result() (CKI Backport Bot) [RHEL-117580] {CVE-2025-39849} +- crypto: seqiv - Handle EBUSY correctly (CKI Backport Bot) [RHEL-117235] {CVE-2023-53373} +- ibmvnic: Increase max subcrq indirect entries with fallback (Mamatha Inamdar) [RHEL-116187] +- fs: fix UAF/GPF bug in nilfs_mdt_destroy (CKI Backport Bot) [RHEL-116662] {CVE-2022-50367} +- firmware: arm_scpi: Ensure scpi_info is not assigned if the probe fails (Charles Mirabile) [RHEL-113837] {CVE-2022-50087} +- hv_netvsc: Fix panic during namespace deletion with VF (Maxim Levitsky) [RHEL-115070] +- RDMA/mana_ib: Fix DSCP value in modify QP (Maxim Levitsky) [RHEL-115070] +- net: mana: Handle Reset Request from MANA NIC (Maxim Levitsky) [RHEL-115070] +- net: mana: Set tx_packets to post gso processing packet count (Maxim Levitsky) [RHEL-115070] +- net: mana: Handle unsupported HWC commands (Maxim Levitsky) [RHEL-115070] +- net: mana: Add handler for hardware servicing events (Maxim Levitsky) [RHEL-115070] +- RDMA/mana_ib: Add device statistics support (Maxim Levitsky) [RHEL-115070] +- net: mana: Expose additional hardware counters for drop and TC via ethtool. (Maxim Levitsky) [RHEL-115070] +- net: mana: Fix warnings for missing export.h header inclusion (Maxim Levitsky) [RHEL-115070] +- net: mana: Record doorbell physical address in PF mode (Maxim Levitsky) [RHEL-115070] +- s390/pci: Do not try re-enabling load/store if device is disabled (CKI Backport Bot) [RHEL-114451] +- s390/pci: Fix stale function handles in error handling (CKI Backport Bot) [RHEL-114451] +- redhat: enable TDX host config (Paolo Bonzini) [RHEL-27146] +- KVM: TDX: Explicitly do WBINVD when no more TDX SEAMCALLs (Paolo Bonzini) [RHEL-27146] +- x86/virt/tdx: Update the kexec section in the TDX documentation (Paolo Bonzini) [RHEL-27146] +- x86/virt/tdx: Remove the !KEXEC_CORE dependency (Paolo Bonzini) [RHEL-27146] +- x86/kexec: Disable kexec/kdump on platforms with TDX partial write erratum (Paolo Bonzini) [RHEL-27146] +- x86/virt/tdx: Mark memory cache state incoherent when making SEAMCALL (Paolo Bonzini) [RHEL-27146] +- x86/sme: Use percpu boolean to control WBINVD during kexec (Paolo Bonzini) [RHEL-27146] +- x86/virt/tdx: Avoid indirect calls to TDX assembly functions (Paolo Bonzini) [RHEL-27146] +- ibmvnic: Use ndo_get_stats64 to fix inaccurate SAR reporting (Mamatha Inamdar) [RHEL-114437] +- ibmvnic: Fix hardcoded NUM_RX_STATS/NUM_TX_STATS with dynamic sizeof (Mamatha Inamdar) [RHEL-114437] +- ibmvnic: Add stat for tx direct vs tx batched (Mamatha Inamdar) [RHEL-114437] +- redhat/configs: Enable CONFIG_MITIGATION_VMSCAPE for x86 (Waiman Long) [RHEL-114272] +- x86/vmscape: Add old Intel CPUs to affected list (Waiman Long) [RHEL-114272] {CVE-2025-40300} +- x86/vmscape: Warn when STIBP is disabled with SMT (Waiman Long) [RHEL-114272] {CVE-2025-40300} +- x86/bugs: Move cpu_bugs_smt_update() down (Waiman Long) [RHEL-114272] {CVE-2025-40300} +- x86/vmscape: Enable the mitigation (Waiman Long) [RHEL-114272] {CVE-2025-40300} +- x86/vmscape: Add conditional IBPB mitigation (Waiman Long) [RHEL-114272] {CVE-2025-40300} +- x86/vmscape: Enumerate VMSCAPE bug (Waiman Long) [RHEL-114272] {CVE-2025-40300} +- Documentation/hw-vuln: Add VMSCAPE documentation (Waiman Long) [RHEL-114272] {CVE-2025-40300} +- randomize_kstack: Remove non-functional per-arch entropy filtering (Waiman Long) [RHEL-114272] +- tunnels: reset the GSO metadata before reusing the skb (Antoine Tenart) [RHEL-113917] +Resolves: RHEL-110810, RHEL-111069, RHEL-113837, RHEL-113917, RHEL-114272, RHEL-114437, RHEL-114451, RHEL-115070, RHEL-116187, RHEL-116662, RHEL-117235, RHEL-117580, RHEL-117880, RHEL-118257, RHEL-119074, RHEL-119130, RHEL-122068, RHEL-27146 + * Fri Oct 17 2025 Augusto Caringi [5.14.0-611.5.1.el9_7] - redhat: revert to using redhatsecureboot504 for RHEL UKI (Vitaly Kuznetsov) [RHEL-122230] Resolves: RHEL-122230 diff --git a/redhat/kernel.spec.template b/redhat/kernel.spec.template index c7c217dfa6..bfe2c6810a 100644 --- a/redhat/kernel.spec.template +++ b/redhat/kernel.spec.template @@ -2433,7 +2433,7 @@ BuildKernel() { mv $KernelUnifiedImage.signed $KernelUnifiedImage for addon in "$KernelAddonsDirOut"/*; do - %pesign -s -i $addon -o $addon.signed -a %{secureboot_ca_0} -c %{secureboot_key_0} -n %{pesign_name_0} + %pesign -s -i $addon -o $addon.signed -a %{secureboot_ca_0} -c $UKI_secureboot_cert -n $UKI_secureboot_name rm -f $addon mv $addon.signed $addon done diff --git a/redhat/rpminspect.yaml b/redhat/rpminspect.yaml index ed799bcadb..eecdab997d 100644 --- a/redhat/rpminspect.yaml +++ b/redhat/rpminspect.yaml @@ -21,10 +21,21 @@ emptyrpm: - kernel-debug - kernel-debug-devel-matched - kernel-devel-matched - - kernel-lpae - kernel-zfcpdump - kernel-zfcpdump-devel-matched - kernel-zfcpdump-modules + - kernel-64k + - kernel-64k-debug + - kernel-64k-debug-devel-matched + - kernel-64k-devel-matched + - kernel-rt + - kernel-rt-debug + - kernel-rt-debug-devel-matched + - kernel-rt-devel-matched + - kernel-rt-64k + - kernel-rt-64k-debug + - kernel-rt-64k-debug-devel-matched + - kernel-rt-64k-devel-matched patches: ignore_list: diff --git a/redhat/scripts/uki_addons/uki_create_json.py b/redhat/scripts/uki_addons/uki_create_json.py index 99387bf987..d4e4e46491 100755 --- a/redhat/scripts/uki_addons/uki_create_json.py +++ b/redhat/scripts/uki_addons/uki_create_json.py @@ -86,7 +86,7 @@ def create_json(addons): def write_json(obj, dest_file): with open(dest_file, 'w') as f: - json.dump(obj , f, indent=4) + json.dump(obj , f, indent=4, sort_keys=True) print(f'Processed addons files are in {dest_file}') if __name__ == "__main__": diff --git a/redhat/self-test/data/centos-6161a435c191.el7 b/redhat/self-test/data/centos-6161a435c191.el7 index cabc4de2eb..080aa062df 100644 --- a/redhat/self-test/data/centos-6161a435c191.el7 +++ b/redhat/self-test/data/centos-6161a435c191.el7 @@ -5,6 +5,7 @@ ARCHCONFIG=X86_64 ARCH_LIST=aarch64 ppc64le s390x x86_64 BASEVERSION=5.12.0-0.rc4.6.test BUILD=6 +BUILDOPTS=+kabidupchk BUILD_FLAGS= BUILD_PROFILE=-p stream BUILD_TARGET=c9s-candidate @@ -81,6 +82,6 @@ UPSTREAMBUILD_GIT_ONLY= UPSTREAM_BRANCH=v5.14 UPSTREAM_TARBALL_NAME=5.12-rc4 VERSION_ON_UPSTREAM=0 -YSTREAM_FLAG=yes -ZSTREAM_FLAG=no +YSTREAM_FLAG=no +ZSTREAM_FLAG=yes _OUTPUT=.. diff --git a/redhat/self-test/data/centos-6161a435c191.el7.spec b/redhat/self-test/data/centos-6161a435c191.el7.spec index bdcde3fd62..1bf0b5ac65 100644 --- a/redhat/self-test/data/centos-6161a435c191.el7.spec +++ b/redhat/self-test/data/centos-6161a435c191.el7.spec @@ -10,4 +10,5 @@ %define patchlevel 12 %define specrelease 0.rc4.6%{?buildid}%{?dist} %define kabiversion 5.12.0-0.rc4.6.test.el7 +%define _with_kabidupchk 1 Mon Mar 28 2022 Fedora Kernel Team [5.12.0-0.rc4.6.test] diff --git a/redhat/self-test/data/centos-6161a435c191.fc25 b/redhat/self-test/data/centos-6161a435c191.fc25 index 3fc9ee1fd8..9f10f7f5d0 100644 --- a/redhat/self-test/data/centos-6161a435c191.fc25 +++ b/redhat/self-test/data/centos-6161a435c191.fc25 @@ -5,6 +5,7 @@ ARCHCONFIG=X86_64 ARCH_LIST=aarch64 ppc64le s390x x86_64 BASEVERSION=5.12.0-0.rc4.6.test BUILD=6 +BUILDOPTS=+kabidupchk BUILD_FLAGS= BUILD_PROFILE=-p stream BUILD_TARGET=c9s-candidate @@ -81,6 +82,6 @@ UPSTREAMBUILD_GIT_ONLY= UPSTREAM_BRANCH=v5.14 UPSTREAM_TARBALL_NAME=5.12-rc4 VERSION_ON_UPSTREAM=0 -YSTREAM_FLAG=yes -ZSTREAM_FLAG=no +YSTREAM_FLAG=no +ZSTREAM_FLAG=yes _OUTPUT=.. diff --git a/redhat/self-test/data/centos-6161a435c191.fc25.spec b/redhat/self-test/data/centos-6161a435c191.fc25.spec index d12d8bcf46..1bb696fa2d 100644 --- a/redhat/self-test/data/centos-6161a435c191.fc25.spec +++ b/redhat/self-test/data/centos-6161a435c191.fc25.spec @@ -10,4 +10,5 @@ %define patchlevel 12 %define specrelease 0.rc4.6%{?buildid}%{?dist} %define kabiversion 5.12.0-0.rc4.6.test.fc25 +%define _with_kabidupchk 1 Mon Mar 28 2022 Fedora Kernel Team [5.12.0-0.rc4.6.test] diff --git a/redhat/self-test/data/centos-9f4ad9e425a1.el7 b/redhat/self-test/data/centos-9f4ad9e425a1.el7 index 48a7178d7e..cd2f0004e0 100644 --- a/redhat/self-test/data/centos-9f4ad9e425a1.el7 +++ b/redhat/self-test/data/centos-9f4ad9e425a1.el7 @@ -5,6 +5,7 @@ ARCHCONFIG=X86_64 ARCH_LIST=aarch64 ppc64le s390x x86_64 BASEVERSION=5.12.0-6.test BUILD=6 +BUILDOPTS=+kabidupchk BUILD_FLAGS= BUILD_PROFILE=-p stream BUILD_TARGET=c9s-candidate @@ -81,6 +82,6 @@ UPSTREAMBUILD_GIT_ONLY= UPSTREAM_BRANCH=v5.14 UPSTREAM_TARBALL_NAME=5.12 VERSION_ON_UPSTREAM=0 -YSTREAM_FLAG=yes -ZSTREAM_FLAG=no +YSTREAM_FLAG=no +ZSTREAM_FLAG=yes _OUTPUT=.. diff --git a/redhat/self-test/data/centos-9f4ad9e425a1.el7.spec b/redhat/self-test/data/centos-9f4ad9e425a1.el7.spec index fc841d2839..eb6507c94f 100644 --- a/redhat/self-test/data/centos-9f4ad9e425a1.el7.spec +++ b/redhat/self-test/data/centos-9f4ad9e425a1.el7.spec @@ -10,4 +10,5 @@ %define patchlevel 12 %define specrelease 6%{?buildid}%{?dist} %define kabiversion 5.12.0-6.test.el7 +%define _with_kabidupchk 1 Mon Mar 28 2022 Fedora Kernel Team [5.12.0-6.test] diff --git a/redhat/self-test/data/centos-9f4ad9e425a1.fc25 b/redhat/self-test/data/centos-9f4ad9e425a1.fc25 index bececbcb95..e5ab1dc157 100644 --- a/redhat/self-test/data/centos-9f4ad9e425a1.fc25 +++ b/redhat/self-test/data/centos-9f4ad9e425a1.fc25 @@ -5,6 +5,7 @@ ARCHCONFIG=X86_64 ARCH_LIST=aarch64 ppc64le s390x x86_64 BASEVERSION=5.12.0-6.test BUILD=6 +BUILDOPTS=+kabidupchk BUILD_FLAGS= BUILD_PROFILE=-p stream BUILD_TARGET=c9s-candidate @@ -81,6 +82,6 @@ UPSTREAMBUILD_GIT_ONLY= UPSTREAM_BRANCH=v5.14 UPSTREAM_TARBALL_NAME=5.12 VERSION_ON_UPSTREAM=0 -YSTREAM_FLAG=yes -ZSTREAM_FLAG=no +YSTREAM_FLAG=no +ZSTREAM_FLAG=yes _OUTPUT=.. diff --git a/redhat/self-test/data/centos-9f4ad9e425a1.fc25.spec b/redhat/self-test/data/centos-9f4ad9e425a1.fc25.spec index 774abe0dae..55c1013c7b 100644 --- a/redhat/self-test/data/centos-9f4ad9e425a1.fc25.spec +++ b/redhat/self-test/data/centos-9f4ad9e425a1.fc25.spec @@ -10,4 +10,5 @@ %define patchlevel 12 %define specrelease 6%{?buildid}%{?dist} %define kabiversion 5.12.0-6.test.fc25 +%define _with_kabidupchk 1 Mon Mar 28 2022 Fedora Kernel Team [5.12.0-6.test] diff --git a/redhat/self-test/data/centos-a5e13c6df0e4.el7 b/redhat/self-test/data/centos-a5e13c6df0e4.el7 index a267dec907..9acbe25a51 100644 --- a/redhat/self-test/data/centos-a5e13c6df0e4.el7 +++ b/redhat/self-test/data/centos-a5e13c6df0e4.el7 @@ -5,6 +5,7 @@ ARCHCONFIG=X86_64 ARCH_LIST=aarch64 ppc64le s390x x86_64 BASEVERSION=5.12.0-0.rc5.6.test BUILD=6 +BUILDOPTS=+kabidupchk BUILD_FLAGS= BUILD_PROFILE=-p stream BUILD_TARGET=c9s-candidate @@ -81,6 +82,6 @@ UPSTREAMBUILD_GIT_ONLY= UPSTREAM_BRANCH=v5.14 UPSTREAM_TARBALL_NAME=5.12-rc5 VERSION_ON_UPSTREAM=0 -YSTREAM_FLAG=yes -ZSTREAM_FLAG=no +YSTREAM_FLAG=no +ZSTREAM_FLAG=yes _OUTPUT=.. diff --git a/redhat/self-test/data/centos-a5e13c6df0e4.el7.spec b/redhat/self-test/data/centos-a5e13c6df0e4.el7.spec index 423d790009..0500ad5e27 100644 --- a/redhat/self-test/data/centos-a5e13c6df0e4.el7.spec +++ b/redhat/self-test/data/centos-a5e13c6df0e4.el7.spec @@ -10,4 +10,5 @@ %define patchlevel 12 %define specrelease 0.rc5.6%{?buildid}%{?dist} %define kabiversion 5.12.0-0.rc5.6.test.el7 +%define _with_kabidupchk 1 Mon Mar 28 2022 Fedora Kernel Team [5.12.0-0.rc5.6.test] diff --git a/redhat/self-test/data/centos-a5e13c6df0e4.fc25 b/redhat/self-test/data/centos-a5e13c6df0e4.fc25 index 05cb076489..0dad71c43a 100644 --- a/redhat/self-test/data/centos-a5e13c6df0e4.fc25 +++ b/redhat/self-test/data/centos-a5e13c6df0e4.fc25 @@ -5,6 +5,7 @@ ARCHCONFIG=X86_64 ARCH_LIST=aarch64 ppc64le s390x x86_64 BASEVERSION=5.12.0-0.rc5.6.test BUILD=6 +BUILDOPTS=+kabidupchk BUILD_FLAGS= BUILD_PROFILE=-p stream BUILD_TARGET=c9s-candidate @@ -81,6 +82,6 @@ UPSTREAMBUILD_GIT_ONLY= UPSTREAM_BRANCH=v5.14 UPSTREAM_TARBALL_NAME=5.12-rc5 VERSION_ON_UPSTREAM=0 -YSTREAM_FLAG=yes -ZSTREAM_FLAG=no +YSTREAM_FLAG=no +ZSTREAM_FLAG=yes _OUTPUT=.. diff --git a/redhat/self-test/data/centos-a5e13c6df0e4.fc25.spec b/redhat/self-test/data/centos-a5e13c6df0e4.fc25.spec index 968abf1b9e..dd3935967d 100644 --- a/redhat/self-test/data/centos-a5e13c6df0e4.fc25.spec +++ b/redhat/self-test/data/centos-a5e13c6df0e4.fc25.spec @@ -10,4 +10,5 @@ %define patchlevel 12 %define specrelease 0.rc5.6%{?buildid}%{?dist} %define kabiversion 5.12.0-0.rc5.6.test.fc25 +%define _with_kabidupchk 1 Mon Mar 28 2022 Fedora Kernel Team [5.12.0-0.rc5.6.test] diff --git a/redhat/self-test/data/centos-edc9dd1e3c31.el7 b/redhat/self-test/data/centos-edc9dd1e3c31.el7 index 4ea008998d..c441cc9a8a 100644 --- a/redhat/self-test/data/centos-edc9dd1e3c31.el7 +++ b/redhat/self-test/data/centos-edc9dd1e3c31.el7 @@ -5,6 +5,7 @@ ARCHCONFIG=X86_64 ARCH_LIST=aarch64 ppc64le s390x x86_64 BASEVERSION=5.12.0-6.test BUILD=6 +BUILDOPTS=+kabidupchk BUILD_FLAGS= BUILD_PROFILE=-p stream BUILD_TARGET=c9s-candidate @@ -81,6 +82,6 @@ UPSTREAMBUILD_GIT_ONLY= UPSTREAM_BRANCH=v5.14 UPSTREAM_TARBALL_NAME=5.12 VERSION_ON_UPSTREAM=0 -YSTREAM_FLAG=yes -ZSTREAM_FLAG=no +YSTREAM_FLAG=no +ZSTREAM_FLAG=yes _OUTPUT=.. diff --git a/redhat/self-test/data/centos-edc9dd1e3c31.el7.spec b/redhat/self-test/data/centos-edc9dd1e3c31.el7.spec index fc841d2839..eb6507c94f 100644 --- a/redhat/self-test/data/centos-edc9dd1e3c31.el7.spec +++ b/redhat/self-test/data/centos-edc9dd1e3c31.el7.spec @@ -10,4 +10,5 @@ %define patchlevel 12 %define specrelease 6%{?buildid}%{?dist} %define kabiversion 5.12.0-6.test.el7 +%define _with_kabidupchk 1 Mon Mar 28 2022 Fedora Kernel Team [5.12.0-6.test] diff --git a/redhat/self-test/data/centos-edc9dd1e3c31.fc25 b/redhat/self-test/data/centos-edc9dd1e3c31.fc25 index 7b14fcc313..fb5c2ad2e6 100644 --- a/redhat/self-test/data/centos-edc9dd1e3c31.fc25 +++ b/redhat/self-test/data/centos-edc9dd1e3c31.fc25 @@ -5,6 +5,7 @@ ARCHCONFIG=X86_64 ARCH_LIST=aarch64 ppc64le s390x x86_64 BASEVERSION=5.12.0-6.test BUILD=6 +BUILDOPTS=+kabidupchk BUILD_FLAGS= BUILD_PROFILE=-p stream BUILD_TARGET=c9s-candidate @@ -81,6 +82,6 @@ UPSTREAMBUILD_GIT_ONLY= UPSTREAM_BRANCH=v5.14 UPSTREAM_TARBALL_NAME=5.12 VERSION_ON_UPSTREAM=0 -YSTREAM_FLAG=yes -ZSTREAM_FLAG=no +YSTREAM_FLAG=no +ZSTREAM_FLAG=yes _OUTPUT=.. diff --git a/redhat/self-test/data/centos-edc9dd1e3c31.fc25.spec b/redhat/self-test/data/centos-edc9dd1e3c31.fc25.spec index 774abe0dae..55c1013c7b 100644 --- a/redhat/self-test/data/centos-edc9dd1e3c31.fc25.spec +++ b/redhat/self-test/data/centos-edc9dd1e3c31.fc25.spec @@ -10,4 +10,5 @@ %define patchlevel 12 %define specrelease 6%{?buildid}%{?dist} %define kabiversion 5.12.0-6.test.fc25 +%define _with_kabidupchk 1 Mon Mar 28 2022 Fedora Kernel Team [5.12.0-6.test] diff --git a/redhat/self-test/data/fedora-6161a435c191.el7 b/redhat/self-test/data/fedora-6161a435c191.el7 index 684a07fcba..40a6a201b8 100644 --- a/redhat/self-test/data/fedora-6161a435c191.el7 +++ b/redhat/self-test/data/fedora-6161a435c191.el7 @@ -5,6 +5,7 @@ ARCHCONFIG=X86_64 ARCH_LIST=aarch64 ppc64le s390x x86_64 BASEVERSION=5.12.0-0.rc4.6.test BUILD=6 +BUILDOPTS=+kabidupchk BUILD_FLAGS= BUILD_PROFILE= BUILD_TARGET=rawhide @@ -81,6 +82,6 @@ UPSTREAMBUILD_GIT_ONLY= UPSTREAM_BRANCH=v5.14 UPSTREAM_TARBALL_NAME=5.12-rc4 VERSION_ON_UPSTREAM=0 -YSTREAM_FLAG=yes -ZSTREAM_FLAG=no +YSTREAM_FLAG=no +ZSTREAM_FLAG=yes _OUTPUT=.. diff --git a/redhat/self-test/data/fedora-6161a435c191.el7.spec b/redhat/self-test/data/fedora-6161a435c191.el7.spec index 9f0a8d7146..0e65381c88 100644 --- a/redhat/self-test/data/fedora-6161a435c191.el7.spec +++ b/redhat/self-test/data/fedora-6161a435c191.el7.spec @@ -10,4 +10,5 @@ %define patchlevel 12 %define specrelease 0.rc4.6%{?buildid}%{?dist} %define kabiversion 5.12.0 +%define _with_kabidupchk 1 Mon Mar 28 2022 Fedora Kernel Team [5.12.0-0.rc4.6.test] diff --git a/redhat/self-test/data/fedora-6161a435c191.fc25 b/redhat/self-test/data/fedora-6161a435c191.fc25 index c36712fa26..740ea30d21 100644 --- a/redhat/self-test/data/fedora-6161a435c191.fc25 +++ b/redhat/self-test/data/fedora-6161a435c191.fc25 @@ -5,6 +5,7 @@ ARCHCONFIG=X86_64 ARCH_LIST=aarch64 ppc64le s390x x86_64 BASEVERSION=5.12.0-0.rc4.6.test BUILD=6 +BUILDOPTS=+kabidupchk BUILD_FLAGS= BUILD_PROFILE= BUILD_TARGET=rawhide @@ -81,6 +82,6 @@ UPSTREAMBUILD_GIT_ONLY= UPSTREAM_BRANCH=v5.14 UPSTREAM_TARBALL_NAME=5.12-rc4 VERSION_ON_UPSTREAM=0 -YSTREAM_FLAG=yes -ZSTREAM_FLAG=no +YSTREAM_FLAG=no +ZSTREAM_FLAG=yes _OUTPUT=.. diff --git a/redhat/self-test/data/fedora-6161a435c191.fc25.spec b/redhat/self-test/data/fedora-6161a435c191.fc25.spec index 9f0a8d7146..0e65381c88 100644 --- a/redhat/self-test/data/fedora-6161a435c191.fc25.spec +++ b/redhat/self-test/data/fedora-6161a435c191.fc25.spec @@ -10,4 +10,5 @@ %define patchlevel 12 %define specrelease 0.rc4.6%{?buildid}%{?dist} %define kabiversion 5.12.0 +%define _with_kabidupchk 1 Mon Mar 28 2022 Fedora Kernel Team [5.12.0-0.rc4.6.test] diff --git a/redhat/self-test/data/fedora-9f4ad9e425a1.el7 b/redhat/self-test/data/fedora-9f4ad9e425a1.el7 index 0d5c37f5d9..c084d162c4 100644 --- a/redhat/self-test/data/fedora-9f4ad9e425a1.el7 +++ b/redhat/self-test/data/fedora-9f4ad9e425a1.el7 @@ -5,6 +5,7 @@ ARCHCONFIG=X86_64 ARCH_LIST=aarch64 ppc64le s390x x86_64 BASEVERSION=5.12.0-6.test BUILD=6 +BUILDOPTS=+kabidupchk BUILD_FLAGS= BUILD_PROFILE= BUILD_TARGET=rawhide @@ -81,6 +82,6 @@ UPSTREAMBUILD_GIT_ONLY= UPSTREAM_BRANCH=v5.14 UPSTREAM_TARBALL_NAME=5.12 VERSION_ON_UPSTREAM=0 -YSTREAM_FLAG=yes -ZSTREAM_FLAG=no +YSTREAM_FLAG=no +ZSTREAM_FLAG=yes _OUTPUT=.. diff --git a/redhat/self-test/data/fedora-9f4ad9e425a1.el7.spec b/redhat/self-test/data/fedora-9f4ad9e425a1.el7.spec index 7abf5bc3fe..e2194adceb 100644 --- a/redhat/self-test/data/fedora-9f4ad9e425a1.el7.spec +++ b/redhat/self-test/data/fedora-9f4ad9e425a1.el7.spec @@ -10,4 +10,5 @@ %define patchlevel 12 %define specrelease 6%{?buildid}%{?dist} %define kabiversion 5.12.0 +%define _with_kabidupchk 1 Mon Mar 28 2022 Fedora Kernel Team [5.12.0-6.test] diff --git a/redhat/self-test/data/fedora-9f4ad9e425a1.fc25 b/redhat/self-test/data/fedora-9f4ad9e425a1.fc25 index 8d45777b49..eec2a62dbf 100644 --- a/redhat/self-test/data/fedora-9f4ad9e425a1.fc25 +++ b/redhat/self-test/data/fedora-9f4ad9e425a1.fc25 @@ -5,6 +5,7 @@ ARCHCONFIG=X86_64 ARCH_LIST=aarch64 ppc64le s390x x86_64 BASEVERSION=5.12.0-6.test BUILD=6 +BUILDOPTS=+kabidupchk BUILD_FLAGS= BUILD_PROFILE= BUILD_TARGET=rawhide @@ -81,6 +82,6 @@ UPSTREAMBUILD_GIT_ONLY= UPSTREAM_BRANCH=v5.14 UPSTREAM_TARBALL_NAME=5.12 VERSION_ON_UPSTREAM=0 -YSTREAM_FLAG=yes -ZSTREAM_FLAG=no +YSTREAM_FLAG=no +ZSTREAM_FLAG=yes _OUTPUT=.. diff --git a/redhat/self-test/data/fedora-9f4ad9e425a1.fc25.spec b/redhat/self-test/data/fedora-9f4ad9e425a1.fc25.spec index 7abf5bc3fe..e2194adceb 100644 --- a/redhat/self-test/data/fedora-9f4ad9e425a1.fc25.spec +++ b/redhat/self-test/data/fedora-9f4ad9e425a1.fc25.spec @@ -10,4 +10,5 @@ %define patchlevel 12 %define specrelease 6%{?buildid}%{?dist} %define kabiversion 5.12.0 +%define _with_kabidupchk 1 Mon Mar 28 2022 Fedora Kernel Team [5.12.0-6.test] diff --git a/redhat/self-test/data/fedora-a5e13c6df0e4.el7 b/redhat/self-test/data/fedora-a5e13c6df0e4.el7 index 27fc563f73..a843dca222 100644 --- a/redhat/self-test/data/fedora-a5e13c6df0e4.el7 +++ b/redhat/self-test/data/fedora-a5e13c6df0e4.el7 @@ -5,6 +5,7 @@ ARCHCONFIG=X86_64 ARCH_LIST=aarch64 ppc64le s390x x86_64 BASEVERSION=5.12.0-0.rc5.6.test BUILD=6 +BUILDOPTS=+kabidupchk BUILD_FLAGS= BUILD_PROFILE= BUILD_TARGET=rawhide @@ -81,6 +82,6 @@ UPSTREAMBUILD_GIT_ONLY= UPSTREAM_BRANCH=v5.14 UPSTREAM_TARBALL_NAME=5.12-rc5 VERSION_ON_UPSTREAM=0 -YSTREAM_FLAG=yes -ZSTREAM_FLAG=no +YSTREAM_FLAG=no +ZSTREAM_FLAG=yes _OUTPUT=.. diff --git a/redhat/self-test/data/fedora-a5e13c6df0e4.el7.spec b/redhat/self-test/data/fedora-a5e13c6df0e4.el7.spec index 1ada6692c6..9e26126af0 100644 --- a/redhat/self-test/data/fedora-a5e13c6df0e4.el7.spec +++ b/redhat/self-test/data/fedora-a5e13c6df0e4.el7.spec @@ -10,4 +10,5 @@ %define patchlevel 12 %define specrelease 0.rc5.6%{?buildid}%{?dist} %define kabiversion 5.12.0 +%define _with_kabidupchk 1 Mon Mar 28 2022 Fedora Kernel Team [5.12.0-0.rc5.6.test] diff --git a/redhat/self-test/data/fedora-a5e13c6df0e4.fc25 b/redhat/self-test/data/fedora-a5e13c6df0e4.fc25 index ddeb5f0cb7..944db1f7fe 100644 --- a/redhat/self-test/data/fedora-a5e13c6df0e4.fc25 +++ b/redhat/self-test/data/fedora-a5e13c6df0e4.fc25 @@ -5,6 +5,7 @@ ARCHCONFIG=X86_64 ARCH_LIST=aarch64 ppc64le s390x x86_64 BASEVERSION=5.12.0-0.rc5.6.test BUILD=6 +BUILDOPTS=+kabidupchk BUILD_FLAGS= BUILD_PROFILE= BUILD_TARGET=rawhide @@ -81,6 +82,6 @@ UPSTREAMBUILD_GIT_ONLY= UPSTREAM_BRANCH=v5.14 UPSTREAM_TARBALL_NAME=5.12-rc5 VERSION_ON_UPSTREAM=0 -YSTREAM_FLAG=yes -ZSTREAM_FLAG=no +YSTREAM_FLAG=no +ZSTREAM_FLAG=yes _OUTPUT=.. diff --git a/redhat/self-test/data/fedora-a5e13c6df0e4.fc25.spec b/redhat/self-test/data/fedora-a5e13c6df0e4.fc25.spec index 1ada6692c6..9e26126af0 100644 --- a/redhat/self-test/data/fedora-a5e13c6df0e4.fc25.spec +++ b/redhat/self-test/data/fedora-a5e13c6df0e4.fc25.spec @@ -10,4 +10,5 @@ %define patchlevel 12 %define specrelease 0.rc5.6%{?buildid}%{?dist} %define kabiversion 5.12.0 +%define _with_kabidupchk 1 Mon Mar 28 2022 Fedora Kernel Team [5.12.0-0.rc5.6.test] diff --git a/redhat/self-test/data/fedora-edc9dd1e3c31.el7 b/redhat/self-test/data/fedora-edc9dd1e3c31.el7 index 557d595a4d..0c722f57a6 100644 --- a/redhat/self-test/data/fedora-edc9dd1e3c31.el7 +++ b/redhat/self-test/data/fedora-edc9dd1e3c31.el7 @@ -5,6 +5,7 @@ ARCHCONFIG=X86_64 ARCH_LIST=aarch64 ppc64le s390x x86_64 BASEVERSION=5.12.0-6.test BUILD=6 +BUILDOPTS=+kabidupchk BUILD_FLAGS= BUILD_PROFILE= BUILD_TARGET=rawhide @@ -81,6 +82,6 @@ UPSTREAMBUILD_GIT_ONLY= UPSTREAM_BRANCH=v5.14 UPSTREAM_TARBALL_NAME=5.12 VERSION_ON_UPSTREAM=0 -YSTREAM_FLAG=yes -ZSTREAM_FLAG=no +YSTREAM_FLAG=no +ZSTREAM_FLAG=yes _OUTPUT=.. diff --git a/redhat/self-test/data/fedora-edc9dd1e3c31.el7.spec b/redhat/self-test/data/fedora-edc9dd1e3c31.el7.spec index 7abf5bc3fe..e2194adceb 100644 --- a/redhat/self-test/data/fedora-edc9dd1e3c31.el7.spec +++ b/redhat/self-test/data/fedora-edc9dd1e3c31.el7.spec @@ -10,4 +10,5 @@ %define patchlevel 12 %define specrelease 6%{?buildid}%{?dist} %define kabiversion 5.12.0 +%define _with_kabidupchk 1 Mon Mar 28 2022 Fedora Kernel Team [5.12.0-6.test] diff --git a/redhat/self-test/data/fedora-edc9dd1e3c31.fc25 b/redhat/self-test/data/fedora-edc9dd1e3c31.fc25 index fb45bedba1..18434ab243 100644 --- a/redhat/self-test/data/fedora-edc9dd1e3c31.fc25 +++ b/redhat/self-test/data/fedora-edc9dd1e3c31.fc25 @@ -5,6 +5,7 @@ ARCHCONFIG=X86_64 ARCH_LIST=aarch64 ppc64le s390x x86_64 BASEVERSION=5.12.0-6.test BUILD=6 +BUILDOPTS=+kabidupchk BUILD_FLAGS= BUILD_PROFILE= BUILD_TARGET=rawhide @@ -81,6 +82,6 @@ UPSTREAMBUILD_GIT_ONLY= UPSTREAM_BRANCH=v5.14 UPSTREAM_TARBALL_NAME=5.12 VERSION_ON_UPSTREAM=0 -YSTREAM_FLAG=yes -ZSTREAM_FLAG=no +YSTREAM_FLAG=no +ZSTREAM_FLAG=yes _OUTPUT=.. diff --git a/redhat/self-test/data/fedora-edc9dd1e3c31.fc25.spec b/redhat/self-test/data/fedora-edc9dd1e3c31.fc25.spec index 7abf5bc3fe..e2194adceb 100644 --- a/redhat/self-test/data/fedora-edc9dd1e3c31.fc25.spec +++ b/redhat/self-test/data/fedora-edc9dd1e3c31.fc25.spec @@ -10,4 +10,5 @@ %define patchlevel 12 %define specrelease 6%{?buildid}%{?dist} %define kabiversion 5.12.0 +%define _with_kabidupchk 1 Mon Mar 28 2022 Fedora Kernel Team [5.12.0-6.test] diff --git a/redhat/self-test/data/rhel-6161a435c191.el7 b/redhat/self-test/data/rhel-6161a435c191.el7 index a12037e05b..e04c12fdfe 100644 --- a/redhat/self-test/data/rhel-6161a435c191.el7 +++ b/redhat/self-test/data/rhel-6161a435c191.el7 @@ -5,6 +5,7 @@ ARCHCONFIG=X86_64 ARCH_LIST=aarch64 ppc64le s390x x86_64 BASEVERSION=5.12.0-0.rc4.6.test BUILD=6 +BUILDOPTS=+kabidupchk BUILD_FLAGS= BUILD_PROFILE= BUILD_TARGET=rhel-9.7.0-test-pesign @@ -81,6 +82,6 @@ UPSTREAMBUILD_GIT_ONLY= UPSTREAM_BRANCH=v5.14 UPSTREAM_TARBALL_NAME=5.12-rc4 VERSION_ON_UPSTREAM=0 -YSTREAM_FLAG=yes -ZSTREAM_FLAG=no +YSTREAM_FLAG=no +ZSTREAM_FLAG=yes _OUTPUT=.. diff --git a/redhat/self-test/data/rhel-6161a435c191.el7.spec b/redhat/self-test/data/rhel-6161a435c191.el7.spec index bdcde3fd62..1bf0b5ac65 100644 --- a/redhat/self-test/data/rhel-6161a435c191.el7.spec +++ b/redhat/self-test/data/rhel-6161a435c191.el7.spec @@ -10,4 +10,5 @@ %define patchlevel 12 %define specrelease 0.rc4.6%{?buildid}%{?dist} %define kabiversion 5.12.0-0.rc4.6.test.el7 +%define _with_kabidupchk 1 Mon Mar 28 2022 Fedora Kernel Team [5.12.0-0.rc4.6.test] diff --git a/redhat/self-test/data/rhel-6161a435c191.fc25 b/redhat/self-test/data/rhel-6161a435c191.fc25 index 6eed090386..41861edb2f 100644 --- a/redhat/self-test/data/rhel-6161a435c191.fc25 +++ b/redhat/self-test/data/rhel-6161a435c191.fc25 @@ -5,6 +5,7 @@ ARCHCONFIG=X86_64 ARCH_LIST=aarch64 ppc64le s390x x86_64 BASEVERSION=5.12.0-0.rc4.6.test BUILD=6 +BUILDOPTS=+kabidupchk BUILD_FLAGS= BUILD_PROFILE= BUILD_TARGET=rhel-9.7.0-test-pesign @@ -81,6 +82,6 @@ UPSTREAMBUILD_GIT_ONLY= UPSTREAM_BRANCH=v5.14 UPSTREAM_TARBALL_NAME=5.12-rc4 VERSION_ON_UPSTREAM=0 -YSTREAM_FLAG=yes -ZSTREAM_FLAG=no +YSTREAM_FLAG=no +ZSTREAM_FLAG=yes _OUTPUT=.. diff --git a/redhat/self-test/data/rhel-6161a435c191.fc25.spec b/redhat/self-test/data/rhel-6161a435c191.fc25.spec index d12d8bcf46..1bb696fa2d 100644 --- a/redhat/self-test/data/rhel-6161a435c191.fc25.spec +++ b/redhat/self-test/data/rhel-6161a435c191.fc25.spec @@ -10,4 +10,5 @@ %define patchlevel 12 %define specrelease 0.rc4.6%{?buildid}%{?dist} %define kabiversion 5.12.0-0.rc4.6.test.fc25 +%define _with_kabidupchk 1 Mon Mar 28 2022 Fedora Kernel Team [5.12.0-0.rc4.6.test] diff --git a/redhat/self-test/data/rhel-9f4ad9e425a1.el7 b/redhat/self-test/data/rhel-9f4ad9e425a1.el7 index c3aea3cfe4..ac8dd2f6d2 100644 --- a/redhat/self-test/data/rhel-9f4ad9e425a1.el7 +++ b/redhat/self-test/data/rhel-9f4ad9e425a1.el7 @@ -5,6 +5,7 @@ ARCHCONFIG=X86_64 ARCH_LIST=aarch64 ppc64le s390x x86_64 BASEVERSION=5.12.0-6.test BUILD=6 +BUILDOPTS=+kabidupchk BUILD_FLAGS= BUILD_PROFILE= BUILD_TARGET=rhel-9.7.0-test-pesign @@ -81,6 +82,6 @@ UPSTREAMBUILD_GIT_ONLY= UPSTREAM_BRANCH=v5.14 UPSTREAM_TARBALL_NAME=5.12 VERSION_ON_UPSTREAM=0 -YSTREAM_FLAG=yes -ZSTREAM_FLAG=no +YSTREAM_FLAG=no +ZSTREAM_FLAG=yes _OUTPUT=.. diff --git a/redhat/self-test/data/rhel-9f4ad9e425a1.el7.spec b/redhat/self-test/data/rhel-9f4ad9e425a1.el7.spec index fc841d2839..eb6507c94f 100644 --- a/redhat/self-test/data/rhel-9f4ad9e425a1.el7.spec +++ b/redhat/self-test/data/rhel-9f4ad9e425a1.el7.spec @@ -10,4 +10,5 @@ %define patchlevel 12 %define specrelease 6%{?buildid}%{?dist} %define kabiversion 5.12.0-6.test.el7 +%define _with_kabidupchk 1 Mon Mar 28 2022 Fedora Kernel Team [5.12.0-6.test] diff --git a/redhat/self-test/data/rhel-9f4ad9e425a1.fc25 b/redhat/self-test/data/rhel-9f4ad9e425a1.fc25 index 3073a08915..125303552b 100644 --- a/redhat/self-test/data/rhel-9f4ad9e425a1.fc25 +++ b/redhat/self-test/data/rhel-9f4ad9e425a1.fc25 @@ -5,6 +5,7 @@ ARCHCONFIG=X86_64 ARCH_LIST=aarch64 ppc64le s390x x86_64 BASEVERSION=5.12.0-6.test BUILD=6 +BUILDOPTS=+kabidupchk BUILD_FLAGS= BUILD_PROFILE= BUILD_TARGET=rhel-9.7.0-test-pesign @@ -81,6 +82,6 @@ UPSTREAMBUILD_GIT_ONLY= UPSTREAM_BRANCH=v5.14 UPSTREAM_TARBALL_NAME=5.12 VERSION_ON_UPSTREAM=0 -YSTREAM_FLAG=yes -ZSTREAM_FLAG=no +YSTREAM_FLAG=no +ZSTREAM_FLAG=yes _OUTPUT=.. diff --git a/redhat/self-test/data/rhel-9f4ad9e425a1.fc25.spec b/redhat/self-test/data/rhel-9f4ad9e425a1.fc25.spec index 774abe0dae..55c1013c7b 100644 --- a/redhat/self-test/data/rhel-9f4ad9e425a1.fc25.spec +++ b/redhat/self-test/data/rhel-9f4ad9e425a1.fc25.spec @@ -10,4 +10,5 @@ %define patchlevel 12 %define specrelease 6%{?buildid}%{?dist} %define kabiversion 5.12.0-6.test.fc25 +%define _with_kabidupchk 1 Mon Mar 28 2022 Fedora Kernel Team [5.12.0-6.test] diff --git a/redhat/self-test/data/rhel-a5e13c6df0e4.el7 b/redhat/self-test/data/rhel-a5e13c6df0e4.el7 index ab3dc93d8c..1ba258d283 100644 --- a/redhat/self-test/data/rhel-a5e13c6df0e4.el7 +++ b/redhat/self-test/data/rhel-a5e13c6df0e4.el7 @@ -5,6 +5,7 @@ ARCHCONFIG=X86_64 ARCH_LIST=aarch64 ppc64le s390x x86_64 BASEVERSION=5.12.0-0.rc5.6.test BUILD=6 +BUILDOPTS=+kabidupchk BUILD_FLAGS= BUILD_PROFILE= BUILD_TARGET=rhel-9.7.0-test-pesign @@ -81,6 +82,6 @@ UPSTREAMBUILD_GIT_ONLY= UPSTREAM_BRANCH=v5.14 UPSTREAM_TARBALL_NAME=5.12-rc5 VERSION_ON_UPSTREAM=0 -YSTREAM_FLAG=yes -ZSTREAM_FLAG=no +YSTREAM_FLAG=no +ZSTREAM_FLAG=yes _OUTPUT=.. diff --git a/redhat/self-test/data/rhel-a5e13c6df0e4.el7.spec b/redhat/self-test/data/rhel-a5e13c6df0e4.el7.spec index 423d790009..0500ad5e27 100644 --- a/redhat/self-test/data/rhel-a5e13c6df0e4.el7.spec +++ b/redhat/self-test/data/rhel-a5e13c6df0e4.el7.spec @@ -10,4 +10,5 @@ %define patchlevel 12 %define specrelease 0.rc5.6%{?buildid}%{?dist} %define kabiversion 5.12.0-0.rc5.6.test.el7 +%define _with_kabidupchk 1 Mon Mar 28 2022 Fedora Kernel Team [5.12.0-0.rc5.6.test] diff --git a/redhat/self-test/data/rhel-a5e13c6df0e4.fc25 b/redhat/self-test/data/rhel-a5e13c6df0e4.fc25 index 96307cf8fa..eaef54cce6 100644 --- a/redhat/self-test/data/rhel-a5e13c6df0e4.fc25 +++ b/redhat/self-test/data/rhel-a5e13c6df0e4.fc25 @@ -5,6 +5,7 @@ ARCHCONFIG=X86_64 ARCH_LIST=aarch64 ppc64le s390x x86_64 BASEVERSION=5.12.0-0.rc5.6.test BUILD=6 +BUILDOPTS=+kabidupchk BUILD_FLAGS= BUILD_PROFILE= BUILD_TARGET=rhel-9.7.0-test-pesign @@ -81,6 +82,6 @@ UPSTREAMBUILD_GIT_ONLY= UPSTREAM_BRANCH=v5.14 UPSTREAM_TARBALL_NAME=5.12-rc5 VERSION_ON_UPSTREAM=0 -YSTREAM_FLAG=yes -ZSTREAM_FLAG=no +YSTREAM_FLAG=no +ZSTREAM_FLAG=yes _OUTPUT=.. diff --git a/redhat/self-test/data/rhel-a5e13c6df0e4.fc25.spec b/redhat/self-test/data/rhel-a5e13c6df0e4.fc25.spec index 968abf1b9e..dd3935967d 100644 --- a/redhat/self-test/data/rhel-a5e13c6df0e4.fc25.spec +++ b/redhat/self-test/data/rhel-a5e13c6df0e4.fc25.spec @@ -10,4 +10,5 @@ %define patchlevel 12 %define specrelease 0.rc5.6%{?buildid}%{?dist} %define kabiversion 5.12.0-0.rc5.6.test.fc25 +%define _with_kabidupchk 1 Mon Mar 28 2022 Fedora Kernel Team [5.12.0-0.rc5.6.test] diff --git a/redhat/self-test/data/rhel-edc9dd1e3c31.el7 b/redhat/self-test/data/rhel-edc9dd1e3c31.el7 index 6f1ea0a703..3d6238cebf 100644 --- a/redhat/self-test/data/rhel-edc9dd1e3c31.el7 +++ b/redhat/self-test/data/rhel-edc9dd1e3c31.el7 @@ -5,6 +5,7 @@ ARCHCONFIG=X86_64 ARCH_LIST=aarch64 ppc64le s390x x86_64 BASEVERSION=5.12.0-6.test BUILD=6 +BUILDOPTS=+kabidupchk BUILD_FLAGS= BUILD_PROFILE= BUILD_TARGET=rhel-9.7.0-test-pesign @@ -81,6 +82,6 @@ UPSTREAMBUILD_GIT_ONLY= UPSTREAM_BRANCH=v5.14 UPSTREAM_TARBALL_NAME=5.12 VERSION_ON_UPSTREAM=0 -YSTREAM_FLAG=yes -ZSTREAM_FLAG=no +YSTREAM_FLAG=no +ZSTREAM_FLAG=yes _OUTPUT=.. diff --git a/redhat/self-test/data/rhel-edc9dd1e3c31.el7.spec b/redhat/self-test/data/rhel-edc9dd1e3c31.el7.spec index fc841d2839..eb6507c94f 100644 --- a/redhat/self-test/data/rhel-edc9dd1e3c31.el7.spec +++ b/redhat/self-test/data/rhel-edc9dd1e3c31.el7.spec @@ -10,4 +10,5 @@ %define patchlevel 12 %define specrelease 6%{?buildid}%{?dist} %define kabiversion 5.12.0-6.test.el7 +%define _with_kabidupchk 1 Mon Mar 28 2022 Fedora Kernel Team [5.12.0-6.test] diff --git a/redhat/self-test/data/rhel-edc9dd1e3c31.fc25 b/redhat/self-test/data/rhel-edc9dd1e3c31.fc25 index e4996d6d5c..2596d8942f 100644 --- a/redhat/self-test/data/rhel-edc9dd1e3c31.fc25 +++ b/redhat/self-test/data/rhel-edc9dd1e3c31.fc25 @@ -5,6 +5,7 @@ ARCHCONFIG=X86_64 ARCH_LIST=aarch64 ppc64le s390x x86_64 BASEVERSION=5.12.0-6.test BUILD=6 +BUILDOPTS=+kabidupchk BUILD_FLAGS= BUILD_PROFILE= BUILD_TARGET=rhel-9.7.0-test-pesign @@ -81,6 +82,6 @@ UPSTREAMBUILD_GIT_ONLY= UPSTREAM_BRANCH=v5.14 UPSTREAM_TARBALL_NAME=5.12 VERSION_ON_UPSTREAM=0 -YSTREAM_FLAG=yes -ZSTREAM_FLAG=no +YSTREAM_FLAG=no +ZSTREAM_FLAG=yes _OUTPUT=.. diff --git a/redhat/self-test/data/rhel-edc9dd1e3c31.fc25.spec b/redhat/self-test/data/rhel-edc9dd1e3c31.fc25.spec index 774abe0dae..55c1013c7b 100644 --- a/redhat/self-test/data/rhel-edc9dd1e3c31.fc25.spec +++ b/redhat/self-test/data/rhel-edc9dd1e3c31.fc25.spec @@ -10,4 +10,5 @@ %define patchlevel 12 %define specrelease 6%{?buildid}%{?dist} %define kabiversion 5.12.0-6.test.fc25 +%define _with_kabidupchk 1 Mon Mar 28 2022 Fedora Kernel Team [5.12.0-6.test]