Good Day,
I know this email might come to you as a surprise because is
coming from someone you haven’t met with before.
I am Mr. Zahiri Keen, the bank manager with BOA bank i contact you for
a deal relating to the funds which are in my position I shall furnish
you with more detail once your response.
Regards,
Mr.Zahiri
commit e89d120c4b720e232cc6a94f0fcbd59c15d41489 upstream.
The AMU counter AMEVCNTR01 (constant counter) should increment at the same
rate as the system counter. On affected Cortex-A510 cores, AMEVCNTR01
increments incorrectly giving a significantly higher output value. This
results in inaccurate task scheduler utilization tracking.
Work around this problem by keeping the reference values of affected
counters to 0. This results in disabling the single user of this
counter: Frequency Invariance Engine (FIE).
This effect is the same to firmware disabling affected counters, in
which case 0 will be returned when reading the disabled counters.
Therefore, AMU counters will not be used for frequency invariance for
affected CPUs and CPUs in the same cpufreq policy. AMUs can still be used
for frequency invariance for unaffected CPUs in the system.
The above is achieved through adding a new erratum: ARM64_ERRATUM_2457168.
Signed-off-by: Ionela Voinescu <ionela.voinescu(a)arm.com>
Cc: Catalin Marinas <catalin.marinas(a)arm.com>
Cc: Will Deacon <will(a)kernel.org>
Cc: James Morse <james.morse(a)arm.com>
Link: https://lore.kernel.org/r/20220819103050.24211-1-ionela.voinescu@arm.com
---
Hi,
This is a backport to stable 5.10.142 of the upstream commit
e89d120c4b72 arm64: errata: add detection for AMEVCNTR01 incrementing incorrectly
This is sent separately as there were conflicts that needed resolving
when applying the mainline patch. Compared to the upstream version this
no longer handles the FFH usecase, as FFH support is not present in 5.10.
Therefore the commit message and Kconfig description are modified to
only describe the effect on FIE caused by this erratum.
Thanks,
Ionela.
Documentation/arm64/silicon-errata.rst | 2 ++
arch/arm64/Kconfig | 18 ++++++++++++++++++
arch/arm64/include/asm/cpucaps.h | 3 ++-
arch/arm64/kernel/cpu_errata.c | 9 +++++++++
arch/arm64/kernel/cpufeature.c | 5 ++++-
5 files changed, 35 insertions(+), 2 deletions(-)
diff --git a/Documentation/arm64/silicon-errata.rst b/Documentation/arm64/silicon-errata.rst
index f01eed0ee23a..22a07c208fee 100644
--- a/Documentation/arm64/silicon-errata.rst
+++ b/Documentation/arm64/silicon-errata.rst
@@ -92,6 +92,8 @@ stable kernels.
+----------------+-----------------+-----------------+-----------------------------+
| ARM | Cortex-A77 | #1508412 | ARM64_ERRATUM_1508412 |
+----------------+-----------------+-----------------+-----------------------------+
+| ARM | Cortex-A510 | #2457168 | ARM64_ERRATUM_2457168 |
++----------------+-----------------+-----------------+-----------------------------+
| ARM | Neoverse-N1 | #1188873,1418040| ARM64_ERRATUM_1418040 |
+----------------+-----------------+-----------------+-----------------------------+
| ARM | Neoverse-N1 | #1349291 | N/A |
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 7c7906e9dafd..1116a8d092c0 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -657,6 +657,24 @@ config ARM64_ERRATUM_1508412
If unsure, say Y.
+config ARM64_ERRATUM_2457168
+ bool "Cortex-A510: 2457168: workaround for AMEVCNTR01 incrementing incorrectly"
+ depends on ARM64_AMU_EXTN
+ default y
+ help
+ This option adds the workaround for ARM Cortex-A510 erratum 2457168.
+
+ The AMU counter AMEVCNTR01 (constant counter) should increment at the same rate
+ as the system counter. On affected Cortex-A510 cores AMEVCNTR01 increments
+ incorrectly giving a significantly higher output value.
+
+ Work around this problem by keeping the reference values of affected counters
+ to 0 thus signaling an error case. This effect is the same to firmware disabling
+ affected counters, in which case 0 will be returned when reading the disabled
+ counters.
+
+ If unsure, say Y.
+
config CAVIUM_ERRATUM_22375
bool "Cavium erratum 22375, 24313"
default y
diff --git a/arch/arm64/include/asm/cpucaps.h b/arch/arm64/include/asm/cpucaps.h
index f42fd0a2e81c..53030d3c03a2 100644
--- a/arch/arm64/include/asm/cpucaps.h
+++ b/arch/arm64/include/asm/cpucaps.h
@@ -67,7 +67,8 @@
#define ARM64_MTE 57
#define ARM64_WORKAROUND_1508412 58
#define ARM64_SPECTRE_BHB 59
+#define ARM64_WORKAROUND_2457168 60
-#define ARM64_NCAPS 60
+#define ARM64_NCAPS 61
#endif /* __ASM_CPUCAPS_H */
diff --git a/arch/arm64/kernel/cpu_errata.c b/arch/arm64/kernel/cpu_errata.c
index 78263dadd00d..aaacca6fd52f 100644
--- a/arch/arm64/kernel/cpu_errata.c
+++ b/arch/arm64/kernel/cpu_errata.c
@@ -545,6 +545,15 @@ const struct arm64_cpu_capabilities arm64_errata[] = {
0, 0,
1, 0),
},
+#endif
+#ifdef CONFIG_ARM64_ERRATUM_2457168
+ {
+ .desc = "ARM erratum 2457168",
+ .capability = ARM64_WORKAROUND_2457168,
+ .type = ARM64_CPUCAP_WEAK_LOCAL_CPU_FEATURE,
+ /* Cortex-A510 r0p0-r1p1 */
+ CAP_MIDR_RANGE(MIDR_CORTEX_A510, 0, 0, 1, 1)
+ },
#endif
{
}
diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index 4087e2d1f39e..e72c90b82656 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -1559,7 +1559,10 @@ static void cpu_amu_enable(struct arm64_cpu_capabilities const *cap)
pr_info("detected CPU%d: Activity Monitors Unit (AMU)\n",
smp_processor_id());
cpumask_set_cpu(smp_processor_id(), &amu_cpus);
- init_cpu_freq_invariance_counters();
+
+ /* 0 reference values signal broken/disabled counters */
+ if (!this_cpu_has_cap(ARM64_WORKAROUND_2457168))
+ init_cpu_freq_invariance_counters();
}
}
--
2.25.1
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
Possible dependencies:
53fc7ad6edf2 ("iommu/vt-d: Correctly calculate sagaw value of IOMMU")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 53fc7ad6edf210b497230ce74b61b322a202470c Mon Sep 17 00:00:00 2001
From: Lu Baolu <baolu.lu(a)linux.intel.com>
Date: Tue, 23 Aug 2022 14:15:55 +0800
Subject: [PATCH] iommu/vt-d: Correctly calculate sagaw value of IOMMU
The Intel IOMMU driver possibly selects between the first-level and the
second-level translation tables for DMA address translation. However,
the levels of page-table walks for the 4KB base page size are calculated
from the SAGAW field of the capability register, which is only valid for
the second-level page table. This causes the IOMMU driver to stop working
if the hardware (or the emulated IOMMU) advertises only first-level
translation capability and reports the SAGAW field as 0.
This solves the above problem by considering both the first level and the
second level when calculating the supported page table levels.
Fixes: b802d070a52a1 ("iommu/vt-d: Use iova over first level")
Cc: stable(a)vger.kernel.org
Signed-off-by: Lu Baolu <baolu.lu(a)linux.intel.com>
Link: https://lore.kernel.org/r/20220817023558.3253263-1-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel(a)suse.de>
diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index b9d058c27568..b155c7af7d15 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -390,14 +390,36 @@ static inline int domain_pfn_supported(struct dmar_domain *domain,
return !(addr_width < BITS_PER_LONG && pfn >> addr_width);
}
+/*
+ * Calculate the Supported Adjusted Guest Address Widths of an IOMMU.
+ * Refer to 11.4.2 of the VT-d spec for the encoding of each bit of
+ * the returned SAGAW.
+ */
+static unsigned long __iommu_calculate_sagaw(struct intel_iommu *iommu)
+{
+ unsigned long fl_sagaw, sl_sagaw;
+
+ fl_sagaw = BIT(2) | (cap_fl1gp_support(iommu->cap) ? BIT(3) : 0);
+ sl_sagaw = cap_sagaw(iommu->cap);
+
+ /* Second level only. */
+ if (!sm_supported(iommu) || !ecap_flts(iommu->ecap))
+ return sl_sagaw;
+
+ /* First level only. */
+ if (!ecap_slts(iommu->ecap))
+ return fl_sagaw;
+
+ return fl_sagaw & sl_sagaw;
+}
+
static int __iommu_calculate_agaw(struct intel_iommu *iommu, int max_gaw)
{
unsigned long sagaw;
int agaw;
- sagaw = cap_sagaw(iommu->cap);
- for (agaw = width_to_agaw(max_gaw);
- agaw >= 0; agaw--) {
+ sagaw = __iommu_calculate_sagaw(iommu);
+ for (agaw = width_to_agaw(max_gaw); agaw >= 0; agaw--) {
if (test_bit(agaw, &sagaw))
break;
}
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
Possible dependencies:
c0a454b9044f ("arm64/bti: Disable in kernel BTI when cross section thunks are broken")
8cdd23c23c3d ("arm64: Restrict ARM64_BTI_KERNEL to clang 12.0.0 and newer")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From c0a454b9044fdc99486853aa424e5b3be2107078 Mon Sep 17 00:00:00 2001
From: Mark Brown <broonie(a)kernel.org>
Date: Mon, 5 Sep 2022 15:22:55 +0100
Subject: [PATCH] arm64/bti: Disable in kernel BTI when cross section thunks
are broken
GCC does not insert a `bti c` instruction at the beginning of a function
when it believes that all callers reach the function through a direct
branch[1]. Unfortunately the logic it uses to determine this is not
sufficiently robust, for example not taking account of functions being
placed in different sections which may be loaded separately, so we may
still see thunks being generated to these functions. If that happens,
the first instruction in the callee function will result in a Branch
Target Exception due to the missing landing pad.
While this has currently only been observed in the case of modules
having their main code loaded sufficiently far from their init section
to require thunks it could potentially happen for other cases so the
safest thing is to disable BTI for the kernel when building with an
affected toolchain.
[1]: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106671
Reported-by: D Scott Phillips <scott(a)os.amperecomputing.com>
[Bits of the commit message are lifted from his report & workaround]
Signed-off-by: Mark Brown <broonie(a)kernel.org>
Link: https://lore.kernel.org/r/20220905142255.591990-1-broonie@kernel.org
Cc: <stable(a)vger.kernel.org> # v5.10+
Signed-off-by: Will Deacon <will(a)kernel.org>
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 9fb9fff08c94..1ce7685ad5de 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -1887,6 +1887,8 @@ config ARM64_BTI_KERNEL
depends on CC_HAS_BRANCH_PROT_PAC_RET_BTI
# https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94697
depends on !CC_IS_GCC || GCC_VERSION >= 100100
+ # https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106671
+ depends on !CC_IS_GCC
# https://github.com/llvm/llvm-project/commit/a88c722e687e6780dcd6a58718350dc…
depends on !CC_IS_CLANG || CLANG_VERSION >= 120000
depends on (!FUNCTION_GRAPH_TRACER || DYNAMIC_FTRACE_WITH_REGS)
commit e89d120c4b720e232cc6a94f0fcbd59c15d41489 upstream.
The AMU counter AMEVCNTR01 (constant counter) should increment at the same
rate as the system counter. On affected Cortex-A510 cores, AMEVCNTR01
increments incorrectly giving a significantly higher output value. This
results in inaccurate task scheduler utilization tracking and incorrect
feedback on CPU frequency.
Work around this problem by returning 0 when reading the affected counter
in key locations that results in disabling all users of this counter from
using it either for frequency invariance or as FFH reference counter. This
effect is the same to firmware disabling affected counters.
Details on how the two features are affected by this erratum:
- AMU counters will not be used for frequency invariance for affected
CPUs and CPUs in the same cpufreq policy. AMUs can still be used for
frequency invariance for unaffected CPUs in the system. Although
unlikely, if no alternative method can be found to support frequency
invariance for affected CPUs (cpufreq based or solution based on
platform counters) frequency invariance will be disabled. Please check
the chapter on frequency invariance at
Documentation/scheduler/sched-capacity.rst for details of its effect.
- Given that FFH can be used to fetch either the core or constant counter
values, restrictions are lifted regarding any of these counters
returning a valid (!0) value. Therefore FFH is considered supported
if there is a least one CPU that support AMUs, independent of any
counters being disabled or affected by this erratum. Clarifying
comments are now added to the cpc_ffh_supported(), cpu_read_constcnt()
and cpu_read_corecnt() functions.
The above is achieved through adding a new erratum: ARM64_ERRATUM_2457168.
Signed-off-by: Ionela Voinescu <ionela.voinescu(a)arm.com>
Cc: Catalin Marinas <catalin.marinas(a)arm.com>
Cc: Will Deacon <will(a)kernel.org>
Cc: James Morse <james.morse(a)arm.com>
Link: https://lore.kernel.org/r/20220819103050.24211-1-ionela.voinescu@arm.com
---
Hi,
This is a backport to stable 5.15.67 of the upstream commit
e89d120c4b72 arm64: errata: add detection for AMEVCNTR01 incrementing incorrectly
This is sent separately as there were minor conflicts that needed resolving
when applying the mainline patch.
Thanks,
Ionela.
Documentation/arm64/silicon-errata.rst | 2 ++
arch/arm64/Kconfig | 17 ++++++++++++++
arch/arm64/kernel/cpu_errata.c | 9 ++++++++
arch/arm64/kernel/cpufeature.c | 5 +++-
arch/arm64/kernel/topology.c | 32 ++++++++++++++++++++++++--
arch/arm64/tools/cpucaps | 1 +
6 files changed, 63 insertions(+), 3 deletions(-)
diff --git a/Documentation/arm64/silicon-errata.rst b/Documentation/arm64/silicon-errata.rst
index 46644736e583..663001f69773 100644
--- a/Documentation/arm64/silicon-errata.rst
+++ b/Documentation/arm64/silicon-errata.rst
@@ -94,6 +94,8 @@ stable kernels.
+----------------+-----------------+-----------------+-----------------------------+
| ARM | Cortex-A510 | #2441009 | ARM64_ERRATUM_2441009 |
+----------------+-----------------+-----------------+-----------------------------+
+| ARM | Cortex-A510 | #2457168 | ARM64_ERRATUM_2457168 |
++----------------+-----------------+-----------------+-----------------------------+
| ARM | Neoverse-N1 | #1188873,1418040| ARM64_ERRATUM_1418040 |
+----------------+-----------------+-----------------+-----------------------------+
| ARM | Neoverse-N1 | #1349291 | N/A |
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 9d80c783142f..9f1d0ca2531d 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -683,6 +683,23 @@ config ARM64_ERRATUM_2441009
If unsure, say Y.
+config ARM64_ERRATUM_2457168
+ bool "Cortex-A510: 2457168: workaround for AMEVCNTR01 incrementing incorrectly"
+ depends on ARM64_AMU_EXTN
+ default y
+ help
+ This option adds the workaround for ARM Cortex-A510 erratum 2457168.
+
+ The AMU counter AMEVCNTR01 (constant counter) should increment at the same rate
+ as the system counter. On affected Cortex-A510 cores AMEVCNTR01 increments
+ incorrectly giving a significantly higher output value.
+
+ Work around this problem by returning 0 when reading the affected counter in
+ key locations that results in disabling all users of this counter. This effect
+ is the same to firmware disabling affected counters.
+
+ If unsure, say Y.
+
config CAVIUM_ERRATUM_22375
bool "Cavium erratum 22375, 24313"
default y
diff --git a/arch/arm64/kernel/cpu_errata.c b/arch/arm64/kernel/cpu_errata.c
index 23c57e0a7fd1..25c495f58f67 100644
--- a/arch/arm64/kernel/cpu_errata.c
+++ b/arch/arm64/kernel/cpu_errata.c
@@ -550,6 +550,15 @@ const struct arm64_cpu_capabilities arm64_errata[] = {
.capability = ARM64_WORKAROUND_NVIDIA_CARMEL_CNP,
ERRATA_MIDR_ALL_VERSIONS(MIDR_NVIDIA_CARMEL),
},
+#endif
+#ifdef CONFIG_ARM64_ERRATUM_2457168
+ {
+ .desc = "ARM erratum 2457168",
+ .capability = ARM64_WORKAROUND_2457168,
+ .type = ARM64_CPUCAP_WEAK_LOCAL_CPU_FEATURE,
+ /* Cortex-A510 r0p0-r1p1 */
+ CAP_MIDR_RANGE(MIDR_CORTEX_A510, 0, 0, 1, 1)
+ },
#endif
{
}
diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index 474aa55c2f68..3e52a9e8b50b 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -1736,7 +1736,10 @@ static void cpu_amu_enable(struct arm64_cpu_capabilities const *cap)
pr_info("detected CPU%d: Activity Monitors Unit (AMU)\n",
smp_processor_id());
cpumask_set_cpu(smp_processor_id(), &amu_cpus);
- update_freq_counters_refs();
+
+ /* 0 reference values signal broken/disabled counters */
+ if (!this_cpu_has_cap(ARM64_WORKAROUND_2457168))
+ update_freq_counters_refs();
}
}
diff --git a/arch/arm64/kernel/topology.c b/arch/arm64/kernel/topology.c
index 4dd14a6620c1..acf67ef4c505 100644
--- a/arch/arm64/kernel/topology.c
+++ b/arch/arm64/kernel/topology.c
@@ -308,12 +308,25 @@ core_initcall(init_amu_fie);
static void cpu_read_corecnt(void *val)
{
+ /*
+ * A value of 0 can be returned if the current CPU does not support AMUs
+ * or if the counter is disabled for this CPU. A return value of 0 at
+ * counter read is properly handled as an error case by the users of the
+ * counter.
+ */
*(u64 *)val = read_corecnt();
}
static void cpu_read_constcnt(void *val)
{
- *(u64 *)val = read_constcnt();
+ /*
+ * Return 0 if the current CPU is affected by erratum 2457168. A value
+ * of 0 is also returned if the current CPU does not support AMUs or if
+ * the counter is disabled. A return value of 0 at counter read is
+ * properly handled as an error case by the users of the counter.
+ */
+ *(u64 *)val = this_cpu_has_cap(ARM64_WORKAROUND_2457168) ?
+ 0UL : read_constcnt();
}
static inline
@@ -340,7 +353,22 @@ int counters_read_on_cpu(int cpu, smp_call_func_t func, u64 *val)
*/
bool cpc_ffh_supported(void)
{
- return freq_counters_valid(get_cpu_with_amu_feat());
+ int cpu = get_cpu_with_amu_feat();
+
+ /*
+ * FFH is considered supported if there is at least one present CPU that
+ * supports AMUs. Using FFH to read core and reference counters for CPUs
+ * that do not support AMUs, have counters disabled or that are affected
+ * by errata, will result in a return value of 0.
+ *
+ * This is done to allow any enabled and valid counters to be read
+ * through FFH, knowing that potentially returning 0 as counter value is
+ * properly handled by the users of these counters.
+ */
+ if ((cpu >= nr_cpu_ids) || !cpumask_test_cpu(cpu, cpu_present_mask))
+ return false;
+
+ return true;
}
int cpc_read_ffh(int cpu, struct cpc_reg *reg, u64 *val)
diff --git a/arch/arm64/tools/cpucaps b/arch/arm64/tools/cpucaps
index b71c6cbb2309..cfaffd3c8289 100644
--- a/arch/arm64/tools/cpucaps
+++ b/arch/arm64/tools/cpucaps
@@ -54,6 +54,7 @@ WORKAROUND_1418040
WORKAROUND_1463225
WORKAROUND_1508412
WORKAROUND_1542419
+WORKAROUND_2457168
WORKAROUND_CAVIUM_23154
WORKAROUND_CAVIUM_27456
WORKAROUND_CAVIUM_30115
--
2.25.1
<Note - hope this works - moved to my more opensource friendly email
account>
On 2022-09-12 08:20, Mark Pearson wrote:
>
> --------------------------------------------------------------------------------
> *From:* Jason A. Donenfeld <Jason(a)zx2c4.com>
> *Sent:* September 12, 2022 6:56
> *To:* Sebastian Reichel <sebastian.reichel(a)collabora.com>; Mark Pearson
> <mpearson(a)lenovo.com>
> *Cc:* linux-pm(a)vger.kernel.org <linux-pm(a)vger.kernel.org>;
> stable(a)vger.kernel.org <stable(a)vger.kernel.org>; Rafael J . Wysocki
> <rafael(a)kernel.org>
> *Subject:* [External] Re: [PATCH RESEND] power: supply: avoid nullptr deref in
> __power_supply_is_system_supplied
> CC+ Mark Pearson from Lenovo
> Full thread is here:
> https://lore.kernel.org/all/YwDsy3ZUgTtlKH9r@zx2c4.com/ <https://lore.kernel.org/all/YwDsy3ZUgTtlKH9r@zx2c4.com/>>
> On Mon, Sep 12, 2022 at 11:48 AM Jason A. Donenfeld <Jason(a)zx2c4.com> wrote:
>>
>> Ah another thing:
>>
>> On Mon, Sep 12, 2022 at 11:45 AM Jason A. Donenfeld <Jason(a)zx2c4.com> wrote:
>> > My machine went through three changes I know about between the threshold
>> > of "not crashing" and "crashing":
>> > - Upgraded to 5.19 and then 6.0-rc1.
>> > - I used my laptop on batteries for a prolonged period of time for the
>> > first time in a while.
>> > - I updated KDE, whose power management UI elements may or may not make
>> > frequent calls to this subsystem to update some visual representation.
>>
>> - Updated my BIOS.
>
> GASP! The plot thickens.
>
> It appears that the BIOS update I applied has been removed from
> https://pcsupport.lenovo.com/fr/en/downloads/ds551052-bios-update-utility-b… <https://pcsupport.lenovo.com/fr/en/downloads/ds551052-bios-update-utility-b…>
> and now it only shows the 1.16 version. I updated from 1.16 to 1.18.
>
> The missing release notes are still online if you futz with the URL:
> https://download.lenovo.com/pccbbs/mobiles/n40ur14w.txt
> <https://download.lenovo.com/pccbbs/mobiles/n40ur14w.txt>
> https://download.lenovo.com/pccbbs/mobiles/n40ur15w.txt
> <https://download.lenovo.com/pccbbs/mobiles/n40ur15w.txt>
>
> One of the items for 1.17 says:
>> - (Fix) Fixed an issue where it took a long time to update the battery FW.
>
> So maybe something was happening here...
>
> I'm CC'ing Mark from Lenovo to see if he has any insight as to why
> this BIOS update was pulled.
>
> Maybe the battery was appearing and disappearing rapidly. If that's
> correct, then it'd indicate that this bandaid patch is *wrong* and
> what actually is needed is some kind of reference counting or RCU
> around that sysfs interface (and maybe others).
>
> Jason
Hi Jason,
I'll have to check with the FW team but looking at the internal notes I
think the FW was pulled because of a graphics display regression.
Version 36W was fixing a brightness control issue in discrete mode and
37W (not yet released) is fixing external display - so my guess is
something about the fix in 36W has a side effect
More interesting is the EC FW updates. There isn't a new version posted
but there are fixes in the previous version (EC 33W) for a fix for a
'suspected EC-Battery communication transaction failure'. Is that
potentially related to this patch in some way? I can go and ask for more
details if we think it's related. I'll also see if I can repro on my
P1G4 - but I hadn't seen any other reports so it might be HW specific.
Can you confirm which FW you have from the BIOS setup screen (F1 during
early boot)? BIOS and EC please.
Mark
This bug is marked as fixed by commit:
net: core: netlink: add helper refcount dec and lock function
net: sched: add helper function to take reference to Qdisc
net: sched: extend Qdisc with rcu
net: sched: rename qdisc_destroy() to qdisc_put()
net: sched: use Qdisc rcu API instead of relying on rtnl lock
But I can't find it in any tested tree for more than 90 days.
Is it a correct commit? Please update it by replying:
#syz fix: exact-commit-title
Until then the bug is still considered open and
new crashes with the same signature are ignored.
From: Arnaldo Carvalho de Melo <acme(a)redhat.com>
Its more intention revealing, and if we're interested in the odd cases
where this may end up truncating we can do debug checks at one
centralized place.
Motivation, of all the container builds, fedora rawhide started
complaining of:
util/machine.c: In function ‘machine__create_modules’:
util/machine.c:1419:50: error: ‘%s’ directive output may be truncated writing up to 255 bytes into a region of size between 0 and 4095 [-Werror=format-truncation=]
1419 | snprintf(path, sizeof(path), "%s/%s", dir_name, dent->d_name);
| ^~
In file included from /usr/include/stdio.h:894,
from util/branch.h:9,
from util/callchain.h:8,
from util/machine.c:7:
In function ‘snprintf’,
inlined from ‘maps__set_modules_path_dir’ at util/machine.c:1419:3,
inlined from ‘machine__set_modules_path’ at util/machine.c:1473:9,
inlined from ‘machine__create_modules’ at util/machine.c:1519:7:
/usr/include/bits/stdio2.h:71:10: note: ‘__builtin___snprintf_chk’ output between 2 and 4352 bytes into a destination of size 4096
There are other places where we should use path__join(), but lets get rid of
this one first.
Cc: Adrian Hunter <adrian.hunter(a)intel.com>
Cc: Ian Rogers <irogers(a)google.com>
Cc: Jiri Olsa <jolsa(a)kernel.org>
Cc: Namhyung Kim <namhyung(a)kernel.org>
Acked-by: Ian Rogers <irogers(a)google.com>
Link: Link: https://lore.kernel.org/r/YebZKjwgfdOz0lAs@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme(a)redhat.com>
---
tools/perf/util/machine.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index 44e40bad0e33..55a041329990 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -16,6 +16,7 @@
#include "map_symbol.h"
#include "branch.h"
#include "mem-events.h"
+#include "path.h"
#include "srcline.h"
#include "symbol.h"
#include "sort.h"
@@ -1407,7 +1408,7 @@ static int maps__set_modules_path_dir(struct maps *maps, const char *dir_name, i
struct stat st;
/*sshfs might return bad dent->d_type, so we have to stat*/
- snprintf(path, sizeof(path), "%s/%s", dir_name, dent->d_name);
+ path__join(path, sizeof(path), dir_name, dent->d_name);
if (stat(path, &st))
continue;
--
2.34.1
The quilt patch titled
Subject: mm/damon: validate if the pmd entry is present before accessing
has been removed from the -mm tree. Its filename was
mm-damon-validate-if-the-pmd-entry-is-present-before-accessing.patch
This patch was dropped because it was merged into the mm-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Baolin Wang <baolin.wang(a)linux.alibaba.com>
Subject: mm/damon: validate if the pmd entry is present before accessing
Date: Thu, 18 Aug 2022 15:37:43 +0800
pmd_huge() is used to validate if the pmd entry is mapped by a huge page,
also including the case of non-present (migration or hwpoisoned) pmd entry
on arm64 or x86 architectures. This means that pmd_pfn() can not get the
correct pfn number for a non-present pmd entry, which will cause
damon_get_page() to get an incorrect page struct (also may be NULL by
pfn_to_online_page()), making the access statistics incorrect.
This means that the DAMON may make incorrect decision according to the
incorrect statistics, for example, DAMON may can not reclaim cold page
in time due to this cold page was regarded as accessed mistakenly if
DAMOS_PAGEOUT operation is specified.
Moreover it does not make sense that we still waste time to get the page
of the non-present entry. Just treat it as not-accessed and skip it,
which maintains consistency with non-present pte level entries.
So add pmd entry present validation to fix the above issues.
Link: https://lkml.kernel.org/r/58b1d1f5fbda7db49ca886d9ef6783e3dcbbbc98.16608050…
Fixes: 3f49584b262c ("mm/damon: implement primitives for the virtual memory address spaces")
Signed-off-by: Baolin Wang <baolin.wang(a)linux.alibaba.com>
Reviewed-by: SeongJae Park <sj(a)kernel.org>
Reviewed-by: Muchun Song <songmuchun(a)bytedance.com>
Cc: Mike Kravetz <mike.kravetz(a)oracle.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/damon/vaddr.c | 10 ++++++++++
1 file changed, 10 insertions(+)
--- a/mm/damon/vaddr.c~mm-damon-validate-if-the-pmd-entry-is-present-before-accessing
+++ a/mm/damon/vaddr.c
@@ -304,6 +304,11 @@ static int damon_mkold_pmd_entry(pmd_t *
if (pmd_huge(*pmd)) {
ptl = pmd_lock(walk->mm, pmd);
+ if (!pmd_present(*pmd)) {
+ spin_unlock(ptl);
+ return 0;
+ }
+
if (pmd_huge(*pmd)) {
damon_pmdp_mkold(pmd, walk->mm, addr);
spin_unlock(ptl);
@@ -431,6 +436,11 @@ static int damon_young_pmd_entry(pmd_t *
#ifdef CONFIG_TRANSPARENT_HUGEPAGE
if (pmd_huge(*pmd)) {
ptl = pmd_lock(walk->mm, pmd);
+ if (!pmd_present(*pmd)) {
+ spin_unlock(ptl);
+ return 0;
+ }
+
if (!pmd_huge(*pmd)) {
spin_unlock(ptl);
goto regular_page;
_
Patches currently in -mm which might be from baolin.wang(a)linux.alibaba.com are
mm-hugetlb-fix-races-when-looking-up-a-cont-pte-pmd-size-hugetlb-page.patch
mm-migrate-do-not-retry-10-times-for-the-subpages-of-fail-to-migrate-thp.patch
The quilt patch titled
Subject: mm: fix dereferencing possible ERR_PTR
has been removed from the -mm tree. Its filename was
mm-fix-dereferencing-possible-err_ptr.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Binyi Han <dantengknight(a)gmail.com>
Subject: mm: fix dereferencing possible ERR_PTR
Date: Sun, 4 Sep 2022 00:46:47 -0700
Smatch checker complains that 'secretmem_mnt' dereferencing possible
ERR_PTR(). Let the function return if 'secretmem_mnt' is ERR_PTR, to
avoid deferencing it.
Link: https://lkml.kernel.org/r/20220904074647.GA64291@cloud-MacBookPro
Fixes: 1507f51255c9f ("mm: introduce memfd_secret system call to create "secret" memory areas")
Signed-off-by: Binyi Han <dantengknight(a)gmail.com>
Reviewed-by: Andrew Morton <akpm(a)linux-foudation.org>
Cc: Mike Rapoport <rppt(a)kernel.org>
Cc: Ammar Faizi <ammarfaizi2(a)gnuweeb.org>
Cc: Hagen Paul Pfeifer <hagen(a)jauu.net>
Cc: James Bottomley <James.Bottomley(a)HansenPartnership.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/secretmem.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/mm/secretmem.c~mm-fix-dereferencing-possible-err_ptr
+++ a/mm/secretmem.c
@@ -285,7 +285,7 @@ static int secretmem_init(void)
secretmem_mnt = kern_mount(&secretmem_fs);
if (IS_ERR(secretmem_mnt))
- ret = PTR_ERR(secretmem_mnt);
+ return PTR_ERR(secretmem_mnt);
/* prevent secretmem mappings from ever getting PROT_EXEC */
secretmem_mnt->mnt_flags |= MNT_NOEXEC;
_
Patches currently in -mm which might be from dantengknight(a)gmail.com are
The quilt patch titled
Subject: mm/damon/dbgfs: fix memory leak when using debugfs_lookup()
has been removed from the -mm tree. Its filename was
mm-damon-dbgfs-fix-memory-leak-when-using.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Subject: mm/damon/dbgfs: fix memory leak when using debugfs_lookup()
Date: Fri, 2 Sep 2022 19:11:49 +0000
When calling debugfs_lookup() the result must have dput() called on it,
otherwise the memory will leak over time. Fix this up by properly calling
dput().
Link: https://lkml.kernel.org/r/20220902191149.112434-1-sj@kernel.org
Fixes: 75c1c2b53c78b ("mm/damon/dbgfs: support multiple contexts")
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Signed-off-by: SeongJae Park <sj(a)kernel.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/damon/dbgfs.c | 19 ++++++++++++++-----
1 file changed, 14 insertions(+), 5 deletions(-)
--- a/mm/damon/dbgfs.c~mm-damon-dbgfs-fix-memory-leak-when-using
+++ a/mm/damon/dbgfs.c
@@ -884,6 +884,7 @@ static int dbgfs_rm_context(char *name)
struct dentry *root, *dir, **new_dirs;
struct damon_ctx **new_ctxs;
int i, j;
+ int ret = 0;
if (damon_nr_running_ctxs())
return -EBUSY;
@@ -898,14 +899,16 @@ static int dbgfs_rm_context(char *name)
new_dirs = kmalloc_array(dbgfs_nr_ctxs - 1, sizeof(*dbgfs_dirs),
GFP_KERNEL);
- if (!new_dirs)
- return -ENOMEM;
+ if (!new_dirs) {
+ ret = -ENOMEM;
+ goto out_dput;
+ }
new_ctxs = kmalloc_array(dbgfs_nr_ctxs - 1, sizeof(*dbgfs_ctxs),
GFP_KERNEL);
if (!new_ctxs) {
- kfree(new_dirs);
- return -ENOMEM;
+ ret = -ENOMEM;
+ goto out_new_dirs;
}
for (i = 0, j = 0; i < dbgfs_nr_ctxs; i++) {
@@ -925,7 +928,13 @@ static int dbgfs_rm_context(char *name)
dbgfs_ctxs = new_ctxs;
dbgfs_nr_ctxs--;
- return 0;
+ goto out_dput;
+
+out_new_dirs:
+ kfree(new_dirs);
+out_dput:
+ dput(dir);
+ return ret;
}
static ssize_t dbgfs_rm_context_write(struct file *file,
_
Patches currently in -mm which might be from gregkh(a)linuxfoundation.org are
The quilt patch titled
Subject: mm/migrate_device.c: copy pte dirty bit to page
has been removed from the -mm tree. Its filename was
mm-migrate_devicec-copy-pte-dirty-bit-to-page.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Alistair Popple <apopple(a)nvidia.com>
Subject: mm/migrate_device.c: copy pte dirty bit to page
Date: Fri, 2 Sep 2022 10:35:53 +1000
migrate_vma_setup() has a fast path in migrate_vma_collect_pmd() that
installs migration entries directly if it can lock the migrating page.
When removing a dirty pte the dirty bit is supposed to be carried over to
the underlying page to prevent it being lost.
Currently migrate_vma_*() can only be used for private anonymous mappings.
That means loss of the dirty bit usually doesn't result in data loss
because these pages are typically not file-backed. However pages may be
backed by swap storage which can result in data loss if an attempt is made
to migrate a dirty page that doesn't yet have the PageDirty flag set.
In this case migration will fail due to unexpected references but the
dirty pte bit will be lost. If the page is subsequently reclaimed data
won't be written back to swap storage as it is considered uptodate,
resulting in data loss if the page is subsequently accessed.
Prevent this by copying the dirty bit to the page when removing the pte to
match what try_to_migrate_one() does.
Link: https://lkml.kernel.org/r/dd48e4882ce859c295c1a77612f66d198b0403f9.16620785…
Fixes: 8c3328f1f36a ("mm/migrate: migrate_vma() unmap page from vma while collecting pages")
Signed-off-by: Alistair Popple <apopple(a)nvidia.com>
Acked-by: Peter Xu <peterx(a)redhat.com>
Reviewed-by: "Huang, Ying" <ying.huang(a)intel.com>
Reported-by: "Huang, Ying" <ying.huang(a)intel.com>
Acked-by: David Hildenbrand <david(a)redhat.com>
Cc: Alex Sierra <alex.sierra(a)amd.com>
Cc: Ben Skeggs <bskeggs(a)redhat.com>
Cc: Felix Kuehling <Felix.Kuehling(a)amd.com>
Cc: huang ying <huang.ying.caritas(a)gmail.com>
Cc: Jason Gunthorpe <jgg(a)nvidia.com>
Cc: John Hubbard <jhubbard(a)nvidia.com>
Cc: Karol Herbst <kherbst(a)redhat.com>
Cc: Logan Gunthorpe <logang(a)deltatee.com>
Cc: Lyude Paul <lyude(a)redhat.com>
Cc: Matthew Wilcox <willy(a)infradead.org>
Cc: Nadav Amit <nadav.amit(a)gmail.com>
Cc: Paul Mackerras <paulus(a)ozlabs.org>
Cc: Ralph Campbell <rcampbell(a)nvidia.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/migrate_device.c | 9 +++++++--
1 file changed, 7 insertions(+), 2 deletions(-)
--- a/mm/migrate_device.c~mm-migrate_devicec-copy-pte-dirty-bit-to-page
+++ a/mm/migrate_device.c
@@ -7,6 +7,7 @@
#include <linux/export.h>
#include <linux/memremap.h>
#include <linux/migrate.h>
+#include <linux/mm.h>
#include <linux/mm_inline.h>
#include <linux/mmu_notifier.h>
#include <linux/oom.h>
@@ -196,7 +197,7 @@ again:
flush_cache_page(vma, addr, pte_pfn(*ptep));
anon_exclusive = PageAnon(page) && PageAnonExclusive(page);
if (anon_exclusive) {
- ptep_clear_flush(vma, addr, ptep);
+ pte = ptep_clear_flush(vma, addr, ptep);
if (page_try_share_anon_rmap(page)) {
set_pte_at(mm, addr, ptep, pte);
@@ -206,11 +207,15 @@ again:
goto next;
}
} else {
- ptep_get_and_clear(mm, addr, ptep);
+ pte = ptep_get_and_clear(mm, addr, ptep);
}
migrate->cpages++;
+ /* Set the dirty flag on the folio now the pte is gone. */
+ if (pte_dirty(pte))
+ folio_mark_dirty(page_folio(page));
+
/* Setup special migration page table entry */
if (mpfn & MIGRATE_PFN_WRITE)
entry = make_writable_migration_entry(
_
Patches currently in -mm which might be from apopple(a)nvidia.com are
mm-gupc-simplify-and-fix-check_and_migrate_movable_pages-return-codes.patch
selftests-hmm-tests-add-test-for-dirty-bits.patch
mm-gupc-dont-pass-gup_flags-to-check_and_migrate_movable_pages.patch
mm-gupc-refactor-check_and_migrate_movable_pages.patch
mm-migrate_devicec-fix-a-misleading-and-out-dated-comment.patch
The quilt patch titled
Subject: mm/migrate_device.c: add missing flush_cache_page()
has been removed from the -mm tree. Its filename was
mm-migrate_devicec-add-missing-flush_cache_page.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Alistair Popple <apopple(a)nvidia.com>
Subject: mm/migrate_device.c: add missing flush_cache_page()
Date: Fri, 2 Sep 2022 10:35:52 +1000
Currently we only call flush_cache_page() for the anon_exclusive case,
however in both cases we clear the pte so should flush the cache.
Link: https://lkml.kernel.org/r/5676f30436ab71d1a587ac73f835ed8bd2113ff5.16620785…
Fixes: 8c3328f1f36a ("mm/migrate: migrate_vma() unmap page from vma while collecting pages")
Signed-off-by: Alistair Popple <apopple(a)nvidia.com>
Reviewed-by: David Hildenbrand <david(a)redhat.com>
Acked-by: Peter Xu <peterx(a)redhat.com>
Cc: Alex Sierra <alex.sierra(a)amd.com>
Cc: Ben Skeggs <bskeggs(a)redhat.com>
Cc: Felix Kuehling <Felix.Kuehling(a)amd.com>
Cc: huang ying <huang.ying.caritas(a)gmail.com>
Cc: "Huang, Ying" <ying.huang(a)intel.com>
Cc: Jason Gunthorpe <jgg(a)nvidia.com>
Cc: John Hubbard <jhubbard(a)nvidia.com>
Cc: Karol Herbst <kherbst(a)redhat.com>
Cc: Logan Gunthorpe <logang(a)deltatee.com>
Cc: Lyude Paul <lyude(a)redhat.com>
Cc: Matthew Wilcox <willy(a)infradead.org>
Cc: Nadav Amit <nadav.amit(a)gmail.com>
Cc: Paul Mackerras <paulus(a)ozlabs.org>
Cc: Ralph Campbell <rcampbell(a)nvidia.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/migrate_device.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/mm/migrate_device.c~mm-migrate_devicec-add-missing-flush_cache_page
+++ a/mm/migrate_device.c
@@ -193,9 +193,9 @@ again:
bool anon_exclusive;
pte_t swp_pte;
+ flush_cache_page(vma, addr, pte_pfn(*ptep));
anon_exclusive = PageAnon(page) && PageAnonExclusive(page);
if (anon_exclusive) {
- flush_cache_page(vma, addr, pte_pfn(*ptep));
ptep_clear_flush(vma, addr, ptep);
if (page_try_share_anon_rmap(page)) {
_
Patches currently in -mm which might be from apopple(a)nvidia.com are
mm-gupc-simplify-and-fix-check_and_migrate_movable_pages-return-codes.patch
selftests-hmm-tests-add-test-for-dirty-bits.patch
mm-gupc-dont-pass-gup_flags-to-check_and_migrate_movable_pages.patch
mm-gupc-refactor-check_and_migrate_movable_pages.patch
mm-migrate_devicec-fix-a-misleading-and-out-dated-comment.patch
The quilt patch titled
Subject: mm/migrate_device.c: flush TLB while holding PTL
has been removed from the -mm tree. Its filename was
mm-migrate_devicec-flush-tlb-while-holding-ptl.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Alistair Popple <apopple(a)nvidia.com>
Subject: mm/migrate_device.c: flush TLB while holding PTL
Date: Fri, 2 Sep 2022 10:35:51 +1000
When clearing a PTE the TLB should be flushed whilst still holding the PTL
to avoid a potential race with madvise/munmap/etc. For example consider
the following sequence:
CPU0 CPU1
---- ----
migrate_vma_collect_pmd()
pte_unmap_unlock()
madvise(MADV_DONTNEED)
-> zap_pte_range()
pte_offset_map_lock()
[ PTE not present, TLB not flushed ]
pte_unmap_unlock()
[ page is still accessible via stale TLB ]
flush_tlb_range()
In this case the page may still be accessed via the stale TLB entry after
madvise returns. Fix this by flushing the TLB while holding the PTL.
Fixes: 8c3328f1f36a ("mm/migrate: migrate_vma() unmap page from vma while collecting pages")
Link: https://lkml.kernel.org/r/9f801e9d8d830408f2ca27821f606e09aa856899.16620785…
Signed-off-by: Alistair Popple <apopple(a)nvidia.com>
Reported-by: Nadav Amit <nadav.amit(a)gmail.com>
Reviewed-by: "Huang, Ying" <ying.huang(a)intel.com>
Acked-by: David Hildenbrand <david(a)redhat.com>
Acked-by: Peter Xu <peterx(a)redhat.com>
Cc: Alex Sierra <alex.sierra(a)amd.com>
Cc: Ben Skeggs <bskeggs(a)redhat.com>
Cc: Felix Kuehling <Felix.Kuehling(a)amd.com>
Cc: huang ying <huang.ying.caritas(a)gmail.com>
Cc: Jason Gunthorpe <jgg(a)nvidia.com>
Cc: John Hubbard <jhubbard(a)nvidia.com>
Cc: Karol Herbst <kherbst(a)redhat.com>
Cc: Logan Gunthorpe <logang(a)deltatee.com>
Cc: Lyude Paul <lyude(a)redhat.com>
Cc: Matthew Wilcox <willy(a)infradead.org>
Cc: Paul Mackerras <paulus(a)ozlabs.org>
Cc: Ralph Campbell <rcampbell(a)nvidia.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/migrate_device.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
--- a/mm/migrate_device.c~mm-migrate_devicec-flush-tlb-while-holding-ptl
+++ a/mm/migrate_device.c
@@ -254,13 +254,14 @@ next:
migrate->dst[migrate->npages] = 0;
migrate->src[migrate->npages++] = mpfn;
}
- arch_leave_lazy_mmu_mode();
- pte_unmap_unlock(ptep - 1, ptl);
/* Only flush the TLB if we actually modified any entries */
if (unmapped)
flush_tlb_range(walk->vma, start, end);
+ arch_leave_lazy_mmu_mode();
+ pte_unmap_unlock(ptep - 1, ptl);
+
return 0;
}
_
Patches currently in -mm which might be from apopple(a)nvidia.com are
mm-gupc-simplify-and-fix-check_and_migrate_movable_pages-return-codes.patch
selftests-hmm-tests-add-test-for-dirty-bits.patch
mm-gupc-dont-pass-gup_flags-to-check_and_migrate_movable_pages.patch
mm-gupc-refactor-check_and_migrate_movable_pages.patch
mm-migrate_devicec-fix-a-misleading-and-out-dated-comment.patch
The quilt patch titled
Subject: mm/page_alloc: fix race condition between build_all_zonelists and page allocation
has been removed from the -mm tree. Its filename was
mm-page_alloc-fix-race-condition-between-build_all_zonelists-and-page-allocation.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Mel Gorman <mgorman(a)techsingularity.net>
Subject: mm/page_alloc: fix race condition between build_all_zonelists and page allocation
Date: Wed, 24 Aug 2022 12:14:50 +0100
Patrick Daly reported the following problem;
NODE_DATA(nid)->node_zonelists[ZONELIST_FALLBACK] - before offline operation
[0] - ZONE_MOVABLE
[1] - ZONE_NORMAL
[2] - NULL
For a GFP_KERNEL allocation, alloc_pages_slowpath() will save the
offset of ZONE_NORMAL in ac->preferred_zoneref. If a concurrent
memory_offline operation removes the last page from ZONE_MOVABLE,
build_all_zonelists() & build_zonerefs_node() will update
node_zonelists as shown below. Only populated zones are added.
NODE_DATA(nid)->node_zonelists[ZONELIST_FALLBACK] - after offline operation
[0] - ZONE_NORMAL
[1] - NULL
[2] - NULL
The race is simple -- page allocation could be in progress when a memory
hot-remove operation triggers a zonelist rebuild that removes zones. The
allocation request will still have a valid ac->preferred_zoneref that is
now pointing to NULL and triggers an OOM kill.
This problem probably always existed but may be slightly easier to trigger
due to 6aa303defb74 ("mm, vmscan: only allocate and reclaim from zones
with pages managed by the buddy allocator") which distinguishes between
zones that are completely unpopulated versus zones that have valid pages
not managed by the buddy allocator (e.g. reserved, memblock, ballooning
etc). Memory hotplug had multiple stages with timing considerations
around managed/present page updates, the zonelist rebuild and the zone
span updates. As David Hildenbrand puts it
memory offlining adjusts managed+present pages of the zone
essentially in one go. If after the adjustments, the zone is no
longer populated (present==0), we rebuild the zone lists.
Once that's done, we try shrinking the zone (start+spanned
pages) -- which results in zone_start_pfn == 0 if there are no
more pages. That happens *after* rebuilding the zonelists via
remove_pfn_range_from_zone().
The only requirement to fix the race is that a page allocation request
identifies when a zonelist rebuild has happened since the allocation
request started and no page has yet been allocated. Use a seqlock_t to
track zonelist updates with a lockless read-side of the zonelist and
protecting the rebuild and update of the counter with a spinlock.
[akpm(a)linux-foundation.org: make zonelist_update_seq static]
Link: https://lkml.kernel.org/r/20220824110900.vh674ltxmzb3proq@techsingularity.n…
Fixes: 6aa303defb74 ("mm, vmscan: only allocate and reclaim from zones with pages managed by the buddy allocator")
Signed-off-by: Mel Gorman <mgorman(a)techsingularity.net>
Reported-by: Patrick Daly <quic_pdaly(a)quicinc.com>
Acked-by: Michal Hocko <mhocko(a)suse.com>
Reviewed-by: David Hildenbrand <david(a)redhat.com>
Cc: <stable(a)vger.kernel.org> [4.9+]
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/page_alloc.c | 53 +++++++++++++++++++++++++++++++++++++---------
1 file changed, 43 insertions(+), 10 deletions(-)
--- a/mm/page_alloc.c~mm-page_alloc-fix-race-condition-between-build_all_zonelists-and-page-allocation
+++ a/mm/page_alloc.c
@@ -4708,6 +4708,30 @@ void fs_reclaim_release(gfp_t gfp_mask)
EXPORT_SYMBOL_GPL(fs_reclaim_release);
#endif
+/*
+ * Zonelists may change due to hotplug during allocation. Detect when zonelists
+ * have been rebuilt so allocation retries. Reader side does not lock and
+ * retries the allocation if zonelist changes. Writer side is protected by the
+ * embedded spin_lock.
+ */
+static DEFINE_SEQLOCK(zonelist_update_seq);
+
+static unsigned int zonelist_iter_begin(void)
+{
+ if (IS_ENABLED(CONFIG_MEMORY_HOTREMOVE))
+ return read_seqbegin(&zonelist_update_seq);
+
+ return 0;
+}
+
+static unsigned int check_retry_zonelist(unsigned int seq)
+{
+ if (IS_ENABLED(CONFIG_MEMORY_HOTREMOVE))
+ return read_seqretry(&zonelist_update_seq, seq);
+
+ return seq;
+}
+
/* Perform direct synchronous page reclaim */
static unsigned long
__perform_reclaim(gfp_t gfp_mask, unsigned int order,
@@ -5001,6 +5025,7 @@ __alloc_pages_slowpath(gfp_t gfp_mask, u
int compaction_retries;
int no_progress_loops;
unsigned int cpuset_mems_cookie;
+ unsigned int zonelist_iter_cookie;
int reserve_flags;
/*
@@ -5011,11 +5036,12 @@ __alloc_pages_slowpath(gfp_t gfp_mask, u
(__GFP_ATOMIC|__GFP_DIRECT_RECLAIM)))
gfp_mask &= ~__GFP_ATOMIC;
-retry_cpuset:
+restart:
compaction_retries = 0;
no_progress_loops = 0;
compact_priority = DEF_COMPACT_PRIORITY;
cpuset_mems_cookie = read_mems_allowed_begin();
+ zonelist_iter_cookie = zonelist_iter_begin();
/*
* The fast path uses conservative alloc_flags to succeed only until
@@ -5187,9 +5213,13 @@ retry:
goto retry;
- /* Deal with possible cpuset update races before we start OOM killing */
- if (check_retry_cpuset(cpuset_mems_cookie, ac))
- goto retry_cpuset;
+ /*
+ * Deal with possible cpuset update races or zonelist updates to avoid
+ * a unnecessary OOM kill.
+ */
+ if (check_retry_cpuset(cpuset_mems_cookie, ac) ||
+ check_retry_zonelist(zonelist_iter_cookie))
+ goto restart;
/* Reclaim has failed us, start killing things */
page = __alloc_pages_may_oom(gfp_mask, order, ac, &did_some_progress);
@@ -5209,9 +5239,13 @@ retry:
}
nopage:
- /* Deal with possible cpuset update races before we fail */
- if (check_retry_cpuset(cpuset_mems_cookie, ac))
- goto retry_cpuset;
+ /*
+ * Deal with possible cpuset update races or zonelist updates to avoid
+ * a unnecessary OOM kill.
+ */
+ if (check_retry_cpuset(cpuset_mems_cookie, ac) ||
+ check_retry_zonelist(zonelist_iter_cookie))
+ goto restart;
/*
* Make sure that __GFP_NOFAIL request doesn't leak out and make sure
@@ -6514,9 +6548,8 @@ static void __build_all_zonelists(void *
int nid;
int __maybe_unused cpu;
pg_data_t *self = data;
- static DEFINE_SPINLOCK(lock);
- spin_lock(&lock);
+ write_seqlock(&zonelist_update_seq);
#ifdef CONFIG_NUMA
memset(node_load, 0, sizeof(node_load));
@@ -6553,7 +6586,7 @@ static void __build_all_zonelists(void *
#endif
}
- spin_unlock(&lock);
+ write_sequnlock(&zonelist_update_seq);
}
static noinline void __init
_
Patches currently in -mm which might be from mgorman(a)techsingularity.net are
--
Greetings,
The Board Directors believe you are in good health, doing great and
with the hope that this mail will meet you in good condition, We are
privileged and delighted to reach you via email" And we are urgently
waiting to hear from you. and again your number is not connecting.
My regards,
Mrs. Mimi Aminu.
Sincerely,
Dr. Kim Chang Jung
The stable patch merging tools failed to automatically merge the
io_uring/LSM CMD passthrough controls into the stable v5.19.y branch,
so I'm doing the backport manually and submitting them directly to
stable for the next v5.19.y release. The backport is necessary due
to the reorg/decomposition of the io_uring code in io_uring/ during
the v5.19->v6.0 merge window. Other than the differences in the
filenames under io_uring, the code changes are pretty much the same.
I've done some basic sanity testing this afternoon with these
patches and everything looks good to me.
If you would prefer to pull these directly from a git tree instead
of email, they are available via the LSM tree on the stable-5.19
branch, using the lsm-pr-20220906 tag.
git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm.git
lsm-pr-20220906
---
Paul Moore (3):
lsm,io_uring: add LSM hooks for the new uring_cmd file op
selinux: implement the security_uring_cmd() LSM hook
Smack: Provide read control for io_uring_cmd
include/linux/lsm_hook_defs.h | 1 +
include/linux/lsm_hooks.h | 3 +++
include/linux/security.h | 5 +++++
io_uring/io_uring.c | 4 ++++
security/security.c | 4 ++++
security/selinux/hooks.c | 24 ++++++++++++++++++++++
security/selinux/include/classmap.h | 2 +-
security/smack/smack_lsm.c | 32 +++++++++++++++++++++++++++++
8 files changed, 74 insertions(+), 1 deletion(-)
--
Dear Friend,
It’s just my urgent need for foreign partner that made me to contact
you via your email. The details will send to you as soon as i heard from
you.
Mrs. Susan Lee Yu-Chen
Hi Greg and Sasha,
Please apply commit 5c5c2baad2b5 ("ASoC: mchp-spdiftx: Fix clang
-Wbitfield-constant-conversion") to 5.10 and newer, as it fixes a
warning with tip of tree LLVM. There will be a minor conflict that can
be resolved by taking commit 403fcb5118a0 ("ASoC: mchp-spdiftx: remove
references to mchp_i2s_caps") before it, which seems reasonable to me.
If that is not acceptable, I can send a backport with just 5c5c2baad2b5.
Cheers,
Nathan
Hello Dear God,s Select Good Day,
I apologized, If this mail find's you disturbing, It might not be the
best way to approach you as we have not met before, but due to the
urgency of my present situation i decided to communicate this way, so
please pardon my manna, I am writing this mail to you with heavy tears
In my eyes and great sorrow in my heart, My Name is Mrs.Juliette
Morgan, and I am contacting you from my country Norway, I want to tell
you this because I don't have any other option than to tell you as I
was touched to open up to you,
I married to Mr.sami Morgan. Who worked with Norway embassy in Burkina
Faso for nine years before he died in the year 2020.We were married
for eleven years without a child He died after a brief illness that
lasted for only five days. Since his death I decided not to remarry,
When my late husband was alive he deposited the sum of € 8.5 Million
Euro (Eight million, Five hundred thousand Euros) in a bank in
Ouagadougou the capital city of Burkina Faso in west Africa Presently
this money is still in bank. He made this money available for
exportation of Gold from Burkina Faso mining.
Recently, My Doctor told me that I would not last for the period of
seven months due to cancer problem. The one that disturbs me most is
my stroke sickness.Having known my condition I decided to hand you
over this money to take care of the less-privileged people, you will
utilize this money the way I am going to instruct herein.
I want you to take 30 Percent of the total money for your personal use
While 70% of the money will go to charity, people in the street and
helping the orphanage. I grew up as an Orphan and I don't have any
body as my family member, just to endeavour that the house of God is
maintained. Am doing this so that God will forgive my sins and accept
my soul because these sicknesses have suffered me so much.
As soon as I receive your reply I shall give you the contact of the
bank in Burkina Faso and I will also instruct the Bank Manager to
issue you an authority letter that will prove you the present
beneficiary of the money in the bank that is if you assure me that you
will act accordingly as I Stated herein.
Always reply to my alternative for security purposes
Hoping to receive your reply:
From Mrs.Juliette Morgan,
From: Rob Clark <robdclark(a)chromium.org>
[ Upstream commit 174974d8463b77c2b4065e98513adb204e64de7d ]
If the previous thing cat'ing $debugfs/rd left the FIFO full, then
subsequent open could deadlock in rd_write() (because open is blocked,
not giving a chance for read() to consume any data in the FIFO). Also
it is generally a good idea to clear out old data from the FIFO.
Signed-off-by: Rob Clark <robdclark(a)chromium.org>
Patchwork: https://patchwork.freedesktop.org/patch/496706/
Link: https://lore.kernel.org/r/20220807160901.2353471-2-robdclark@gmail.com
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
drivers/gpu/drm/msm/msm_rd.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers/gpu/drm/msm/msm_rd.c b/drivers/gpu/drm/msm/msm_rd.c
index 4823019eb422b..a8a04d8c5ca62 100644
--- a/drivers/gpu/drm/msm/msm_rd.c
+++ b/drivers/gpu/drm/msm/msm_rd.c
@@ -188,6 +188,9 @@ static int rd_open(struct inode *inode, struct file *file)
file->private_data = rd;
rd->open = true;
+ /* Reset fifo to clear any previously unread data: */
+ rd->fifo.head = rd->fifo.tail = 0;
+
/* the parsing tools need to know gpu-id to know which
* register database to load.
*/
--
2.35.1
From: Rob Clark <robdclark(a)chromium.org>
[ Upstream commit 174974d8463b77c2b4065e98513adb204e64de7d ]
If the previous thing cat'ing $debugfs/rd left the FIFO full, then
subsequent open could deadlock in rd_write() (because open is blocked,
not giving a chance for read() to consume any data in the FIFO). Also
it is generally a good idea to clear out old data from the FIFO.
Signed-off-by: Rob Clark <robdclark(a)chromium.org>
Patchwork: https://patchwork.freedesktop.org/patch/496706/
Link: https://lore.kernel.org/r/20220807160901.2353471-2-robdclark@gmail.com
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
drivers/gpu/drm/msm/msm_rd.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers/gpu/drm/msm/msm_rd.c b/drivers/gpu/drm/msm/msm_rd.c
index bdce1c9434c6c..54891cbf4f50f 100644
--- a/drivers/gpu/drm/msm/msm_rd.c
+++ b/drivers/gpu/drm/msm/msm_rd.c
@@ -193,6 +193,9 @@ static int rd_open(struct inode *inode, struct file *file)
file->private_data = rd;
rd->open = true;
+ /* Reset fifo to clear any previously unread data: */
+ rd->fifo.head = rd->fifo.tail = 0;
+
/* the parsing tools need to know gpu-id to know which
* register database to load.
*/
--
2.35.1
From: Rob Clark <robdclark(a)chromium.org>
[ Upstream commit 174974d8463b77c2b4065e98513adb204e64de7d ]
If the previous thing cat'ing $debugfs/rd left the FIFO full, then
subsequent open could deadlock in rd_write() (because open is blocked,
not giving a chance for read() to consume any data in the FIFO). Also
it is generally a good idea to clear out old data from the FIFO.
Signed-off-by: Rob Clark <robdclark(a)chromium.org>
Patchwork: https://patchwork.freedesktop.org/patch/496706/
Link: https://lore.kernel.org/r/20220807160901.2353471-2-robdclark@gmail.com
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
drivers/gpu/drm/msm/msm_rd.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers/gpu/drm/msm/msm_rd.c b/drivers/gpu/drm/msm/msm_rd.c
index d4cc5ceb22d01..bb65aab49c214 100644
--- a/drivers/gpu/drm/msm/msm_rd.c
+++ b/drivers/gpu/drm/msm/msm_rd.c
@@ -199,6 +199,9 @@ static int rd_open(struct inode *inode, struct file *file)
file->private_data = rd;
rd->open = true;
+ /* Reset fifo to clear any previously unread data: */
+ rd->fifo.head = rd->fifo.tail = 0;
+
/* the parsing tools need to know gpu-id to know which
* register database to load.
*/
--
2.35.1
From: Rob Clark <robdclark(a)chromium.org>
[ Upstream commit 174974d8463b77c2b4065e98513adb204e64de7d ]
If the previous thing cat'ing $debugfs/rd left the FIFO full, then
subsequent open could deadlock in rd_write() (because open is blocked,
not giving a chance for read() to consume any data in the FIFO). Also
it is generally a good idea to clear out old data from the FIFO.
Signed-off-by: Rob Clark <robdclark(a)chromium.org>
Patchwork: https://patchwork.freedesktop.org/patch/496706/
Link: https://lore.kernel.org/r/20220807160901.2353471-2-robdclark@gmail.com
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
drivers/gpu/drm/msm/msm_rd.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers/gpu/drm/msm/msm_rd.c b/drivers/gpu/drm/msm/msm_rd.c
index c7832a951039f..a6b024b06b363 100644
--- a/drivers/gpu/drm/msm/msm_rd.c
+++ b/drivers/gpu/drm/msm/msm_rd.c
@@ -191,6 +191,9 @@ static int rd_open(struct inode *inode, struct file *file)
file->private_data = rd;
rd->open = true;
+ /* Reset fifo to clear any previously unread data: */
+ rd->fifo.head = rd->fifo.tail = 0;
+
/* the parsing tools need to know gpu-id to know which
* register database to load.
*/
--
2.35.1
From: SeongJae Park <sj(a)kernel.org>
The advertisement of the persistent grants feature (writing
'feature-persistent' to xenbus) should mean not the decision for using
the feature but only the availability of the feature. However, commit
aac8a70db24b ("xen-blkback: add a parameter for disabling of persistent
grants") made a field of blkback, which was a place for saving only the
negotiation result, to be used for yet another purpose: caching of the
'feature_persistent' parameter value. As a result, the advertisement,
which should follow only the parameter value, becomes inconsistent.
This commit fixes the misuse of the semantic by making blkback saves the
parameter value in a separate place and advertises the support based on
only the saved value.
Fixes: aac8a70db24b ("xen-blkback: add a parameter for disabling of persistent grants")
Cc: <stable(a)vger.kernel.org> # 5.10.x
Suggested-by: Juergen Gross <jgross(a)suse.com>
Signed-off-by: SeongJae Park <sj(a)kernel.org>
Tested-by: Marek Marczykowski-Górecki <marmarek(a)invisiblethingslab.com>
Reviewed-by: Juergen Gross <jgross(a)suse.com>
Link: https://lore.kernel.org/r/20220831165824.94815-2-sj@kernel.org
Signed-off-by: Juergen Gross <jgross(a)suse.com>
---
drivers/block/xen-blkback/common.h | 3 +++
drivers/block/xen-blkback/xenbus.c | 6 ++++--
2 files changed, 7 insertions(+), 2 deletions(-)
diff --git a/drivers/block/xen-blkback/common.h b/drivers/block/xen-blkback/common.h
index bda5c815e441..a28473470e66 100644
--- a/drivers/block/xen-blkback/common.h
+++ b/drivers/block/xen-blkback/common.h
@@ -226,6 +226,9 @@ struct xen_vbd {
sector_t size;
unsigned int flush_support:1;
unsigned int discard_secure:1;
+ /* Connect-time cached feature_persistent parameter value */
+ unsigned int feature_gnt_persistent_parm:1;
+ /* Persistent grants feature negotiation result */
unsigned int feature_gnt_persistent:1;
unsigned int overflow_max_grants:1;
};
diff --git a/drivers/block/xen-blkback/xenbus.c b/drivers/block/xen-blkback/xenbus.c
index ee7ad2fb432d..c0227dfa4688 100644
--- a/drivers/block/xen-blkback/xenbus.c
+++ b/drivers/block/xen-blkback/xenbus.c
@@ -907,7 +907,7 @@ static void connect(struct backend_info *be)
xen_blkbk_barrier(xbt, be, be->blkif->vbd.flush_support);
err = xenbus_printf(xbt, dev->nodename, "feature-persistent", "%u",
- be->blkif->vbd.feature_gnt_persistent);
+ be->blkif->vbd.feature_gnt_persistent_parm);
if (err) {
xenbus_dev_fatal(dev, err, "writing %s/feature-persistent",
dev->nodename);
@@ -1085,7 +1085,9 @@ static int connect_ring(struct backend_info *be)
return -ENOSYS;
}
- blkif->vbd.feature_gnt_persistent = feature_persistent &&
+ blkif->vbd.feature_gnt_persistent_parm = feature_persistent;
+ blkif->vbd.feature_gnt_persistent =
+ blkif->vbd.feature_gnt_persistent_parm &&
xenbus_read_unsigned(dev->otherend, "feature-persistent", 0);
blkif->vbd.overflow_max_grants = 0;
--
2.34.1
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
Possible dependencies:
672d6ca75865 ("drm/i915: Implement WaEdpLinkRateDataReload")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 672d6ca758651f0ec12cd0d59787067a5bde1c96 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Ville=20Syrj=C3=A4l=C3=A4?= <ville.syrjala(a)linux.intel.com>
Date: Fri, 2 Sep 2022 10:03:18 +0300
Subject: [PATCH] drm/i915: Implement WaEdpLinkRateDataReload
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
A lot of modern laptops use the Parade PS8461E MUX for eDP
switching. The MUX can operate in jitter cleaning mode or
redriver mode, the first one resulting in higher link
quality. The jitter cleaning mode needs to know the link
rate used and the MUX achieves this by snooping the
LINK_BW_SET, LINK_RATE_SELECT and SUPPORTED_LINK_RATES
DPCD accesses.
When the MUX is powered down (seems this can happen whenever
the display is turned off) it loses track of the snooped
link rates so when we do the LINK_RATE_SELECT write it no
longer knowns which link rate we're selecting, and thus it
falls back to the lower quality redriver mode. This results
in unstable high link rates (eg. usually 8.1Gbps link rate
no longer works correctly).
In order to avoid all that let's re-snoop SUPPORTED_LINK_RATES
from the sink at the start of every link training.
Unfortunately we don't have a way to detect the presence of
the MUX. It looks like the set of laptops equipped with this
MUX is fairly large and contains devices from multiple
manufacturers. It may also still be growing with new models.
So a quirk doesn't seem like a very easily maintainable
option, thus we shall attempt to do this unconditionally on
all machines that use LINK_RATE_SELECT. Hopefully this extra
DPCD read doesn't cause issues for any unaffected machine.
If that turns out to be the case we'll need to convert this
into a quirk in the future.
Cc: stable(a)vger.kernel.org
Cc: Jason A. Donenfeld <Jason(a)zx2c4.com>
Cc: Ankit Nautiyal <ankit.k.nautiyal(a)intel.com>
Cc: Jani Nikula <jani.nikula(a)intel.com>
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/6205
Signed-off-by: Ville Syrjälä <ville.syrjala(a)linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220902070319.15395-1-ville.…
Tested-by: Aaron Ma <aaron.ma(a)canonical.com>
Tested-by: Jason A. Donenfeld <Jason(a)zx2c4.com>
Reviewed-by: Jani Nikula <jani.nikula(a)intel.com>
(cherry picked from commit 25899c590cb5ba9b9f284c6ca8e7e9086793d641)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi(a)intel.com>
diff --git a/drivers/gpu/drm/i915/display/intel_dp_link_training.c b/drivers/gpu/drm/i915/display/intel_dp_link_training.c
index 9feaf1a589f3..d213d8ad1ea5 100644
--- a/drivers/gpu/drm/i915/display/intel_dp_link_training.c
+++ b/drivers/gpu/drm/i915/display/intel_dp_link_training.c
@@ -671,6 +671,28 @@ intel_dp_prepare_link_train(struct intel_dp *intel_dp,
intel_dp_compute_rate(intel_dp, crtc_state->port_clock,
&link_bw, &rate_select);
+ /*
+ * WaEdpLinkRateDataReload
+ *
+ * Parade PS8461E MUX (used on varius TGL+ laptops) needs
+ * to snoop the link rates reported by the sink when we
+ * use LINK_RATE_SET in order to operate in jitter cleaning
+ * mode (as opposed to redriver mode). Unfortunately it
+ * loses track of the snooped link rates when powered down,
+ * so we need to make it re-snoop often. Without this high
+ * link rates are not stable.
+ */
+ if (!link_bw) {
+ struct intel_connector *connector = intel_dp->attached_connector;
+ __le16 sink_rates[DP_MAX_SUPPORTED_RATES];
+
+ drm_dbg_kms(&i915->drm, "[CONNECTOR:%d:%s] Reloading eDP link rates\n",
+ connector->base.base.id, connector->base.name);
+
+ drm_dp_dpcd_read(&intel_dp->aux, DP_SUPPORTED_LINK_RATES,
+ sink_rates, sizeof(sink_rates));
+ }
+
if (link_bw)
drm_dbg_kms(&i915->drm,
"[ENCODER:%d:%s] Using LINK_BW_SET value %02x\n",
commit 25e9fbf0fd38868a429feabc38abebfc6dbf6542 upstream.
Both __device_attach_driver() and __driver_attach() check the return
code of the bus_type.match() function to see if the device needs to be
added to the deferred probe list. After adding the device to the list,
the logic attempts to bind the device to the driver anyway, as if the
device had matched with the driver, which is not correct.
If __device_attach_driver() detects that the device in question is not
ready to match with a driver on the bus, then it doesn't make sense for
the device to attempt to bind with the current driver or continue
attempting to match with any of the other drivers on the bus. So, update
the logic in __device_attach_driver() to reflect this.
If __driver_attach() detects that a driver tried to match with a device
that is not ready to match yet, then the driver should not attempt to bind
with the device. However, the driver can still attempt to match and bind
with other devices on the bus, as drivers can be bound to multiple
devices. So, update the logic in __driver_attach() to reflect this.
Fixes: 656b8035b0ee ("ARM: 8524/1: driver cohandle -EPROBE_DEFER from bus_type.match()")
Cc: stable(a)vger.kernel.org
Cc: Saravana Kannan <saravanak(a)google.com>
Reported-by: Guenter Roeck <linux(a)roeck-us.net>
Tested-by: Guenter Roeck <linux(a)roeck-us.net>
Tested-by: Linus Walleij <linus.walleij(a)linaro.org>
Reviewed-by: Saravana Kannan <saravanak(a)google.com>
Signed-off-by: Isaac J. Manjarres <isaacmanjarres(a)google.com>
---
drivers/base/dd.c | 10 ++++++++++
1 file changed, 10 insertions(+)
diff --git a/drivers/base/dd.c b/drivers/base/dd.c
index fc27fab62f50..a7bcbb99e820 100644
--- a/drivers/base/dd.c
+++ b/drivers/base/dd.c
@@ -632,6 +632,11 @@ static int __device_attach_driver(struct device_driver *drv, void *_data)
} else if (ret == -EPROBE_DEFER) {
dev_dbg(dev, "Device match requests probe deferral\n");
driver_deferred_probe_add(dev);
+ /*
+ * Device can't match with a driver right now, so don't attempt
+ * to match or bind with other drivers on the bus.
+ */
+ return ret;
} else if (ret < 0) {
dev_dbg(dev, "Bus failed to match device: %d", ret);
return ret;
@@ -774,6 +779,11 @@ static int __driver_attach(struct device *dev, void *data)
} else if (ret == -EPROBE_DEFER) {
dev_dbg(dev, "Device match requests probe deferral\n");
driver_deferred_probe_add(dev);
+ /*
+ * Driver could not match with device, but may match with
+ * another device on the bus.
+ */
+ return 0;
} else if (ret < 0) {
dev_dbg(dev, "Bus failed to match device: %d", ret);
return ret;
--
2.37.2.789.g6183377224-goog
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
Possible dependencies:
e1cab970574c ("drm/i915/slpc: Let's fix the PCODE min freq table setup for SLPC")
4dd4375bc4ff ("drm/i915: split out intel_pcode.[ch] to separate file")
1eecf31e3c96 ("drm/i915: split out vlv sideband to a separate file")
5f5ada0bae45 ("drm/i915: De-wrapper bxt_ddi_phy_set_signal_levels()")
193299ad9d85 ("drm/i915: Nuke useless .set_signal_levels() wrappers")
e722ab8b6968 ("drm/i915: Generalize .set_signal_levels()")
5bafd85dd770 ("drm/i915: Introduce has_buf_trans_select()")
f820693bc238 ("drm/i915: Introduce has_iboost()")
80e77e30a212 ("drm/i915/dpll: move dpll modeset asserts to intel_dpll.c")
aa0813b1ba31 ("drm/i915/pps: move pps (panel) modeset asserts to intel_pps.c")
e04a911f4366 ("drm/i915/fdi: move fdi modeset asserts to intel_fdi.c")
e505d76404b1 ("drm/i915: s/ddi_translations/trans/")
4360a2b54fd7 ("drm/i915/display: add intel_fdi_link_train wrapper.")
8c66081b0b32 ("drm/i915: s/pipe/transcoder/ when dealing with PIPECONF/TRANSCONF")
a338847abc8e ("drm/i915: Call {vlv,chv}_prepare_pll() from {vlv,chv}_enable_pll()")
510e890e8222 ("drm/i915: Remove the 'reg' local variable")
8a3b3df39757 ("drm/i915: Clean up variable names in old dpll functions")
6205372b4b6d ("drm/i915: Clean dpll calling convention")
24951b5813c1 ("drm/i915: Constify struct dpll all over")
b294425e9091 ("drm/i915: Extract ilk_update_pll_dividers()")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From e1cab970574c001d83e59ca8388c474a57a1afb6 Mon Sep 17 00:00:00 2001
From: Rodrigo Vivi <rodrigo.vivi(a)intel.com>
Date: Wed, 31 Aug 2022 17:45:38 -0400
Subject: [PATCH] drm/i915/slpc: Let's fix the PCODE min freq table setup for
SLPC
We need to inform PCODE of a desired ring frequencies so PCODE update
the memory frequencies to us. rps->min_freq and rps->max_freq are the
frequencies used in that request. However they were unset when SLPC was
enabled and PCODE never updated the memory freq.
v2 (as Suggested by Ashutosh): if SLPC is in use, let's pick the right
frequencies from the get_ia_constants instead of the fake init of
rps' min and max.
v3: don't forget the max <= min return
v4: Move all the freq conversion to intel_rps.c. And the max <= min
check to where it belongs.
v5: (Ashutosh) Fix old comment s/50 HZ/50 MHz and add a doc explaining
the "raw format"
Fixes: 7ba79a671568 ("drm/i915/guc/slpc: Gate Host RPS when SLPC is enabled")
Cc: <stable(a)vger.kernel.org> # v5.15+
Cc: Ashutosh Dixit <ashutosh.dixit(a)intel.com>
Tested-by: Sushma Venkatesh Reddy <sushma.venkatesh.reddy(a)intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi(a)intel.com>
Reviewed-by: Ashutosh Dixit <ashutosh.dixit(a)intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220831214538.143950-1-rodri…
(cherry picked from commit 018a7bdbb090b9155a6509a0d1a684db4afaa5b1)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi(a)intel.com>
diff --git a/drivers/gpu/drm/i915/gt/intel_llc.c b/drivers/gpu/drm/i915/gt/intel_llc.c
index 14fe65812e42..1d19c073ba2e 100644
--- a/drivers/gpu/drm/i915/gt/intel_llc.c
+++ b/drivers/gpu/drm/i915/gt/intel_llc.c
@@ -12,6 +12,7 @@
#include "intel_llc.h"
#include "intel_mchbar_regs.h"
#include "intel_pcode.h"
+#include "intel_rps.h"
struct ia_constants {
unsigned int min_gpu_freq;
@@ -55,9 +56,6 @@ static bool get_ia_constants(struct intel_llc *llc,
if (!HAS_LLC(i915) || IS_DGFX(i915))
return false;
- if (rps->max_freq <= rps->min_freq)
- return false;
-
consts->max_ia_freq = cpu_max_MHz();
consts->min_ring_freq =
@@ -65,13 +63,8 @@ static bool get_ia_constants(struct intel_llc *llc,
/* convert DDR frequency from units of 266.6MHz to bandwidth */
consts->min_ring_freq = mult_frac(consts->min_ring_freq, 8, 3);
- consts->min_gpu_freq = rps->min_freq;
- consts->max_gpu_freq = rps->max_freq;
- if (GRAPHICS_VER(i915) >= 9) {
- /* Convert GT frequency to 50 HZ units */
- consts->min_gpu_freq /= GEN9_FREQ_SCALER;
- consts->max_gpu_freq /= GEN9_FREQ_SCALER;
- }
+ consts->min_gpu_freq = intel_rps_get_min_raw_freq(rps);
+ consts->max_gpu_freq = intel_rps_get_max_raw_freq(rps);
return true;
}
@@ -130,6 +123,12 @@ static void gen6_update_ring_freq(struct intel_llc *llc)
if (!get_ia_constants(llc, &consts))
return;
+ /*
+ * Although this is unlikely on any platform during initialization,
+ * let's ensure we don't get accidentally into infinite loop
+ */
+ if (consts.max_gpu_freq <= consts.min_gpu_freq)
+ return;
/*
* For each potential GPU frequency, load a ring frequency we'd like
* to use for memory access. We do this by specifying the IA frequency
diff --git a/drivers/gpu/drm/i915/gt/intel_rps.c b/drivers/gpu/drm/i915/gt/intel_rps.c
index fb3f57ee450b..7bb967034679 100644
--- a/drivers/gpu/drm/i915/gt/intel_rps.c
+++ b/drivers/gpu/drm/i915/gt/intel_rps.c
@@ -2126,6 +2126,31 @@ u32 intel_rps_get_max_frequency(struct intel_rps *rps)
return intel_gpu_freq(rps, rps->max_freq_softlimit);
}
+/**
+ * intel_rps_get_max_raw_freq - returns the max frequency in some raw format.
+ * @rps: the intel_rps structure
+ *
+ * Returns the max frequency in a raw format. In newer platforms raw is in
+ * units of 50 MHz.
+ */
+u32 intel_rps_get_max_raw_freq(struct intel_rps *rps)
+{
+ struct intel_guc_slpc *slpc = rps_to_slpc(rps);
+ u32 freq;
+
+ if (rps_uses_slpc(rps)) {
+ return DIV_ROUND_CLOSEST(slpc->rp0_freq,
+ GT_FREQUENCY_MULTIPLIER);
+ } else {
+ freq = rps->max_freq;
+ if (GRAPHICS_VER(rps_to_i915(rps)) >= 9) {
+ /* Convert GT frequency to 50 MHz units */
+ freq /= GEN9_FREQ_SCALER;
+ }
+ return freq;
+ }
+}
+
u32 intel_rps_get_rp0_frequency(struct intel_rps *rps)
{
struct intel_guc_slpc *slpc = rps_to_slpc(rps);
@@ -2214,6 +2239,31 @@ u32 intel_rps_get_min_frequency(struct intel_rps *rps)
return intel_gpu_freq(rps, rps->min_freq_softlimit);
}
+/**
+ * intel_rps_get_min_raw_freq - returns the min frequency in some raw format.
+ * @rps: the intel_rps structure
+ *
+ * Returns the min frequency in a raw format. In newer platforms raw is in
+ * units of 50 MHz.
+ */
+u32 intel_rps_get_min_raw_freq(struct intel_rps *rps)
+{
+ struct intel_guc_slpc *slpc = rps_to_slpc(rps);
+ u32 freq;
+
+ if (rps_uses_slpc(rps)) {
+ return DIV_ROUND_CLOSEST(slpc->min_freq,
+ GT_FREQUENCY_MULTIPLIER);
+ } else {
+ freq = rps->min_freq;
+ if (GRAPHICS_VER(rps_to_i915(rps)) >= 9) {
+ /* Convert GT frequency to 50 MHz units */
+ freq /= GEN9_FREQ_SCALER;
+ }
+ return freq;
+ }
+}
+
static int set_min_freq(struct intel_rps *rps, u32 val)
{
int ret = 0;
diff --git a/drivers/gpu/drm/i915/gt/intel_rps.h b/drivers/gpu/drm/i915/gt/intel_rps.h
index 1e8d56491308..4509dfdc52e0 100644
--- a/drivers/gpu/drm/i915/gt/intel_rps.h
+++ b/drivers/gpu/drm/i915/gt/intel_rps.h
@@ -37,8 +37,10 @@ u32 intel_rps_get_cagf(struct intel_rps *rps, u32 rpstat1);
u32 intel_rps_read_actual_frequency(struct intel_rps *rps);
u32 intel_rps_get_requested_frequency(struct intel_rps *rps);
u32 intel_rps_get_min_frequency(struct intel_rps *rps);
+u32 intel_rps_get_min_raw_freq(struct intel_rps *rps);
int intel_rps_set_min_frequency(struct intel_rps *rps, u32 val);
u32 intel_rps_get_max_frequency(struct intel_rps *rps);
+u32 intel_rps_get_max_raw_freq(struct intel_rps *rps);
int intel_rps_set_max_frequency(struct intel_rps *rps, u32 val);
u32 intel_rps_get_rp0_frequency(struct intel_rps *rps);
u32 intel_rps_get_rp1_frequency(struct intel_rps *rps);
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
Possible dependencies:
4d9cb92ca41d ("io_uring/rw: fix short rw error handling")
f2ccb5aed7bc ("io_uring: make io_kiocb_to_cmd() typesafe")
14b146b688ad ("io_uring: notification completion optimisation")
6a9ce66f4d08 ("io_uring/net: make page accounting more consistent")
2e32ba5607ee ("io_uring/net: checks errors of zc mem accounting")
3ff1a0d395c0 ("io_uring: enable managed frags with register buffers")
492dddb4f6e3 ("io_uring: add zc notification flush requests")
4379d5f15b3f ("io_uring: rename IORING_OP_FILES_UPDATE")
63809137ebb5 ("io_uring: flush notifiers after sendzc")
10c7d33ecd51 ("io_uring: sendzc with fixed buffers")
e29e3bd4b968 ("io_uring: account locked pages for non-fixed zc")
06a5464be84e ("io_uring: wire send zc request type")
bc24d6bd32df ("io_uring: add notification slot registration")
68ef5578efc8 ("io_uring: add rsrc referencing for notifiers")
e58d498e81ba ("io_uring: complete notifiers in tw")
eb4a299b2f95 ("io_uring: cache struct io_notif")
eb42cebb2cf2 ("io_uring: add zc notification infrastructure")
e70cb60893ca ("io_uring: export io_put_task()")
72c531f8ef30 ("net: copy from user before calling __get_compat_msghdr")
7fa875b8e53c ("net: copy from user before calling __copy_msghdr")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 4d9cb92ca41dd8e905a4569ceba4716c2f39c75a Mon Sep 17 00:00:00 2001
From: Pavel Begunkov <asml.silence(a)gmail.com>
Date: Fri, 9 Sep 2022 12:11:49 +0100
Subject: [PATCH] io_uring/rw: fix short rw error handling
We have a couple of problems, first reports of unexpected link breakage
for reads when cqe->res indicates that the IO was done in full. The
reason here is partial IO with retries.
TL;DR; we compare the result in __io_complete_rw_common() against
req->cqe.res, but req->cqe.res doesn't store the full length but rather
the length left to be done. So, when we pass the full corrected result
via kiocb_done() -> __io_complete_rw_common(), it fails.
The second problem is that we don't try to correct res in
io_complete_rw(), which, for instance, might be a problem for O_DIRECT
but when a prefix of data was cached in the page cache. We also
definitely don't want to pass a corrected result into io_rw_done().
The fix here is to leave __io_complete_rw_common() alone, always pass
not corrected result into it and fix it up as the last step just before
actually finishing the I/O.
Cc: stable(a)vger.kernel.org
Signed-off-by: Pavel Begunkov <asml.silence(a)gmail.com>
Link: https://github.com/axboe/liburing/issues/643
Reported-by: Beld Zhang <beldzhang(a)gmail.com>
Signed-off-by: Jens Axboe <axboe(a)kernel.dk>
diff --git a/io_uring/rw.c b/io_uring/rw.c
index 1babd77da79c..1e18a44adcf5 100644
--- a/io_uring/rw.c
+++ b/io_uring/rw.c
@@ -206,6 +206,20 @@ static bool __io_complete_rw_common(struct io_kiocb *req, long res)
return false;
}
+static inline unsigned io_fixup_rw_res(struct io_kiocb *req, unsigned res)
+{
+ struct io_async_rw *io = req->async_data;
+
+ /* add previously done IO, if any */
+ if (req_has_async_data(req) && io->bytes_done > 0) {
+ if (res < 0)
+ res = io->bytes_done;
+ else
+ res += io->bytes_done;
+ }
+ return res;
+}
+
static void io_complete_rw(struct kiocb *kiocb, long res)
{
struct io_rw *rw = container_of(kiocb, struct io_rw, kiocb);
@@ -213,7 +227,7 @@ static void io_complete_rw(struct kiocb *kiocb, long res)
if (__io_complete_rw_common(req, res))
return;
- io_req_set_res(req, res, 0);
+ io_req_set_res(req, io_fixup_rw_res(req, res), 0);
req->io_task_work.func = io_req_task_complete;
io_req_task_work_add(req);
}
@@ -240,22 +254,14 @@ static void io_complete_rw_iopoll(struct kiocb *kiocb, long res)
static int kiocb_done(struct io_kiocb *req, ssize_t ret,
unsigned int issue_flags)
{
- struct io_async_rw *io = req->async_data;
struct io_rw *rw = io_kiocb_to_cmd(req, struct io_rw);
-
- /* add previously done IO, if any */
- if (req_has_async_data(req) && io->bytes_done > 0) {
- if (ret < 0)
- ret = io->bytes_done;
- else
- ret += io->bytes_done;
- }
+ unsigned final_ret = io_fixup_rw_res(req, ret);
if (req->flags & REQ_F_CUR_POS)
req->file->f_pos = rw->kiocb.ki_pos;
if (ret >= 0 && (rw->kiocb.ki_complete == io_complete_rw)) {
if (!__io_complete_rw_common(req, ret)) {
- io_req_set_res(req, req->cqe.res,
+ io_req_set_res(req, final_ret,
io_put_kbuf(req, issue_flags));
return IOU_OK;
}
@@ -268,7 +274,7 @@ static int kiocb_done(struct io_kiocb *req, ssize_t ret,
if (io_resubmit_prep(req))
io_req_task_queue_reissue(req);
else
- io_req_task_queue_fail(req, ret);
+ io_req_task_queue_fail(req, final_ret);
}
return IOU_ISSUE_SKIP_COMPLETE;
}
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
Possible dependencies:
4d9cb92ca41d ("io_uring/rw: fix short rw error handling")
f2ccb5aed7bc ("io_uring: make io_kiocb_to_cmd() typesafe")
14b146b688ad ("io_uring: notification completion optimisation")
6a9ce66f4d08 ("io_uring/net: make page accounting more consistent")
2e32ba5607ee ("io_uring/net: checks errors of zc mem accounting")
3ff1a0d395c0 ("io_uring: enable managed frags with register buffers")
492dddb4f6e3 ("io_uring: add zc notification flush requests")
4379d5f15b3f ("io_uring: rename IORING_OP_FILES_UPDATE")
63809137ebb5 ("io_uring: flush notifiers after sendzc")
10c7d33ecd51 ("io_uring: sendzc with fixed buffers")
e29e3bd4b968 ("io_uring: account locked pages for non-fixed zc")
06a5464be84e ("io_uring: wire send zc request type")
bc24d6bd32df ("io_uring: add notification slot registration")
68ef5578efc8 ("io_uring: add rsrc referencing for notifiers")
e58d498e81ba ("io_uring: complete notifiers in tw")
eb4a299b2f95 ("io_uring: cache struct io_notif")
eb42cebb2cf2 ("io_uring: add zc notification infrastructure")
e70cb60893ca ("io_uring: export io_put_task()")
72c531f8ef30 ("net: copy from user before calling __get_compat_msghdr")
7fa875b8e53c ("net: copy from user before calling __copy_msghdr")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 4d9cb92ca41dd8e905a4569ceba4716c2f39c75a Mon Sep 17 00:00:00 2001
From: Pavel Begunkov <asml.silence(a)gmail.com>
Date: Fri, 9 Sep 2022 12:11:49 +0100
Subject: [PATCH] io_uring/rw: fix short rw error handling
We have a couple of problems, first reports of unexpected link breakage
for reads when cqe->res indicates that the IO was done in full. The
reason here is partial IO with retries.
TL;DR; we compare the result in __io_complete_rw_common() against
req->cqe.res, but req->cqe.res doesn't store the full length but rather
the length left to be done. So, when we pass the full corrected result
via kiocb_done() -> __io_complete_rw_common(), it fails.
The second problem is that we don't try to correct res in
io_complete_rw(), which, for instance, might be a problem for O_DIRECT
but when a prefix of data was cached in the page cache. We also
definitely don't want to pass a corrected result into io_rw_done().
The fix here is to leave __io_complete_rw_common() alone, always pass
not corrected result into it and fix it up as the last step just before
actually finishing the I/O.
Cc: stable(a)vger.kernel.org
Signed-off-by: Pavel Begunkov <asml.silence(a)gmail.com>
Link: https://github.com/axboe/liburing/issues/643
Reported-by: Beld Zhang <beldzhang(a)gmail.com>
Signed-off-by: Jens Axboe <axboe(a)kernel.dk>
diff --git a/io_uring/rw.c b/io_uring/rw.c
index 1babd77da79c..1e18a44adcf5 100644
--- a/io_uring/rw.c
+++ b/io_uring/rw.c
@@ -206,6 +206,20 @@ static bool __io_complete_rw_common(struct io_kiocb *req, long res)
return false;
}
+static inline unsigned io_fixup_rw_res(struct io_kiocb *req, unsigned res)
+{
+ struct io_async_rw *io = req->async_data;
+
+ /* add previously done IO, if any */
+ if (req_has_async_data(req) && io->bytes_done > 0) {
+ if (res < 0)
+ res = io->bytes_done;
+ else
+ res += io->bytes_done;
+ }
+ return res;
+}
+
static void io_complete_rw(struct kiocb *kiocb, long res)
{
struct io_rw *rw = container_of(kiocb, struct io_rw, kiocb);
@@ -213,7 +227,7 @@ static void io_complete_rw(struct kiocb *kiocb, long res)
if (__io_complete_rw_common(req, res))
return;
- io_req_set_res(req, res, 0);
+ io_req_set_res(req, io_fixup_rw_res(req, res), 0);
req->io_task_work.func = io_req_task_complete;
io_req_task_work_add(req);
}
@@ -240,22 +254,14 @@ static void io_complete_rw_iopoll(struct kiocb *kiocb, long res)
static int kiocb_done(struct io_kiocb *req, ssize_t ret,
unsigned int issue_flags)
{
- struct io_async_rw *io = req->async_data;
struct io_rw *rw = io_kiocb_to_cmd(req, struct io_rw);
-
- /* add previously done IO, if any */
- if (req_has_async_data(req) && io->bytes_done > 0) {
- if (ret < 0)
- ret = io->bytes_done;
- else
- ret += io->bytes_done;
- }
+ unsigned final_ret = io_fixup_rw_res(req, ret);
if (req->flags & REQ_F_CUR_POS)
req->file->f_pos = rw->kiocb.ki_pos;
if (ret >= 0 && (rw->kiocb.ki_complete == io_complete_rw)) {
if (!__io_complete_rw_common(req, ret)) {
- io_req_set_res(req, req->cqe.res,
+ io_req_set_res(req, final_ret,
io_put_kbuf(req, issue_flags));
return IOU_OK;
}
@@ -268,7 +274,7 @@ static int kiocb_done(struct io_kiocb *req, ssize_t ret,
if (io_resubmit_prep(req))
io_req_task_queue_reissue(req);
else
- io_req_task_queue_fail(req, ret);
+ io_req_task_queue_fail(req, final_ret);
}
return IOU_ISSUE_SKIP_COMPLETE;
}
The patch below does not apply to the 5.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
Possible dependencies:
4d9cb92ca41d ("io_uring/rw: fix short rw error handling")
f2ccb5aed7bc ("io_uring: make io_kiocb_to_cmd() typesafe")
14b146b688ad ("io_uring: notification completion optimisation")
6a9ce66f4d08 ("io_uring/net: make page accounting more consistent")
2e32ba5607ee ("io_uring/net: checks errors of zc mem accounting")
3ff1a0d395c0 ("io_uring: enable managed frags with register buffers")
492dddb4f6e3 ("io_uring: add zc notification flush requests")
4379d5f15b3f ("io_uring: rename IORING_OP_FILES_UPDATE")
63809137ebb5 ("io_uring: flush notifiers after sendzc")
10c7d33ecd51 ("io_uring: sendzc with fixed buffers")
e29e3bd4b968 ("io_uring: account locked pages for non-fixed zc")
06a5464be84e ("io_uring: wire send zc request type")
bc24d6bd32df ("io_uring: add notification slot registration")
68ef5578efc8 ("io_uring: add rsrc referencing for notifiers")
e58d498e81ba ("io_uring: complete notifiers in tw")
eb4a299b2f95 ("io_uring: cache struct io_notif")
eb42cebb2cf2 ("io_uring: add zc notification infrastructure")
e70cb60893ca ("io_uring: export io_put_task()")
72c531f8ef30 ("net: copy from user before calling __get_compat_msghdr")
7fa875b8e53c ("net: copy from user before calling __copy_msghdr")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 4d9cb92ca41dd8e905a4569ceba4716c2f39c75a Mon Sep 17 00:00:00 2001
From: Pavel Begunkov <asml.silence(a)gmail.com>
Date: Fri, 9 Sep 2022 12:11:49 +0100
Subject: [PATCH] io_uring/rw: fix short rw error handling
We have a couple of problems, first reports of unexpected link breakage
for reads when cqe->res indicates that the IO was done in full. The
reason here is partial IO with retries.
TL;DR; we compare the result in __io_complete_rw_common() against
req->cqe.res, but req->cqe.res doesn't store the full length but rather
the length left to be done. So, when we pass the full corrected result
via kiocb_done() -> __io_complete_rw_common(), it fails.
The second problem is that we don't try to correct res in
io_complete_rw(), which, for instance, might be a problem for O_DIRECT
but when a prefix of data was cached in the page cache. We also
definitely don't want to pass a corrected result into io_rw_done().
The fix here is to leave __io_complete_rw_common() alone, always pass
not corrected result into it and fix it up as the last step just before
actually finishing the I/O.
Cc: stable(a)vger.kernel.org
Signed-off-by: Pavel Begunkov <asml.silence(a)gmail.com>
Link: https://github.com/axboe/liburing/issues/643
Reported-by: Beld Zhang <beldzhang(a)gmail.com>
Signed-off-by: Jens Axboe <axboe(a)kernel.dk>
diff --git a/io_uring/rw.c b/io_uring/rw.c
index 1babd77da79c..1e18a44adcf5 100644
--- a/io_uring/rw.c
+++ b/io_uring/rw.c
@@ -206,6 +206,20 @@ static bool __io_complete_rw_common(struct io_kiocb *req, long res)
return false;
}
+static inline unsigned io_fixup_rw_res(struct io_kiocb *req, unsigned res)
+{
+ struct io_async_rw *io = req->async_data;
+
+ /* add previously done IO, if any */
+ if (req_has_async_data(req) && io->bytes_done > 0) {
+ if (res < 0)
+ res = io->bytes_done;
+ else
+ res += io->bytes_done;
+ }
+ return res;
+}
+
static void io_complete_rw(struct kiocb *kiocb, long res)
{
struct io_rw *rw = container_of(kiocb, struct io_rw, kiocb);
@@ -213,7 +227,7 @@ static void io_complete_rw(struct kiocb *kiocb, long res)
if (__io_complete_rw_common(req, res))
return;
- io_req_set_res(req, res, 0);
+ io_req_set_res(req, io_fixup_rw_res(req, res), 0);
req->io_task_work.func = io_req_task_complete;
io_req_task_work_add(req);
}
@@ -240,22 +254,14 @@ static void io_complete_rw_iopoll(struct kiocb *kiocb, long res)
static int kiocb_done(struct io_kiocb *req, ssize_t ret,
unsigned int issue_flags)
{
- struct io_async_rw *io = req->async_data;
struct io_rw *rw = io_kiocb_to_cmd(req, struct io_rw);
-
- /* add previously done IO, if any */
- if (req_has_async_data(req) && io->bytes_done > 0) {
- if (ret < 0)
- ret = io->bytes_done;
- else
- ret += io->bytes_done;
- }
+ unsigned final_ret = io_fixup_rw_res(req, ret);
if (req->flags & REQ_F_CUR_POS)
req->file->f_pos = rw->kiocb.ki_pos;
if (ret >= 0 && (rw->kiocb.ki_complete == io_complete_rw)) {
if (!__io_complete_rw_common(req, ret)) {
- io_req_set_res(req, req->cqe.res,
+ io_req_set_res(req, final_ret,
io_put_kbuf(req, issue_flags));
return IOU_OK;
}
@@ -268,7 +274,7 @@ static int kiocb_done(struct io_kiocb *req, ssize_t ret,
if (io_resubmit_prep(req))
io_req_task_queue_reissue(req);
else
- io_req_task_queue_fail(req, ret);
+ io_req_task_queue_fail(req, final_ret);
}
return IOU_ISSUE_SKIP_COMPLETE;
}
May I ask how you approach funding for your projects/business deals?
I'd love to discuss your current goals and see if we can help you or your clients get your desired funding for your businesses. We offer both secured and unsecured loans to investors
Let me know your thoughts.
Thanks,
Online Agent
Peter Melek
The patch titled
Subject: mm: bring back update_mmu_cache() to finish_fault()
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
mm-bring-back-update_mmu_cache-to-finish_fault.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Sergei Antonov <saproj(a)gmail.com>
Subject: mm: bring back update_mmu_cache() to finish_fault()
Date: Thu, 8 Sep 2022 23:48:09 +0300
Running this test program on ARMv4 a few times (sometimes just once)
reproduces the bug.
int main()
{
unsigned i;
char paragon[SIZE];
void* ptr;
memset(paragon, 0xAA, SIZE);
ptr = mmap(NULL, SIZE, PROT_READ | PROT_WRITE,
MAP_ANON | MAP_SHARED, -1, 0);
if (ptr == MAP_FAILED) return 1;
printf("ptr = %p\n", ptr);
for (i=0;i<10000;i++){
memset(ptr, 0xAA, SIZE);
if (memcmp(ptr, paragon, SIZE)) {
printf("Unexpected bytes on iteration %u!!!\n", i);
break;
}
}
munmap(ptr, SIZE);
}
In the "ptr" buffer there appear runs of zero bytes which are aligned
by 16 and their lengths are multiple of 16.
Linux v5.11 does not have the bug, "git bisect" finds the first bad commit:
f9ce0be71d1f ("mm: Cleanup faultaround and finish_fault() codepaths")
Before the commit update_mmu_cache() was called during a call to
filemap_map_pages() as well as finish_fault(). After the commit
finish_fault() lacks it.
Bring back update_mmu_cache() to finish_fault() to fix the bug.
Also call update_mmu_tlb() only when returning VM_FAULT_NOPAGE to more
closely reproduce the code of alloc_set_pte() function that existed before
the commit.
On many platforms update_mmu_cache() is nop:
x86, see arch/x86/include/asm/pgtable
ARMv6+, see arch/arm/include/asm/tlbflush.h
So, it seems, few users ran into this bug.
Link: https://lkml.kernel.org/r/20220908204809.2012451-1-saproj@gmail.com
Fixes: f9ce0be71d1f ("mm: Cleanup faultaround and finish_fault() codepaths")
Signed-off-by: Sergei Antonov <saproj(a)gmail.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov(a)linux.intel.com>
Cc: Will Deacon <will(a)kernel.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/memory.c | 14 ++++++++++----
1 file changed, 10 insertions(+), 4 deletions(-)
--- a/mm/memory.c~mm-bring-back-update_mmu_cache-to-finish_fault
+++ a/mm/memory.c
@@ -4386,14 +4386,20 @@ vm_fault_t finish_fault(struct vm_fault
vmf->pte = pte_offset_map_lock(vma->vm_mm, vmf->pmd,
vmf->address, &vmf->ptl);
- ret = 0;
+
/* Re-check under ptl */
- if (likely(!vmf_pte_changed(vmf)))
+ if (likely(!vmf_pte_changed(vmf))) {
do_set_pte(vmf, page, vmf->address);
- else
+
+ /* no need to invalidate: a not-present page won't be cached */
+ update_mmu_cache(vma, vmf->address, vmf->pte);
+
+ ret = 0;
+ } else {
+ update_mmu_tlb(vma, vmf->address, vmf->pte);
ret = VM_FAULT_NOPAGE;
+ }
- update_mmu_tlb(vma, vmf->address, vmf->pte);
pte_unmap_unlock(vmf->pte, vmf->ptl);
return ret;
}
_
Patches currently in -mm which might be from saproj(a)gmail.com are
mm-bring-back-update_mmu_cache-to-finish_fault.patch
The patch titled
Subject: frontswap: don't call ->init if no ops are registered
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
frontswap-dont-call-init-if-no-ops-are-registered.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Christoph Hellwig <hch(a)lst.de>
Subject: frontswap: don't call ->init if no ops are registered
Date: Fri, 9 Sep 2022 15:08:29 +0200
If no frontswap module (i.e. zswap) was registered, frontswap_ops will be
NULL. In such situation, swapon crashes with the following stack trace:
Unable to handle kernel access to user memory outside uaccess routines at virtual address 0000000000000000
Mem abort info:
ESR = 0x0000000096000004
EC = 0x25: DABT (current EL), IL = 32 bits
SET = 0, FnV = 0
EA = 0, S1PTW = 0
FSC = 0x04: level 0 translation fault
Data abort info:
ISV = 0, ISS = 0x00000004
CM = 0, WnR = 0
user pgtable: 4k pages, 48-bit VAs, pgdp=00000020a4fab000
[0000000000000000] pgd=0000000000000000, p4d=0000000000000000
Internal error: Oops: 96000004 [#1] SMP
Modules linked in: zram fsl_dpaa2_eth pcs_lynx phylink ahci_qoriq crct10dif_ce ghash_ce sbsa_gwdt fsl_mc_dpio nvme lm90 nvme_core at803x xhci_plat_hcd rtc_fsl_ftm_alarm xgmac_mdio ahci_platform i2c_imx ip6_tables ip_tables fuse
Unloaded tainted modules: cppc_cpufreq():1
CPU: 10 PID: 761 Comm: swapon Not tainted 6.0.0-rc2-00454-g22100432cf14 #1
Hardware name: SolidRun Ltd. SolidRun CEX7 Platform, BIOS EDK II Jun 21 2022
pstate: 00400005 (nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
pc : frontswap_init+0x38/0x60
lr : __do_sys_swapon+0x8a8/0x9f4
sp : ffff80000969bcf0
x29: ffff80000969bcf0 x28: ffff37bee0d8fc00 x27: ffff80000a7f5000
x26: fffffcdefb971e80 x25: ffffaba797453b90 x24: 0000000000000064
x23: ffff37c1f209d1a8 x22: ffff37bee880e000 x21: ffffaba797748560
x20: ffff37bee0d8fce4 x19: ffffaba797748488 x18: 0000000000000014
x17: 0000000030ec029a x16: ffffaba795a479b0 x15: 0000000000000000
x14: 0000000000000000 x13: 0000000000000030 x12: 0000000000000001
x11: ffff37c63c0aba18 x10: 0000000000000000 x9 : ffffaba7956b8c88
x8 : ffff80000969bcd0 x7 : 0000000000000000 x6 : 0000000000000000
x5 : 0000000000000001 x4 : 0000000000000000 x3 : ffffaba79730f000
x2 : ffff37bee0d8fc00 x1 : 0000000000000000 x0 : 0000000000000000
Call trace:
frontswap_init+0x38/0x60
__do_sys_swapon+0x8a8/0x9f4
__arm64_sys_swapon+0x28/0x3c
invoke_syscall+0x78/0x100
el0_svc_common.constprop.0+0xd4/0xf4
do_el0_svc+0x38/0x4c
el0_svc+0x34/0x10c
el0t_64_sync_handler+0x11c/0x150
el0t_64_sync+0x190/0x194
Code: d000e283 910003fd f9006c41 f946d461 (f9400021)
---[ end trace 0000000000000000 ]---
Link: https://lkml.kernel.org/r/20220909130829.3262926-1-hch@lst.de
Fixes: 1da0d94a3ec8 ("frontswap: remove support for multiple ops")
Reported-by: Nathan Chancellor <nathan(a)kernel.org>
Signed-off-by: Liu Shixin <liushixin2(a)huawei.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk(a)oracle.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/frontswap.c | 3 +++
1 file changed, 3 insertions(+)
--- a/mm/frontswap.c~frontswap-dont-call-init-if-no-ops-are-registered
+++ a/mm/frontswap.c
@@ -125,6 +125,9 @@ void frontswap_init(unsigned type, unsig
* p->frontswap set to something valid to work properly.
*/
frontswap_map_set(sis, map);
+
+ if (!frontswap_enabled())
+ return;
frontswap_ops->init(type);
}
_
Patches currently in -mm which might be from hch(a)lst.de are
frontswap-dont-call-init-if-no-ops-are-registered.patch
mm-remove-the-end_write_func-argument-to-__swap_writepage.patch
On Thu, Sep 08, 2022 at 04:42:38PM +0000, Lazar, Lijo wrote:
> I am not sure if ASPM settings can be generalized by PCIE core.
> Performance vs Power savings when ASPM is enabled will require some
> additional tuning and that will be device specific.
Can you elaborate on this? In the universe of drivers, very few do
their own ASPM configuration, and it's usually to work around hardware
defects, e.g., L1 doesn't work on some e1000e devices, L0s doesn't
work on some iwlwifi devices, etc.
The core does know how to configure all the ASPM features defined in
the PCIe spec, e.g., L0s, L1, L1.1, L1.2, and LTR.
> In some of the other ASICs, this programming is done in VBIOS/SBIOS
> firmware. Having it in driver provides the advantage of additional
> tuning without forcing a VBIOS upgrade.
I think it's clearly the intent of the PCIe spec that ASPM
configuration be done by generic code. Here are some things that
require a system-level view, not just an individual device view:
- L0s, L1, and L1 Substates cannot be enabled unless both ends
support it (PCIe r6.0, secs 5.4.1.4, 7.5.3.7, 5.5.4).
- Devices advertise the "Acceptable Latency" they can accept for
transitions from L0s or L1 to L0, and the actual latency depends
on the "Exit Latencies" of all the devices in the path to the Root
Port (sec 5.4.1.3.2).
- LTR (required by L1.2) cannot be enabled unless it is already
enabled in all upstream devices (sec 6.18). This patch relies on
"ltr_path", which works now but relies on the PCI core never
reconfiguring the upstream path.
There might be amdgpu-specific features the driver needs to set up,
but if drivers fiddle with architected features like LTR behind the
PCI core's back, things are likely to break.
> From: Alex Deucher <alexdeucher(a)gmail.com>
> On Thu, Sep 8, 2022 at 12:12 PM Bjorn Helgaas <helgaas(a)kernel.org> wrote:
> > Do you know why the driver configures ASPM itself? If the PCI core is
> > doing something wrong (and I'm sure it is, ASPM support is kind of a
> > mess), I'd much prefer to fix up the core where *all* drivers can
> > benefit from it.
>
> This is the programming sequence we get from our hardware team and it
> is used on both windows and Linux. As far as I understand it windows
> doesn't handle this in the core, it's up to the individual drivers to
> enable it. I'm not familiar with how this should be enabled
> generically, but at least for our hardware, it seems to have some
> variation compared to what is done in the PCI core due to stability,
> etc. It seems to me that this may need asic specific implementations
> for a lot of hardware depending on the required programming sequences.
> E.g., various asics may need hardware workaround for bugs or platform
> issues, etc. I can ask for more details from our hardware team.
If the PCI core has stability issues, I want to fix them. This
hardware may have its own stability issues, and I would ideally like
to have drivers use interfaces like pci_disable_link_state() to avoid
broken things. Maybe we need new interfaces for more subtle kinds of
breakage.
Bjorn
From: Xiu Jianfeng <xiujianfeng(a)huawei.com>
[ Upstream commit 51dd64bb99e4478fc5280171acd8e1b529eadaf7 ]
This reverts commit ccf11dbaa07b328fa469415c362d33459c140a37.
Commit ccf11dbaa07b ("evm: Fix memleak in init_desc") said there is
memleak in init_desc. That may be incorrect, as we can see, tmp_tfm is
saved in one of the two global variables hmac_tfm or evm_tfm[hash_algo],
then if init_desc is called next time, there is no need to alloc tfm
again, so in the error path of kmalloc desc or crypto_shash_init(desc),
It is not a problem without freeing tmp_tfm.
And also that commit did not reset the global variable to NULL after
freeing tmp_tfm and this makes *tfm a dangling pointer which may cause a
UAF issue.
Reported-by: Guozihua (Scott) <guozihua(a)huawei.com>
Signed-off-by: Xiu Jianfeng <xiujianfeng(a)huawei.com>
Signed-off-by: Mimi Zohar <zohar(a)linux.ibm.com>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
security/integrity/evm/evm_crypto.c | 7 ++-----
1 file changed, 2 insertions(+), 5 deletions(-)
diff --git a/security/integrity/evm/evm_crypto.c b/security/integrity/evm/evm_crypto.c
index 25dac691491b..ee6bd945f3d6 100644
--- a/security/integrity/evm/evm_crypto.c
+++ b/security/integrity/evm/evm_crypto.c
@@ -75,7 +75,7 @@ static struct shash_desc *init_desc(char type, uint8_t hash_algo)
{
long rc;
const char *algo;
- struct crypto_shash **tfm, *tmp_tfm = NULL;
+ struct crypto_shash **tfm, *tmp_tfm;
struct shash_desc *desc;
if (type == EVM_XATTR_HMAC) {
@@ -120,16 +120,13 @@ static struct shash_desc *init_desc(char type, uint8_t hash_algo)
alloc:
desc = kmalloc(sizeof(*desc) + crypto_shash_descsize(*tfm),
GFP_KERNEL);
- if (!desc) {
- crypto_free_shash(tmp_tfm);
+ if (!desc)
return ERR_PTR(-ENOMEM);
- }
desc->tfm = *tfm;
rc = crypto_shash_init(desc);
if (rc) {
- crypto_free_shash(tmp_tfm);
kfree(desc);
return ERR_PTR(rc);
}
--
2.35.1
The patch below does not apply to the 4.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
Possible dependencies:
873aefb376bb ("vfio/type1: Unpin zero pages")
4b6c33b32296 ("vfio/type1: Prepare for batched pinning with struct vfio_batch")
be16c1fd99f4 ("vfio/type1: Change success value of vaddr_get_pfn()")
aae7a75a821a ("vfio/type1: Add proper error unwind for vfio_iommu_replay()")
64019a2e467a ("mm/gup: remove task_struct pointer for all gup code")
bce617edecad ("mm: do page fault accounting in handle_mm_fault")
ed03d924587e ("mm/gup: use a standard migration target allocation callback")
bbe88753bd42 ("mm/hugetlb: make hugetlb migration callback CMA aware")
41b4dc14ee80 ("mm/gup: restrict CMA region by using allocation scope API")
19fc7bed252c ("mm/migrate: introduce a standard migration target allocation function")
d92bbc2719bd ("mm/hugetlb: unify migration callbacks")
b4b382238ed2 ("mm/migrate: move migration helper from .h to .c")
c7073bab5772 ("mm/page_isolation: prefer the node of the source page")
3e4e28c5a8f0 ("mmap locking API: convert mmap_sem API comments")
d8ed45c5dcd4 ("mmap locking API: use coccinelle to convert mmap_sem rwsem call sites")
ca5999fde0a1 ("mm: introduce include/linux/pgtable.h")
420c2091b65a ("mm/gup: introduce pin_user_pages_locked()")
5a36f0f3f518 ("Merge tag 'vfio-v5.8-rc1' of git://github.com/awilliam/linux-vfio")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 873aefb376bbc0ed1dd2381ea1d6ec88106fdbd4 Mon Sep 17 00:00:00 2001
From: Alex Williamson <alex.williamson(a)redhat.com>
Date: Mon, 29 Aug 2022 21:05:40 -0600
Subject: [PATCH] vfio/type1: Unpin zero pages
There's currently a reference count leak on the zero page. We increment
the reference via pin_user_pages_remote(), but the page is later handled
as an invalid/reserved page, therefore it's not accounted against the
user and not unpinned by our put_pfn().
Introducing special zero page handling in put_pfn() would resolve the
leak, but without accounting of the zero page, a single user could
still create enough mappings to generate a reference count overflow.
The zero page is always resident, so for our purposes there's no reason
to keep it pinned. Therefore, add a loop to walk pages returned from
pin_user_pages_remote() and unpin any zero pages.
Cc: stable(a)vger.kernel.org
Reported-by: Luboslav Pivarc <lpivarc(a)redhat.com>
Reviewed-by: David Hildenbrand <david(a)redhat.com>
Link: https://lore.kernel.org/r/166182871735.3518559.8884121293045337358.stgit@om…
Signed-off-by: Alex Williamson <alex.williamson(a)redhat.com>
diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index db516c90a977..8706482665d1 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -558,6 +558,18 @@ static int vaddr_get_pfns(struct mm_struct *mm, unsigned long vaddr,
ret = pin_user_pages_remote(mm, vaddr, npages, flags | FOLL_LONGTERM,
pages, NULL, NULL);
if (ret > 0) {
+ int i;
+
+ /*
+ * The zero page is always resident, we don't need to pin it
+ * and it falls into our invalid/reserved test so we don't
+ * unpin in put_pfn(). Unpin all zero pages in the batch here.
+ */
+ for (i = 0 ; i < ret; i++) {
+ if (unlikely(is_zero_pfn(page_to_pfn(pages[i]))))
+ unpin_user_page(pages[i]);
+ }
+
*pfn = page_to_pfn(pages[0]);
goto done;
}
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
Possible dependencies:
873aefb376bb ("vfio/type1: Unpin zero pages")
4b6c33b32296 ("vfio/type1: Prepare for batched pinning with struct vfio_batch")
be16c1fd99f4 ("vfio/type1: Change success value of vaddr_get_pfn()")
aae7a75a821a ("vfio/type1: Add proper error unwind for vfio_iommu_replay()")
64019a2e467a ("mm/gup: remove task_struct pointer for all gup code")
bce617edecad ("mm: do page fault accounting in handle_mm_fault")
ed03d924587e ("mm/gup: use a standard migration target allocation callback")
bbe88753bd42 ("mm/hugetlb: make hugetlb migration callback CMA aware")
41b4dc14ee80 ("mm/gup: restrict CMA region by using allocation scope API")
19fc7bed252c ("mm/migrate: introduce a standard migration target allocation function")
d92bbc2719bd ("mm/hugetlb: unify migration callbacks")
b4b382238ed2 ("mm/migrate: move migration helper from .h to .c")
c7073bab5772 ("mm/page_isolation: prefer the node of the source page")
3e4e28c5a8f0 ("mmap locking API: convert mmap_sem API comments")
d8ed45c5dcd4 ("mmap locking API: use coccinelle to convert mmap_sem rwsem call sites")
ca5999fde0a1 ("mm: introduce include/linux/pgtable.h")
420c2091b65a ("mm/gup: introduce pin_user_pages_locked()")
5a36f0f3f518 ("Merge tag 'vfio-v5.8-rc1' of git://github.com/awilliam/linux-vfio")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 873aefb376bbc0ed1dd2381ea1d6ec88106fdbd4 Mon Sep 17 00:00:00 2001
From: Alex Williamson <alex.williamson(a)redhat.com>
Date: Mon, 29 Aug 2022 21:05:40 -0600
Subject: [PATCH] vfio/type1: Unpin zero pages
There's currently a reference count leak on the zero page. We increment
the reference via pin_user_pages_remote(), but the page is later handled
as an invalid/reserved page, therefore it's not accounted against the
user and not unpinned by our put_pfn().
Introducing special zero page handling in put_pfn() would resolve the
leak, but without accounting of the zero page, a single user could
still create enough mappings to generate a reference count overflow.
The zero page is always resident, so for our purposes there's no reason
to keep it pinned. Therefore, add a loop to walk pages returned from
pin_user_pages_remote() and unpin any zero pages.
Cc: stable(a)vger.kernel.org
Reported-by: Luboslav Pivarc <lpivarc(a)redhat.com>
Reviewed-by: David Hildenbrand <david(a)redhat.com>
Link: https://lore.kernel.org/r/166182871735.3518559.8884121293045337358.stgit@om…
Signed-off-by: Alex Williamson <alex.williamson(a)redhat.com>
diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index db516c90a977..8706482665d1 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -558,6 +558,18 @@ static int vaddr_get_pfns(struct mm_struct *mm, unsigned long vaddr,
ret = pin_user_pages_remote(mm, vaddr, npages, flags | FOLL_LONGTERM,
pages, NULL, NULL);
if (ret > 0) {
+ int i;
+
+ /*
+ * The zero page is always resident, we don't need to pin it
+ * and it falls into our invalid/reserved test so we don't
+ * unpin in put_pfn(). Unpin all zero pages in the batch here.
+ */
+ for (i = 0 ; i < ret; i++) {
+ if (unlikely(is_zero_pfn(page_to_pfn(pages[i]))))
+ unpin_user_page(pages[i]);
+ }
+
*pfn = page_to_pfn(pages[0]);
goto done;
}
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
Possible dependencies:
873aefb376bb ("vfio/type1: Unpin zero pages")
4b6c33b32296 ("vfio/type1: Prepare for batched pinning with struct vfio_batch")
be16c1fd99f4 ("vfio/type1: Change success value of vaddr_get_pfn()")
aae7a75a821a ("vfio/type1: Add proper error unwind for vfio_iommu_replay()")
64019a2e467a ("mm/gup: remove task_struct pointer for all gup code")
bce617edecad ("mm: do page fault accounting in handle_mm_fault")
ed03d924587e ("mm/gup: use a standard migration target allocation callback")
bbe88753bd42 ("mm/hugetlb: make hugetlb migration callback CMA aware")
41b4dc14ee80 ("mm/gup: restrict CMA region by using allocation scope API")
19fc7bed252c ("mm/migrate: introduce a standard migration target allocation function")
d92bbc2719bd ("mm/hugetlb: unify migration callbacks")
b4b382238ed2 ("mm/migrate: move migration helper from .h to .c")
c7073bab5772 ("mm/page_isolation: prefer the node of the source page")
3e4e28c5a8f0 ("mmap locking API: convert mmap_sem API comments")
d8ed45c5dcd4 ("mmap locking API: use coccinelle to convert mmap_sem rwsem call sites")
ca5999fde0a1 ("mm: introduce include/linux/pgtable.h")
420c2091b65a ("mm/gup: introduce pin_user_pages_locked()")
5a36f0f3f518 ("Merge tag 'vfio-v5.8-rc1' of git://github.com/awilliam/linux-vfio")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 873aefb376bbc0ed1dd2381ea1d6ec88106fdbd4 Mon Sep 17 00:00:00 2001
From: Alex Williamson <alex.williamson(a)redhat.com>
Date: Mon, 29 Aug 2022 21:05:40 -0600
Subject: [PATCH] vfio/type1: Unpin zero pages
There's currently a reference count leak on the zero page. We increment
the reference via pin_user_pages_remote(), but the page is later handled
as an invalid/reserved page, therefore it's not accounted against the
user and not unpinned by our put_pfn().
Introducing special zero page handling in put_pfn() would resolve the
leak, but without accounting of the zero page, a single user could
still create enough mappings to generate a reference count overflow.
The zero page is always resident, so for our purposes there's no reason
to keep it pinned. Therefore, add a loop to walk pages returned from
pin_user_pages_remote() and unpin any zero pages.
Cc: stable(a)vger.kernel.org
Reported-by: Luboslav Pivarc <lpivarc(a)redhat.com>
Reviewed-by: David Hildenbrand <david(a)redhat.com>
Link: https://lore.kernel.org/r/166182871735.3518559.8884121293045337358.stgit@om…
Signed-off-by: Alex Williamson <alex.williamson(a)redhat.com>
diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index db516c90a977..8706482665d1 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -558,6 +558,18 @@ static int vaddr_get_pfns(struct mm_struct *mm, unsigned long vaddr,
ret = pin_user_pages_remote(mm, vaddr, npages, flags | FOLL_LONGTERM,
pages, NULL, NULL);
if (ret > 0) {
+ int i;
+
+ /*
+ * The zero page is always resident, we don't need to pin it
+ * and it falls into our invalid/reserved test so we don't
+ * unpin in put_pfn(). Unpin all zero pages in the batch here.
+ */
+ for (i = 0 ; i < ret; i++) {
+ if (unlikely(is_zero_pfn(page_to_pfn(pages[i]))))
+ unpin_user_page(pages[i]);
+ }
+
*pfn = page_to_pfn(pages[0]);
goto done;
}
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
Possible dependencies:
873aefb376bb ("vfio/type1: Unpin zero pages")
4b6c33b32296 ("vfio/type1: Prepare for batched pinning with struct vfio_batch")
be16c1fd99f4 ("vfio/type1: Change success value of vaddr_get_pfn()")
aae7a75a821a ("vfio/type1: Add proper error unwind for vfio_iommu_replay()")
64019a2e467a ("mm/gup: remove task_struct pointer for all gup code")
bce617edecad ("mm: do page fault accounting in handle_mm_fault")
ed03d924587e ("mm/gup: use a standard migration target allocation callback")
bbe88753bd42 ("mm/hugetlb: make hugetlb migration callback CMA aware")
41b4dc14ee80 ("mm/gup: restrict CMA region by using allocation scope API")
19fc7bed252c ("mm/migrate: introduce a standard migration target allocation function")
d92bbc2719bd ("mm/hugetlb: unify migration callbacks")
b4b382238ed2 ("mm/migrate: move migration helper from .h to .c")
c7073bab5772 ("mm/page_isolation: prefer the node of the source page")
3e4e28c5a8f0 ("mmap locking API: convert mmap_sem API comments")
d8ed45c5dcd4 ("mmap locking API: use coccinelle to convert mmap_sem rwsem call sites")
ca5999fde0a1 ("mm: introduce include/linux/pgtable.h")
420c2091b65a ("mm/gup: introduce pin_user_pages_locked()")
5a36f0f3f518 ("Merge tag 'vfio-v5.8-rc1' of git://github.com/awilliam/linux-vfio")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 873aefb376bbc0ed1dd2381ea1d6ec88106fdbd4 Mon Sep 17 00:00:00 2001
From: Alex Williamson <alex.williamson(a)redhat.com>
Date: Mon, 29 Aug 2022 21:05:40 -0600
Subject: [PATCH] vfio/type1: Unpin zero pages
There's currently a reference count leak on the zero page. We increment
the reference via pin_user_pages_remote(), but the page is later handled
as an invalid/reserved page, therefore it's not accounted against the
user and not unpinned by our put_pfn().
Introducing special zero page handling in put_pfn() would resolve the
leak, but without accounting of the zero page, a single user could
still create enough mappings to generate a reference count overflow.
The zero page is always resident, so for our purposes there's no reason
to keep it pinned. Therefore, add a loop to walk pages returned from
pin_user_pages_remote() and unpin any zero pages.
Cc: stable(a)vger.kernel.org
Reported-by: Luboslav Pivarc <lpivarc(a)redhat.com>
Reviewed-by: David Hildenbrand <david(a)redhat.com>
Link: https://lore.kernel.org/r/166182871735.3518559.8884121293045337358.stgit@om…
Signed-off-by: Alex Williamson <alex.williamson(a)redhat.com>
diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index db516c90a977..8706482665d1 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -558,6 +558,18 @@ static int vaddr_get_pfns(struct mm_struct *mm, unsigned long vaddr,
ret = pin_user_pages_remote(mm, vaddr, npages, flags | FOLL_LONGTERM,
pages, NULL, NULL);
if (ret > 0) {
+ int i;
+
+ /*
+ * The zero page is always resident, we don't need to pin it
+ * and it falls into our invalid/reserved test so we don't
+ * unpin in put_pfn(). Unpin all zero pages in the batch here.
+ */
+ for (i = 0 ; i < ret; i++) {
+ if (unlikely(is_zero_pfn(page_to_pfn(pages[i]))))
+ unpin_user_page(pages[i]);
+ }
+
*pfn = page_to_pfn(pages[0]);
goto done;
}
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
Possible dependencies:
873aefb376bb ("vfio/type1: Unpin zero pages")
4b6c33b32296 ("vfio/type1: Prepare for batched pinning with struct vfio_batch")
be16c1fd99f4 ("vfio/type1: Change success value of vaddr_get_pfn()")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 873aefb376bbc0ed1dd2381ea1d6ec88106fdbd4 Mon Sep 17 00:00:00 2001
From: Alex Williamson <alex.williamson(a)redhat.com>
Date: Mon, 29 Aug 2022 21:05:40 -0600
Subject: [PATCH] vfio/type1: Unpin zero pages
There's currently a reference count leak on the zero page. We increment
the reference via pin_user_pages_remote(), but the page is later handled
as an invalid/reserved page, therefore it's not accounted against the
user and not unpinned by our put_pfn().
Introducing special zero page handling in put_pfn() would resolve the
leak, but without accounting of the zero page, a single user could
still create enough mappings to generate a reference count overflow.
The zero page is always resident, so for our purposes there's no reason
to keep it pinned. Therefore, add a loop to walk pages returned from
pin_user_pages_remote() and unpin any zero pages.
Cc: stable(a)vger.kernel.org
Reported-by: Luboslav Pivarc <lpivarc(a)redhat.com>
Reviewed-by: David Hildenbrand <david(a)redhat.com>
Link: https://lore.kernel.org/r/166182871735.3518559.8884121293045337358.stgit@om…
Signed-off-by: Alex Williamson <alex.williamson(a)redhat.com>
diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index db516c90a977..8706482665d1 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -558,6 +558,18 @@ static int vaddr_get_pfns(struct mm_struct *mm, unsigned long vaddr,
ret = pin_user_pages_remote(mm, vaddr, npages, flags | FOLL_LONGTERM,
pages, NULL, NULL);
if (ret > 0) {
+ int i;
+
+ /*
+ * The zero page is always resident, we don't need to pin it
+ * and it falls into our invalid/reserved test so we don't
+ * unpin in put_pfn(). Unpin all zero pages in the batch here.
+ */
+ for (i = 0 ; i < ret; i++) {
+ if (unlikely(is_zero_pfn(page_to_pfn(pages[i]))))
+ unpin_user_page(pages[i]);
+ }
+
*pfn = page_to_pfn(pages[0]);
goto done;
}
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
Possible dependencies:
54c3931957f6 ("tracing: hold caller_addr to hardirq_{enable,disable}_ip")
8b023accc8df ("lockdep: Fix -Wunused-parameter for _THIS_IP_")
ef9989afda73 ("kvm: add guest_state_{enter,exit}_irqoff()")
ed922739c919 ("KVM: Use interval tree to do fast hva lookup in memslots")
26b8345abc75 ("KVM: Resolve memslot ID via a hash table instead of via a static array")
1e8617d37fc3 ("KVM: Move WARN on invalid memslot index to update_memslots()")
4e4d30cb9b87 ("KVM: Resync only arch fields when slots_arch_lock gets reacquired")
c5b077549136 ("KVM: Convert the kvm->vcpus array to a xarray")
27592ae8dbe4 ("KVM: Move wiping of the kvm->vcpus array to common code")
bda44d844758 ("KVM: Ensure local memslot copies operate on up-to-date arch-specific data")
99cdc6c18c2d ("RISC-V: Add initial skeletal KVM support")
192ad3c27a48 ("Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 54c3931957f6a6194d5972eccc36d052964b2abe Mon Sep 17 00:00:00 2001
From: Yipeng Zou <zouyipeng(a)huawei.com>
Date: Thu, 1 Sep 2022 18:45:14 +0800
Subject: [PATCH] tracing: hold caller_addr to hardirq_{enable,disable}_ip
Currently, The arguments passing to lockdep_hardirqs_{on,off} was fixed
in CALLER_ADDR0.
The function trace_hardirqs_on_caller should have been intended to use
caller_addr to represent the address that caller wants to be traced.
For example, lockdep log in riscv showing the last {enabled,disabled} at
__trace_hardirqs_{on,off} all the time(if called by):
[ 57.853175] hardirqs last enabled at (2519): __trace_hardirqs_on+0xc/0x14
[ 57.853848] hardirqs last disabled at (2520): __trace_hardirqs_off+0xc/0x14
After use trace_hardirqs_xx_caller, we can get more effective information:
[ 53.781428] hardirqs last enabled at (2595): restore_all+0xe/0x66
[ 53.782185] hardirqs last disabled at (2596): ret_from_exception+0xa/0x10
Link: https://lkml.kernel.org/r/20220901104515.135162-2-zouyipeng@huawei.com
Cc: stable(a)vger.kernel.org
Fixes: c3bc8fd637a96 ("tracing: Centralize preemptirq tracepoints and unify their usage")
Signed-off-by: Yipeng Zou <zouyipeng(a)huawei.com>
Signed-off-by: Steven Rostedt (Google) <rostedt(a)goodmis.org>
diff --git a/kernel/trace/trace_preemptirq.c b/kernel/trace/trace_preemptirq.c
index 95b58bd757ce..1e130da1b742 100644
--- a/kernel/trace/trace_preemptirq.c
+++ b/kernel/trace/trace_preemptirq.c
@@ -95,14 +95,14 @@ __visible void trace_hardirqs_on_caller(unsigned long caller_addr)
}
lockdep_hardirqs_on_prepare();
- lockdep_hardirqs_on(CALLER_ADDR0);
+ lockdep_hardirqs_on(caller_addr);
}
EXPORT_SYMBOL(trace_hardirqs_on_caller);
NOKPROBE_SYMBOL(trace_hardirqs_on_caller);
__visible void trace_hardirqs_off_caller(unsigned long caller_addr)
{
- lockdep_hardirqs_off(CALLER_ADDR0);
+ lockdep_hardirqs_off(caller_addr);
if (!this_cpu_read(tracing_irq_cpu)) {
this_cpu_write(tracing_irq_cpu, 1);
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
Possible dependencies:
54c3931957f6 ("tracing: hold caller_addr to hardirq_{enable,disable}_ip")
8b023accc8df ("lockdep: Fix -Wunused-parameter for _THIS_IP_")
ef9989afda73 ("kvm: add guest_state_{enter,exit}_irqoff()")
ed922739c919 ("KVM: Use interval tree to do fast hva lookup in memslots")
26b8345abc75 ("KVM: Resolve memslot ID via a hash table instead of via a static array")
1e8617d37fc3 ("KVM: Move WARN on invalid memslot index to update_memslots()")
4e4d30cb9b87 ("KVM: Resync only arch fields when slots_arch_lock gets reacquired")
c5b077549136 ("KVM: Convert the kvm->vcpus array to a xarray")
27592ae8dbe4 ("KVM: Move wiping of the kvm->vcpus array to common code")
bda44d844758 ("KVM: Ensure local memslot copies operate on up-to-date arch-specific data")
99cdc6c18c2d ("RISC-V: Add initial skeletal KVM support")
192ad3c27a48 ("Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 54c3931957f6a6194d5972eccc36d052964b2abe Mon Sep 17 00:00:00 2001
From: Yipeng Zou <zouyipeng(a)huawei.com>
Date: Thu, 1 Sep 2022 18:45:14 +0800
Subject: [PATCH] tracing: hold caller_addr to hardirq_{enable,disable}_ip
Currently, The arguments passing to lockdep_hardirqs_{on,off} was fixed
in CALLER_ADDR0.
The function trace_hardirqs_on_caller should have been intended to use
caller_addr to represent the address that caller wants to be traced.
For example, lockdep log in riscv showing the last {enabled,disabled} at
__trace_hardirqs_{on,off} all the time(if called by):
[ 57.853175] hardirqs last enabled at (2519): __trace_hardirqs_on+0xc/0x14
[ 57.853848] hardirqs last disabled at (2520): __trace_hardirqs_off+0xc/0x14
After use trace_hardirqs_xx_caller, we can get more effective information:
[ 53.781428] hardirqs last enabled at (2595): restore_all+0xe/0x66
[ 53.782185] hardirqs last disabled at (2596): ret_from_exception+0xa/0x10
Link: https://lkml.kernel.org/r/20220901104515.135162-2-zouyipeng@huawei.com
Cc: stable(a)vger.kernel.org
Fixes: c3bc8fd637a96 ("tracing: Centralize preemptirq tracepoints and unify their usage")
Signed-off-by: Yipeng Zou <zouyipeng(a)huawei.com>
Signed-off-by: Steven Rostedt (Google) <rostedt(a)goodmis.org>
diff --git a/kernel/trace/trace_preemptirq.c b/kernel/trace/trace_preemptirq.c
index 95b58bd757ce..1e130da1b742 100644
--- a/kernel/trace/trace_preemptirq.c
+++ b/kernel/trace/trace_preemptirq.c
@@ -95,14 +95,14 @@ __visible void trace_hardirqs_on_caller(unsigned long caller_addr)
}
lockdep_hardirqs_on_prepare();
- lockdep_hardirqs_on(CALLER_ADDR0);
+ lockdep_hardirqs_on(caller_addr);
}
EXPORT_SYMBOL(trace_hardirqs_on_caller);
NOKPROBE_SYMBOL(trace_hardirqs_on_caller);
__visible void trace_hardirqs_off_caller(unsigned long caller_addr)
{
- lockdep_hardirqs_off(CALLER_ADDR0);
+ lockdep_hardirqs_off(caller_addr);
if (!this_cpu_read(tracing_irq_cpu)) {
this_cpu_write(tracing_irq_cpu, 1);
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
Possible dependencies:
54c3931957f6 ("tracing: hold caller_addr to hardirq_{enable,disable}_ip")
8b023accc8df ("lockdep: Fix -Wunused-parameter for _THIS_IP_")
ef9989afda73 ("kvm: add guest_state_{enter,exit}_irqoff()")
ed922739c919 ("KVM: Use interval tree to do fast hva lookup in memslots")
26b8345abc75 ("KVM: Resolve memslot ID via a hash table instead of via a static array")
1e8617d37fc3 ("KVM: Move WARN on invalid memslot index to update_memslots()")
4e4d30cb9b87 ("KVM: Resync only arch fields when slots_arch_lock gets reacquired")
c5b077549136 ("KVM: Convert the kvm->vcpus array to a xarray")
27592ae8dbe4 ("KVM: Move wiping of the kvm->vcpus array to common code")
bda44d844758 ("KVM: Ensure local memslot copies operate on up-to-date arch-specific data")
99cdc6c18c2d ("RISC-V: Add initial skeletal KVM support")
192ad3c27a48 ("Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 54c3931957f6a6194d5972eccc36d052964b2abe Mon Sep 17 00:00:00 2001
From: Yipeng Zou <zouyipeng(a)huawei.com>
Date: Thu, 1 Sep 2022 18:45:14 +0800
Subject: [PATCH] tracing: hold caller_addr to hardirq_{enable,disable}_ip
Currently, The arguments passing to lockdep_hardirqs_{on,off} was fixed
in CALLER_ADDR0.
The function trace_hardirqs_on_caller should have been intended to use
caller_addr to represent the address that caller wants to be traced.
For example, lockdep log in riscv showing the last {enabled,disabled} at
__trace_hardirqs_{on,off} all the time(if called by):
[ 57.853175] hardirqs last enabled at (2519): __trace_hardirqs_on+0xc/0x14
[ 57.853848] hardirqs last disabled at (2520): __trace_hardirqs_off+0xc/0x14
After use trace_hardirqs_xx_caller, we can get more effective information:
[ 53.781428] hardirqs last enabled at (2595): restore_all+0xe/0x66
[ 53.782185] hardirqs last disabled at (2596): ret_from_exception+0xa/0x10
Link: https://lkml.kernel.org/r/20220901104515.135162-2-zouyipeng@huawei.com
Cc: stable(a)vger.kernel.org
Fixes: c3bc8fd637a96 ("tracing: Centralize preemptirq tracepoints and unify their usage")
Signed-off-by: Yipeng Zou <zouyipeng(a)huawei.com>
Signed-off-by: Steven Rostedt (Google) <rostedt(a)goodmis.org>
diff --git a/kernel/trace/trace_preemptirq.c b/kernel/trace/trace_preemptirq.c
index 95b58bd757ce..1e130da1b742 100644
--- a/kernel/trace/trace_preemptirq.c
+++ b/kernel/trace/trace_preemptirq.c
@@ -95,14 +95,14 @@ __visible void trace_hardirqs_on_caller(unsigned long caller_addr)
}
lockdep_hardirqs_on_prepare();
- lockdep_hardirqs_on(CALLER_ADDR0);
+ lockdep_hardirqs_on(caller_addr);
}
EXPORT_SYMBOL(trace_hardirqs_on_caller);
NOKPROBE_SYMBOL(trace_hardirqs_on_caller);
__visible void trace_hardirqs_off_caller(unsigned long caller_addr)
{
- lockdep_hardirqs_off(CALLER_ADDR0);
+ lockdep_hardirqs_off(caller_addr);
if (!this_cpu_read(tracing_irq_cpu)) {
this_cpu_write(tracing_irq_cpu, 1);
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
Possible dependencies:
54c3931957f6 ("tracing: hold caller_addr to hardirq_{enable,disable}_ip")
8b023accc8df ("lockdep: Fix -Wunused-parameter for _THIS_IP_")
ef9989afda73 ("kvm: add guest_state_{enter,exit}_irqoff()")
ed922739c919 ("KVM: Use interval tree to do fast hva lookup in memslots")
26b8345abc75 ("KVM: Resolve memslot ID via a hash table instead of via a static array")
1e8617d37fc3 ("KVM: Move WARN on invalid memslot index to update_memslots()")
4e4d30cb9b87 ("KVM: Resync only arch fields when slots_arch_lock gets reacquired")
c5b077549136 ("KVM: Convert the kvm->vcpus array to a xarray")
27592ae8dbe4 ("KVM: Move wiping of the kvm->vcpus array to common code")
bda44d844758 ("KVM: Ensure local memslot copies operate on up-to-date arch-specific data")
99cdc6c18c2d ("RISC-V: Add initial skeletal KVM support")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 54c3931957f6a6194d5972eccc36d052964b2abe Mon Sep 17 00:00:00 2001
From: Yipeng Zou <zouyipeng(a)huawei.com>
Date: Thu, 1 Sep 2022 18:45:14 +0800
Subject: [PATCH] tracing: hold caller_addr to hardirq_{enable,disable}_ip
Currently, The arguments passing to lockdep_hardirqs_{on,off} was fixed
in CALLER_ADDR0.
The function trace_hardirqs_on_caller should have been intended to use
caller_addr to represent the address that caller wants to be traced.
For example, lockdep log in riscv showing the last {enabled,disabled} at
__trace_hardirqs_{on,off} all the time(if called by):
[ 57.853175] hardirqs last enabled at (2519): __trace_hardirqs_on+0xc/0x14
[ 57.853848] hardirqs last disabled at (2520): __trace_hardirqs_off+0xc/0x14
After use trace_hardirqs_xx_caller, we can get more effective information:
[ 53.781428] hardirqs last enabled at (2595): restore_all+0xe/0x66
[ 53.782185] hardirqs last disabled at (2596): ret_from_exception+0xa/0x10
Link: https://lkml.kernel.org/r/20220901104515.135162-2-zouyipeng@huawei.com
Cc: stable(a)vger.kernel.org
Fixes: c3bc8fd637a96 ("tracing: Centralize preemptirq tracepoints and unify their usage")
Signed-off-by: Yipeng Zou <zouyipeng(a)huawei.com>
Signed-off-by: Steven Rostedt (Google) <rostedt(a)goodmis.org>
diff --git a/kernel/trace/trace_preemptirq.c b/kernel/trace/trace_preemptirq.c
index 95b58bd757ce..1e130da1b742 100644
--- a/kernel/trace/trace_preemptirq.c
+++ b/kernel/trace/trace_preemptirq.c
@@ -95,14 +95,14 @@ __visible void trace_hardirqs_on_caller(unsigned long caller_addr)
}
lockdep_hardirqs_on_prepare();
- lockdep_hardirqs_on(CALLER_ADDR0);
+ lockdep_hardirqs_on(caller_addr);
}
EXPORT_SYMBOL(trace_hardirqs_on_caller);
NOKPROBE_SYMBOL(trace_hardirqs_on_caller);
__visible void trace_hardirqs_off_caller(unsigned long caller_addr)
{
- lockdep_hardirqs_off(CALLER_ADDR0);
+ lockdep_hardirqs_off(caller_addr);
if (!this_cpu_read(tracing_irq_cpu)) {
this_cpu_write(tracing_irq_cpu, 1);
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
Possible dependencies:
47311db8e8f3 ("tracefs: Only clobber mode/uid/gid on remount if asked")
851e99ebeec3 ("tracefs: Set the group ownership in apply_options() not parse_options()")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 47311db8e8f33011d90dee76b39c8886120cdda4 Mon Sep 17 00:00:00 2001
From: Brian Norris <briannorris(a)chromium.org>
Date: Fri, 26 Aug 2022 17:44:17 -0700
Subject: [PATCH] tracefs: Only clobber mode/uid/gid on remount if asked
Users may have explicitly configured their tracefs permissions; we
shouldn't overwrite those just because a second mount appeared.
Only clobber if the options were provided at mount time.
Note: the previous behavior was especially surprising in the presence of
automounted /sys/kernel/debug/tracing/.
Existing behavior:
## Pre-existing status: tracefs is 0755.
# stat -c '%A' /sys/kernel/tracing/
drwxr-xr-x
## (Re)trigger the automount.
# umount /sys/kernel/debug/tracing
# stat -c '%A' /sys/kernel/debug/tracing/.
drwx------
## Unexpected: the automount changed mode for other mount instances.
# stat -c '%A' /sys/kernel/tracing/
drwx------
New behavior (after this change):
## Pre-existing status: tracefs is 0755.
# stat -c '%A' /sys/kernel/tracing/
drwxr-xr-x
## (Re)trigger the automount.
# umount /sys/kernel/debug/tracing
# stat -c '%A' /sys/kernel/debug/tracing/.
drwxr-xr-x
## Expected: the automount does not change other mount instances.
# stat -c '%A' /sys/kernel/tracing/
drwxr-xr-x
Link: https://lkml.kernel.org/r/20220826174353.2.Iab6e5ea57963d6deca5311b27fb7226…
Cc: stable(a)vger.kernel.org
Fixes: 4282d60689d4f ("tracefs: Add new tracefs file system")
Signed-off-by: Brian Norris <briannorris(a)chromium.org>
Signed-off-by: Steven Rostedt (Google) <rostedt(a)goodmis.org>
diff --git a/fs/tracefs/inode.c b/fs/tracefs/inode.c
index 81d26abf486f..da85b3979195 100644
--- a/fs/tracefs/inode.c
+++ b/fs/tracefs/inode.c
@@ -141,6 +141,8 @@ struct tracefs_mount_opts {
kuid_t uid;
kgid_t gid;
umode_t mode;
+ /* Opt_* bitfield. */
+ unsigned int opts;
};
enum {
@@ -241,6 +243,7 @@ static int tracefs_parse_options(char *data, struct tracefs_mount_opts *opts)
kgid_t gid;
char *p;
+ opts->opts = 0;
opts->mode = TRACEFS_DEFAULT_MODE;
while ((p = strsep(&data, ",")) != NULL) {
@@ -275,24 +278,36 @@ static int tracefs_parse_options(char *data, struct tracefs_mount_opts *opts)
* but traditionally tracefs has ignored all mount options
*/
}
+
+ opts->opts |= BIT(token);
}
return 0;
}
-static int tracefs_apply_options(struct super_block *sb)
+static int tracefs_apply_options(struct super_block *sb, bool remount)
{
struct tracefs_fs_info *fsi = sb->s_fs_info;
struct inode *inode = d_inode(sb->s_root);
struct tracefs_mount_opts *opts = &fsi->mount_opts;
- inode->i_mode &= ~S_IALLUGO;
- inode->i_mode |= opts->mode;
+ /*
+ * On remount, only reset mode/uid/gid if they were provided as mount
+ * options.
+ */
+
+ if (!remount || opts->opts & BIT(Opt_mode)) {
+ inode->i_mode &= ~S_IALLUGO;
+ inode->i_mode |= opts->mode;
+ }
- inode->i_uid = opts->uid;
+ if (!remount || opts->opts & BIT(Opt_uid))
+ inode->i_uid = opts->uid;
- /* Set all the group ids to the mount option */
- set_gid(sb->s_root, opts->gid);
+ if (!remount || opts->opts & BIT(Opt_gid)) {
+ /* Set all the group ids to the mount option */
+ set_gid(sb->s_root, opts->gid);
+ }
return 0;
}
@@ -307,7 +322,7 @@ static int tracefs_remount(struct super_block *sb, int *flags, char *data)
if (err)
goto fail;
- tracefs_apply_options(sb);
+ tracefs_apply_options(sb, true);
fail:
return err;
@@ -359,7 +374,7 @@ static int trace_fill_super(struct super_block *sb, void *data, int silent)
sb->s_op = &tracefs_super_operations;
- tracefs_apply_options(sb);
+ tracefs_apply_options(sb, false);
return 0;
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
Possible dependencies:
47311db8e8f3 ("tracefs: Only clobber mode/uid/gid on remount if asked")
851e99ebeec3 ("tracefs: Set the group ownership in apply_options() not parse_options()")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 47311db8e8f33011d90dee76b39c8886120cdda4 Mon Sep 17 00:00:00 2001
From: Brian Norris <briannorris(a)chromium.org>
Date: Fri, 26 Aug 2022 17:44:17 -0700
Subject: [PATCH] tracefs: Only clobber mode/uid/gid on remount if asked
Users may have explicitly configured their tracefs permissions; we
shouldn't overwrite those just because a second mount appeared.
Only clobber if the options were provided at mount time.
Note: the previous behavior was especially surprising in the presence of
automounted /sys/kernel/debug/tracing/.
Existing behavior:
## Pre-existing status: tracefs is 0755.
# stat -c '%A' /sys/kernel/tracing/
drwxr-xr-x
## (Re)trigger the automount.
# umount /sys/kernel/debug/tracing
# stat -c '%A' /sys/kernel/debug/tracing/.
drwx------
## Unexpected: the automount changed mode for other mount instances.
# stat -c '%A' /sys/kernel/tracing/
drwx------
New behavior (after this change):
## Pre-existing status: tracefs is 0755.
# stat -c '%A' /sys/kernel/tracing/
drwxr-xr-x
## (Re)trigger the automount.
# umount /sys/kernel/debug/tracing
# stat -c '%A' /sys/kernel/debug/tracing/.
drwxr-xr-x
## Expected: the automount does not change other mount instances.
# stat -c '%A' /sys/kernel/tracing/
drwxr-xr-x
Link: https://lkml.kernel.org/r/20220826174353.2.Iab6e5ea57963d6deca5311b27fb7226…
Cc: stable(a)vger.kernel.org
Fixes: 4282d60689d4f ("tracefs: Add new tracefs file system")
Signed-off-by: Brian Norris <briannorris(a)chromium.org>
Signed-off-by: Steven Rostedt (Google) <rostedt(a)goodmis.org>
diff --git a/fs/tracefs/inode.c b/fs/tracefs/inode.c
index 81d26abf486f..da85b3979195 100644
--- a/fs/tracefs/inode.c
+++ b/fs/tracefs/inode.c
@@ -141,6 +141,8 @@ struct tracefs_mount_opts {
kuid_t uid;
kgid_t gid;
umode_t mode;
+ /* Opt_* bitfield. */
+ unsigned int opts;
};
enum {
@@ -241,6 +243,7 @@ static int tracefs_parse_options(char *data, struct tracefs_mount_opts *opts)
kgid_t gid;
char *p;
+ opts->opts = 0;
opts->mode = TRACEFS_DEFAULT_MODE;
while ((p = strsep(&data, ",")) != NULL) {
@@ -275,24 +278,36 @@ static int tracefs_parse_options(char *data, struct tracefs_mount_opts *opts)
* but traditionally tracefs has ignored all mount options
*/
}
+
+ opts->opts |= BIT(token);
}
return 0;
}
-static int tracefs_apply_options(struct super_block *sb)
+static int tracefs_apply_options(struct super_block *sb, bool remount)
{
struct tracefs_fs_info *fsi = sb->s_fs_info;
struct inode *inode = d_inode(sb->s_root);
struct tracefs_mount_opts *opts = &fsi->mount_opts;
- inode->i_mode &= ~S_IALLUGO;
- inode->i_mode |= opts->mode;
+ /*
+ * On remount, only reset mode/uid/gid if they were provided as mount
+ * options.
+ */
+
+ if (!remount || opts->opts & BIT(Opt_mode)) {
+ inode->i_mode &= ~S_IALLUGO;
+ inode->i_mode |= opts->mode;
+ }
- inode->i_uid = opts->uid;
+ if (!remount || opts->opts & BIT(Opt_uid))
+ inode->i_uid = opts->uid;
- /* Set all the group ids to the mount option */
- set_gid(sb->s_root, opts->gid);
+ if (!remount || opts->opts & BIT(Opt_gid)) {
+ /* Set all the group ids to the mount option */
+ set_gid(sb->s_root, opts->gid);
+ }
return 0;
}
@@ -307,7 +322,7 @@ static int tracefs_remount(struct super_block *sb, int *flags, char *data)
if (err)
goto fail;
- tracefs_apply_options(sb);
+ tracefs_apply_options(sb, true);
fail:
return err;
@@ -359,7 +374,7 @@ static int trace_fill_super(struct super_block *sb, void *data, int silent)
sb->s_op = &tracefs_super_operations;
- tracefs_apply_options(sb);
+ tracefs_apply_options(sb, false);
return 0;
This patch series backports a few VM preemption_status, steal_time and
PV TLB flushing fixes to 5.10 stable kernel.
Most of the changes backport cleanly except i had to work around a few
becauseof missing support/APIs in 5.10 kernel. I have captured those in
the changelog as well in the individual patches.
Changelog
- Use mark_page_dirty_in_slot api without kvm argument (KVM: x86: Fix
recording of guest steal time / preempted status)
- Avoid checking for xen_msr and SEV-ES conditions (KVM: x86:
do not set st->preempted when going back to user space)
- Use VCPU_STAT macro to expose preemption_reported and
preemption_other fields (KVM: x86: do not report a vCPU as preempted
outside instruction boundaries)
David Woodhouse (2):
KVM: x86: Fix recording of guest steal time / preempted status
KVM: Fix steal time asm constraints
Lai Jiangshan (1):
KVM: x86: Ensure PV TLB flush tracepoint reflects KVM behavior
Paolo Bonzini (5):
KVM: x86: do not set st->preempted when going back to user space
KVM: x86: do not report a vCPU as preempted outside instruction
boundaries
KVM: x86: revalidate steal time cache if MSR value changes
KVM: x86: do not report preemption if the steal time cache is stale
KVM: x86: move guest_pv_has out of user_access section
Sean Christopherson (1):
KVM: x86: Remove obsolete disabling of page faults in
kvm_arch_vcpu_put()
arch/x86/include/asm/kvm_host.h | 5 +-
arch/x86/kvm/svm/svm.c | 2 +
arch/x86/kvm/vmx/vmx.c | 1 +
arch/x86/kvm/x86.c | 164 ++++++++++++++++++++++----------
4 files changed, 122 insertions(+), 50 deletions(-)
--
2.37.1
Unsanitized pages trigger WARN_ON() unconditionally, which can panic the
whole computer, if /proc/sys/kernel/panic_on_warn is set.
In sgx_init(), if misc_register() fails or misc_register() succeeds but
neither sgx_drv_init() nor sgx_vepc_init() succeeds, then ksgxd will be
prematurely stopped. This may leave unsanitized pages, which will result a
false warning.
Refine __sgx_sanitize_pages() to return:
1. Zero when the sanitization process is complete or ksgxd has been
requested to stop.
2. The number of unsanitized pages otherwise.
Link: https://lore.kernel.org/linux-sgx/20220825051827.246698-1-jarkko@kernel.org…
Fixes: 51ab30eb2ad4 ("x86/sgx: Replace section->init_laundry_list with sgx_dirty_page_list")
Cc: stable(a)vger.kernel.org # v5.13+
Reported-by: Paul Menzel <pmenzel(a)molgen.mpg.de>
Signed-off-by: Jarkko Sakkinen <jarkko(a)kernel.org>
---
v9:
- Remove left_dirty declaratio from ksgxd().
- Update commit message.
v8:
- Discard changes that are not relevant for the stable fix. This
does absolutely minimum to address the bug:
https://lore.kernel.org/linux-sgx/a5fa56bdc57d6472a306bd8d795afc674b724538.…
v7:
- Rewrote commit message.
- Do not return -ECANCELED on premature stop. Instead use zero both
premature stop and complete sanitization.
v6:
- Address Reinette's feedback:
https://lore.kernel.org/linux-sgx/Yw6%2FiTzSdSw%2FY%2FVO@kernel.org/
v5:
- Add the klog dump and sysctl option to the commit message.
v4:
- Explain expectations for dirty_page_list in the function header, instead
of an inline comment.
- Improve commit message to explain the conditions better.
- Return the number of pages left dirty to ksgxd() and print warning after
the 2nd call, if there are any.
v3:
- Remove WARN_ON().
- Tuned comments and the commit message a bit.
v2:
- Replaced WARN_ON() with optional pr_info() inside
__sgx_sanitize_pages().
- Rewrote the commit message.
- Added the fixes tag.
---
arch/x86/kernel/cpu/sgx/main.c | 15 +++++++++------
1 file changed, 9 insertions(+), 6 deletions(-)
diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c
index 515e2a5f25bb..0aad028f04d4 100644
--- a/arch/x86/kernel/cpu/sgx/main.c
+++ b/arch/x86/kernel/cpu/sgx/main.c
@@ -49,9 +49,13 @@ static LIST_HEAD(sgx_dirty_page_list);
* Reset post-kexec EPC pages to the uninitialized state. The pages are removed
* from the input list, and made available for the page allocator. SECS pages
* prepending their children in the input list are left intact.
+ *
+ * Return 0 when sanitization was successful or kthread was stopped, and the
+ * number of unsanitized pages otherwise.
*/
-static void __sgx_sanitize_pages(struct list_head *dirty_page_list)
+static unsigned long __sgx_sanitize_pages(struct list_head *dirty_page_list)
{
+ unsigned long left_dirty = 0;
struct sgx_epc_page *page;
LIST_HEAD(dirty);
int ret;
@@ -59,7 +63,7 @@ static void __sgx_sanitize_pages(struct list_head *dirty_page_list)
/* dirty_page_list is thread-local, no need for a lock: */
while (!list_empty(dirty_page_list)) {
if (kthread_should_stop())
- return;
+ return 0;
page = list_first_entry(dirty_page_list, struct sgx_epc_page, list);
@@ -92,12 +96,14 @@ static void __sgx_sanitize_pages(struct list_head *dirty_page_list)
} else {
/* The page is not yet clean - move to the dirty list. */
list_move_tail(&page->list, &dirty);
+ left_dirty++;
}
cond_resched();
}
list_splice(&dirty, dirty_page_list);
+ return left_dirty;
}
static bool sgx_reclaimer_age(struct sgx_epc_page *epc_page)
@@ -395,10 +401,7 @@ static int ksgxd(void *p)
* required for SECS pages, whose child pages blocked EREMOVE.
*/
__sgx_sanitize_pages(&sgx_dirty_page_list);
- __sgx_sanitize_pages(&sgx_dirty_page_list);
-
- /* sanity check: */
- WARN_ON(!list_empty(&sgx_dirty_page_list));
+ WARN_ON(__sgx_sanitize_pages(&sgx_dirty_page_list));
while (!kthread_should_stop()) {
if (try_to_freeze())
--
2.37.2
The following commit has been merged into the x86/urgent branch of tip:
Commit-ID: 81fa6fd13b5c43601fba8486f2385dbd7c1935e2
Gitweb: https://git.kernel.org/tip/81fa6fd13b5c43601fba8486f2385dbd7c1935e2
Author: Haitao Huang <haitao.huang(a)linux.intel.com>
AuthorDate: Tue, 06 Sep 2022 03:02:21 +03:00
Committer: Dave Hansen <dave.hansen(a)linux.intel.com>
CommitterDate: Thu, 08 Sep 2022 13:28:31 -07:00
x86/sgx: Handle VA page allocation failure for EAUG on PF.
VM_FAULT_NOPAGE is expected behaviour for -EBUSY failure path, when
augmenting a page, as this means that the reclaimer thread has been
triggered, and the intention is just to round-trip in ring-3, and
retry with a new page fault.
Fixes: 5a90d2c3f5ef ("x86/sgx: Support adding of pages to an initialized enclave")
Signed-off-by: Haitao Huang <haitao.huang(a)linux.intel.com>
Signed-off-by: Jarkko Sakkinen <jarkko(a)kernel.org>
Signed-off-by: Dave Hansen <dave.hansen(a)linux.intel.com>
Reviewed-by: Reinette Chatre <reinette.chatre(a)intel.com>
Tested-by: Vijay Dhanraj <vijay.dhanraj(a)intel.com>
Cc: stable(a)vger.kernel.org
Link: https://lkml.kernel.org/r/20220906000221.34286-3-jarkko@kernel.org
---
arch/x86/kernel/cpu/sgx/encl.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/arch/x86/kernel/cpu/sgx/encl.c b/arch/x86/kernel/cpu/sgx/encl.c
index 24c1bb8..8bdeae2 100644
--- a/arch/x86/kernel/cpu/sgx/encl.c
+++ b/arch/x86/kernel/cpu/sgx/encl.c
@@ -344,8 +344,11 @@ static vm_fault_t sgx_encl_eaug_page(struct vm_area_struct *vma,
}
va_page = sgx_encl_grow(encl, false);
- if (IS_ERR(va_page))
+ if (IS_ERR(va_page)) {
+ if (PTR_ERR(va_page) == -EBUSY)
+ vmret = VM_FAULT_NOPAGE;
goto err_out_epc;
+ }
if (va_page)
list_add(&va_page->list, &encl->va_pages);
Running this test program on ARMv4 a few times (sometimes just once)
reproduces the bug.
int main()
{
unsigned i;
char paragon[SIZE];
void* ptr;
memset(paragon, 0xAA, SIZE);
ptr = mmap(NULL, SIZE, PROT_READ | PROT_WRITE,
MAP_ANON | MAP_SHARED, -1, 0);
if (ptr == MAP_FAILED) return 1;
printf("ptr = %p\n", ptr);
for (i=0;i<10000;i++){
memset(ptr, 0xAA, SIZE);
if (memcmp(ptr, paragon, SIZE)) {
printf("Unexpected bytes on iteration %u!!!\n", i);
break;
}
}
munmap(ptr, SIZE);
}
In the "ptr" buffer there appear runs of zero bytes which are aligned
by 16 and their lengths are multiple of 16.
Linux v5.11 does not have the bug, "git bisect" finds the first bad commit:
f9ce0be71d1f ("mm: Cleanup faultaround and finish_fault() codepaths")
Before the commit update_mmu_cache() was called during a call to
filemap_map_pages() as well as finish_fault(). After the commit
finish_fault() lacks it.
Bring back update_mmu_cache() to finish_fault() to fix the bug.
Also call update_mmu_tlb() only when returning VM_FAULT_NOPAGE to more
closely reproduce the code of alloc_set_pte() function that existed before
the commit.
On many platforms update_mmu_cache() is nop:
x86, see arch/x86/include/asm/pgtable
ARMv6+, see arch/arm/include/asm/tlbflush.h
So, it seems, few users ran into this bug.
Fixes: f9ce0be71d1f ("mm: Cleanup faultaround and finish_fault() codepaths")
Signed-off-by: Sergei Antonov <saproj(a)gmail.com>
Cc: Kirill A. Shutemov <kirill.shutemov(a)linux.intel.com>
---
mm/memory.c | 14 ++++++++++----
1 file changed, 10 insertions(+), 4 deletions(-)
diff --git a/mm/memory.c b/mm/memory.c
index 4ba73f5aa8bb..a78814413ac0 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -4386,14 +4386,20 @@ vm_fault_t finish_fault(struct vm_fault *vmf)
vmf->pte = pte_offset_map_lock(vma->vm_mm, vmf->pmd,
vmf->address, &vmf->ptl);
- ret = 0;
+
/* Re-check under ptl */
- if (likely(!vmf_pte_changed(vmf)))
+ if (likely(!vmf_pte_changed(vmf))) {
do_set_pte(vmf, page, vmf->address);
- else
+
+ /* no need to invalidate: a not-present page won't be cached */
+ update_mmu_cache(vma, vmf->address, vmf->pte);
+
+ ret = 0;
+ } else {
+ update_mmu_tlb(vma, vmf->address, vmf->pte);
ret = VM_FAULT_NOPAGE;
+ }
- update_mmu_tlb(vma, vmf->address, vmf->pte);
pte_unmap_unlock(vmf->pte, vmf->ptl);
return ret;
}
--
2.34.1
if iterate_dir() returns non-negative value, caller has to treat it
as normal and check there is any error while populating dentry
information. ksmbd doesn't have to do anything because ksmbd already
checks too small OutputBufferLength to store one file information.
And because ctx->pos is set to file->f_pos when iterative_dir is called,
remove restart_ctx(). And if iterate_dir() return -EIO, which mean
directory entry is corrupted, return STATUS_FILE_CORRUPT_ERROR error
response.
This patch fixes some failure of SMB2_QUERY_DIRECTORY, which happens when
ntfs3 is local filesystem.
Fixes: e2f34481b24d ("cifsd: add server-side procedures for SMB3")
Cc: stable(a)vger.kernel.org
Signed-off-by: Hyunchul Lee <hyc.lee(a)gmail.com>
Signed-off-by: Namjae Jeon <linkinjeon(a)kernel.org>
---
v2:
- remove unneeded restart_ctx().
- If directory entry is corrupted, return STATUS_FILE_CORRUPT_ERROR
error response.
fs/ksmbd/smb2pdu.c | 14 ++++----------
1 file changed, 4 insertions(+), 10 deletions(-)
diff --git a/fs/ksmbd/smb2pdu.c b/fs/ksmbd/smb2pdu.c
index ba74aba2f1d3..634e21bba770 100644
--- a/fs/ksmbd/smb2pdu.c
+++ b/fs/ksmbd/smb2pdu.c
@@ -3809,11 +3809,6 @@ static int __query_dir(struct dir_context *ctx, const char *name, int namlen,
return 0;
}
-static void restart_ctx(struct dir_context *ctx)
-{
- ctx->pos = 0;
-}
-
static int verify_info_level(int info_level)
{
switch (info_level) {
@@ -3921,7 +3916,6 @@ int smb2_query_dir(struct ksmbd_work *work)
if (srch_flag & SMB2_REOPEN || srch_flag & SMB2_RESTART_SCANS) {
ksmbd_debug(SMB, "Restart directory scan\n");
generic_file_llseek(dir_fp->filp, 0, SEEK_SET);
- restart_ctx(&dir_fp->readdir_data.ctx);
}
memset(&d_info, 0, sizeof(struct ksmbd_dir_info));
@@ -3968,11 +3962,9 @@ int smb2_query_dir(struct ksmbd_work *work)
*/
if (!d_info.out_buf_len && !d_info.num_entry)
goto no_buf_len;
- if (rc == 0)
- restart_ctx(&dir_fp->readdir_data.ctx);
- if (rc == -ENOSPC)
+ if (rc > 0 || rc == -ENOSPC)
rc = 0;
- if (rc)
+ else if (rc)
goto err_out;
d_info.wptr = d_info.rptr;
@@ -4029,6 +4021,8 @@ int smb2_query_dir(struct ksmbd_work *work)
rsp->hdr.Status = STATUS_NO_MEMORY;
else if (rc == -EFAULT)
rsp->hdr.Status = STATUS_INVALID_INFO_CLASS;
+ else if (rc == -EIO)
+ rsp->hdr.Status = STATUS_FILE_CORRUPT_ERROR;
if (!rsp->hdr.Status)
rsp->hdr.Status = STATUS_UNEXPECTED_IO_ERROR;
--
2.25.1
--
I am Stefano Pessina, an Italian business tycoon, investor, and
philanthropist. the vice chairman, chief executive officer (CEO), and
the single largest shareholder of Walgreens Boots Alliance. I gave
away 25 percent of my personal wealth to charity. And I also pledged
to give away the rest of 25% this year 2022 to Individuals.. I have
decided to donate $2,200,000.00 to you. If you are interested in my
donation, do contact me for more info
Using rbtree for sorting groups by average fragment size is relatively
expensive (needs rbtree update on every block freeing or allocation) and
leads to wide spreading of allocations because selection of block group
is very sentitive both to changes in free space and amount of blocks
allocated. Furthermore selecting group with the best matching average
fragment size is not necessary anyway, even more so because the
variability of fragment sizes within a group is likely large so average
is not telling much. We just need a group with large enough average
fragment size so that we have high probability of finding large enough
free extent and we don't want average fragment size to be too big so
that we are likely to find free extent only somewhat larger than what we
need.
So instead of maintaing rbtree of groups sorted by fragment size keep
bins (lists) or groups where average fragment size is in the interval
[2^i, 2^(i+1)). This structure requires less updates on block allocation
/ freeing, generally avoids chaotic spreading of allocations into block
groups, and still is able to quickly (even faster that the rbtree)
provide a block group which is likely to have a suitably sized free
space extent.
This patch reduces number of block groups used when untarring archive
with medium sized files (size somewhat above 64k which is default
mballoc limit for avoiding locality group preallocation) to about half
and thus improves write speeds for eMMC flash significantly.
Fixes: 196e402adf2e ("ext4: improve cr 0 / cr 1 group scanning")
CC: stable(a)vger.kernel.org
Reported-and-tested-by: Stefan Wahren <stefan.wahren(a)i2se.com>
Tested-by: Ojaswin Mujoo <ojaswin(a)linux.ibm.com>
Signed-off-by: Jan Kara <jack(a)suse.cz>
---
fs/ext4/ext4.h | 10 +-
fs/ext4/mballoc.c | 249 ++++++++++++++++++++--------------------------
fs/ext4/mballoc.h | 1 -
3 files changed, 111 insertions(+), 149 deletions(-)
diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index 9bca5565547b..3bf9a6926798 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -167,8 +167,6 @@ enum SHIFT_DIRECTION {
#define EXT4_MB_CR0_OPTIMIZED 0x8000
/* Avg fragment size rb tree lookup succeeded at least once for cr = 1 */
#define EXT4_MB_CR1_OPTIMIZED 0x00010000
-/* Perform linear traversal for one group */
-#define EXT4_MB_SEARCH_NEXT_LINEAR 0x00020000
struct ext4_allocation_request {
/* target inode for block we're allocating */
struct inode *inode;
@@ -1600,8 +1598,8 @@ struct ext4_sb_info {
struct list_head s_discard_list;
struct work_struct s_discard_work;
atomic_t s_retry_alloc_pending;
- struct rb_root s_mb_avg_fragment_size_root;
- rwlock_t s_mb_rb_lock;
+ struct list_head *s_mb_avg_fragment_size;
+ rwlock_t *s_mb_avg_fragment_size_locks;
struct list_head *s_mb_largest_free_orders;
rwlock_t *s_mb_largest_free_orders_locks;
@@ -3413,6 +3411,8 @@ struct ext4_group_info {
ext4_grpblk_t bb_first_free; /* first free block */
ext4_grpblk_t bb_free; /* total free blocks */
ext4_grpblk_t bb_fragments; /* nr of freespace fragments */
+ int bb_avg_fragment_size_order; /* order of average
+ fragment in BG */
ext4_grpblk_t bb_largest_free_order;/* order of largest frag in BG */
ext4_group_t bb_group; /* Group number */
struct list_head bb_prealloc_list;
@@ -3420,7 +3420,7 @@ struct ext4_group_info {
void *bb_bitmap;
#endif
struct rw_semaphore alloc_sem;
- struct rb_node bb_avg_fragment_size_rb;
+ struct list_head bb_avg_fragment_size_node;
struct list_head bb_largest_free_order_node;
ext4_grpblk_t bb_counters[]; /* Nr of free power-of-two-block
* regions, index is order.
diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
index af1e49c3603f..31873af0421b 100644
--- a/fs/ext4/mballoc.c
+++ b/fs/ext4/mballoc.c
@@ -140,13 +140,15 @@
* number of buddy bitmap orders possible) number of lists. Group-infos are
* placed in appropriate lists.
*
- * 2) Average fragment size rb tree (sbi->s_mb_avg_fragment_size_root)
+ * 2) Average fragment size lists (sbi->s_mb_avg_fragment_size)
*
- * Locking: sbi->s_mb_rb_lock (rwlock)
+ * Locking: sbi->s_mb_avg_fragment_size_locks(array of rw locks)
*
- * This is a red black tree consisting of group infos and the tree is sorted
- * by average fragment sizes (which is calculated as ext4_group_info->bb_free
- * / ext4_group_info->bb_fragments).
+ * This is an array of lists where in the i-th list there are groups with
+ * average fragment size >= 2^i and < 2^(i+1). The average fragment size
+ * is computed as ext4_group_info->bb_free / ext4_group_info->bb_fragments.
+ * Note that we don't bother with a special list for completely empty groups
+ * so we only have MB_NUM_ORDERS(sb) lists.
*
* When "mb_optimize_scan" mount option is set, mballoc consults the above data
* structures to decide the order in which groups are to be traversed for
@@ -160,7 +162,8 @@
*
* At CR = 1, we only consider groups where average fragment size > request
* size. So, we lookup a group which has average fragment size just above or
- * equal to request size using our rb tree (data structure 2) in O(log N) time.
+ * equal to request size using our average fragment size group lists (data
+ * structure 2) in O(1) time.
*
* If "mb_optimize_scan" mount option is not set, mballoc traverses groups in
* linear order which requires O(N) search time for each CR 0 and CR 1 phase.
@@ -802,65 +805,51 @@ static void ext4_mb_mark_free_simple(struct super_block *sb,
}
}
-static void ext4_mb_rb_insert(struct rb_root *root, struct rb_node *new,
- int (*cmp)(struct rb_node *, struct rb_node *))
+static int mb_avg_fragment_size_order(struct super_block *sb, ext4_grpblk_t len)
{
- struct rb_node **iter = &root->rb_node, *parent = NULL;
+ int order;
- while (*iter) {
- parent = *iter;
- if (cmp(new, *iter) > 0)
- iter = &((*iter)->rb_left);
- else
- iter = &((*iter)->rb_right);
- }
-
- rb_link_node(new, parent, iter);
- rb_insert_color(new, root);
-}
-
-static int
-ext4_mb_avg_fragment_size_cmp(struct rb_node *rb1, struct rb_node *rb2)
-{
- struct ext4_group_info *grp1 = rb_entry(rb1,
- struct ext4_group_info,
- bb_avg_fragment_size_rb);
- struct ext4_group_info *grp2 = rb_entry(rb2,
- struct ext4_group_info,
- bb_avg_fragment_size_rb);
- int num_frags_1, num_frags_2;
-
- num_frags_1 = grp1->bb_fragments ?
- grp1->bb_free / grp1->bb_fragments : 0;
- num_frags_2 = grp2->bb_fragments ?
- grp2->bb_free / grp2->bb_fragments : 0;
-
- return (num_frags_2 - num_frags_1);
+ /*
+ * We don't bother with a special lists groups with only 1 block free
+ * extents and for completely empty groups.
+ */
+ order = fls(len) - 2;
+ if (order < 0)
+ return 0;
+ if (order == MB_NUM_ORDERS(sb))
+ order--;
+ return order;
}
-/*
- * Reinsert grpinfo into the avg_fragment_size tree with new average
- * fragment size.
- */
+/* Move group to appropriate avg_fragment_size list */
static void
mb_update_avg_fragment_size(struct super_block *sb, struct ext4_group_info *grp)
{
struct ext4_sb_info *sbi = EXT4_SB(sb);
+ int new_order;
if (!test_opt2(sb, MB_OPTIMIZE_SCAN) || grp->bb_free == 0)
return;
- write_lock(&sbi->s_mb_rb_lock);
- if (!RB_EMPTY_NODE(&grp->bb_avg_fragment_size_rb)) {
- rb_erase(&grp->bb_avg_fragment_size_rb,
- &sbi->s_mb_avg_fragment_size_root);
- RB_CLEAR_NODE(&grp->bb_avg_fragment_size_rb);
- }
+ new_order = mb_avg_fragment_size_order(sb,
+ grp->bb_free / grp->bb_fragments);
+ if (new_order == grp->bb_avg_fragment_size_order)
+ return;
- ext4_mb_rb_insert(&sbi->s_mb_avg_fragment_size_root,
- &grp->bb_avg_fragment_size_rb,
- ext4_mb_avg_fragment_size_cmp);
- write_unlock(&sbi->s_mb_rb_lock);
+ if (grp->bb_avg_fragment_size_order != -1) {
+ write_lock(&sbi->s_mb_avg_fragment_size_locks[
+ grp->bb_avg_fragment_size_order]);
+ list_del(&grp->bb_avg_fragment_size_node);
+ write_unlock(&sbi->s_mb_avg_fragment_size_locks[
+ grp->bb_avg_fragment_size_order]);
+ }
+ grp->bb_avg_fragment_size_order = new_order;
+ write_lock(&sbi->s_mb_avg_fragment_size_locks[
+ grp->bb_avg_fragment_size_order]);
+ list_add_tail(&grp->bb_avg_fragment_size_node,
+ &sbi->s_mb_avg_fragment_size[grp->bb_avg_fragment_size_order]);
+ write_unlock(&sbi->s_mb_avg_fragment_size_locks[
+ grp->bb_avg_fragment_size_order]);
}
/*
@@ -909,86 +898,56 @@ static void ext4_mb_choose_next_group_cr0(struct ext4_allocation_context *ac,
*new_cr = 1;
} else {
*group = grp->bb_group;
- ac->ac_last_optimal_group = *group;
ac->ac_flags |= EXT4_MB_CR0_OPTIMIZED;
}
}
/*
- * Choose next group by traversing average fragment size tree. Updates *new_cr
- * if cr lvel needs an update. Sets EXT4_MB_SEARCH_NEXT_LINEAR to indicate that
- * the linear search should continue for one iteration since there's lock
- * contention on the rb tree lock.
+ * Choose next group by traversing average fragment size list of suitable
+ * order. Updates *new_cr if cr level needs an update.
*/
static void ext4_mb_choose_next_group_cr1(struct ext4_allocation_context *ac,
int *new_cr, ext4_group_t *group, ext4_group_t ngroups)
{
struct ext4_sb_info *sbi = EXT4_SB(ac->ac_sb);
- int avg_fragment_size, best_so_far;
- struct rb_node *node, *found;
- struct ext4_group_info *grp;
-
- /*
- * If there is contention on the lock, instead of waiting for the lock
- * to become available, just continue searching lineraly. We'll resume
- * our rb tree search later starting at ac->ac_last_optimal_group.
- */
- if (!read_trylock(&sbi->s_mb_rb_lock)) {
- ac->ac_flags |= EXT4_MB_SEARCH_NEXT_LINEAR;
- return;
- }
+ struct ext4_group_info *grp, *iter;
+ int i;
if (unlikely(ac->ac_flags & EXT4_MB_CR1_OPTIMIZED)) {
if (sbi->s_mb_stats)
atomic_inc(&sbi->s_bal_cr1_bad_suggestions);
- /* We have found something at CR 1 in the past */
- grp = ext4_get_group_info(ac->ac_sb, ac->ac_last_optimal_group);
- for (found = rb_next(&grp->bb_avg_fragment_size_rb); found != NULL;
- found = rb_next(found)) {
- grp = rb_entry(found, struct ext4_group_info,
- bb_avg_fragment_size_rb);
+ }
+
+ for (i = mb_avg_fragment_size_order(ac->ac_sb, ac->ac_g_ex.fe_len);
+ i < MB_NUM_ORDERS(ac->ac_sb); i++) {
+ if (list_empty(&sbi->s_mb_avg_fragment_size[i]))
+ continue;
+ read_lock(&sbi->s_mb_avg_fragment_size_locks[i]);
+ if (list_empty(&sbi->s_mb_avg_fragment_size[i])) {
+ read_unlock(&sbi->s_mb_avg_fragment_size_locks[i]);
+ continue;
+ }
+ grp = NULL;
+ list_for_each_entry(iter, &sbi->s_mb_avg_fragment_size[i],
+ bb_avg_fragment_size_node) {
if (sbi->s_mb_stats)
atomic64_inc(&sbi->s_bal_cX_groups_considered[1]);
- if (likely(ext4_mb_good_group(ac, grp->bb_group, 1)))
+ if (likely(ext4_mb_good_group(ac, iter->bb_group, 1))) {
+ grp = iter;
break;
- }
- goto done;
- }
-
- node = sbi->s_mb_avg_fragment_size_root.rb_node;
- best_so_far = 0;
- found = NULL;
-
- while (node) {
- grp = rb_entry(node, struct ext4_group_info,
- bb_avg_fragment_size_rb);
- avg_fragment_size = 0;
- if (ext4_mb_good_group(ac, grp->bb_group, 1)) {
- avg_fragment_size = grp->bb_fragments ?
- grp->bb_free / grp->bb_fragments : 0;
- if (!best_so_far || avg_fragment_size < best_so_far) {
- best_so_far = avg_fragment_size;
- found = node;
}
}
- if (avg_fragment_size > ac->ac_g_ex.fe_len)
- node = node->rb_right;
- else
- node = node->rb_left;
+ read_unlock(&sbi->s_mb_avg_fragment_size_locks[i]);
+ if (grp)
+ break;
}
-done:
- if (found) {
- grp = rb_entry(found, struct ext4_group_info,
- bb_avg_fragment_size_rb);
+ if (grp) {
*group = grp->bb_group;
ac->ac_flags |= EXT4_MB_CR1_OPTIMIZED;
} else {
*new_cr = 2;
}
-
- read_unlock(&sbi->s_mb_rb_lock);
- ac->ac_last_optimal_group = *group;
}
static inline int should_optimize_scan(struct ext4_allocation_context *ac)
@@ -1017,11 +976,6 @@ next_linear_group(struct ext4_allocation_context *ac, int group, int ngroups)
goto inc_and_return;
}
- if (ac->ac_flags & EXT4_MB_SEARCH_NEXT_LINEAR) {
- ac->ac_flags &= ~EXT4_MB_SEARCH_NEXT_LINEAR;
- goto inc_and_return;
- }
-
return group;
inc_and_return:
/*
@@ -1152,13 +1106,13 @@ void ext4_mb_generate_buddy(struct super_block *sb,
EXT4_GROUP_INFO_BBITMAP_CORRUPT);
}
mb_set_largest_free_order(sb, grp);
+ mb_update_avg_fragment_size(sb, grp);
clear_bit(EXT4_GROUP_INFO_NEED_INIT_BIT, &(grp->bb_state));
period = get_cycles() - period;
atomic_inc(&sbi->s_mb_buddies_generated);
atomic64_add(period, &sbi->s_mb_generation_time);
- mb_update_avg_fragment_size(sb, grp);
}
/* The buddy information is attached the buddy cache inode
@@ -2711,7 +2665,6 @@ ext4_mb_regular_allocator(struct ext4_allocation_context *ac)
* from the goal value specified
*/
group = ac->ac_g_ex.fe_group;
- ac->ac_last_optimal_group = group;
ac->ac_groups_linear_remaining = sbi->s_mb_max_linear_groups;
prefetch_grp = group;
@@ -2993,9 +2946,7 @@ __acquires(&EXT4_SB(sb)->s_mb_rb_lock)
struct super_block *sb = pde_data(file_inode(seq->file));
unsigned long position;
- read_lock(&EXT4_SB(sb)->s_mb_rb_lock);
-
- if (*pos < 0 || *pos >= MB_NUM_ORDERS(sb) + 1)
+ if (*pos < 0 || *pos >= 2*MB_NUM_ORDERS(sb))
return NULL;
position = *pos + 1;
return (void *) ((unsigned long) position);
@@ -3007,7 +2958,7 @@ static void *ext4_mb_seq_structs_summary_next(struct seq_file *seq, void *v, lof
unsigned long position;
++*pos;
- if (*pos < 0 || *pos >= MB_NUM_ORDERS(sb) + 1)
+ if (*pos < 0 || *pos >= 2*MB_NUM_ORDERS(sb))
return NULL;
position = *pos + 1;
return (void *) ((unsigned long) position);
@@ -3019,29 +2970,22 @@ static int ext4_mb_seq_structs_summary_show(struct seq_file *seq, void *v)
struct ext4_sb_info *sbi = EXT4_SB(sb);
unsigned long position = ((unsigned long) v);
struct ext4_group_info *grp;
- struct rb_node *n;
- unsigned int count, min, max;
+ unsigned int count;
position--;
if (position >= MB_NUM_ORDERS(sb)) {
- seq_puts(seq, "fragment_size_tree:\n");
- n = rb_first(&sbi->s_mb_avg_fragment_size_root);
- if (!n) {
- seq_puts(seq, "\ttree_min: 0\n\ttree_max: 0\n\ttree_nodes: 0\n");
- return 0;
- }
- grp = rb_entry(n, struct ext4_group_info, bb_avg_fragment_size_rb);
- min = grp->bb_fragments ? grp->bb_free / grp->bb_fragments : 0;
- count = 1;
- while (rb_next(n)) {
- count++;
- n = rb_next(n);
- }
- grp = rb_entry(n, struct ext4_group_info, bb_avg_fragment_size_rb);
- max = grp->bb_fragments ? grp->bb_free / grp->bb_fragments : 0;
+ position -= MB_NUM_ORDERS(sb);
+ if (position == 0)
+ seq_puts(seq, "avg_fragment_size_lists:\n");
- seq_printf(seq, "\ttree_min: %u\n\ttree_max: %u\n\ttree_nodes: %u\n",
- min, max, count);
+ count = 0;
+ read_lock(&sbi->s_mb_avg_fragment_size_locks[position]);
+ list_for_each_entry(grp, &sbi->s_mb_avg_fragment_size[position],
+ bb_avg_fragment_size_node)
+ count++;
+ read_unlock(&sbi->s_mb_avg_fragment_size_locks[position]);
+ seq_printf(seq, "\tlist_order_%u_groups: %u\n",
+ (unsigned int)position, count);
return 0;
}
@@ -3051,9 +2995,11 @@ static int ext4_mb_seq_structs_summary_show(struct seq_file *seq, void *v)
seq_puts(seq, "max_free_order_lists:\n");
}
count = 0;
+ read_lock(&sbi->s_mb_largest_free_orders_locks[position]);
list_for_each_entry(grp, &sbi->s_mb_largest_free_orders[position],
bb_largest_free_order_node)
count++;
+ read_unlock(&sbi->s_mb_largest_free_orders_locks[position]);
seq_printf(seq, "\tlist_order_%u_groups: %u\n",
(unsigned int)position, count);
@@ -3061,11 +3007,7 @@ static int ext4_mb_seq_structs_summary_show(struct seq_file *seq, void *v)
}
static void ext4_mb_seq_structs_summary_stop(struct seq_file *seq, void *v)
-__releases(&EXT4_SB(sb)->s_mb_rb_lock)
{
- struct super_block *sb = pde_data(file_inode(seq->file));
-
- read_unlock(&EXT4_SB(sb)->s_mb_rb_lock);
}
const struct seq_operations ext4_mb_seq_structs_summary_ops = {
@@ -3178,8 +3120,9 @@ int ext4_mb_add_groupinfo(struct super_block *sb, ext4_group_t group,
init_rwsem(&meta_group_info[i]->alloc_sem);
meta_group_info[i]->bb_free_root = RB_ROOT;
INIT_LIST_HEAD(&meta_group_info[i]->bb_largest_free_order_node);
- RB_CLEAR_NODE(&meta_group_info[i]->bb_avg_fragment_size_rb);
+ INIT_LIST_HEAD(&meta_group_info[i]->bb_avg_fragment_size_node);
meta_group_info[i]->bb_largest_free_order = -1; /* uninit */
+ meta_group_info[i]->bb_avg_fragment_size_order = -1; /* uninit */
meta_group_info[i]->bb_group = group;
mb_group_bb_bitmap_alloc(sb, meta_group_info[i], group);
@@ -3428,7 +3371,24 @@ int ext4_mb_init(struct super_block *sb)
i++;
} while (i < MB_NUM_ORDERS(sb));
- sbi->s_mb_avg_fragment_size_root = RB_ROOT;
+ sbi->s_mb_avg_fragment_size =
+ kmalloc_array(MB_NUM_ORDERS(sb), sizeof(struct list_head),
+ GFP_KERNEL);
+ if (!sbi->s_mb_avg_fragment_size) {
+ ret = -ENOMEM;
+ goto out;
+ }
+ sbi->s_mb_avg_fragment_size_locks =
+ kmalloc_array(MB_NUM_ORDERS(sb), sizeof(rwlock_t),
+ GFP_KERNEL);
+ if (!sbi->s_mb_avg_fragment_size_locks) {
+ ret = -ENOMEM;
+ goto out;
+ }
+ for (i = 0; i < MB_NUM_ORDERS(sb); i++) {
+ INIT_LIST_HEAD(&sbi->s_mb_avg_fragment_size[i]);
+ rwlock_init(&sbi->s_mb_avg_fragment_size_locks[i]);
+ }
sbi->s_mb_largest_free_orders =
kmalloc_array(MB_NUM_ORDERS(sb), sizeof(struct list_head),
GFP_KERNEL);
@@ -3447,7 +3407,6 @@ int ext4_mb_init(struct super_block *sb)
INIT_LIST_HEAD(&sbi->s_mb_largest_free_orders[i]);
rwlock_init(&sbi->s_mb_largest_free_orders_locks[i]);
}
- rwlock_init(&sbi->s_mb_rb_lock);
spin_lock_init(&sbi->s_md_lock);
sbi->s_mb_free_pending = 0;
@@ -3518,6 +3477,8 @@ int ext4_mb_init(struct super_block *sb)
free_percpu(sbi->s_locality_groups);
sbi->s_locality_groups = NULL;
out:
+ kfree(sbi->s_mb_avg_fragment_size);
+ kfree(sbi->s_mb_avg_fragment_size_locks);
kfree(sbi->s_mb_largest_free_orders);
kfree(sbi->s_mb_largest_free_orders_locks);
kfree(sbi->s_mb_offsets);
@@ -3584,6 +3545,8 @@ int ext4_mb_release(struct super_block *sb)
kvfree(group_info);
rcu_read_unlock();
}
+ kfree(sbi->s_mb_avg_fragment_size);
+ kfree(sbi->s_mb_avg_fragment_size_locks);
kfree(sbi->s_mb_largest_free_orders);
kfree(sbi->s_mb_largest_free_orders_locks);
kfree(sbi->s_mb_offsets);
diff --git a/fs/ext4/mballoc.h b/fs/ext4/mballoc.h
index 39da92ceabf8..dcda2a943cee 100644
--- a/fs/ext4/mballoc.h
+++ b/fs/ext4/mballoc.h
@@ -178,7 +178,6 @@ struct ext4_allocation_context {
/* copy of the best found extent taken before preallocation efforts */
struct ext4_free_extent ac_f_ex;
- ext4_group_t ac_last_optimal_group;
__u32 ac_groups_considered;
__u32 ac_flags; /* allocation hints */
__u16 ac_groups_scanned;
--
2.35.3
The patch titled
Subject: mm/huge_memory: use pfn_to_online_page() in split_huge_pages_all()
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
mm-huge_memory-use-pfn_to_online_page-in-split_huge_pages_all.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Naoya Horiguchi <naoya.horiguchi(a)nec.com>
Subject: mm/huge_memory: use pfn_to_online_page() in split_huge_pages_all()
Date: Thu, 8 Sep 2022 13:11:50 +0900
NULL pointer dereference is triggered when calling thp split via debugfs
on the system with offlined memory blocks. With debug option enabled, the
following kernel messages are printed out:
page:00000000467f4890 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x121c000
flags: 0x17fffc00000000(node=0|zone=2|lastcpupid=0x1ffff)
raw: 0017fffc00000000 0000000000000000 dead000000000122 0000000000000000
raw: 0000000000000000 0000000000000000 00000001ffffffff 0000000000000000
page dumped because: unmovable page
page:000000007d7ab72e is uninitialized and poisoned
page dumped because: VM_BUG_ON_PAGE(PagePoisoned(p))
------------[ cut here ]------------
kernel BUG at include/linux/mm.h:1248!
invalid opcode: 0000 [#1] PREEMPT SMP PTI
CPU: 16 PID: 20964 Comm: bash Tainted: G I 6.0.0-rc3-foll-numa+ #41
...
RIP: 0010:split_huge_pages_write+0xcf4/0xe30
This shows that page_to_nid() in page_zone() is unexpectedly called for an
offlined memmap.
Use pfn_to_online_page() to get struct page in PFN walker.
Link: https://lkml.kernel.org/r/20220908041150.3430269-1-naoya.horiguchi@linux.dev
Fixes: 49071d436b51 ("thp: add debugfs handle to split all huge pages")
Signed-off-by: Naoya Horiguchi <naoya.horiguchi(a)nec.com>
Co-developed-by: David Hildenbrand <david(a)redhat.com>
Signed-off-by: David Hildenbrand <david(a)redhat.com>
Reviewed-by: Yang Shi <shy828301(a)gmail.com>
Acked-by: Michal Hocko <mhocko(a)suse.com>
Reviewed-by: Miaohe Lin <linmiaohe(a)huawei.com>
Reviewed-by: Oscar Salvador <osalvador(a)suse.de>
Acked-by: Kirill A. Shutemov <kirill.shutemov(a)linux.intel.com>
Cc: Matthew Wilcox <willy(a)infradead.org>
Cc: Muchun Song <songmuchun(a)bytedance.com>
Cc: <stable(a)vger.kernel.org> [5.10+]
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/huge_memory.c | 6 ++----
1 file changed, 2 insertions(+), 4 deletions(-)
--- a/mm/huge_memory.c~mm-huge_memory-use-pfn_to_online_page-in-split_huge_pages_all
+++ a/mm/huge_memory.c
@@ -2894,11 +2894,9 @@ static void split_huge_pages_all(void)
max_zone_pfn = zone_end_pfn(zone);
for (pfn = zone->zone_start_pfn; pfn < max_zone_pfn; pfn++) {
int nr_pages;
- if (!pfn_valid(pfn))
- continue;
- page = pfn_to_page(pfn);
- if (!get_page_unless_zero(page))
+ page = pfn_to_online_page(pfn);
+ if (!page || !get_page_unless_zero(page))
continue;
if (zone != page_zone(page))
_
Patches currently in -mm which might be from naoya.horiguchi(a)nec.com are
mm-huge_memory-use-pfn_to_online_page-in-split_huge_pages_all.patch
The patch titled
Subject: mm: fix madivse_pageout mishandling on non-LRU page
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
mm-fix-madivse_pageout-mishandling-on-non-lru-page.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Minchan Kim <minchan(a)kernel.org>
Subject: mm: fix madivse_pageout mishandling on non-LRU page
Date: Thu, 8 Sep 2022 08:12:04 -0700
MADV_PAGEOUT tries to isolate non-LRU pages and gets a warning from
isolate_lru_page below.
Fix it by checking PageLRU in advance.
------------[ cut here ]------------
trying to isolate tail page
WARNING: CPU: 0 PID: 6175 at mm/folio-compat.c:158 isolate_lru_page+0x130/0x140
Modules linked in:
CPU: 0 PID: 6175 Comm: syz-executor.0 Not tainted 5.18.12 #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014
RIP: 0010:isolate_lru_page+0x130/0x140
Link: https://lore.kernel.org/linux-mm/485f8c33.2471b.182d5726afb.Coremail.hantia…
Link: https://lkml.kernel.org/r/20220908151204.762596-1-minchan@kernel.org
Fixes: 1a4e58cce84e ("mm: introduce MADV_PAGEOUT")
Signed-off-by: Minchan Kim <minchan(a)kernel.org>
Reported-by: �������`� <hantianshuo(a)iie.ac.cn>
Suggested-by: Yang Shi <shy828301(a)gmail.com>
Acked-by: Yang Shi <shy828301(a)gmail.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/madvise.c | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)
--- a/mm/madvise.c~mm-fix-madivse_pageout-mishandling-on-non-lru-page
+++ a/mm/madvise.c
@@ -451,8 +451,11 @@ regular_page:
continue;
}
- /* Do not interfere with other mappings of this page */
- if (page_mapcount(page) != 1)
+ /*
+ * Do not interfere with other mappings of this page and
+ * non-LRU page.
+ */
+ if (!PageLRU(page) || page_mapcount(page) != 1)
continue;
VM_BUG_ON_PAGE(PageTransCompound(page), page);
_
Patches currently in -mm which might be from minchan(a)kernel.org are
mm-fix-madivse_pageout-mishandling-on-non-lru-page.patch
From: Nicolas Dichtel <nicolas.dichtel(a)6wind.com>
BugLink: https://bugs.launchpad.net/bugs/1988809
__mkroute_input() uses fib_validate_source() to trigger an icmp redirect.
My understanding is that fib_validate_source() is used to know if the src
address and the gateway address are on the same link. For that,
fib_validate_source() returns 1 (same link) or 0 (not the same network).
__mkroute_input() is the only user of these positive values, all other
callers only look if the returned value is negative.
Since the below patch, fib_validate_source() didn't return anymore 1 when
both addresses are on the same network, because the route lookup returns
RT_SCOPE_LINK instead of RT_SCOPE_HOST. But this is, in fact, right.
Let's adapat the test to return 1 again when both addresses are on the same
link.
CC: stable(a)vger.kernel.org
Fixes: 747c14307214 ("ip: fix dflt addr selection for connected nexthop")
Reported-by: kernel test robot <yujie.liu(a)intel.com>
Reported-by: Heng Qi <hengqi(a)linux.alibaba.com>
Signed-off-by: Nicolas Dichtel <nicolas.dichtel(a)6wind.com>
Reviewed-by: David Ahern <dsahern(a)kernel.org>
Link: https://lore.kernel.org/r/20220829100121.3821-1-nicolas.dichtel@6wind.com
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
(cherry-picked from commit eb55dc09b5dd040232d5de32812cc83001a23da6)
Signed-off-by: Luke Nowakowski-Krijger <luke.nowakowskikrijger(a)canonical.com>
---
net/ipv4/fib_frontend.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/net/ipv4/fib_frontend.c b/net/ipv4/fib_frontend.c
index ef3e7a3e3a29e..d38c8ca93ba09 100644
--- a/net/ipv4/fib_frontend.c
+++ b/net/ipv4/fib_frontend.c
@@ -399,7 +399,7 @@ static int __fib_validate_source(struct sk_buff *skb, __be32 src, __be32 dst,
dev_match = dev_match || (res.type == RTN_LOCAL &&
dev == net->loopback_dev);
if (dev_match) {
- ret = FIB_RES_NHC(res)->nhc_scope >= RT_SCOPE_HOST;
+ ret = FIB_RES_NHC(res)->nhc_scope >= RT_SCOPE_LINK;
return ret;
}
if (no_addr)
@@ -411,7 +411,7 @@ static int __fib_validate_source(struct sk_buff *skb, __be32 src, __be32 dst,
ret = 0;
if (fib_lookup(net, &fl4, &res, FIB_LOOKUP_IGNORE_LINKSTATE) == 0) {
if (res.type == RTN_UNICAST)
- ret = FIB_RES_NHC(res)->nhc_scope >= RT_SCOPE_HOST;
+ ret = FIB_RES_NHC(res)->nhc_scope >= RT_SCOPE_LINK;
}
return ret;
--
2.34.1
From: Nicolas Dichtel <nicolas.dichtel(a)6wind.com>
BugLink: https://bugs.launchpad.net/bugs/1988809
When a nexthop is added, without a gw address, the default scope was set
to 'host'. Thus, when a source address is selected, 127.0.0.1 may be chosen
but rejected when the route is used.
When using a route without a nexthop id, the scope can be configured in the
route, thus the problem doesn't exist.
To explain more deeply: when a user creates a nexthop, it cannot specify
the scope. To create it, the function nh_create_ipv4() calls fib_check_nh()
with scope set to 0. fib_check_nh() calls fib_check_nh_nongw() wich was
setting scope to 'host'. Then, nh_create_ipv4() calls
fib_info_update_nhc_saddr() with scope set to 'host'. The src addr is
chosen before the route is inserted.
When a 'standard' route (ie without a reference to a nexthop) is added,
fib_create_info() calls fib_info_update_nhc_saddr() with the scope set by
the user. iproute2 set the scope to 'link' by default.
Here is a way to reproduce the problem:
ip netns add foo
ip -n foo link set lo up
ip netns add bar
ip -n bar link set lo up
sleep 1
ip -n foo link add name eth0 type dummy
ip -n foo link set eth0 up
ip -n foo address add 192.168.0.1/24 dev eth0
ip -n foo link add name veth0 type veth peer name veth1 netns bar
ip -n foo link set veth0 up
ip -n bar link set veth1 up
ip -n bar address add 192.168.1.1/32 dev veth1
ip -n bar route add default dev veth1
ip -n foo nexthop add id 1 dev veth0
ip -n foo route add 192.168.1.1 nhid 1
Try to get/use the route:
> $ ip -n foo route get 192.168.1.1
> RTNETLINK answers: Invalid argument
> $ ip netns exec foo ping -c1 192.168.1.1
> ping: connect: Invalid argument
Try without nexthop group (iproute2 sets scope to 'link' by dflt):
ip -n foo route del 192.168.1.1
ip -n foo route add 192.168.1.1 dev veth0
Try to get/use the route:
> $ ip -n foo route get 192.168.1.1
> 192.168.1.1 dev veth0 src 192.168.0.1 uid 0
> cache
> $ ip netns exec foo ping -c1 192.168.1.1
> PING 192.168.1.1 (192.168.1.1) 56(84) bytes of data.
> 64 bytes from 192.168.1.1: icmp_seq=1 ttl=64 time=0.039 ms
>
> --- 192.168.1.1 ping statistics ---
> 1 packets transmitted, 1 received, 0% packet loss, time 0ms
> rtt min/avg/max/mdev = 0.039/0.039/0.039/0.000 ms
CC: stable(a)vger.kernel.org
Fixes: 597cfe4fc339 ("nexthop: Add support for IPv4 nexthops")
Reported-by: Edwin Brossette <edwin.brossette(a)6wind.com>
Signed-off-by: Nicolas Dichtel <nicolas.dichtel(a)6wind.com>
Link: https://lore.kernel.org/r/20220713114853.29406-1-nicolas.dichtel@6wind.com
Signed-off-by: Paolo Abeni <pabeni(a)redhat.com>
(backported from commit 747c14307214b55dbd8250e1ab44cad8305756f1)
[lukenow: use dev_hold instead of dev_hold_track as dev_hold_track
requires some debugging infastructure that we don't want to backport]
Signed-off-by: Luke Nowakowski-Krijger <luke.nowakowskikrijger(a)canonical.com>
---
net/ipv4/fib_semantics.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/ipv4/fib_semantics.c b/net/ipv4/fib_semantics.c
index f99ad4a98907d..16fe034615635 100644
--- a/net/ipv4/fib_semantics.c
+++ b/net/ipv4/fib_semantics.c
@@ -1217,7 +1217,7 @@ static int fib_check_nh_nongw(struct net *net, struct fib_nh *nh,
nh->fib_nh_dev = in_dev->dev;
dev_hold(nh->fib_nh_dev);
- nh->fib_nh_scope = RT_SCOPE_HOST;
+ nh->fib_nh_scope = RT_SCOPE_LINK;
if (!netif_carrier_ok(nh->fib_nh_dev))
nh->fib_nh_flags |= RTNH_F_LINKDOWN;
err = 0;
--
2.34.1
commit 25e9fbf0fd38868a429feabc38abebfc6dbf6542 upstream.
Both __device_attach_driver() and __driver_attach() check the return
code of the bus_type.match() function to see if the device needs to be
added to the deferred probe list. After adding the device to the list,
the logic attempts to bind the device to the driver anyway, as if the
device had matched with the driver, which is not correct.
If __device_attach_driver() detects that the device in question is not
ready to match with a driver on the bus, then it doesn't make sense for
the device to attempt to bind with the current driver or continue
attempting to match with any of the other drivers on the bus. So, update
the logic in __device_attach_driver() to reflect this.
If __driver_attach() detects that a driver tried to match with a device
that is not ready to match yet, then the driver should not attempt to bind
with the device. However, the driver can still attempt to match and bind
with other devices on the bus, as drivers can be bound to multiple
devices. So, update the logic in __driver_attach() to reflect this.
Fixes: 656b8035b0ee ("ARM: 8524/1: driver cohandle -EPROBE_DEFER from bus_type.match()")
Cc: stable(a)vger.kernel.org
Cc: Saravana Kannan <saravanak(a)google.com>
Reported-by: Guenter Roeck <linux(a)roeck-us.net>
Tested-by: Guenter Roeck <linux(a)roeck-us.net>
Tested-by: Linus Walleij <linus.walleij(a)linaro.org>
Reviewed-by: Saravana Kannan <saravanak(a)google.com>
Signed-off-by: Isaac J. Manjarres <isaacmanjarres(a)google.com>
---
drivers/base/dd.c | 10 ++++++++++
1 file changed, 10 insertions(+)
diff --git a/drivers/base/dd.c b/drivers/base/dd.c
index ff59a1851cb4..faaa0440b294 100644
--- a/drivers/base/dd.c
+++ b/drivers/base/dd.c
@@ -590,6 +590,11 @@ static int __device_attach_driver(struct device_driver *drv, void *_data)
} else if (ret == -EPROBE_DEFER) {
dev_dbg(dev, "Device match requests probe deferral\n");
driver_deferred_probe_add(dev);
+ /*
+ * Device can't match with a driver right now, so don't attempt
+ * to match or bind with other drivers on the bus.
+ */
+ return ret;
} else if (ret < 0) {
dev_dbg(dev, "Bus failed to match device: %d", ret);
return ret;
@@ -732,6 +737,11 @@ static int __driver_attach(struct device *dev, void *data)
} else if (ret == -EPROBE_DEFER) {
dev_dbg(dev, "Device match requests probe deferral\n");
driver_deferred_probe_add(dev);
+ /*
+ * Driver could not match with device, but may match with
+ * another device on the bus.
+ */
+ return 0;
} else if (ret < 0) {
dev_dbg(dev, "Bus failed to match device: %d", ret);
return ret;
--
2.37.2.789.g6183377224-goog
commit 25e9fbf0fd38868a429feabc38abebfc6dbf6542 upstream.
Both __device_attach_driver() and __driver_attach() check the return
code of the bus_type.match() function to see if the device needs to be
added to the deferred probe list. After adding the device to the list,
the logic attempts to bind the device to the driver anyway, as if the
device had matched with the driver, which is not correct.
If __device_attach_driver() detects that the device in question is not
ready to match with a driver on the bus, then it doesn't make sense for
the device to attempt to bind with the current driver or continue
attempting to match with any of the other drivers on the bus. So, update
the logic in __device_attach_driver() to reflect this.
If __driver_attach() detects that a driver tried to match with a device
that is not ready to match yet, then the driver should not attempt to bind
with the device. However, the driver can still attempt to match and bind
with other devices on the bus, as drivers can be bound to multiple
devices. So, update the logic in __driver_attach() to reflect this.
Fixes: 656b8035b0ee ("ARM: 8524/1: driver cohandle -EPROBE_DEFER from bus_type.match()")
Cc: stable(a)vger.kernel.org
Cc: Saravana Kannan <saravanak(a)google.com>
Reported-by: Guenter Roeck <linux(a)roeck-us.net>
Tested-by: Guenter Roeck <linux(a)roeck-us.net>
Tested-by: Linus Walleij <linus.walleij(a)linaro.org>
Reviewed-by: Saravana Kannan <saravanak(a)google.com>
Signed-off-by: Isaac J. Manjarres <isaacmanjarres(a)google.com>
---
drivers/base/dd.c | 10 ++++++++++
1 file changed, 10 insertions(+)
diff --git a/drivers/base/dd.c b/drivers/base/dd.c
index 26ba7a99b7d5..63390a416b44 100644
--- a/drivers/base/dd.c
+++ b/drivers/base/dd.c
@@ -738,6 +738,11 @@ static int __device_attach_driver(struct device_driver *drv, void *_data)
} else if (ret == -EPROBE_DEFER) {
dev_dbg(dev, "Device match requests probe deferral\n");
driver_deferred_probe_add(dev);
+ /*
+ * Device can't match with a driver right now, so don't attempt
+ * to match or bind with other drivers on the bus.
+ */
+ return ret;
} else if (ret < 0) {
dev_dbg(dev, "Bus failed to match device: %d", ret);
return ret;
@@ -891,6 +896,11 @@ static int __driver_attach(struct device *dev, void *data)
} else if (ret == -EPROBE_DEFER) {
dev_dbg(dev, "Device match requests probe deferral\n");
driver_deferred_probe_add(dev);
+ /*
+ * Driver could not match with device, but may match with
+ * another device on the bus.
+ */
+ return 0;
} else if (ret < 0) {
dev_dbg(dev, "Bus failed to match device: %d", ret);
return ret;
--
2.37.2.789.g6183377224-goog
The patch below does not apply to the 4.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
Possible dependencies:
9b6641dd95a0 ("ext4: fix super block checksum incorrect after mount")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 9b6641dd95a0c441b277dd72ba22fed8d61f76ad Mon Sep 17 00:00:00 2001
From: Ye Bin <yebin10(a)huawei.com>
Date: Wed, 25 May 2022 09:29:04 +0800
Subject: [PATCH] ext4: fix super block checksum incorrect after mount
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
We got issue as follows:
[home]# mount /dev/sda test
EXT4-fs (sda): warning: mounting fs with errors, running e2fsck is recommended
[home]# dmesg
EXT4-fs (sda): warning: mounting fs with errors, running e2fsck is recommended
EXT4-fs (sda): Errors on filesystem, clearing orphan list.
EXT4-fs (sda): recovery complete
EXT4-fs (sda): mounted filesystem with ordered data mode. Quota mode: none.
[home]# debugfs /dev/sda
debugfs 1.46.5 (30-Dec-2021)
Checksum errors in superblock! Retrying...
Reason is ext4_orphan_cleanup will reset ‘s_last_orphan’ but not update
super block checksum.
To solve above issue, defer update super block checksum after
ext4_orphan_cleanup.
Signed-off-by: Ye Bin <yebin10(a)huawei.com>
Cc: stable(a)kernel.org
Reviewed-by: Jan Kara <jack(a)suse.cz>
Reviewed-by: Ritesh Harjani <ritesh.list(a)gmail.com>
Link: https://lore.kernel.org/r/20220525012904.1604737-1-yebin10@huawei.com
Signed-off-by: Theodore Ts'o <tytso(a)mit.edu>
diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index b2ecae8adbfc..13d562d11235 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -5302,14 +5302,6 @@ static int __ext4_fill_super(struct fs_context *fc, struct super_block *sb)
err = percpu_counter_init(&sbi->s_freeinodes_counter, freei,
GFP_KERNEL);
}
- /*
- * Update the checksum after updating free space/inode
- * counters. Otherwise the superblock can have an incorrect
- * checksum in the buffer cache until it is written out and
- * e2fsprogs programs trying to open a file system immediately
- * after it is mounted can fail.
- */
- ext4_superblock_csum_set(sb);
if (!err)
err = percpu_counter_init(&sbi->s_dirs_counter,
ext4_count_dirs(sb), GFP_KERNEL);
@@ -5367,6 +5359,14 @@ static int __ext4_fill_super(struct fs_context *fc, struct super_block *sb)
EXT4_SB(sb)->s_mount_state |= EXT4_ORPHAN_FS;
ext4_orphan_cleanup(sb, es);
EXT4_SB(sb)->s_mount_state &= ~EXT4_ORPHAN_FS;
+ /*
+ * Update the checksum after updating free space/inode counters and
+ * ext4_orphan_cleanup. Otherwise the superblock can have an incorrect
+ * checksum in the buffer cache until it is written out and
+ * e2fsprogs programs trying to open a file system immediately
+ * after it is mounted can fail.
+ */
+ ext4_superblock_csum_set(sb);
if (needs_recovery) {
ext4_msg(sb, KERN_INFO, "recovery complete");
err = ext4_mark_recovery_complete(sb, es);
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
Possible dependencies:
9b6641dd95a0 ("ext4: fix super block checksum incorrect after mount")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 9b6641dd95a0c441b277dd72ba22fed8d61f76ad Mon Sep 17 00:00:00 2001
From: Ye Bin <yebin10(a)huawei.com>
Date: Wed, 25 May 2022 09:29:04 +0800
Subject: [PATCH] ext4: fix super block checksum incorrect after mount
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
We got issue as follows:
[home]# mount /dev/sda test
EXT4-fs (sda): warning: mounting fs with errors, running e2fsck is recommended
[home]# dmesg
EXT4-fs (sda): warning: mounting fs with errors, running e2fsck is recommended
EXT4-fs (sda): Errors on filesystem, clearing orphan list.
EXT4-fs (sda): recovery complete
EXT4-fs (sda): mounted filesystem with ordered data mode. Quota mode: none.
[home]# debugfs /dev/sda
debugfs 1.46.5 (30-Dec-2021)
Checksum errors in superblock! Retrying...
Reason is ext4_orphan_cleanup will reset ‘s_last_orphan’ but not update
super block checksum.
To solve above issue, defer update super block checksum after
ext4_orphan_cleanup.
Signed-off-by: Ye Bin <yebin10(a)huawei.com>
Cc: stable(a)kernel.org
Reviewed-by: Jan Kara <jack(a)suse.cz>
Reviewed-by: Ritesh Harjani <ritesh.list(a)gmail.com>
Link: https://lore.kernel.org/r/20220525012904.1604737-1-yebin10@huawei.com
Signed-off-by: Theodore Ts'o <tytso(a)mit.edu>
diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index b2ecae8adbfc..13d562d11235 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -5302,14 +5302,6 @@ static int __ext4_fill_super(struct fs_context *fc, struct super_block *sb)
err = percpu_counter_init(&sbi->s_freeinodes_counter, freei,
GFP_KERNEL);
}
- /*
- * Update the checksum after updating free space/inode
- * counters. Otherwise the superblock can have an incorrect
- * checksum in the buffer cache until it is written out and
- * e2fsprogs programs trying to open a file system immediately
- * after it is mounted can fail.
- */
- ext4_superblock_csum_set(sb);
if (!err)
err = percpu_counter_init(&sbi->s_dirs_counter,
ext4_count_dirs(sb), GFP_KERNEL);
@@ -5367,6 +5359,14 @@ static int __ext4_fill_super(struct fs_context *fc, struct super_block *sb)
EXT4_SB(sb)->s_mount_state |= EXT4_ORPHAN_FS;
ext4_orphan_cleanup(sb, es);
EXT4_SB(sb)->s_mount_state &= ~EXT4_ORPHAN_FS;
+ /*
+ * Update the checksum after updating free space/inode counters and
+ * ext4_orphan_cleanup. Otherwise the superblock can have an incorrect
+ * checksum in the buffer cache until it is written out and
+ * e2fsprogs programs trying to open a file system immediately
+ * after it is mounted can fail.
+ */
+ ext4_superblock_csum_set(sb);
if (needs_recovery) {
ext4_msg(sb, KERN_INFO, "recovery complete");
err = ext4_mark_recovery_complete(sb, es);
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
Possible dependencies:
9b6641dd95a0 ("ext4: fix super block checksum incorrect after mount")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 9b6641dd95a0c441b277dd72ba22fed8d61f76ad Mon Sep 17 00:00:00 2001
From: Ye Bin <yebin10(a)huawei.com>
Date: Wed, 25 May 2022 09:29:04 +0800
Subject: [PATCH] ext4: fix super block checksum incorrect after mount
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
We got issue as follows:
[home]# mount /dev/sda test
EXT4-fs (sda): warning: mounting fs with errors, running e2fsck is recommended
[home]# dmesg
EXT4-fs (sda): warning: mounting fs with errors, running e2fsck is recommended
EXT4-fs (sda): Errors on filesystem, clearing orphan list.
EXT4-fs (sda): recovery complete
EXT4-fs (sda): mounted filesystem with ordered data mode. Quota mode: none.
[home]# debugfs /dev/sda
debugfs 1.46.5 (30-Dec-2021)
Checksum errors in superblock! Retrying...
Reason is ext4_orphan_cleanup will reset ‘s_last_orphan’ but not update
super block checksum.
To solve above issue, defer update super block checksum after
ext4_orphan_cleanup.
Signed-off-by: Ye Bin <yebin10(a)huawei.com>
Cc: stable(a)kernel.org
Reviewed-by: Jan Kara <jack(a)suse.cz>
Reviewed-by: Ritesh Harjani <ritesh.list(a)gmail.com>
Link: https://lore.kernel.org/r/20220525012904.1604737-1-yebin10@huawei.com
Signed-off-by: Theodore Ts'o <tytso(a)mit.edu>
diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index b2ecae8adbfc..13d562d11235 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -5302,14 +5302,6 @@ static int __ext4_fill_super(struct fs_context *fc, struct super_block *sb)
err = percpu_counter_init(&sbi->s_freeinodes_counter, freei,
GFP_KERNEL);
}
- /*
- * Update the checksum after updating free space/inode
- * counters. Otherwise the superblock can have an incorrect
- * checksum in the buffer cache until it is written out and
- * e2fsprogs programs trying to open a file system immediately
- * after it is mounted can fail.
- */
- ext4_superblock_csum_set(sb);
if (!err)
err = percpu_counter_init(&sbi->s_dirs_counter,
ext4_count_dirs(sb), GFP_KERNEL);
@@ -5367,6 +5359,14 @@ static int __ext4_fill_super(struct fs_context *fc, struct super_block *sb)
EXT4_SB(sb)->s_mount_state |= EXT4_ORPHAN_FS;
ext4_orphan_cleanup(sb, es);
EXT4_SB(sb)->s_mount_state &= ~EXT4_ORPHAN_FS;
+ /*
+ * Update the checksum after updating free space/inode counters and
+ * ext4_orphan_cleanup. Otherwise the superblock can have an incorrect
+ * checksum in the buffer cache until it is written out and
+ * e2fsprogs programs trying to open a file system immediately
+ * after it is mounted can fail.
+ */
+ ext4_superblock_csum_set(sb);
if (needs_recovery) {
ext4_msg(sb, KERN_INFO, "recovery complete");
err = ext4_mark_recovery_complete(sb, es);
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
Possible dependencies:
9b6641dd95a0 ("ext4: fix super block checksum incorrect after mount")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 9b6641dd95a0c441b277dd72ba22fed8d61f76ad Mon Sep 17 00:00:00 2001
From: Ye Bin <yebin10(a)huawei.com>
Date: Wed, 25 May 2022 09:29:04 +0800
Subject: [PATCH] ext4: fix super block checksum incorrect after mount
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
We got issue as follows:
[home]# mount /dev/sda test
EXT4-fs (sda): warning: mounting fs with errors, running e2fsck is recommended
[home]# dmesg
EXT4-fs (sda): warning: mounting fs with errors, running e2fsck is recommended
EXT4-fs (sda): Errors on filesystem, clearing orphan list.
EXT4-fs (sda): recovery complete
EXT4-fs (sda): mounted filesystem with ordered data mode. Quota mode: none.
[home]# debugfs /dev/sda
debugfs 1.46.5 (30-Dec-2021)
Checksum errors in superblock! Retrying...
Reason is ext4_orphan_cleanup will reset ‘s_last_orphan’ but not update
super block checksum.
To solve above issue, defer update super block checksum after
ext4_orphan_cleanup.
Signed-off-by: Ye Bin <yebin10(a)huawei.com>
Cc: stable(a)kernel.org
Reviewed-by: Jan Kara <jack(a)suse.cz>
Reviewed-by: Ritesh Harjani <ritesh.list(a)gmail.com>
Link: https://lore.kernel.org/r/20220525012904.1604737-1-yebin10@huawei.com
Signed-off-by: Theodore Ts'o <tytso(a)mit.edu>
diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index b2ecae8adbfc..13d562d11235 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -5302,14 +5302,6 @@ static int __ext4_fill_super(struct fs_context *fc, struct super_block *sb)
err = percpu_counter_init(&sbi->s_freeinodes_counter, freei,
GFP_KERNEL);
}
- /*
- * Update the checksum after updating free space/inode
- * counters. Otherwise the superblock can have an incorrect
- * checksum in the buffer cache until it is written out and
- * e2fsprogs programs trying to open a file system immediately
- * after it is mounted can fail.
- */
- ext4_superblock_csum_set(sb);
if (!err)
err = percpu_counter_init(&sbi->s_dirs_counter,
ext4_count_dirs(sb), GFP_KERNEL);
@@ -5367,6 +5359,14 @@ static int __ext4_fill_super(struct fs_context *fc, struct super_block *sb)
EXT4_SB(sb)->s_mount_state |= EXT4_ORPHAN_FS;
ext4_orphan_cleanup(sb, es);
EXT4_SB(sb)->s_mount_state &= ~EXT4_ORPHAN_FS;
+ /*
+ * Update the checksum after updating free space/inode counters and
+ * ext4_orphan_cleanup. Otherwise the superblock can have an incorrect
+ * checksum in the buffer cache until it is written out and
+ * e2fsprogs programs trying to open a file system immediately
+ * after it is mounted can fail.
+ */
+ ext4_superblock_csum_set(sb);
if (needs_recovery) {
ext4_msg(sb, KERN_INFO, "recovery complete");
err = ext4_mark_recovery_complete(sb, es);
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
Possible dependencies:
9b6641dd95a0 ("ext4: fix super block checksum incorrect after mount")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 9b6641dd95a0c441b277dd72ba22fed8d61f76ad Mon Sep 17 00:00:00 2001
From: Ye Bin <yebin10(a)huawei.com>
Date: Wed, 25 May 2022 09:29:04 +0800
Subject: [PATCH] ext4: fix super block checksum incorrect after mount
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
We got issue as follows:
[home]# mount /dev/sda test
EXT4-fs (sda): warning: mounting fs with errors, running e2fsck is recommended
[home]# dmesg
EXT4-fs (sda): warning: mounting fs with errors, running e2fsck is recommended
EXT4-fs (sda): Errors on filesystem, clearing orphan list.
EXT4-fs (sda): recovery complete
EXT4-fs (sda): mounted filesystem with ordered data mode. Quota mode: none.
[home]# debugfs /dev/sda
debugfs 1.46.5 (30-Dec-2021)
Checksum errors in superblock! Retrying...
Reason is ext4_orphan_cleanup will reset ‘s_last_orphan’ but not update
super block checksum.
To solve above issue, defer update super block checksum after
ext4_orphan_cleanup.
Signed-off-by: Ye Bin <yebin10(a)huawei.com>
Cc: stable(a)kernel.org
Reviewed-by: Jan Kara <jack(a)suse.cz>
Reviewed-by: Ritesh Harjani <ritesh.list(a)gmail.com>
Link: https://lore.kernel.org/r/20220525012904.1604737-1-yebin10@huawei.com
Signed-off-by: Theodore Ts'o <tytso(a)mit.edu>
diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index b2ecae8adbfc..13d562d11235 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -5302,14 +5302,6 @@ static int __ext4_fill_super(struct fs_context *fc, struct super_block *sb)
err = percpu_counter_init(&sbi->s_freeinodes_counter, freei,
GFP_KERNEL);
}
- /*
- * Update the checksum after updating free space/inode
- * counters. Otherwise the superblock can have an incorrect
- * checksum in the buffer cache until it is written out and
- * e2fsprogs programs trying to open a file system immediately
- * after it is mounted can fail.
- */
- ext4_superblock_csum_set(sb);
if (!err)
err = percpu_counter_init(&sbi->s_dirs_counter,
ext4_count_dirs(sb), GFP_KERNEL);
@@ -5367,6 +5359,14 @@ static int __ext4_fill_super(struct fs_context *fc, struct super_block *sb)
EXT4_SB(sb)->s_mount_state |= EXT4_ORPHAN_FS;
ext4_orphan_cleanup(sb, es);
EXT4_SB(sb)->s_mount_state &= ~EXT4_ORPHAN_FS;
+ /*
+ * Update the checksum after updating free space/inode counters and
+ * ext4_orphan_cleanup. Otherwise the superblock can have an incorrect
+ * checksum in the buffer cache until it is written out and
+ * e2fsprogs programs trying to open a file system immediately
+ * after it is mounted can fail.
+ */
+ ext4_superblock_csum_set(sb);
if (needs_recovery) {
ext4_msg(sb, KERN_INFO, "recovery complete");
err = ext4_mark_recovery_complete(sb, es);
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
Possible dependencies:
9b6641dd95a0 ("ext4: fix super block checksum incorrect after mount")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 9b6641dd95a0c441b277dd72ba22fed8d61f76ad Mon Sep 17 00:00:00 2001
From: Ye Bin <yebin10(a)huawei.com>
Date: Wed, 25 May 2022 09:29:04 +0800
Subject: [PATCH] ext4: fix super block checksum incorrect after mount
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
We got issue as follows:
[home]# mount /dev/sda test
EXT4-fs (sda): warning: mounting fs with errors, running e2fsck is recommended
[home]# dmesg
EXT4-fs (sda): warning: mounting fs with errors, running e2fsck is recommended
EXT4-fs (sda): Errors on filesystem, clearing orphan list.
EXT4-fs (sda): recovery complete
EXT4-fs (sda): mounted filesystem with ordered data mode. Quota mode: none.
[home]# debugfs /dev/sda
debugfs 1.46.5 (30-Dec-2021)
Checksum errors in superblock! Retrying...
Reason is ext4_orphan_cleanup will reset ‘s_last_orphan’ but not update
super block checksum.
To solve above issue, defer update super block checksum after
ext4_orphan_cleanup.
Signed-off-by: Ye Bin <yebin10(a)huawei.com>
Cc: stable(a)kernel.org
Reviewed-by: Jan Kara <jack(a)suse.cz>
Reviewed-by: Ritesh Harjani <ritesh.list(a)gmail.com>
Link: https://lore.kernel.org/r/20220525012904.1604737-1-yebin10@huawei.com
Signed-off-by: Theodore Ts'o <tytso(a)mit.edu>
diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index b2ecae8adbfc..13d562d11235 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -5302,14 +5302,6 @@ static int __ext4_fill_super(struct fs_context *fc, struct super_block *sb)
err = percpu_counter_init(&sbi->s_freeinodes_counter, freei,
GFP_KERNEL);
}
- /*
- * Update the checksum after updating free space/inode
- * counters. Otherwise the superblock can have an incorrect
- * checksum in the buffer cache until it is written out and
- * e2fsprogs programs trying to open a file system immediately
- * after it is mounted can fail.
- */
- ext4_superblock_csum_set(sb);
if (!err)
err = percpu_counter_init(&sbi->s_dirs_counter,
ext4_count_dirs(sb), GFP_KERNEL);
@@ -5367,6 +5359,14 @@ static int __ext4_fill_super(struct fs_context *fc, struct super_block *sb)
EXT4_SB(sb)->s_mount_state |= EXT4_ORPHAN_FS;
ext4_orphan_cleanup(sb, es);
EXT4_SB(sb)->s_mount_state &= ~EXT4_ORPHAN_FS;
+ /*
+ * Update the checksum after updating free space/inode counters and
+ * ext4_orphan_cleanup. Otherwise the superblock can have an incorrect
+ * checksum in the buffer cache until it is written out and
+ * e2fsprogs programs trying to open a file system immediately
+ * after it is mounted can fail.
+ */
+ ext4_superblock_csum_set(sb);
if (needs_recovery) {
ext4_msg(sb, KERN_INFO, "recovery complete");
err = ext4_mark_recovery_complete(sb, es);
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
Possible dependencies:
9a472613f5bc ("soc: fsl: select FSL_GUTS driver for DPIO")
69651bd8d303 ("soc: fsl: dpio: add Net DIM integration")
ed1d2143fee5 ("soc: fsl: dpio: add support for irq coalescing per software portal")
2cf0b6fe9bd3 ("soc: fsl: dpio: extract the QBMAN clock frequency from the attributes")
3b2abda7d28c ("soc: fsl: dpio: Replace QMAN array mode with ring mode enqueue")
b46fe745e4f6 ("soc: fsl: dpio: QMAN performance improvement with function pointer indirection")
9d98809711ae ("soc: fsl: dpio: Adding QMAN multiple enqueue interface")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 9a472613f5bccf1b36837423495ae592a9c5182f Mon Sep 17 00:00:00 2001
From: Mathew McBride <matt(a)traverse.com.au>
Date: Thu, 1 Sep 2022 05:21:49 +0000
Subject: [PATCH] soc: fsl: select FSL_GUTS driver for DPIO
The soc/fsl/dpio driver will perform a soc_device_match()
to determine the optimal cache settings for a given CPU core.
If FSL_GUTS is not enabled, this search will fail and
the driver will not configure cache stashing for the given
DPIO, and a string of "unknown SoC" messages will appear:
fsl_mc_dpio dpio.7: unknown SoC version
fsl_mc_dpio dpio.6: unknown SoC version
fsl_mc_dpio dpio.5: unknown SoC version
Fixes: 51da14e96e9b ("soc: fsl: dpio: configure cache stashing destination")
Signed-off-by: Mathew McBride <matt(a)traverse.com.au>
Reviewed-by: Ioana Ciornei <ioana.ciornei(a)nxp.com>
Cc: stable(a)vger.kernel.org
Link: https://lore.kernel.org/r/20220901052149.23873-2-matt@traverse.com.au'
Signed-off-by: Arnd Bergmann <arnd(a)arndb.de>
diff --git a/drivers/soc/fsl/Kconfig b/drivers/soc/fsl/Kconfig
index 07d52cafbb31..fcec6ed83d5e 100644
--- a/drivers/soc/fsl/Kconfig
+++ b/drivers/soc/fsl/Kconfig
@@ -24,6 +24,7 @@ config FSL_MC_DPIO
tristate "QorIQ DPAA2 DPIO driver"
depends on FSL_MC_BUS
select SOC_BUS
+ select FSL_GUTS
select DIMLIB
help
Driver for the DPAA2 DPIO object. A DPIO provides queue and
The patch below does not apply to the 5.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
Possible dependencies:
<!DOCTYPE html>
<html lang='en'>
<head>
<title>kernel/git/sashal/deps.git - Sashas dependency resolver</title>
<meta name='generator' content='cgit '/>
<meta name='robots' content='noindex, nofollow'/>
<link rel='stylesheet' type='text/css' href='/cgit-data/cgit.css'/>
<link rel='shortcut icon' href='/favicon.ico'/>
<link rel='alternate' title='Atom feed' href='http://git.kernel.org/pub/scm/linux/kernel/git/sashal/deps.git/atom/?h=mast…' type='application/atom+xml'/>
<link rel='vcs-git' href='git://git.kernel.org/pub/scm/linux/kernel/git/sashal/deps.git' title='kernel/git/sashal/deps.git Git repository'/>
<link rel='vcs-git' href='https://git.kernel.org/pub/scm/linux/kernel/git/sashal/deps.git' title='kernel/git/sashal/deps.git Git repository'/>
<link rel='vcs-git' href='https://kernel.googlesource.com/pub/scm/linux/kernel/git/sashal/deps.git' title='kernel/git/sashal/deps.git Git repository'/>
</head>
<body>
<div id='cgit'><table id='header'>
<tr>
<td class='logo' rowspan='2'><a href='/'><img src='/cgit-data/cgit.png' alt='cgit logo'/></a></td>
<td class='main'><a href='/'>index</a> : <a title='kernel/git/sashal/deps.git' href='/pub/scm/linux/kernel/git/sashal/deps.git/'>kernel/git/sashal/deps.git</a></td><td class='form'><form method='get'>
<select name='h' onchange='this.form.submit();'>
<option value='master' selected='selected'>master</option>
</select> <input type='submit' value='switch'/></form></td></tr>
<tr><td class='sub'>Sashas dependency resolver</td><td class='sub right'>Sasha Levin</td></tr></table>
<table class='tabs'><tr><td>
<a href='/pub/scm/linux/kernel/git/sashal/deps.git/'>summary</a><a href='/pub/scm/linux/kernel/git/sashal/deps.git/refs/'>refs</a><a href='/pub/scm/linux/kernel/git/sashal/deps.git/log/'>log</a><a href='/pub/scm/linux/kernel/git/sashal/deps.git/tree/'>tree</a><a href='/pub/scm/linux/kernel/git/sashal/deps.git/commit/'>commit</a><a href='/pub/scm/linux/kernel/git/sashal/deps.git/diff/'>diff</a><a href='/pub/scm/linux/kernel/git/sashal/deps.git/stats/'>stats</a></td><td class='form'><form class='right' method='get' action='/pub/scm/linux/kernel/git/sashal/deps.git/log/'>
<select name='qt'>
<option value='grep'>log msg</option>
<option value='author'>author</option>
<option value='committer'>committer</option>
<option value='range'>range</option>
</select>
<input class='txt' type='search' size='10' name='q' value=''/>
<input type='submit' value='search'/>
</form>
</td></tr></table>
<div class='content'><div class='error'>Not found</div>
</div> <!-- class=content -->
<div class='footer'>generated by <a href='https://git.zx2c4.com/cgit/about/'>cgit </a> (<a href='https://git-scm.com/'>git 2.34.1</a>) at 2022-09-08 17:04:03 +0000</div>
</div> <!-- id=cgit -->
</body>
</html>
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 9b6641dd95a0c441b277dd72ba22fed8d61f76ad Mon Sep 17 00:00:00 2001
From: Ye Bin <yebin10(a)huawei.com>
Date: Wed, 25 May 2022 09:29:04 +0800
Subject: [PATCH] ext4: fix super block checksum incorrect after mount
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
We got issue as follows:
[home]# mount /dev/sda test
EXT4-fs (sda): warning: mounting fs with errors, running e2fsck is recommended
[home]# dmesg
EXT4-fs (sda): warning: mounting fs with errors, running e2fsck is recommended
EXT4-fs (sda): Errors on filesystem, clearing orphan list.
EXT4-fs (sda): recovery complete
EXT4-fs (sda): mounted filesystem with ordered data mode. Quota mode: none.
[home]# debugfs /dev/sda
debugfs 1.46.5 (30-Dec-2021)
Checksum errors in superblock! Retrying...
Reason is ext4_orphan_cleanup will reset ‘s_last_orphan’ but not update
super block checksum.
To solve above issue, defer update super block checksum after
ext4_orphan_cleanup.
Signed-off-by: Ye Bin <yebin10(a)huawei.com>
Cc: stable(a)kernel.org
Reviewed-by: Jan Kara <jack(a)suse.cz>
Reviewed-by: Ritesh Harjani <ritesh.list(a)gmail.com>
Link: https://lore.kernel.org/r/20220525012904.1604737-1-yebin10@huawei.com
Signed-off-by: Theodore Ts'o <tytso(a)mit.edu>
diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index b2ecae8adbfc..13d562d11235 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -5302,14 +5302,6 @@ static int __ext4_fill_super(struct fs_context *fc, struct super_block *sb)
err = percpu_counter_init(&sbi->s_freeinodes_counter, freei,
GFP_KERNEL);
}
- /*
- * Update the checksum after updating free space/inode
- * counters. Otherwise the superblock can have an incorrect
- * checksum in the buffer cache until it is written out and
- * e2fsprogs programs trying to open a file system immediately
- * after it is mounted can fail.
- */
- ext4_superblock_csum_set(sb);
if (!err)
err = percpu_counter_init(&sbi->s_dirs_counter,
ext4_count_dirs(sb), GFP_KERNEL);
@@ -5367,6 +5359,14 @@ static int __ext4_fill_super(struct fs_context *fc, struct super_block *sb)
EXT4_SB(sb)->s_mount_state |= EXT4_ORPHAN_FS;
ext4_orphan_cleanup(sb, es);
EXT4_SB(sb)->s_mount_state &= ~EXT4_ORPHAN_FS;
+ /*
+ * Update the checksum after updating free space/inode counters and
+ * ext4_orphan_cleanup. Otherwise the superblock can have an incorrect
+ * checksum in the buffer cache until it is written out and
+ * e2fsprogs programs trying to open a file system immediately
+ * after it is mounted can fail.
+ */
+ ext4_superblock_csum_set(sb);
if (needs_recovery) {
ext4_msg(sb, KERN_INFO, "recovery complete");
err = ext4_mark_recovery_complete(sb, es);
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
Possible dependencies:
9a472613f5bc ("soc: fsl: select FSL_GUTS driver for DPIO")
69651bd8d303 ("soc: fsl: dpio: add Net DIM integration")
ed1d2143fee5 ("soc: fsl: dpio: add support for irq coalescing per software portal")
2cf0b6fe9bd3 ("soc: fsl: dpio: extract the QBMAN clock frequency from the attributes")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 9a472613f5bccf1b36837423495ae592a9c5182f Mon Sep 17 00:00:00 2001
From: Mathew McBride <matt(a)traverse.com.au>
Date: Thu, 1 Sep 2022 05:21:49 +0000
Subject: [PATCH] soc: fsl: select FSL_GUTS driver for DPIO
The soc/fsl/dpio driver will perform a soc_device_match()
to determine the optimal cache settings for a given CPU core.
If FSL_GUTS is not enabled, this search will fail and
the driver will not configure cache stashing for the given
DPIO, and a string of "unknown SoC" messages will appear:
fsl_mc_dpio dpio.7: unknown SoC version
fsl_mc_dpio dpio.6: unknown SoC version
fsl_mc_dpio dpio.5: unknown SoC version
Fixes: 51da14e96e9b ("soc: fsl: dpio: configure cache stashing destination")
Signed-off-by: Mathew McBride <matt(a)traverse.com.au>
Reviewed-by: Ioana Ciornei <ioana.ciornei(a)nxp.com>
Cc: stable(a)vger.kernel.org
Link: https://lore.kernel.org/r/20220901052149.23873-2-matt@traverse.com.au'
Signed-off-by: Arnd Bergmann <arnd(a)arndb.de>
diff --git a/drivers/soc/fsl/Kconfig b/drivers/soc/fsl/Kconfig
index 07d52cafbb31..fcec6ed83d5e 100644
--- a/drivers/soc/fsl/Kconfig
+++ b/drivers/soc/fsl/Kconfig
@@ -24,6 +24,7 @@ config FSL_MC_DPIO
tristate "QorIQ DPAA2 DPIO driver"
depends on FSL_MC_BUS
select SOC_BUS
+ select FSL_GUTS
select DIMLIB
help
Driver for the DPAA2 DPIO object. A DPIO provides queue and
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 9a472613f5bccf1b36837423495ae592a9c5182f Mon Sep 17 00:00:00 2001
From: Mathew McBride <matt(a)traverse.com.au>
Date: Thu, 1 Sep 2022 05:21:49 +0000
Subject: [PATCH] soc: fsl: select FSL_GUTS driver for DPIO
The soc/fsl/dpio driver will perform a soc_device_match()
to determine the optimal cache settings for a given CPU core.
If FSL_GUTS is not enabled, this search will fail and
the driver will not configure cache stashing for the given
DPIO, and a string of "unknown SoC" messages will appear:
fsl_mc_dpio dpio.7: unknown SoC version
fsl_mc_dpio dpio.6: unknown SoC version
fsl_mc_dpio dpio.5: unknown SoC version
Fixes: 51da14e96e9b ("soc: fsl: dpio: configure cache stashing destination")
Signed-off-by: Mathew McBride <matt(a)traverse.com.au>
Reviewed-by: Ioana Ciornei <ioana.ciornei(a)nxp.com>
Cc: stable(a)vger.kernel.org
Link: https://lore.kernel.org/r/20220901052149.23873-2-matt@traverse.com.au'
Signed-off-by: Arnd Bergmann <arnd(a)arndb.de>
diff --git a/drivers/soc/fsl/Kconfig b/drivers/soc/fsl/Kconfig
index 07d52cafbb31..fcec6ed83d5e 100644
--- a/drivers/soc/fsl/Kconfig
+++ b/drivers/soc/fsl/Kconfig
@@ -24,6 +24,7 @@ config FSL_MC_DPIO
tristate "QorIQ DPAA2 DPIO driver"
depends on FSL_MC_BUS
select SOC_BUS
+ select FSL_GUTS
select DIMLIB
help
Driver for the DPAA2 DPIO object. A DPIO provides queue and
The patch below does not apply to the 4.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
Possible dependencies:
9cb636b5f6a8 ("efi: capsule-loader: Fix use-after-free in efi_capsule_write")
5dce14b9d1a2 ("efi/capsule: Clean up pr_err/_info() messages")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 9cb636b5f6a8cc6d1b50809ec8f8d33ae0c84c95 Mon Sep 17 00:00:00 2001
From: Hyunwoo Kim <imv4bel(a)gmail.com>
Date: Wed, 7 Sep 2022 09:07:14 -0700
Subject: [PATCH] efi: capsule-loader: Fix use-after-free in efi_capsule_write
A race condition may occur if the user calls close() on another thread
during a write() operation on the device node of the efi capsule.
This is a race condition that occurs between the efi_capsule_write() and
efi_capsule_flush() functions of efi_capsule_fops, which ultimately
results in UAF.
So, the page freeing process is modified to be done in
efi_capsule_release() instead of efi_capsule_flush().
Cc: <stable(a)vger.kernel.org> # v4.9+
Signed-off-by: Hyunwoo Kim <imv4bel(a)gmail.com>
Link: https://lore.kernel.org/all/20220907102920.GA88602@ubuntu/
Signed-off-by: Ard Biesheuvel <ardb(a)kernel.org>
diff --git a/drivers/firmware/efi/capsule-loader.c b/drivers/firmware/efi/capsule-loader.c
index 4dde8edd53b6..3e8d4b51a814 100644
--- a/drivers/firmware/efi/capsule-loader.c
+++ b/drivers/firmware/efi/capsule-loader.c
@@ -242,29 +242,6 @@ static ssize_t efi_capsule_write(struct file *file, const char __user *buff,
return ret;
}
-/**
- * efi_capsule_flush - called by file close or file flush
- * @file: file pointer
- * @id: not used
- *
- * If a capsule is being partially uploaded then calling this function
- * will be treated as upload termination and will free those completed
- * buffer pages and -ECANCELED will be returned.
- **/
-static int efi_capsule_flush(struct file *file, fl_owner_t id)
-{
- int ret = 0;
- struct capsule_info *cap_info = file->private_data;
-
- if (cap_info->index > 0) {
- pr_err("capsule upload not complete\n");
- efi_free_all_buff_pages(cap_info);
- ret = -ECANCELED;
- }
-
- return ret;
-}
-
/**
* efi_capsule_release - called by file close
* @inode: not used
@@ -277,6 +254,13 @@ static int efi_capsule_release(struct inode *inode, struct file *file)
{
struct capsule_info *cap_info = file->private_data;
+ if (cap_info->index > 0 &&
+ (cap_info->header.headersize == 0 ||
+ cap_info->count < cap_info->total_size)) {
+ pr_err("capsule upload not complete\n");
+ efi_free_all_buff_pages(cap_info);
+ }
+
kfree(cap_info->pages);
kfree(cap_info->phys);
kfree(file->private_data);
@@ -324,7 +308,6 @@ static const struct file_operations efi_capsule_fops = {
.owner = THIS_MODULE,
.open = efi_capsule_open,
.write = efi_capsule_write,
- .flush = efi_capsule_flush,
.release = efi_capsule_release,
.llseek = no_llseek,
};
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
Possible dependencies:
1a3887924a7e ("efi: libstub: Disable struct randomization")
cc49c71d2abe ("efi/libstub: Disable Shadow Call Stack")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 1a3887924a7e6edd331be76da7bf4c1e8eab4b1e Mon Sep 17 00:00:00 2001
From: Ard Biesheuvel <ardb(a)kernel.org>
Date: Mon, 22 Aug 2022 19:20:33 +0200
Subject: [PATCH] efi: libstub: Disable struct randomization
The EFI stub is a wrapper around the core kernel that makes it look like
a EFI compatible PE/COFF application to the EFI firmware. EFI
applications run on top of the EFI runtime, which is heavily based on
so-called protocols, which are struct types consisting [mostly] of
function pointer members that are instantiated and recorded in a
protocol database.
These structs look like the ideal randomization candidates to the
randstruct plugin (as they only carry function pointers), but of course,
these protocols are contracts between the firmware that exposes them,
and the EFI applications (including our stubbed kernel) that invoke
them. This means that struct randomization for EFI protocols is not a
great idea, and given that the stub shares very little data with the
core kernel that is represented as a randomizable struct, we're better
off just disabling it completely here.
Cc: <stable(a)vger.kernel.org> # v4.14+
Reported-by: Daniel Marth <daniel.marth(a)inso.tuwien.ac.at>
Tested-by: Daniel Marth <daniel.marth(a)inso.tuwien.ac.at>
Signed-off-by: Ard Biesheuvel <ardb(a)kernel.org>
Acked-by: Kees Cook <keescook(a)chromium.org>
diff --git a/drivers/firmware/efi/libstub/Makefile b/drivers/firmware/efi/libstub/Makefile
index d0537573501e..2c67f71f2375 100644
--- a/drivers/firmware/efi/libstub/Makefile
+++ b/drivers/firmware/efi/libstub/Makefile
@@ -37,6 +37,13 @@ KBUILD_CFLAGS := $(cflags-y) -Os -DDISABLE_BRANCH_PROFILING \
$(call cc-option,-fno-addrsig) \
-D__DISABLE_EXPORTS
+#
+# struct randomization only makes sense for Linux internal types, which the EFI
+# stub code never touches, so let's turn off struct randomization for the stub
+# altogether
+#
+KBUILD_CFLAGS := $(filter-out $(RANDSTRUCT_CFLAGS), $(KBUILD_CFLAGS))
+
# remove SCS flags from all objects in this directory
KBUILD_CFLAGS := $(filter-out $(CC_FLAGS_SCS), $(KBUILD_CFLAGS))
# disable LTO
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
Possible dependencies:
1a3887924a7e ("efi: libstub: Disable struct randomization")
cc49c71d2abe ("efi/libstub: Disable Shadow Call Stack")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 1a3887924a7e6edd331be76da7bf4c1e8eab4b1e Mon Sep 17 00:00:00 2001
From: Ard Biesheuvel <ardb(a)kernel.org>
Date: Mon, 22 Aug 2022 19:20:33 +0200
Subject: [PATCH] efi: libstub: Disable struct randomization
The EFI stub is a wrapper around the core kernel that makes it look like
a EFI compatible PE/COFF application to the EFI firmware. EFI
applications run on top of the EFI runtime, which is heavily based on
so-called protocols, which are struct types consisting [mostly] of
function pointer members that are instantiated and recorded in a
protocol database.
These structs look like the ideal randomization candidates to the
randstruct plugin (as they only carry function pointers), but of course,
these protocols are contracts between the firmware that exposes them,
and the EFI applications (including our stubbed kernel) that invoke
them. This means that struct randomization for EFI protocols is not a
great idea, and given that the stub shares very little data with the
core kernel that is represented as a randomizable struct, we're better
off just disabling it completely here.
Cc: <stable(a)vger.kernel.org> # v4.14+
Reported-by: Daniel Marth <daniel.marth(a)inso.tuwien.ac.at>
Tested-by: Daniel Marth <daniel.marth(a)inso.tuwien.ac.at>
Signed-off-by: Ard Biesheuvel <ardb(a)kernel.org>
Acked-by: Kees Cook <keescook(a)chromium.org>
diff --git a/drivers/firmware/efi/libstub/Makefile b/drivers/firmware/efi/libstub/Makefile
index d0537573501e..2c67f71f2375 100644
--- a/drivers/firmware/efi/libstub/Makefile
+++ b/drivers/firmware/efi/libstub/Makefile
@@ -37,6 +37,13 @@ KBUILD_CFLAGS := $(cflags-y) -Os -DDISABLE_BRANCH_PROFILING \
$(call cc-option,-fno-addrsig) \
-D__DISABLE_EXPORTS
+#
+# struct randomization only makes sense for Linux internal types, which the EFI
+# stub code never touches, so let's turn off struct randomization for the stub
+# altogether
+#
+KBUILD_CFLAGS := $(filter-out $(RANDSTRUCT_CFLAGS), $(KBUILD_CFLAGS))
+
# remove SCS flags from all objects in this directory
KBUILD_CFLAGS := $(filter-out $(CC_FLAGS_SCS), $(KBUILD_CFLAGS))
# disable LTO
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
Possible dependencies:
1a3887924a7e ("efi: libstub: Disable struct randomization")
cc49c71d2abe ("efi/libstub: Disable Shadow Call Stack")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 1a3887924a7e6edd331be76da7bf4c1e8eab4b1e Mon Sep 17 00:00:00 2001
From: Ard Biesheuvel <ardb(a)kernel.org>
Date: Mon, 22 Aug 2022 19:20:33 +0200
Subject: [PATCH] efi: libstub: Disable struct randomization
The EFI stub is a wrapper around the core kernel that makes it look like
a EFI compatible PE/COFF application to the EFI firmware. EFI
applications run on top of the EFI runtime, which is heavily based on
so-called protocols, which are struct types consisting [mostly] of
function pointer members that are instantiated and recorded in a
protocol database.
These structs look like the ideal randomization candidates to the
randstruct plugin (as they only carry function pointers), but of course,
these protocols are contracts between the firmware that exposes them,
and the EFI applications (including our stubbed kernel) that invoke
them. This means that struct randomization for EFI protocols is not a
great idea, and given that the stub shares very little data with the
core kernel that is represented as a randomizable struct, we're better
off just disabling it completely here.
Cc: <stable(a)vger.kernel.org> # v4.14+
Reported-by: Daniel Marth <daniel.marth(a)inso.tuwien.ac.at>
Tested-by: Daniel Marth <daniel.marth(a)inso.tuwien.ac.at>
Signed-off-by: Ard Biesheuvel <ardb(a)kernel.org>
Acked-by: Kees Cook <keescook(a)chromium.org>
diff --git a/drivers/firmware/efi/libstub/Makefile b/drivers/firmware/efi/libstub/Makefile
index d0537573501e..2c67f71f2375 100644
--- a/drivers/firmware/efi/libstub/Makefile
+++ b/drivers/firmware/efi/libstub/Makefile
@@ -37,6 +37,13 @@ KBUILD_CFLAGS := $(cflags-y) -Os -DDISABLE_BRANCH_PROFILING \
$(call cc-option,-fno-addrsig) \
-D__DISABLE_EXPORTS
+#
+# struct randomization only makes sense for Linux internal types, which the EFI
+# stub code never touches, so let's turn off struct randomization for the stub
+# altogether
+#
+KBUILD_CFLAGS := $(filter-out $(RANDSTRUCT_CFLAGS), $(KBUILD_CFLAGS))
+
# remove SCS flags from all objects in this directory
KBUILD_CFLAGS := $(filter-out $(CC_FLAGS_SCS), $(KBUILD_CFLAGS))
# disable LTO
From: Roberto Sassu <roberto.sassu(a)huawei.com>
Commit 6e71b04a82248 ("bpf: Add file mode configuration into bpf maps")
added the BPF_F_RDONLY and BPF_F_WRONLY flags, to let user space specify
whether it will just read or modify a map.
Map access control is done in two steps. First, when user space wants to
obtain a map fd, it provides to the kernel the eBPF-defined flags, which
are converted into open flags and passed to the security_bpf_map() security
hook for evaluation by LSMs.
Second, if user space successfully obtained an fd, it passes that fd to the
kernel when it requests a map operation (e.g. lookup or update). The kernel
first checks if the fd has the modes required to perform the requested
operation and, if yes, continues the execution and returns the result to
user space.
While the fd modes check was added for map_*_elem() functions, it is
currently missing for map iterators, added more recently with commit
a5cbe05a6673 ("bpf: Implement bpf iterator for map elements"). A map
iterator executes a chosen eBPF program for each key/value pair of a map
and allows that program to read and/or modify them.
Whether a map iterator allows only read or also write depends on whether
the MEM_RDONLY flag in the ctx_arg_info member of the bpf_iter_reg
structure is set. Also, write needs to be supported at verifier level (for
example, it is currently not supported for sock maps).
Since map iterators obtain a map from a user space fd with
bpf_map_get_with_uref(), add the new req_modes parameter to that function,
so that map iterators can provide the required fd modes to access a map. If
the user space fd doesn't include the required modes,
bpf_map_get_with_uref() returns with an error, and the map iterator will
not be created.
If a map iterator marks both the key and value as read-only, it calls
bpf_map_get_with_uref() with FMODE_CAN_READ as value for req_modes. If it
also allows write access to either the key or the value, it calls that
function with FMODE_CAN_READ | FMODE_CAN_WRITE as value for req_modes,
regardless of whether or not the write is supported by the verifier (the
write is intentionally allowed).
bpf_fd_probe_obj() does not require any fd mode, as the fd is only used for
the purpose of finding the eBPF object type, for pinning the object to the
bpffs filesystem.
Finally, it is worth to mention that the fd modes check was not added for
the cgroup iterator, although it registers an attach_target method like the
other iterators. The reason is that the fd is not the only way for user
space to reference a cgroup object (also by ID and by path). For the
protection to be effective, all reference methods need to be evaluated
consistently. This work is deferred to a separate patch.
Cc: stable(a)vger.kernel.org # 5.10.x
Fixes: a5cbe05a6673 ("bpf: Implement bpf iterator for map elements")
Signed-off-by: Roberto Sassu <roberto.sassu(a)huawei.com>
---
include/linux/bpf.h | 2 +-
kernel/bpf/inode.c | 2 +-
kernel/bpf/map_iter.c | 3 ++-
kernel/bpf/syscall.c | 8 +++++++-
net/core/bpf_sk_storage.c | 3 ++-
net/core/sock_map.c | 3 ++-
6 files changed, 15 insertions(+), 6 deletions(-)
diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index 9c1674973e03..6cd2ca910553 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -1628,7 +1628,7 @@ bool bpf_map_equal_kptr_off_tab(const struct bpf_map *map_a, const struct bpf_ma
void bpf_map_free_kptrs(struct bpf_map *map, void *map_value);
struct bpf_map *bpf_map_get(u32 ufd);
-struct bpf_map *bpf_map_get_with_uref(u32 ufd);
+struct bpf_map *bpf_map_get_with_uref(u32 ufd, fmode_t req_modes);
struct bpf_map *__bpf_map_get(struct fd f);
void bpf_map_inc(struct bpf_map *map);
void bpf_map_inc_with_uref(struct bpf_map *map);
diff --git a/kernel/bpf/inode.c b/kernel/bpf/inode.c
index 4f841e16779e..862e1caa8b0f 100644
--- a/kernel/bpf/inode.c
+++ b/kernel/bpf/inode.c
@@ -71,7 +71,7 @@ static void *bpf_fd_probe_obj(u32 ufd, enum bpf_type *type)
{
void *raw;
- raw = bpf_map_get_with_uref(ufd);
+ raw = bpf_map_get_with_uref(ufd, 0);
if (!IS_ERR(raw)) {
*type = BPF_TYPE_MAP;
return raw;
diff --git a/kernel/bpf/map_iter.c b/kernel/bpf/map_iter.c
index b0fa190b0979..1143f8960135 100644
--- a/kernel/bpf/map_iter.c
+++ b/kernel/bpf/map_iter.c
@@ -110,7 +110,8 @@ static int bpf_iter_attach_map(struct bpf_prog *prog,
if (!linfo->map.map_fd)
return -EBADF;
- map = bpf_map_get_with_uref(linfo->map.map_fd);
+ map = bpf_map_get_with_uref(linfo->map.map_fd,
+ FMODE_CAN_READ | FMODE_CAN_WRITE);
if (IS_ERR(map))
return PTR_ERR(map);
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index 4e9d4622aef7..4a2063d8e99c 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -1232,7 +1232,7 @@ struct bpf_map *bpf_map_get(u32 ufd)
}
EXPORT_SYMBOL(bpf_map_get);
-struct bpf_map *bpf_map_get_with_uref(u32 ufd)
+struct bpf_map *bpf_map_get_with_uref(u32 ufd, fmode_t req_modes)
{
struct fd f = fdget(ufd);
struct bpf_map *map;
@@ -1241,7 +1241,13 @@ struct bpf_map *bpf_map_get_with_uref(u32 ufd)
if (IS_ERR(map))
return map;
+ if ((map_get_sys_perms(map, f) & req_modes) != req_modes) {
+ map = ERR_PTR(-EPERM);
+ goto out;
+ }
+
bpf_map_inc_with_uref(map);
+out:
fdput(f);
return map;
diff --git a/net/core/bpf_sk_storage.c b/net/core/bpf_sk_storage.c
index 1b7f385643b4..bf9c6afed8ac 100644
--- a/net/core/bpf_sk_storage.c
+++ b/net/core/bpf_sk_storage.c
@@ -897,7 +897,8 @@ static int bpf_iter_attach_map(struct bpf_prog *prog,
if (!linfo->map.map_fd)
return -EBADF;
- map = bpf_map_get_with_uref(linfo->map.map_fd);
+ map = bpf_map_get_with_uref(linfo->map.map_fd,
+ FMODE_CAN_READ | FMODE_CAN_WRITE);
if (IS_ERR(map))
return PTR_ERR(map);
diff --git a/net/core/sock_map.c b/net/core/sock_map.c
index a660baedd9e7..7f7375dc39b2 100644
--- a/net/core/sock_map.c
+++ b/net/core/sock_map.c
@@ -1636,7 +1636,8 @@ static int sock_map_iter_attach_target(struct bpf_prog *prog,
if (!linfo->map.map_fd)
return -EBADF;
- map = bpf_map_get_with_uref(linfo->map.map_fd);
+ map = bpf_map_get_with_uref(linfo->map.map_fd,
+ FMODE_CAN_READ | FMODE_CAN_WRITE);
if (IS_ERR(map))
return PTR_ERR(map);
--
2.25.1
MADV_PAGEOUT tries to isolate non-LRU pages and get the warning
from isolate_lru_page below.
Fix it with checking PageLRU in advance.
------------[ cut here ]------------
trying to isolate tail page
WARNING: CPU: 0 PID: 6175 at mm/folio-compat.c:158 isolate_lru_page+0x130/0x140
Modules linked in:
CPU: 0 PID: 6175 Comm: syz-executor.0 Not tainted 5.18.12 #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014
RIP: 0010:isolate_lru_page+0x130/0x140
Link: https://lore.kernel.org/linux-mm/485f8c33.2471b.182d5726afb.Coremail.hantia…
Reported-by: 韩天硕 <hantianshuo(a)iie.ac.cn>
Suggested-by: Yang Shi <shy828301(a)gmail.com>
Fixes: 1a4e58cce84e ("mm: introduce MADV_PAGEOUT")
Cc: stable(a)vger.kernel.org
Signed-off-by: Minchan Kim <minchan(a)kernel.org>
Acked-by: Yang Shi <shy828301(a)gmail.com>
---
mm/madvise.c | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/mm/madvise.c b/mm/madvise.c
index 682e1d161aef..a3fc4cd32ed3 100644
--- a/mm/madvise.c
+++ b/mm/madvise.c
@@ -452,8 +452,11 @@ static int madvise_cold_or_pageout_pte_range(pmd_t *pmd,
continue;
}
- /* Do not interfere with other mappings of this page */
- if (page_mapcount(page) != 1)
+ /*
+ * Do not interfere with other mappings of this page and
+ * non-LRU page.
+ */
+ if (!PageLRU(page) || page_mapcount(page) != 1)
continue;
VM_BUG_ON_PAGE(PageTransCompound(page), page);
--
2.37.2.672.g94769d06f0-goog
Prior to 5.6, when /dev/random was opened with O_NONBLOCK, it would
return -EAGAIN if there was no entropy. When the pools were unified in
5.6, this was lost. The post 5.6 behavior of blocking until the pool is
initialized, and ignoring O_NONBLOCK in the process, went unnoticed,
with no reports about the regression received for two and a half years.
However, eventually this indeed did break somebody's userspace.
So we restore the old behavior, by returning -EAGAIN if the pool is not
initialized. Unlike the old /dev/random, this can only occur during
early boot, after which it never blocks again.
In order to make this O_NONBLOCK behavior consistent with other
expectations, also respect users reading with preadv2(RWF_NOWAIT) and
similar.
Fixes: 30c08efec888 ("random: make /dev/random be almost like /dev/urandom")
Reported-by: Guozihua <guozihua(a)huawei.com>
Reported-by: Zhongguohua <zhongguohua1(a)huawei.com>
Cc: Al Viro <viro(a)zeniv.linux.org.uk>
Cc: Theodore Ts'o <tytso(a)mit.edu>
Cc: Andrew Lutomirski <luto(a)kernel.org>
Cc: stable(a)vger.kernel.org
Signed-off-by: Jason A. Donenfeld <Jason(a)zx2c4.com>
---
drivers/char/mem.c | 4 ++--
drivers/char/random.c | 5 +++++
2 files changed, 7 insertions(+), 2 deletions(-)
diff --git a/drivers/char/mem.c b/drivers/char/mem.c
index 84ca98ed1dad..c2b37009b11e 100644
--- a/drivers/char/mem.c
+++ b/drivers/char/mem.c
@@ -706,8 +706,8 @@ static const struct memdev {
#endif
[5] = { "zero", 0666, &zero_fops, FMODE_NOWAIT },
[7] = { "full", 0666, &full_fops, 0 },
- [8] = { "random", 0666, &random_fops, 0 },
- [9] = { "urandom", 0666, &urandom_fops, 0 },
+ [8] = { "random", 0666, &random_fops, FMODE_NOWAIT },
+ [9] = { "urandom", 0666, &urandom_fops, FMODE_NOWAIT },
#ifdef CONFIG_PRINTK
[11] = { "kmsg", 0644, &kmsg_fops, 0 },
#endif
diff --git a/drivers/char/random.c b/drivers/char/random.c
index 79d7d4e4e582..c8cc23515568 100644
--- a/drivers/char/random.c
+++ b/drivers/char/random.c
@@ -1347,6 +1347,11 @@ static ssize_t random_read_iter(struct kiocb *kiocb, struct iov_iter *iter)
{
int ret;
+ if (!crng_ready() &&
+ ((kiocb->ki_flags & (IOCB_NOWAIT | IOCB_NOIO)) ||
+ (kiocb->ki_filp->f_flags & O_NONBLOCK)))
+ return -EAGAIN;
+
ret = wait_for_random_bytes();
if (ret != 0)
return ret;
--
2.37.3
Hi,
while trying to backup a Dell R7525 system running Debian bookworm/testing
using
LVM snapshots I noticed that the system will 'freeze' sometimes (not all the
times) when creating the snapshot.
First I thought this was related to LVM so I created
https://listman.redhat.com/archives/linux-lvm/2022-July/026228.html
(continued at
https://listman.redhat.com/archives/linux-lvm/2022-August/thread.html#26229)
Long story short:
I was even able to reproduce with fsfreeze, see last strace lines
> [...]
> 14471 1659449870.984635 openat(AT_FDCWD, "/var/lib/machines", O_RDONLY) =
3
> 14471 1659449870.984658 newfstatat(3, "", {st_mode=S_IFDIR|0700,
st_size=4096, ...}, AT_EMPTY_PATH) = 0
> 14471 1659449870.984678 ioctl(3, FIFREEZE
so I started to bisect kernel and found the following bad commit:
> md: add support for REQ_NOWAIT
>
> commit 021a24460dc2 ("block: add QUEUE_FLAG_NOWAIT") added support
> for checking whether a given bdev supports handling of REQ_NOWAIT or not.
> Since then commit 6abc49468eea ("dm: add support for REQ_NOWAIT and enable
> it for linear target") added support for REQ_NOWAIT for dm. This uses
> a similar approach to incorporate REQ_NOWAIT for md based bios.
>
> This patch was tested using t/io_uring tool within FIO. A nvme drive
> was partitioned into 2 partitions and a simple raid 0 configuration
> /dev/md0 was created.
>
> md0 : active raid0 nvme4n1p1[1] nvme4n1p2[0]
> 937423872 blocks super 1.2 512k chunks
>
> Before patch:
>
> $ ./t/io_uring /dev/md0 -p 0 -a 0 -d 1 -r 100
>
> Running top while the above runs:
>
> $ ps -eL | grep $(pidof io_uring)
>
> 38396 38396 pts/2 00:00:00 io_uring
> 38396 38397 pts/2 00:00:15 io_uring
> 38396 38398 pts/2 00:00:13 iou-wrk-38397
>
> We can see iou-wrk-38397 io worker thread created which gets created
> when io_uring sees that the underlying device (/dev/md0 in this case)
> doesn't support nowait.
>
> After patch:
>
> $ ./t/io_uring /dev/md0 -p 0 -a 0 -d 1 -r 100
>
> Running top while the above runs:
>
> $ ps -eL | grep $(pidof io_uring)
>
> 38341 38341 pts/2 00:10:22 io_uring
> 38341 38342 pts/2 00:10:37 io_uring
>
> After running this patch, we don't see any io worker thread
> being created which indicated that io_uring saw that the
> underlying device does support nowait. This is the exact behaviour
> noticed on a dm device which also supports nowait.
>
> For all the other raid personalities except raid0, we would need
> to train pieces which involves make_request fn in order for them
> to correctly handle REQ_NOWAIT.
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i
d=f51d46d0e7cb5b8494aa534d276a9d8915a2443d
After reverting this commit (and follow up commit
0f9650bd838efe5c52f7e5f40c3204ad59f1964d)
v5.18.15 and v5.19 worked for me again.
At this point I still wonder why I experienced the same problem even after I
removed one nvme device from the mdraid array and tested it separately. So
maybe
there is another nowait/REQ_NOWAIT problem somewhere. During bisect I only
tested
against the mdraid array.
#regzbot introduced: f51d46d0e7cb5b8494aa534d276a9d8915a2443d
#regzbot link:
https://listman.redhat.com/archives/linux-lvm/2022-July/026228.html
#regzbot link:
https://listman.redhat.com/archives/linux-lvm/2022-August/thread.html#26229
--
Regards,
Thomas
صندوق النقد الدولي (I.M.F)
شعبة إدارة الديون الدولية ،
# 1900 ، شارع الرئيس
مرحبًا بكم في عنوان البريد الإلكتروني الرسمي للمدير I.M.F. كريستالينا جورجيفا
عزيزي المستفيد!
لقد سمح لنا وزير الخزانة المعين حديثًا والهيئة الحاكمة للسلطة النقدية
للأمم المتحدة بفحص الأموال التي لم تتم المطالبة بها والتي لطالما كانت
مدينة لحكومة الأمم المتحدة ، لذلك تم اتهام مالكيها بالاحتيال.
المحتالون الذين يستخدمون اسم الأمم المتحدة ، وفقًا لسجل تخزين البيانات
مع عنوان البريد الإلكتروني لنظامنا أثناء التحقيق الذي أجريناه ، فإن
دفعتك مدرجة في قائمة تضم 150 مستفيدًا في الفئات التالية: صندوق يانصيب
غير مُسلَّم / صندوق يانصيب غير مدفوع / وراثة نقل غير مكتملة / أموال
العقد.
قام مسؤولو البنك الفاسد ، الذين ارتكبوا الفساد من أجل الاحتيال على
أموالك ، بتأخير دفعك بشكل غير معقول ، مما أدى إلى تحملك الكثير من
التكاليف وتأخير غير معقول في قبول مدفوعاتك. اختارت الأمم المتحدة
وصندوق النقد الدولي (IMF) دفع جميع التعويضات لـ 150 مستفيدًا باستخدام
بطاقات Visa ATM من أمريكا الشمالية وأمريكا الجنوبية والولايات المتحدة
وأوروبا وآسيا وحول العالم ، حيث تتوفر تقنية الدفع العالمية هذه
للمستهلكين والشركات والمؤسسات المالية. ويسمح للحكومات باستخدام العملات
الرقمية بدلاً من النقد والشيكات.
لقد قمنا بالترتيب لسداد مدفوعاتك باستخدام بطاقة Visa ATM وسيتم إصدارها
لك وإرسالها مباشرةً إلى عنوانك عبر أي خدمات بريد سريع متاحة. بعد
الاتصال بنا ، سيتم تحويل مبلغ 1،500،000.00 دولار أمريكي إلى بطاقة Visa
ATM ، والتي ستسمح لك بسحب أموالك عن طريق سحب ما لا يقل عن 10،000 دولار
أمريكي في اليوم من أي ماكينة صراف آلي في بلدك. بناءً على طلبك ، يمكنك
زيادة الحد إلى 20،000.00 دولار في اليوم. في هذا الصدد ، يجب عليك
الاتصال بإدارة المدفوعات والتحويلات الدولية وتقديم المعلومات المطلوبة
من خلال:
1. اسمك الكامل ..............
2. عنوانك الكامل ...
3. الجنسية ................
4. تاريخ الميلاد / الجنس .........
5. التخصص ...
6. رقم الهاتف .........
7. عنوان البريد الإلكتروني لشركتك ......
8. عنوان البريد الإلكتروني الشخصي ......
لتحديد هذا الرمز (الرابط: CLIENT-966/16) ، استخدمه كموضوع للبريد
الإلكتروني الخاص بك وحاول تقديم المعلومات المذكورة أعلاه إلى الموظفين
التاليين لإصدار وتسليم بطاقة Visa ATM ؛
نوصيك بفتح عنوان بريد إلكتروني شخصي برقم جديد للسماح لوكيل البنك بتتبع
هذه المدفوعات وتبادل الرسائل لمنع المزيد من التأخير أو التوجيه الخاطئ
لأموالك. اتصل بوكيل البنك الإفريقي المتحد الآن باستخدام معلومات
الاتصال أدناه:
الشخص المسؤول: السيد توني إلوميلو
إدارة تحويل أموال التعويضات ، جهة الاتصال بالبريد الإلكتروني لبنك
إفريقيا المتحد: (mrtonyelumelu98(a)gmail.com)
نحتاج إلى رد سريع على هذا البريد الإلكتروني لتجنب المزيد من التأخير.
صديقك المخلص
السّيدة. كريستالينا جورجيفا
The lock handling problems in gsmld_write(), reported by Syzkaller, have
been fixed by the following patches which can be cleanly applied to the
5.10 branch.
commit fe8f65b018effbf473f53af3538d0c1878b8b329 upstream.
Xen blkfront advertises its support of the persistent grants feature
when it first setting up and when resuming in 'talk_to_blkback()'.
Then, blkback reads the advertised value when it connects with blkfront
and decides if it will use the persistent grants feature or not, and
advertises its decision to blkfront. Blkfront reads the blkback's
decision and it also makes the decision for the use of the feature.
Commit 402c43ea6b34 ("xen-blkfront: Apply 'feature_persistent' parameter
when connect"), however, made the blkfront's read of the parameter for
disabling the advertisement, namely 'feature_persistent', to be done
when it negotiate, not when advertise. Therefore blkfront advertises
without reading the parameter. As the field for caching the parameter
value is zero-initialized, it always advertises as the feature is
disabled, so that the persistent grants feature becomes always disabled.
This commit fixes the issue by making the blkfront does parmeter caching
just before the advertisement.
Fixes: 402c43ea6b34 ("xen-blkfront: Apply 'feature_persistent' parameter when connect")
Cc: <stable(a)vger.kernel.org> # 5.10.x
Reported-by: Marek Marczykowski-Górecki <marmarek(a)invisiblethingslab.com>
Signed-off-by: SeongJae Park <sj(a)kernel.org>
Tested-by: Marek Marczykowski-Górecki <marmarek(a)invisiblethingslab.com>
Reviewed-by: Juergen Gross <jgross(a)suse.com>
Link: https://lore.kernel.org/r/20220831165824.94815-4-sj@kernel.org
Signed-off-by: Juergen Gross <jgross(a)suse.com>
---
This patch is a manual backport of the upstream commit on the 5.10.y
kernel. Please note that this patch can be applied on the latest 5.10.y
only after the preceding patch[1] is applied.
[1] https://lore.kernel.org/stable/20220906132819.016040100@linuxfoundation.org/
drivers/block/xen-blkfront.c | 14 +++++++-------
1 file changed, 7 insertions(+), 7 deletions(-)
diff --git a/drivers/block/xen-blkfront.c b/drivers/block/xen-blkfront.c
index 9d5460f6e0ff..6f33d62331b1 100644
--- a/drivers/block/xen-blkfront.c
+++ b/drivers/block/xen-blkfront.c
@@ -1852,6 +1852,12 @@ static void free_info(struct blkfront_info *info)
kfree(info);
}
+/* Enable the persistent grants feature. */
+static bool feature_persistent = true;
+module_param(feature_persistent, bool, 0644);
+MODULE_PARM_DESC(feature_persistent,
+ "Enables the persistent grants feature");
+
/* Common code used when first setting up, and when resuming. */
static int talk_to_blkback(struct xenbus_device *dev,
struct blkfront_info *info)
@@ -1943,6 +1949,7 @@ static int talk_to_blkback(struct xenbus_device *dev,
message = "writing protocol";
goto abort_transaction;
}
+ info->feature_persistent_parm = feature_persistent;
err = xenbus_printf(xbt, dev->nodename, "feature-persistent", "%u",
info->feature_persistent_parm);
if (err)
@@ -2019,12 +2026,6 @@ static int negotiate_mq(struct blkfront_info *info)
return 0;
}
-/* Enable the persistent grants feature. */
-static bool feature_persistent = true;
-module_param(feature_persistent, bool, 0644);
-MODULE_PARM_DESC(feature_persistent,
- "Enables the persistent grants feature");
-
/**
* Entry point to this code when a new device is created. Allocate the basic
* structures and the ring buffer for communication with the backend, and
@@ -2394,7 +2395,6 @@ static void blkfront_gather_backend_features(struct blkfront_info *info)
if (xenbus_read_unsigned(info->xbdev->otherend, "feature-discard", 0))
blkfront_setup_discard(info);
- info->feature_persistent_parm = feature_persistent;
if (info->feature_persistent_parm)
info->feature_persistent =
!!xenbus_read_unsigned(info->xbdev->otherend,
--
2.25.1
Apologies for the delay in reporting this: I messed up my first attempt at
bisecting, then I've spent a week going to, enjoying, returning from and
recovering from a music festival.
Up to and including 5.18.18 things are fine. With 5.19.0 (and .1 and .2) I see
lots of errors and hangs on the USB2 chipset, e.g.
$ grep "usb 9-4" dmesg.5.19.2
[ 6.669075] usb 9-4: new full-speed USB device number 2 using ohci-pci
[ 6.829087] usb 9-4: device descriptor read/64, error -32
[ 7.097094] usb 9-4: device descriptor read/64, error -32
[ 7.361087] usb 9-4: new full-speed USB device number 3 using ohci-pci
[ 7.521152] usb 9-4: device descriptor read/64, error -32
[ 7.789066] usb 9-4: device descriptor read/64, error -32
[ 8.081070] usb 9-4: new full-speed USB device number 4 using ohci-pci
[ 8.497138] usb 9-4: device not accepting address 4, error -32
[ 8.653140] usb 9-4: new full-speed USB device number 5 using ohci-pci
[ 9.069141] usb 9-4: device not accepting address 5, error -32
$
$ grep "usb 1-2" dmesg.5.19.2
[ 5.917102] usb 1-2: new high-speed USB device number 2 using ehci-pci
[ 6.277076] usb 1-2: device descriptor read/64, error -71
[ 6.513143] usb 1-2: device descriptor read/64, error -32
[ 6.753146] usb 1-2: new high-speed USB device number 3 using ehci-pci
[ 6.881143] usb 1-2: device descriptor read/64, error -32
[ 7.117144] usb 1-2: device descriptor read/64, error -32
[ 7.429141] usb 1-2: new high-speed USB device number 4 using ehci-pci
[ 7.845134] usb 1-2: device not accepting address 4, error -32
[ 7.977142] usb 1-2: new high-speed USB device number 5 using ehci-pci
[ 8.393158] usb 1-2: device not accepting address 5, error -32
$
the USB port is then no longer usable
This is not reproducible on the other chipset (USB3) on this machine,
nor on two other systems. Swapping USB cables doesn't help.
I have bisected it to
$ git bisect bad
78013eaadf696d2105982abb4018fbae394ca08f is the first bad commit
commit 78013eaadf696d2105982abb4018fbae394ca08f
Author: Christoph Hellwig <hch(a)lst.de>
Date: Mon Feb 14 14:11:44 2022 +0100
x86: remove the IOMMU table infrastructure
however it will not easily revert
I'll be more than happy to assist with any debugging/testing.
$ git revert 78013eaadf696d2105982abb4018fbae394ca08f
Auto-merging arch/x86/include/asm/dma-mapping.h
CONFLICT (content): Merge conflict in arch/x86/include/asm/dma-mapping.h
Auto-merging arch/x86/include/asm/iommu.h
Auto-merging arch/x86/include/asm/xen/swiotlb-xen.h
Auto-merging arch/x86/kernel/Makefile
Auto-merging arch/x86/kernel/pci-dma.c
CONFLICT (content): Merge conflict in arch/x86/kernel/pci-dma.c
Auto-merging arch/x86/kernel/vmlinux.lds.S
Auto-merging drivers/iommu/amd/init.c
Auto-merging drivers/iommu/amd/iommu.c
CONFLICT (content): Merge conflict in drivers/iommu/amd/iommu.c
Auto-merging drivers/iommu/intel/dmar.c
error: could not revert 78013eaadf69... x86: remove the IOMMU table infrastructure
# dmidecode | grep -A2 "^Base Board"
Base Board Information
Manufacturer: Gigabyte Technology Co., Ltd.
Product Name: 970A-DS3P
#
# lspci -nn | grep -i usb
00:12.0 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB OHCI0 Controller [1002:4397]
00:12.2 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB EHCI Controller [1002:4396]
00:13.0 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB OHCI0 Controller [1002:4397]
00:13.2 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB EHCI Controller [1002:4396]
00:14.5 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB OHCI2 Controller [1002:4399]
00:16.0 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB OHCI0 Controller [1002:4397]
00:16.2 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB EHCI Controller [1002:4396]
02:00.0 USB controller [0c03]: VIA Technologies, Inc. VL805/806 xHCI USB 3.0 Controller [1106:3483] (rev 01)
#
# lspci -v -s 00:12
00:12.0 USB controller: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB OHCI0 Controller (prog-if 10 [OHCI])
Subsystem: Gigabyte Technology Co., Ltd GA-880GMA-USB3
Flags: bus master, 66MHz, medium devsel, latency 32, IRQ 18
Memory at fe50a000 (32-bit, non-prefetchable) [size=4K]
Kernel driver in use: ohci-pci
Kernel modules: ohci_pci
00:12.2 USB controller: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB EHCI Controller (prog-if 20 [EHCI])
Subsystem: Gigabyte Technology Co., Ltd GA-880GMA-USB3
Flags: bus master, 66MHz, medium devsel, latency 32, IRQ 17
Memory at fe509000 (32-bit, non-prefetchable) [size=256]
Capabilities: [c0] Power Management version 2
Capabilities: [e4] Debug port: BAR=1 offset=00e0
Kernel driver in use: ehci-pci
Kernel modules: ehci_pci
#
# lsusb
Bus 005 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
Bus 009 Device 002: ID 067b:2303 Prolific Technology, Inc. PL2303 Serial Port / Mobile Action MA-8910P
Bus 009 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
Bus 008 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
Bus 004 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
Bus 007 Device 002: ID 03f0:0317 HP, Inc LaserJet 1200
Bus 007 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
Bus 001 Device 002: ID 04e8:6860 Samsung Electronics Co., Ltd Galaxy A5 (MTP)
Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
Bus 006 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
Bus 003 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
Bus 002 Device 002: ID 2109:3431 VIA Labs, Inc. Hub
Bus 002 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
#
$ git bisect log
git bisect start
# good: [4b0986a3613c92f4ec1bdc7f60ec66fea135991f] Linux 5.18
git bisect good 4b0986a3613c92f4ec1bdc7f60ec66fea135991f
# good: [07e0b709cab7dc987b5071443789865e20481119] Linux 5.18.18
git bisect good 07e0b709cab7dc987b5071443789865e20481119
# bad: [3d7cb6b04c3f3115719235cc6866b10326de34cd] Linux 5.19
git bisect bad 3d7cb6b04c3f3115719235cc6866b10326de34cd
# bad: [c011dd537ffe47462051930413fed07dbdc80313] Merge tag 'arm-soc-5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
git bisect bad c011dd537ffe47462051930413fed07dbdc80313
# good: [7e062cda7d90543ac8c7700fc7c5527d0c0f22ad] Merge tag 'net-next-5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
git bisect good 7e062cda7d90543ac8c7700fc7c5527d0c0f22ad
# good: [f8122500a039abeabfff41b0ad8b6a2c94c1107d] Merge branch 'etnaviv/next' of https://git.pengutronix.de/git/lst/linux into drm-next
git bisect good f8122500a039abeabfff41b0ad8b6a2c94c1107d
# good: [2518f226c60d8e04d18ba4295500a5b0b8ac7659] Merge tag 'drm-next-2022-05-25' of git://anongit.freedesktop.org/drm/drm
git bisect good 2518f226c60d8e04d18ba4295500a5b0b8ac7659
# good: [f7a344468105ef8c54086dfdc800e6f5a8417d3e] ASoC: max98090: Move check for invalid values before casting in max98090_put_enab_tlv()
git bisect good f7a344468105ef8c54086dfdc800e6f5a8417d3e
# good: [fbe86daca0ba878b04fa241b85e26e54d17d4229] Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
git bisect good fbe86daca0ba878b04fa241b85e26e54d17d4229
# good: [709c8632597c3276cd21324b0256628f1a7fd4df] xfs: rework deferred attribute operation setup
git bisect good 709c8632597c3276cd21324b0256628f1a7fd4df
# bad: [babf0bb978e3c9fce6c4eba6b744c8754fd43d8e] Merge tag 'xfs-5.19-for-linus' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux
git bisect bad babf0bb978e3c9fce6c4eba6b744c8754fd43d8e
# bad: [8b728edc5be161799434cc17e1279db2f8eabe29] Merge tag 'fs_for_v5.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs
git bisect bad 8b728edc5be161799434cc17e1279db2f8eabe29
# bad: [3f70356edf5611c28a68d8d5a9c2b442c9eb81e6] swiotlb: merge swiotlb-xen initialization into swiotlb
git bisect bad 3f70356edf5611c28a68d8d5a9c2b442c9eb81e6
# good: [f39f8d0eb081407e470396fd4cc376c526d13066] MIPS/octeon: use swiotlb_init instead of open coding it
git bisect good f39f8d0eb081407e470396fd4cc376c526d13066
# bad: [c6af2aa9ffc9763826607bc2664ef3ea4475ed18] swiotlb: make the swiotlb_init interface more useful
git bisect bad c6af2aa9ffc9763826607bc2664ef3ea4475ed18
# bad: [a3e230926708125205ffd06d3dc2175a8263ae7e] x86: centralize setting SWIOTLB_FORCE when guest memory encryption is enabled
git bisect bad a3e230926708125205ffd06d3dc2175a8263ae7e
# bad: [78013eaadf696d2105982abb4018fbae394ca08f] x86: remove the IOMMU table infrastructure
git bisect bad 78013eaadf696d2105982abb4018fbae394ca08f
# first bad commit: [78013eaadf696d2105982abb4018fbae394ca08f] x86: remove the IOMMU table infrastructure
$
--
Alan J. Wylie https://www.wylie.me.uk/
Dance like no-one's watching. / Encrypt like everyone is.
Security is inversely proportional to convenience
Curently we don't use any preallocation when a file is already closed
when allocating blocks (from writeback code when converting delayed
allocation). However for small files, using locality group preallocation
is actually desirable as that is not specific to a particular file.
Rather it is a method to pack small files together to reduce
fragmentation and for that the fact the file is closed is actually even
stronger hint the file would benefit from packing. So change the logic
to allow locality group preallocation in this case.
Fixes: 196e402adf2e ("ext4: improve cr 0 / cr 1 group scanning")
CC: stable(a)vger.kernel.org
Reported-and-tested-by: Stefan Wahren <stefan.wahren(a)i2se.com>
Tested-by: Ojaswin Mujoo <ojaswin(a)linux.ibm.com>
Reviewed-by: Ritesh Harjani (IBM) <ritesh.list(a)gmail.com>
Signed-off-by: Jan Kara <jack(a)suse.cz>
---
fs/ext4/mballoc.c | 27 +++++++++++++++------------
1 file changed, 15 insertions(+), 12 deletions(-)
diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
index 6251b4a6cc63..af1e49c3603f 100644
--- a/fs/ext4/mballoc.c
+++ b/fs/ext4/mballoc.c
@@ -5195,6 +5195,7 @@ static void ext4_mb_group_or_file(struct ext4_allocation_context *ac)
struct ext4_sb_info *sbi = EXT4_SB(ac->ac_sb);
int bsbits = ac->ac_sb->s_blocksize_bits;
loff_t size, isize;
+ bool inode_pa_eligible, group_pa_eligible;
if (!(ac->ac_flags & EXT4_MB_HINT_DATA))
return;
@@ -5202,25 +5203,27 @@ static void ext4_mb_group_or_file(struct ext4_allocation_context *ac)
if (unlikely(ac->ac_flags & EXT4_MB_HINT_GOAL_ONLY))
return;
+ group_pa_eligible = sbi->s_mb_group_prealloc > 0;
+ inode_pa_eligible = true;
size = ac->ac_o_ex.fe_logical + EXT4_C2B(sbi, ac->ac_o_ex.fe_len);
isize = (i_size_read(ac->ac_inode) + ac->ac_sb->s_blocksize - 1)
>> bsbits;
+ /* No point in using inode preallocation for closed files */
if ((size == isize) && !ext4_fs_is_busy(sbi) &&
- !inode_is_open_for_write(ac->ac_inode)) {
- ac->ac_flags |= EXT4_MB_HINT_NOPREALLOC;
- return;
- }
+ !inode_is_open_for_write(ac->ac_inode))
+ inode_pa_eligible = false;
- if (sbi->s_mb_group_prealloc <= 0) {
- ac->ac_flags |= EXT4_MB_STREAM_ALLOC;
- return;
- }
-
- /* don't use group allocation for large files */
size = max(size, isize);
- if (size > sbi->s_mb_stream_request) {
- ac->ac_flags |= EXT4_MB_STREAM_ALLOC;
+ /* Don't use group allocation for large files */
+ if (size > sbi->s_mb_stream_request)
+ group_pa_eligible = false;
+
+ if (!group_pa_eligible) {
+ if (inode_pa_eligible)
+ ac->ac_flags |= EXT4_MB_STREAM_ALLOC;
+ else
+ ac->ac_flags |= EXT4_MB_HINT_NOPREALLOC;
return;
}
--
2.35.3
mb_set_largest_free_order() updates lists containing groups with largest
chunk of free space of given order. The way it updates it leads to
always moving the group to the tail of the list. Thus allocations
looking for free space of given order effectively end up cycling through
all groups (and due to initialization in last to first order). This
spreads allocations among block groups which reduces performance for
rotating disks or low-end flash media. Change
mb_set_largest_free_order() to only update lists if the order of the
largest free chunk in the group changed.
Fixes: 196e402adf2e ("ext4: improve cr 0 / cr 1 group scanning")
CC: stable(a)vger.kernel.org
Reported-and-tested-by: Stefan Wahren <stefan.wahren(a)i2se.com>
Tested-by: Ojaswin Mujoo <ojaswin(a)linux.ibm.com>
Reviewed-by: Ritesh Harjani (IBM) <ritesh.list(a)gmail.com>
Signed-off-by: Jan Kara <jack(a)suse.cz>
---
fs/ext4/mballoc.c | 24 +++++++++++++-----------
1 file changed, 13 insertions(+), 11 deletions(-)
diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
index 41e1cfecac3b..6251b4a6cc63 100644
--- a/fs/ext4/mballoc.c
+++ b/fs/ext4/mballoc.c
@@ -1077,23 +1077,25 @@ mb_set_largest_free_order(struct super_block *sb, struct ext4_group_info *grp)
struct ext4_sb_info *sbi = EXT4_SB(sb);
int i;
- if (test_opt2(sb, MB_OPTIMIZE_SCAN) && grp->bb_largest_free_order >= 0) {
+ for (i = MB_NUM_ORDERS(sb) - 1; i >= 0; i--)
+ if (grp->bb_counters[i] > 0)
+ break;
+ /* No need to move between order lists? */
+ if (!test_opt2(sb, MB_OPTIMIZE_SCAN) ||
+ i == grp->bb_largest_free_order) {
+ grp->bb_largest_free_order = i;
+ return;
+ }
+
+ if (grp->bb_largest_free_order >= 0) {
write_lock(&sbi->s_mb_largest_free_orders_locks[
grp->bb_largest_free_order]);
list_del_init(&grp->bb_largest_free_order_node);
write_unlock(&sbi->s_mb_largest_free_orders_locks[
grp->bb_largest_free_order]);
}
- grp->bb_largest_free_order = -1; /* uninit */
-
- for (i = MB_NUM_ORDERS(sb) - 1; i >= 0; i--) {
- if (grp->bb_counters[i] > 0) {
- grp->bb_largest_free_order = i;
- break;
- }
- }
- if (test_opt2(sb, MB_OPTIMIZE_SCAN) &&
- grp->bb_largest_free_order >= 0 && grp->bb_free) {
+ grp->bb_largest_free_order = i;
+ if (grp->bb_largest_free_order >= 0 && grp->bb_free) {
write_lock(&sbi->s_mb_largest_free_orders_locks[
grp->bb_largest_free_order]);
list_add_tail(&grp->bb_largest_free_order_node,
--
2.35.3
From: Chuck Lever <chuck.lever(a)oracle.com>
commit f11ad7aa653130b71e2e89bed207f387718216d5 upstream.
RFC 8881 explains the purpose of the write verifier this way:
> The final portion of the result is the field writeverf. This field
> is the write verifier and is a cookie that the client can use to
> determine whether a server has changed instance state (e.g., server
> restart) between a call to WRITE and a subsequent call to either
> WRITE or COMMIT.
But then it says:
> This cookie MUST be unchanged during a single instance of the
> NFSv4.1 server and MUST be unique between instances of the NFSv4.1
> server. If the cookie changes, then the client MUST assume that
> any data written with an UNSTABLE4 value for committed and an old
> writeverf in the reply has been lost and will need to be
> recovered.
RFC 1813 has similar language for NFSv3. NFSv2 does not have a write
verifier since it doesn't implement the COMMIT procedure.
Since commit 19e0663ff9bc ("nfsd: Ensure sampling of the write
verifier is atomic with the write"), the Linux NFS server has
returned a boot-time-based verifier for UNSTABLE WRITEs, but a zero
verifier for FILE_SYNC and DATA_SYNC WRITEs. FILE_SYNC and DATA_SYNC
WRITEs are not followed up with a COMMIT, so there's no need for
clients to compare verifiers for stable writes.
However, by returning a different verifier for stable and unstable
writes, the above commit puts the Linux NFS server a step farther
out of compliance with the first MUST above. At least one NFS client
(FreeBSD) noticed the difference, making this a potential
regression.
Reported-by: Rick Macklem <rmacklem(a)uoguelph.ca>
Link: https://lore.kernel.org/linux-nfs/YQXPR0101MB096857EEACF04A6DF1FC6D9BDD749@…
Fixes: 19e0663ff9bc ("nfsd: Ensure sampling of the write verifier is atomic with the write")
Signed-off-by: Chuck Lever <chuck.lever(a)oracle.com>
Signed-off-by: Michael Kochera <kochera(a)google.com>
---
Removed down_write to fix the conflict in the cherry-pick. The down_write functionality was no longer needed there. Upstream commit 555dbf1a9aac6d3150c8b52fa35f768a692f4eeb titled nfsd: Replace use of rwsem with errseq_t removed those and replace it with new functionality that was more scalable. This commit is already backported onto 5.10 and so removing down_write ensures consistency with that change. Tested by compiling and booting successfully.
fs/nfsd/vfs.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
index c852bb5ff212..a4ae1fcd2ab1 100644
--- a/fs/nfsd/vfs.c
+++ b/fs/nfsd/vfs.c
@@ -1014,6 +1014,10 @@ nfsd_vfs_write(struct svc_rqst *rqstp, struct svc_fh *fhp, struct nfsd_file *nf,
iov_iter_kvec(&iter, WRITE, vec, vlen, *cnt);
since = READ_ONCE(file->f_wb_err);
if (flags & RWF_SYNC) {
+ if (verf)
+ nfsd_copy_boot_verifier(verf,
+ net_generic(SVC_NET(rqstp),
+ nfsd_net_id));
host_err = vfs_iter_write(file, &iter, &pos, flags);
if (host_err < 0)
nfsd_reset_boot_verifier(net_generic(SVC_NET(rqstp),
--
2.37.2.789.g6183377224-goog
--
I am Stefano Pessina, an Italian business tycoon, investor, and
philanthropist. the vice chairman, chief executive officer (CEO), and
the single largest shareholder of Walgreens Boots Alliance. I gave
away 25 percent of my personal wealth to charity. And I also pledged
to give away the rest of 25% this year 2022 to Individuals.. I have
decided to donate $2,200,000.00 to you. If you are interested in my
donation, do contact me for more details.......
One of the side-effects of mb_optimize_scan was that the optimized
functions to select next group to try were called even before we tried
the goal group. As a result we no longer allocate files close to
corresponding inodes as well as we don't try to expand currently
allocated extent in the same group. This results in reaim regression
with workfile.disk workload of upto 8% with many clients on my test
machine:
baseline mb_optimize_scan
Hmean disk-1 2114.16 ( 0.00%) 2099.37 ( -0.70%)
Hmean disk-41 87794.43 ( 0.00%) 83787.47 * -4.56%*
Hmean disk-81 148170.73 ( 0.00%) 135527.05 * -8.53%*
Hmean disk-121 177506.11 ( 0.00%) 166284.93 * -6.32%*
Hmean disk-161 220951.51 ( 0.00%) 207563.39 * -6.06%*
Hmean disk-201 208722.74 ( 0.00%) 203235.59 ( -2.63%)
Hmean disk-241 222051.60 ( 0.00%) 217705.51 ( -1.96%)
Hmean disk-281 252244.17 ( 0.00%) 241132.72 * -4.41%*
Hmean disk-321 255844.84 ( 0.00%) 245412.84 * -4.08%*
Also this is causing huge regression (time increased by a factor of 5 or
so) when untarring archive with lots of small files on some eMMC storage
cards.
Fix the problem by making sure we try goal group first.
Fixes: 196e402adf2e ("ext4: improve cr 0 / cr 1 group scanning")
CC: stable(a)vger.kernel.org
Reported-by: Stefan Wahren <stefan.wahren(a)i2se.com>
Link: https://lore.kernel.org/all/20220727105123.ckwrhbilzrxqpt24@quack3/
Link: https://lore.kernel.org/all/0d81a7c2-46b7-6010-62a4-3e6cfc1628d6@i2se.com/
Signed-off-by: Jan Kara <jack(a)suse.cz>
---
fs/ext4/mballoc.c | 14 +++++++-------
1 file changed, 7 insertions(+), 7 deletions(-)
diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
index bd8f8b5c3d30..41e1cfecac3b 100644
--- a/fs/ext4/mballoc.c
+++ b/fs/ext4/mballoc.c
@@ -1049,8 +1049,10 @@ static void ext4_mb_choose_next_group(struct ext4_allocation_context *ac,
{
*new_cr = ac->ac_criteria;
- if (!should_optimize_scan(ac) || ac->ac_groups_linear_remaining)
+ if (!should_optimize_scan(ac) || ac->ac_groups_linear_remaining) {
+ *group = next_linear_group(ac, *group, ngroups);
return;
+ }
if (*new_cr == 0) {
ext4_mb_choose_next_group_cr0(ac, new_cr, group, ngroups);
@@ -2636,7 +2638,7 @@ static noinline_for_stack int
ext4_mb_regular_allocator(struct ext4_allocation_context *ac)
{
ext4_group_t prefetch_grp = 0, ngroups, group, i;
- int cr = -1;
+ int cr = -1, new_cr;
int err = 0, first_err = 0;
unsigned int nr = 0, prefetch_ios = 0;
struct ext4_sb_info *sbi;
@@ -2711,13 +2713,11 @@ ext4_mb_regular_allocator(struct ext4_allocation_context *ac)
ac->ac_groups_linear_remaining = sbi->s_mb_max_linear_groups;
prefetch_grp = group;
- for (i = 0; i < ngroups; group = next_linear_group(ac, group, ngroups),
- i++) {
- int ret = 0, new_cr;
+ for (i = 0, new_cr = cr; i < ngroups; i++,
+ ext4_mb_choose_next_group(ac, &new_cr, &group, ngroups)) {
+ int ret = 0;
cond_resched();
-
- ext4_mb_choose_next_group(ac, &new_cr, &group, ngroups);
if (new_cr != cr) {
cr = new_cr;
goto repeat;
--
2.35.3
From: Yipeng Zou <zouyipeng(a)huawei.com>
Currently, The arguments passing to lockdep_hardirqs_{on,off} was fixed
in CALLER_ADDR0.
The function trace_hardirqs_on_caller should have been intended to use
caller_addr to represent the address that caller wants to be traced.
For example, lockdep log in riscv showing the last {enabled,disabled} at
__trace_hardirqs_{on,off} all the time(if called by):
[ 57.853175] hardirqs last enabled at (2519): __trace_hardirqs_on+0xc/0x14
[ 57.853848] hardirqs last disabled at (2520): __trace_hardirqs_off+0xc/0x14
After use trace_hardirqs_xx_caller, we can get more effective information:
[ 53.781428] hardirqs last enabled at (2595): restore_all+0xe/0x66
[ 53.782185] hardirqs last disabled at (2596): ret_from_exception+0xa/0x10
Link: https://lkml.kernel.org/r/20220901104515.135162-2-zouyipeng@huawei.com
Cc: stable(a)vger.kernel.org
Fixes: c3bc8fd637a96 ("tracing: Centralize preemptirq tracepoints and unify their usage")
Signed-off-by: Yipeng Zou <zouyipeng(a)huawei.com>
Signed-off-by: Steven Rostedt (Google) <rostedt(a)goodmis.org>
---
kernel/trace/trace_preemptirq.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/kernel/trace/trace_preemptirq.c b/kernel/trace/trace_preemptirq.c
index 95b58bd757ce..1e130da1b742 100644
--- a/kernel/trace/trace_preemptirq.c
+++ b/kernel/trace/trace_preemptirq.c
@@ -95,14 +95,14 @@ __visible void trace_hardirqs_on_caller(unsigned long caller_addr)
}
lockdep_hardirqs_on_prepare();
- lockdep_hardirqs_on(CALLER_ADDR0);
+ lockdep_hardirqs_on(caller_addr);
}
EXPORT_SYMBOL(trace_hardirqs_on_caller);
NOKPROBE_SYMBOL(trace_hardirqs_on_caller);
__visible void trace_hardirqs_off_caller(unsigned long caller_addr)
{
- lockdep_hardirqs_off(CALLER_ADDR0);
+ lockdep_hardirqs_off(caller_addr);
if (!this_cpu_read(tracing_irq_cpu)) {
this_cpu_write(tracing_irq_cpu, 1);
--
2.35.1
Chevron Corporation is currently recruiting foreign applicants on
various job positions available kindly apply by sending your CV/résumé
to chevron.uk(a)europe.com for more details.
HR Management
This is a note to let you know that I've just added the patch titled
serial: tegra-tcu: Use uart_xmit_advance(), fixes icount.tx
to my tty git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty.git
in the tty-linus branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will hopefully also be merged in Linus's tree for the
next -rc kernel release.
If you have any questions about this process, please let me know.
From 1d10cd4da593bc0196a239dcc54dac24b6b0a74e Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Ilpo=20J=C3=A4rvinen?= <ilpo.jarvinen(a)linux.intel.com>
Date: Thu, 1 Sep 2022 17:39:34 +0300
Subject: serial: tegra-tcu: Use uart_xmit_advance(), fixes icount.tx
accounting
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Tx'ing does not correctly account Tx'ed characters into icount.tx.
Using uart_xmit_advance() fixes the problem.
Fixes: 2d908b38d409 ("serial: Add Tegra Combined UART driver")
Cc: <stable(a)vger.kernel.org> # serial: Create uart_xmit_advance()
Reviewed-by: Andy Shevchenko <andy.shevchenko(a)gmail.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen(a)linux.intel.com>
Link: https://lore.kernel.org/r/20220901143934.8850-4-ilpo.jarvinen@linux.intel.c…
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/tty/serial/tegra-tcu.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/tty/serial/tegra-tcu.c b/drivers/tty/serial/tegra-tcu.c
index 4877c54c613d..889b701ba7c6 100644
--- a/drivers/tty/serial/tegra-tcu.c
+++ b/drivers/tty/serial/tegra-tcu.c
@@ -101,7 +101,7 @@ static void tegra_tcu_uart_start_tx(struct uart_port *port)
break;
tegra_tcu_write(tcu, &xmit->buf[xmit->tail], count);
- xmit->tail = (xmit->tail + count) & (UART_XMIT_SIZE - 1);
+ uart_xmit_advance(port, count);
}
uart_write_wakeup(port);
--
2.37.3
This is a note to let you know that I've just added the patch titled
serial: tegra: Use uart_xmit_advance(), fixes icount.tx accounting
to my tty git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty.git
in the tty-linus branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will hopefully also be merged in Linus's tree for the
next -rc kernel release.
If you have any questions about this process, please let me know.
From 754f68044c7dd6c52534ba3e0f664830285c4b15 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Ilpo=20J=C3=A4rvinen?= <ilpo.jarvinen(a)linux.intel.com>
Date: Thu, 1 Sep 2022 17:39:33 +0300
Subject: serial: tegra: Use uart_xmit_advance(), fixes icount.tx accounting
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
DMA complete & stop paths did not correctly account Tx'ed characters
into icount.tx. Using uart_xmit_advance() fixes the problem.
Fixes: e9ea096dd225 ("serial: tegra: add serial driver")
Cc: <stable(a)vger.kernel.org> # serial: Create uart_xmit_advance()
Reviewed-by: Andy Shevchenko <andy.shevchenko(a)gmail.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen(a)linux.intel.com>
Link: https://lore.kernel.org/r/20220901143934.8850-3-ilpo.jarvinen@linux.intel.c…
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/tty/serial/serial-tegra.c | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)
diff --git a/drivers/tty/serial/serial-tegra.c b/drivers/tty/serial/serial-tegra.c
index ad4f3567ff90..a5748e41483b 100644
--- a/drivers/tty/serial/serial-tegra.c
+++ b/drivers/tty/serial/serial-tegra.c
@@ -525,7 +525,7 @@ static void tegra_uart_tx_dma_complete(void *args)
count = tup->tx_bytes_requested - state.residue;
async_tx_ack(tup->tx_dma_desc);
spin_lock_irqsave(&tup->uport.lock, flags);
- xmit->tail = (xmit->tail + count) & (UART_XMIT_SIZE - 1);
+ uart_xmit_advance(&tup->uport, count);
tup->tx_in_progress = 0;
if (uart_circ_chars_pending(xmit) < WAKEUP_CHARS)
uart_write_wakeup(&tup->uport);
@@ -613,7 +613,6 @@ static unsigned int tegra_uart_tx_empty(struct uart_port *u)
static void tegra_uart_stop_tx(struct uart_port *u)
{
struct tegra_uart_port *tup = to_tegra_uport(u);
- struct circ_buf *xmit = &tup->uport.state->xmit;
struct dma_tx_state state;
unsigned int count;
@@ -624,7 +623,7 @@ static void tegra_uart_stop_tx(struct uart_port *u)
dmaengine_tx_status(tup->tx_dma_chan, tup->tx_cookie, &state);
count = tup->tx_bytes_requested - state.residue;
async_tx_ack(tup->tx_dma_desc);
- xmit->tail = (xmit->tail + count) & (UART_XMIT_SIZE - 1);
+ uart_xmit_advance(&tup->uport, count);
tup->tx_in_progress = 0;
}
--
2.37.3
ATTENTION
BUSINESS PARTNER,
I AM LUMAR CASEY WORKING WITH AN INSURANCE FINANCIAL INSTITUTE, WITH
MY POSITION AND PRIVILEGES I WAS ABLE TO SOURCE OUT AN OVER DUE
PAYMENT OF 12.8 MILLION POUNDS THAT IS NOW SECURED WITH A SHIPPING
DIPLOMATIC OUTLET.
I AM SEEKING YOUR PARTNERSHIP TO RECEIVE THIS CONSIGNMENT AS AS MY
PARTNER TO INVEST THIS FUND INTO A PROSPEROUS INVESTMENT VENTURE IN
YOUR COUNTRY.
I AWAIT YOUR REPLY TO ENABLE US PROCEED WITH THIS BUSINESS PARTNERSHIP TOGETHER.
REGARDS,
LUMAR CASEY
Commit e8ae0e140c05 ("vfio: Require that devices support DMA cache
coherence") requires IOMMU drivers to advertise
IOMMU_CAP_CACHE_COHERENCY, in order to be used by VFIO. Since VFIO does
not provide to userspace the ability to maintain coherency through cache
invalidations, it requires hardware coherency. Advertise the capability
in order to restore VFIO support.
The meaning of IOMMU_CAP_CACHE_COHERENCY also changed from "IOMMU can
enforce cache coherent DMA transactions" to "IOMMU_CACHE is supported".
While virtio-iommu cannot enforce coherency (of PCIe no-snoop
transactions), it does support IOMMU_CACHE.
We can distinguish different cases of non-coherent DMA:
(1) When accesses from a hardware endpoint are not coherent. The host
would describe such a device using firmware methods ('dma-coherent'
in device-tree, '_CCA' in ACPI), since they are also needed without
a vIOMMU. In this case mappings are created without IOMMU_CACHE.
virtio-iommu doesn't need any additional support. It sends the same
requests as for coherent devices.
(2) When the physical IOMMU supports non-cacheable mappings. Supporting
those would require a new feature in virtio-iommu, new PROBE request
property and MAP flags. Device drivers would use a new API to
discover this since it depends on the architecture and the physical
IOMMU.
(3) When the hardware supports PCIe no-snoop. It is possible for
assigned PCIe devices to issue no-snoop transactions, and the
virtio-iommu specification is lacking any mention of this.
Arm platforms don't necessarily support no-snoop, and those that do
cannot enforce coherency of no-snoop transactions. Device drivers
must be careful about assuming that no-snoop transactions won't end
up cached; see commit e02f5c1bb228 ("drm: disable uncached DMA
optimization for ARM and arm64"). On x86 platforms, the host may or
may not enforce coherency of no-snoop transactions with the physical
IOMMU. But according to the above commit, on x86 a driver which
assumes that no-snoop DMA is compatible with uncached CPU mappings
will also work if the host enforces coherency.
Although these issues are not specific to virtio-iommu, it could be
used to facilitate discovery and configuration of no-snoop. This
would require a new feature bit, PROBE property and ATTACH/MAP
flags.
Cc: stable(a)vger.kernel.org
Fixes: e8ae0e140c05 ("vfio: Require that devices support DMA cache coherence")
Signed-off-by: Jean-Philippe Brucker <jean-philippe(a)linaro.org>
---
Since v2 [1], I tried to refine the commit message.
This fix is needed for v5.19 and v6.0.
I can improve the check once Robin's change [2] is merged:
capable(IOMMU_CAP_CACHE_COHERENCY) could return dev->dma_coherent for
case (1) above.
[1] https://lore.kernel.org/linux-iommu/20220818163801.1011548-1-jean-philippe@…
[2] https://lore.kernel.org/linux-iommu/d8bd8777d06929ad8f49df7fc80e1b9af32a41b…
---
drivers/iommu/virtio-iommu.c | 11 +++++++++++
1 file changed, 11 insertions(+)
diff --git a/drivers/iommu/virtio-iommu.c b/drivers/iommu/virtio-iommu.c
index 08eeafc9529f..80151176ba12 100644
--- a/drivers/iommu/virtio-iommu.c
+++ b/drivers/iommu/virtio-iommu.c
@@ -1006,7 +1006,18 @@ static int viommu_of_xlate(struct device *dev, struct of_phandle_args *args)
return iommu_fwspec_add_ids(dev, args->args, 1);
}
+static bool viommu_capable(enum iommu_cap cap)
+{
+ switch (cap) {
+ case IOMMU_CAP_CACHE_COHERENCY:
+ return true;
+ default:
+ return false;
+ }
+}
+
static struct iommu_ops viommu_ops = {
+ .capable = viommu_capable,
.domain_alloc = viommu_domain_alloc,
.probe_device = viommu_probe_device,
.probe_finalize = viommu_probe_finalize,
--
2.37.1
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 25e9fbf0fd38868a429feabc38abebfc6dbf6542 Mon Sep 17 00:00:00 2001
From: "Isaac J. Manjarres" <isaacmanjarres(a)google.com>
Date: Wed, 17 Aug 2022 11:40:26 -0700
Subject: [PATCH] driver core: Don't probe devices after bus_type.match() probe
deferral
Both __device_attach_driver() and __driver_attach() check the return
code of the bus_type.match() function to see if the device needs to be
added to the deferred probe list. After adding the device to the list,
the logic attempts to bind the device to the driver anyway, as if the
device had matched with the driver, which is not correct.
If __device_attach_driver() detects that the device in question is not
ready to match with a driver on the bus, then it doesn't make sense for
the device to attempt to bind with the current driver or continue
attempting to match with any of the other drivers on the bus. So, update
the logic in __device_attach_driver() to reflect this.
If __driver_attach() detects that a driver tried to match with a device
that is not ready to match yet, then the driver should not attempt to bind
with the device. However, the driver can still attempt to match and bind
with other devices on the bus, as drivers can be bound to multiple
devices. So, update the logic in __driver_attach() to reflect this.
Fixes: 656b8035b0ee ("ARM: 8524/1: driver cohandle -EPROBE_DEFER from bus_type.match()")
Cc: stable(a)vger.kernel.org
Cc: Saravana Kannan <saravanak(a)google.com>
Reported-by: Guenter Roeck <linux(a)roeck-us.net>
Tested-by: Guenter Roeck <linux(a)roeck-us.net>
Tested-by: Linus Walleij <linus.walleij(a)linaro.org>
Reviewed-by: Saravana Kannan <saravanak(a)google.com>
Signed-off-by: Isaac J. Manjarres <isaacmanjarres(a)google.com>
Link: https://lore.kernel.org/r/20220817184026.3468620-1-isaacmanjarres@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/base/dd.c b/drivers/base/dd.c
index a8916d1bfdcb..ec69b43f926a 100644
--- a/drivers/base/dd.c
+++ b/drivers/base/dd.c
@@ -911,6 +911,11 @@ static int __device_attach_driver(struct device_driver *drv, void *_data)
dev_dbg(dev, "Device match requests probe deferral\n");
dev->can_match = true;
driver_deferred_probe_add(dev);
+ /*
+ * Device can't match with a driver right now, so don't attempt
+ * to match or bind with other drivers on the bus.
+ */
+ return ret;
} else if (ret < 0) {
dev_dbg(dev, "Bus failed to match device: %d\n", ret);
return ret;
@@ -1150,6 +1155,11 @@ static int __driver_attach(struct device *dev, void *data)
dev_dbg(dev, "Device match requests probe deferral\n");
dev->can_match = true;
driver_deferred_probe_add(dev);
+ /*
+ * Driver could not match with device, but may match with
+ * another device on the bus.
+ */
+ return 0;
} else if (ret < 0) {
dev_dbg(dev, "Bus failed to match device: %d\n", ret);
return ret;
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From fe8f65b018effbf473f53af3538d0c1878b8b329 Mon Sep 17 00:00:00 2001
From: SeongJae Park <sj(a)kernel.org>
Date: Wed, 31 Aug 2022 16:58:24 +0000
Subject: [PATCH] xen-blkfront: Cache feature_persistent value before
advertisement
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Xen blkfront advertises its support of the persistent grants feature
when it first setting up and when resuming in 'talk_to_blkback()'.
Then, blkback reads the advertised value when it connects with blkfront
and decides if it will use the persistent grants feature or not, and
advertises its decision to blkfront. Blkfront reads the blkback's
decision and it also makes the decision for the use of the feature.
Commit 402c43ea6b34 ("xen-blkfront: Apply 'feature_persistent' parameter
when connect"), however, made the blkfront's read of the parameter for
disabling the advertisement, namely 'feature_persistent', to be done
when it negotiate, not when advertise. Therefore blkfront advertises
without reading the parameter. As the field for caching the parameter
value is zero-initialized, it always advertises as the feature is
disabled, so that the persistent grants feature becomes always disabled.
This commit fixes the issue by making the blkfront does parmeter caching
just before the advertisement.
Fixes: 402c43ea6b34 ("xen-blkfront: Apply 'feature_persistent' parameter when connect")
Cc: <stable(a)vger.kernel.org> # 5.10.x
Reported-by: Marek Marczykowski-Górecki <marmarek(a)invisiblethingslab.com>
Signed-off-by: SeongJae Park <sj(a)kernel.org>
Tested-by: Marek Marczykowski-Górecki <marmarek(a)invisiblethingslab.com>
Reviewed-by: Juergen Gross <jgross(a)suse.com>
Link: https://lore.kernel.org/r/20220831165824.94815-4-sj@kernel.org
Signed-off-by: Juergen Gross <jgross(a)suse.com>
diff --git a/drivers/block/xen-blkfront.c b/drivers/block/xen-blkfront.c
index 276b2ee2a155..1f85750f981e 100644
--- a/drivers/block/xen-blkfront.c
+++ b/drivers/block/xen-blkfront.c
@@ -1759,6 +1759,12 @@ static int write_per_ring_nodes(struct xenbus_transaction xbt,
return err;
}
+/* Enable the persistent grants feature. */
+static bool feature_persistent = true;
+module_param(feature_persistent, bool, 0644);
+MODULE_PARM_DESC(feature_persistent,
+ "Enables the persistent grants feature");
+
/* Common code used when first setting up, and when resuming. */
static int talk_to_blkback(struct xenbus_device *dev,
struct blkfront_info *info)
@@ -1850,6 +1856,7 @@ static int talk_to_blkback(struct xenbus_device *dev,
message = "writing protocol";
goto abort_transaction;
}
+ info->feature_persistent_parm = feature_persistent;
err = xenbus_printf(xbt, dev->nodename, "feature-persistent", "%u",
info->feature_persistent_parm);
if (err)
@@ -1919,12 +1926,6 @@ static int negotiate_mq(struct blkfront_info *info)
return 0;
}
-/* Enable the persistent grants feature. */
-static bool feature_persistent = true;
-module_param(feature_persistent, bool, 0644);
-MODULE_PARM_DESC(feature_persistent,
- "Enables the persistent grants feature");
-
/*
* Entry point to this code when a new device is created. Allocate the basic
* structures and the ring buffer for communication with the backend, and
@@ -2284,7 +2285,6 @@ static void blkfront_gather_backend_features(struct blkfront_info *info)
if (xenbus_read_unsigned(info->xbdev->otherend, "feature-discard", 0))
blkfront_setup_discard(info);
- info->feature_persistent_parm = feature_persistent;
if (info->feature_persistent_parm)
info->feature_persistent =
!!xenbus_read_unsigned(info->xbdev->otherend,
HDP flush is used early in the init sequence as part of memory controller
block initialization. Hence remapping of HDP registers needed for flush
needs to happen earlier.
This also fixes the Unsupported Request error reported through AER during
driver load. The error happens as a write happens to the remap offset
before real remapping is done.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=216373
The error was unnoticed before and got visible because of the commit
referenced below. This doesn't fix anything in the commit below, rather
fixes the issue in amdgpu exposed by the commit. The reference is only
to associate this commit with below one so that both go together.
Fixes: 8795e182b02d ("PCI/portdrv: Don't disable AER reporting in get_port_device_capability()")
Reported-by: Tom Seewald <tseewald(a)gmail.com>
Signed-off-by: Lijo Lazar <lijo.lazar(a)amd.com>
Cc: stable(a)vger.kernel.org
---
v2:
Take care of IP resume cases (Alex Deucher)
Add NULL check to nbio.funcs to cover older (GFXv8) ASICs (Felix Kuehling)
Add more details in commit message and associate with AER patch (Bjorn
Helgaas)
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 24 ++++++++++++++++++++++
drivers/gpu/drm/amd/amdgpu/nv.c | 6 ------
drivers/gpu/drm/amd/amdgpu/soc15.c | 6 ------
drivers/gpu/drm/amd/amdgpu/soc21.c | 6 ------
4 files changed, 24 insertions(+), 18 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index ce7d117efdb5..e420118769a5 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -2334,6 +2334,26 @@ static int amdgpu_device_init_schedulers(struct amdgpu_device *adev)
return 0;
}
+/**
+ * amdgpu_device_prepare_ip - prepare IPs for hardware initialization
+ *
+ * @adev: amdgpu_device pointer
+ *
+ * Any common hardware initialization sequence that needs to be done before
+ * hw init of individual IPs is performed here. This is different from the
+ * 'common block' which initializes a set of IPs.
+ */
+static void amdgpu_device_prepare_ip(struct amdgpu_device *adev)
+{
+ /* Remap HDP registers to a hole in mmio space, for the purpose
+ * of exposing those registers to process space. This needs to be
+ * done before hw init of ip blocks to take care of HDP flush
+ * operations through registers during hw_init.
+ */
+ if (adev->nbio.funcs && adev->nbio.funcs->remap_hdp_registers &&
+ !amdgpu_sriov_vf(adev))
+ adev->nbio.funcs->remap_hdp_registers(adev);
+}
/**
* amdgpu_device_ip_init - run init for hardware IPs
@@ -2376,6 +2396,8 @@ static int amdgpu_device_ip_init(struct amdgpu_device *adev)
DRM_ERROR("amdgpu_vram_scratch_init failed %d\n", r);
goto init_failed;
}
+
+ amdgpu_device_prepare_ip(adev);
r = adev->ip_blocks[i].version->funcs->hw_init((void *)adev);
if (r) {
DRM_ERROR("hw_init %d failed %d\n", i, r);
@@ -3058,6 +3080,7 @@ static int amdgpu_device_ip_reinit_early_sriov(struct amdgpu_device *adev)
AMD_IP_BLOCK_TYPE_IH,
};
+ amdgpu_device_prepare_ip(adev);
for (i = 0; i < adev->num_ip_blocks; i++) {
int j;
struct amdgpu_ip_block *block;
@@ -3139,6 +3162,7 @@ static int amdgpu_device_ip_resume_phase1(struct amdgpu_device *adev)
{
int i, r;
+ amdgpu_device_prepare_ip(adev);
for (i = 0; i < adev->num_ip_blocks; i++) {
if (!adev->ip_blocks[i].status.valid || adev->ip_blocks[i].status.hw)
continue;
diff --git a/drivers/gpu/drm/amd/amdgpu/nv.c b/drivers/gpu/drm/amd/amdgpu/nv.c
index b3fba8dea63c..3ac7fef74277 100644
--- a/drivers/gpu/drm/amd/amdgpu/nv.c
+++ b/drivers/gpu/drm/amd/amdgpu/nv.c
@@ -1032,12 +1032,6 @@ static int nv_common_hw_init(void *handle)
nv_program_aspm(adev);
/* setup nbio registers */
adev->nbio.funcs->init_registers(adev);
- /* remap HDP registers to a hole in mmio space,
- * for the purpose of expose those registers
- * to process space
- */
- if (adev->nbio.funcs->remap_hdp_registers && !amdgpu_sriov_vf(adev))
- adev->nbio.funcs->remap_hdp_registers(adev);
/* enable the doorbell aperture */
nv_enable_doorbell_aperture(adev, true);
diff --git a/drivers/gpu/drm/amd/amdgpu/soc15.c b/drivers/gpu/drm/amd/amdgpu/soc15.c
index fde6154f2009..a0481e37d7cf 100644
--- a/drivers/gpu/drm/amd/amdgpu/soc15.c
+++ b/drivers/gpu/drm/amd/amdgpu/soc15.c
@@ -1240,12 +1240,6 @@ static int soc15_common_hw_init(void *handle)
soc15_program_aspm(adev);
/* setup nbio registers */
adev->nbio.funcs->init_registers(adev);
- /* remap HDP registers to a hole in mmio space,
- * for the purpose of expose those registers
- * to process space
- */
- if (adev->nbio.funcs->remap_hdp_registers && !amdgpu_sriov_vf(adev))
- adev->nbio.funcs->remap_hdp_registers(adev);
/* enable the doorbell aperture */
soc15_enable_doorbell_aperture(adev, true);
diff --git a/drivers/gpu/drm/amd/amdgpu/soc21.c b/drivers/gpu/drm/amd/amdgpu/soc21.c
index 55284b24f113..16b447055102 100644
--- a/drivers/gpu/drm/amd/amdgpu/soc21.c
+++ b/drivers/gpu/drm/amd/amdgpu/soc21.c
@@ -660,12 +660,6 @@ static int soc21_common_hw_init(void *handle)
soc21_program_aspm(adev);
/* setup nbio registers */
adev->nbio.funcs->init_registers(adev);
- /* remap HDP registers to a hole in mmio space,
- * for the purpose of expose those registers
- * to process space
- */
- if (adev->nbio.funcs->remap_hdp_registers)
- adev->nbio.funcs->remap_hdp_registers(adev);
/* enable the doorbell aperture */
soc21_enable_doorbell_aperture(adev, true);
--
2.25.1
Good Day,
I've viewed your profile on Linkedin regarding a proposal that has something in common with you, kindly reply back for more details on my private email: js3522029(a)gmail.com
Thanks,
Jan Smith,
js3522029(a)gmail.com