hi,
new version of pahole (1.24) is causing compilation fail for 5.15
stable kernel, discussed in here [1][2]. Sending fix for that plus
one dependency patch.
Note for patch 1:
there was one extra line change in scripts/pahole-flags.sh file in
its linux tree merge commit:
fc02cb2b37fe Merge tag 'net-next-for-5.16' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
not sure how/if to track that, I squashed the change in.
thanks,
jirka
[1] https://lore.kernel.org/bpf/20220825163538.vajnsv3xcpbhl47v@altlinux.org/
[2] https://lore.kernel.org/bpf/YwQRKkmWqsf%2FDu6A@kernel.org/
---
Jiri Olsa (1):
kbuild: Unify options for BTF generation for vmlinux and modules
Martin Rodriguez Reboredo (1):
kbuild: Add skip_encoding_btf_enum64 option to pahole
Makefile | 3 +++
scripts/Makefile.modfinal | 2 +-
scripts/link-vmlinux.sh | 11 +----------
scripts/pahole-flags.sh | 24 ++++++++++++++++++++++++
4 files changed, 29 insertions(+), 11 deletions(-)
create mode 100755 scripts/pahole-flags.sh
A recent change in LLVM made CONFIG_EFI_STUB unselectable because it no
longer pretends to support '-mabi=ms', breaking the dependency in
Kconfig. Lack of CONFIG_EFI_STUB can prevent kernels from booting via
EFI in certain circumstances.
This check was added by commit 8f24f8c2fc82 ("efi/libstub: Annotate
firmware routines as __efiapi") to ensure that '__attribute__((ms_abi))'
was available, as '-mabi=ms' is not actually used in any cflags.
According to the GCC documentation, this attribute has been supported
since GCC 4.4.7. The kernel currently requires GCC 5.1 so this check is
not necessary; even when that change landed in 5.6, the kernel required
GCC 4.9 so it was unnecessary then as well. Clang supports
'__attribute__((ms_abi))' for all versions that are supported for
building the kernel so no additional check is needed. Remove the
'depends on' line altogether to allow CONFIG_EFI_STUB to be selected
when CONFIG_EFI is enabled, regardless of compiler.
Cc: stable(a)vger.kernel.org
Fixes: 8f24f8c2fc82 ("efi/libstub: Annotate firmware routines as __efiapi")
Link: https://github.com/ClangBuiltLinux/linux/issues/1725
Link: https://gcc.gnu.org/onlinedocs/gcc-4.4.7/gcc/Function-Attributes.html
Link: https://github.com/llvm/llvm-project/commit/d1ad006a8f64bdc17f618deffa9e7c9…
Signed-off-by: Nathan Chancellor <nathan(a)kernel.org>
---
arch/x86/Kconfig | 1 -
1 file changed, 1 deletion(-)
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index f9920f1341c8..81012154d9ed 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -1956,7 +1956,6 @@ config EFI
config EFI_STUB
bool "EFI stub support"
depends on EFI
- depends on $(cc-option,-mabi=ms) || X86_32
select RELOCATABLE
help
This kernel feature allows a bzImage to be loaded directly
base-commit: f76349cf41451c5c42a99f18a9163377e4b364ff
--
2.37.3
If an error is detected as a result of user-space process accessing a
corrupt memory location, the CPU may take an abort. Then the platform
firmware reports kernel via NMI like notifications, e.g. NOTIFY_SEA,
NOTIFY_SOFTWARE_DELEGATED, etc.
For NMI like notifications, commit 7f17b4a121d0 ("ACPI: APEI: Kick the
memory_failure() queue for synchronous errors") keep track of whether
memory_failure() work was queued, and make task_work pending to flush out
the queue so that the work is processed before return to user-space.
The code use init_mm to check whether the error occurs in user space:
if (current->mm != &init_mm)
The condition is always true, becase _nobody_ ever has "init_mm" as a real
VM any more.
In addition to abort, errors can also be signaled as asynchronous
exceptions, such as interrupt and SError. In such case, the interrupted
current process could be any kind of thread. When a kernel thread is
interrupted, the work ghes_kick_task_work deferred to task_work will never
be processed because entry_handler returns to call ret_to_kernel() instead
of ret_to_user(). Consequently, the estatus_node alloced from
ghes_estatus_pool in ghes_in_nmi_queue_one_entry() will not be freed.
After around 200 allocations in our platform, the ghes_estatus_pool will
run of memory and ghes_in_nmi_queue_one_entry() returns ENOMEM. As a
result, the event failed to be processed.
sdei: event 805 on CPU 113 failed with error: -2
Finally, a lot of unhandled events may cause platform firmware to exceed
some threshold and reboot.
The condition should generally just do
if (current->mm)
as described in active_mm.rst documentation.
Then if an asynchronous error is detected when a kernel thread is running,
(e.g. when detected by a background scrubber), do not add task_work to it
as the original patch intends to do.
Fixes: 7f17b4a121d0 ("ACPI: APEI: Kick the memory_failure() queue for synchronous errors")
Signed-off-by: Shuai Xue <xueshuai(a)linux.alibaba.com>
---
changes since v1:
- add description the side effect and give more details
drivers/acpi/apei/ghes.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
index d91ad378c00d..80ad530583c9 100644
--- a/drivers/acpi/apei/ghes.c
+++ b/drivers/acpi/apei/ghes.c
@@ -985,7 +985,7 @@ static void ghes_proc_in_irq(struct irq_work *irq_work)
ghes_estatus_cache_add(generic, estatus);
}
- if (task_work_pending && current->mm != &init_mm) {
+ if (task_work_pending && current->mm) {
estatus_node->task_work.func = ghes_kick_task_work;
estatus_node->task_work_cpu = smp_processor_id();
ret = task_work_add(current, &estatus_node->task_work,
--
2.20.1.12.g72788fdb
Commit c462ac288f2c ("mm: Introduce arch_validate_flags()") added a late
check in mmap_region() to let architectures validate vm_flags. The check
needs to happen after calling ->mmap() as the flags can potentially be
modified during this callback.
If arch_validate_flags() check fails we unmap and free the vma. However,
the error path fails to undo the ->mmap() call that previously succeeded
and depending on the specific ->mmap() implementation this translates to
reference increments, memory allocations and other operations what will
not be cleaned up.
There are several places (mainly device drivers) where this is an issue.
However, one specific example is bpf_map_mmap() which keeps count of the
mappings in map->writecnt. The count is incremented on ->mmap() and then
decremented on vm_ops->close(). When arch_validate_flags() fails this
count is off since bpf_map_mmap_close() is never called.
One can reproduce this issue in arm64 devices with MTE support. Here the
vm_flags are checked to only allow VM_MTE if VM_MTE_ALLOWED has been set
previously. From userspace then is enough to pass the PROT_MTE flag to
mmap() syscall to trigger the arch_validate_flags() failure.
The following program reproduces this issue:
---
#include <stdio.h>
#include <unistd.h>
#include <linux/unistd.h>
#include <linux/bpf.h>
#include <sys/mman.h>
int main(void)
{
union bpf_attr attr = {
.map_type = BPF_MAP_TYPE_ARRAY,
.key_size = sizeof(int),
.value_size = sizeof(long long),
.max_entries = 256,
.map_flags = BPF_F_MMAPABLE,
};
int fd;
fd = syscall(__NR_bpf, BPF_MAP_CREATE, &attr, sizeof(attr));
mmap(NULL, 4096, PROT_WRITE | PROT_MTE, MAP_SHARED, fd, 0);
return 0;
}
---
By manually adding some log statements to the vm_ops callbacks we can
confirm that when passing PROT_MTE to mmap() the map->writecnt is off
upon ->release():
With PROT_MTE flag:
root@debian:~# ./bpf-test
[ 111.263874] bpf_map_write_active_inc: map=9 writecnt=1
[ 111.288763] bpf_map_release: map=9 writecnt=1
Without PROT_MTE flag:
root@debian:~# ./bpf-test
[ 157.816912] bpf_map_write_active_inc: map=10 writecnt=1
[ 157.830442] bpf_map_write_active_dec: map=10 writecnt=0
[ 157.832396] bpf_map_release: map=10 writecnt=0
This patch fixes the above issue by calling vm_ops->close() when the
arch_validate_flags() check fails, after this we can proceed to unmap
and free the vma on the error path.
Fixes: c462ac288f2c ("mm: Introduce arch_validate_flags()")
Cc: Catalin Marinas <catalin.marinas(a)arm.com>
Cc: Liam Howlett <liam.howlett(a)oracle.com>
Cc: Suren Baghdasaryan <surenb(a)google.com>
Cc: <stable(a)vger.kernel.org> # v5.10+
Signed-off-by: Carlos Llamas <cmllamas(a)google.com>
---
mm/mmap.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/mm/mmap.c b/mm/mmap.c
index 9d780f415be3..36c08e2c78da 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -1797,7 +1797,7 @@ unsigned long mmap_region(struct file *file, unsigned long addr,
if (!arch_validate_flags(vma->vm_flags)) {
error = -EINVAL;
if (file)
- goto unmap_and_free_vma;
+ goto close_and_free_vma;
else
goto free_vma;
}
@@ -1844,6 +1844,9 @@ unsigned long mmap_region(struct file *file, unsigned long addr,
return addr;
+close_and_free_vma:
+ if (vma->vm_ops && vma->vm_ops->close)
+ vma->vm_ops->close(vma);
unmap_and_free_vma:
fput(vma->vm_file);
vma->vm_file = NULL;
--
2.38.0.rc1.362.ged0d419d3c-goog
From: Gou Hao <gouhao(a)uniontech.com>
patch1: is memory leak of audit rule
patch2~3: is memory leak about 'fsname' field of struct ima_rule_entry
Tyler Hicks (3):
ima: Have the LSM free its audit rule
ima: Free the entire rule when deleting a list of rules
ima: Free the entire rule if it fails to parse
security/integrity/ima/ima.h | 5 +++++
security/integrity/ima/ima_policy.c | 24 ++++++++++++++++++------
2 files changed, 23 insertions(+), 6 deletions(-)
--
2.20.1
Commit c7b79a752871 ("mfd: intel-lpss: Add Intel Alder Lake PCH-S PCI
IDs") caused a regression on certain Gigabyte motherboards for Intel
Alder Lake-S where system crashes to NULL pointer dereference in
i2c_dw_xfer_msg() when system resumes from S3 sleep state ("deep").
I was able to debug the issue on Gigabyte Z690 AORUS ELITE and made
following notes:
- Issue happens when resuming from S3 but not when resuming from
"s2idle"
- PCI device 00:15.0 == i2c_designware.0 is already in D0 state when
system enters into pci_pm_resume_noirq() while all other i2c_designware
PCI devices are in D3. Devices were runtime suspended and in D3 prior
entering into suspend
- Interrupt comes after pci_pm_resume_noirq() when device interrupts are
re-enabled
- According to register dump the interrupt really comes from the
i2c_designware.0. Controller is enabled, I2C target address register
points to a one detectable I2C device address 0x60 and the
DW_IC_RAW_INTR_STAT register START_DET, STOP_DET, ACTIVITY and
TX_EMPTY bits are set indicating completed I2C transaction.
My guess is that the firmware uses this controller to communicate with
an on-board I2C device during resume but does not disable the controller
before giving control to an operating system.
I was told the UEFI update fixes this but never the less it revealed the
driver is not ready to handle TX_EMPTY (or RX_FULL) interrupt when device
is supposed to be idle and state variables are not set (especially the
dev->msgs pointer which may point to NULL or stale old data).
Introduce a new software status flag STATUS_ACTIVE indicating when the
controller is active in driver point of view. Now treat all interrupts
that occur when is not set as unexpected and mask all interrupts from
the controller.
Fixes: c7b79a752871 ("mfd: intel-lpss: Add Intel Alder Lake PCH-S PCI IDs")
Reported-by: Samuel Clark <slc2015(a)gmail.com>
Link: https://bugzilla.kernel.org/show_bug.cgi?id=215907
Cc: stable(a)vger.kernel.org # v5.12+
Signed-off-by: Jarkko Nikula <jarkko.nikula(a)linux.intel.com>
---
Hans: Are you able to test this on your Baytrail collection with shared
I2C controller and PUNIT so that patch doesn't break anything? I believe
even if the Linux interrupt for such shared I2C controller is shared e.g.
with the i2c-i801 the PUNIT access should not get affected by this
interrupt masking. I think PUNIT access won't use interrupts but you
never know... We have a MRD7 that has shared I2C controller with PUNIT
but Linux interrupt is not shared. I.e. unexpected interrupt case and
masking doesn't get hit.
i2c-designware-slave.c is not fully in sync with this new status flag since
STATUS_ACTIVE gets overwritten there and unexpected interrupt case is
needlessly hit in case of shared interrupt and this controller is
suspended but both of those can go to an another patchset.
---
drivers/i2c/busses/i2c-designware-core.h | 7 +++++--
drivers/i2c/busses/i2c-designware-master.c | 13 +++++++++++++
2 files changed, 18 insertions(+), 2 deletions(-)
diff --git a/drivers/i2c/busses/i2c-designware-core.h b/drivers/i2c/busses/i2c-designware-core.h
index 70b80e710990..4d3a3b464ecd 100644
--- a/drivers/i2c/busses/i2c-designware-core.h
+++ b/drivers/i2c/busses/i2c-designware-core.h
@@ -126,8 +126,9 @@
* status codes
*/
#define STATUS_IDLE 0x0
-#define STATUS_WRITE_IN_PROGRESS 0x1
-#define STATUS_READ_IN_PROGRESS 0x2
+#define STATUS_ACTIVE 0x1
+#define STATUS_WRITE_IN_PROGRESS 0x2
+#define STATUS_READ_IN_PROGRESS 0x4
/*
* operation modes
@@ -334,12 +335,14 @@ void i2c_dw_disable_int(struct dw_i2c_dev *dev);
static inline void __i2c_dw_enable(struct dw_i2c_dev *dev)
{
+ dev->status |= STATUS_ACTIVE;
regmap_write(dev->map, DW_IC_ENABLE, 1);
}
static inline void __i2c_dw_disable_nowait(struct dw_i2c_dev *dev)
{
regmap_write(dev->map, DW_IC_ENABLE, 0);
+ dev->status &= ~STATUS_ACTIVE;
}
void __i2c_dw_disable(struct dw_i2c_dev *dev);
diff --git a/drivers/i2c/busses/i2c-designware-master.c b/drivers/i2c/busses/i2c-designware-master.c
index 44a94b225ed8..dc3c5a15a95b 100644
--- a/drivers/i2c/busses/i2c-designware-master.c
+++ b/drivers/i2c/busses/i2c-designware-master.c
@@ -716,6 +716,19 @@ static int i2c_dw_irq_handler_master(struct dw_i2c_dev *dev)
u32 stat;
stat = i2c_dw_read_clear_intrbits(dev);
+
+ if (!(dev->status & STATUS_ACTIVE)) {
+ /*
+ * Unexpected interrupt in driver point of view. State
+ * variables are either unset or stale so acknowledge and
+ * disable interrupts for suppressing further interrupts if
+ * interrupt really came from this HW (E.g. firmware has left
+ * the HW active).
+ */
+ regmap_write(dev->map, DW_IC_INTR_MASK, 0);
+ return 0;
+ }
+
if (stat & DW_IC_INTR_TX_ABRT) {
dev->cmd_err |= DW_IC_ERR_TX_ABRT;
dev->status = STATUS_IDLE;
--
2.35.1
This reverts dbf8e63f48af ("f2fs: remove device type check for direct IO"),
and apply the below first version, since it contributed out-of-order DIO writes.
For zoned devices, f2fs forbids direct IO and forces buffered IO
to serialize write IOs. However, the constraint does not apply to
read IOs.
Cc: stable(a)vger.kernel.org
Fixes: dbf8e63f48af ("f2fs: remove device type check for direct IO")
Signed-off-by: Eunhee Rho <eunhee83.rho(a)samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk(a)kernel.org>
---
fs/f2fs/f2fs.h | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
index 2ed00111a399..a0b2c8626a75 100644
--- a/fs/f2fs/f2fs.h
+++ b/fs/f2fs/f2fs.h
@@ -4535,7 +4535,12 @@ static inline bool f2fs_force_buffered_io(struct inode *inode,
/* disallow direct IO if any of devices has unaligned blksize */
if (f2fs_is_multi_device(sbi) && !sbi->aligned_blksize)
return true;
-
+ /*
+ * for blkzoned device, fallback direct IO to buffered IO, so
+ * all IOs can be serialized by log-structured write.
+ */
+ if (f2fs_sb_has_blkzoned(sbi) && (rw == WRITE))
+ return true;
if (f2fs_lfs_mode(sbi) && (rw == WRITE)) {
if (block_unaligned_IO(inode, iocb, iter))
return true;
--
2.38.0.rc1.362.ged0d419d3c-goog