This is version 4 of the MSI interrupts for ntb_transport patchset.
It's mostly a resend of v3 but with spelling and grammar fixes found by Bjorn.
I've addressed the feedback so far and rebased on the latest kernel
and would like this to be considered for merging this cycle.
The only outstanding issue I know of is that it still will not work
with IDT hardware, but ntb_transport doesn't work with IDT hardware
and there is still no sensible common infrastructure to support
ntb_peer_mw_set_trans(). Thus, I decline to consider that complication
in this patchset. However, I'll be happy to review work that adds this
feature in the future.
Also, as the port number and resource index stuff is a bit complicated,
I made a quick out of tree test fixture to ensure it's correct[1]. As
an excerise I also wrote some test code[2] using the upcomming KUnit
feature.
Logan
[1] https://repl.it/repls/ExcitingPresentFile
[2] https://github.com/sbates130272/linux-p2pmem/commits/ntb_kunit
--
Changes in v4:
* Rebased onto v5.1-rc6 (No changes)
* Numerous grammar and spelling mistakes spotted by Bjorn
--
Changes in v3:
* Rebased onto v5.1-rc1 (Dropped the first two patches as they have
been merged, and cleaned up some minor conflicts in the PCI tree)
* Added a new patch (#3) to calculate logical port numbers that
are port numbers from 0 to (number of ports - 1). This is
then used in ntb_peer_resource_idx() to fix the issues brought
up by Serge.
* Fixed missing __iomem and iowrite calls (as noticed by Serge)
* Added patch 10 which describes ntb_msi_test in the documentation
file (as requested by Serge)
* A couple other minor nits and documentation fixes
--
Changes in v2:
* Cleaned up the changes in intel_irq_remapping.c to make them
less confusing and add a comment. (Per discussion with Jacob and
Joerg)
* Fixed a nit from Bjorn and collected his Ack
* Added a Kconfig dependancy on CONFIG_PCI_MSI for CONFIG_NTB_MSI
as the Kbuild robot hit a random config that didn't build
without it.
* Worked in a callback for when the MSI descriptor changes so that
the clients can resend the new address and data values to the peer.
On my test system this was never necessary, but there may be
other platforms where this can occur. I tested this by hacking
in a path to rewrite the MSI descriptor when I change the cpu
affinity of an IRQ. There's a bit of uncertainty over the latency
of the change, but without hardware this can acctually occur on
we can't test this. This was the result of a discussion with Dave.
--
This patch series adds optional support for using MSI interrupts instead
of NTB doorbells in ntb_transport. This is desirable seeing doorbells on
current hardware are quite slow and therefore switching to MSI interrupts
provides a significant performance gain. On switchtec hardware, a simple
apples-to-apples comparison shows ntb_netdev/iperf numbers going from
3.88Gb/s to 14.1Gb/s when switching to MSI interrupts.
To do this, a couple changes are required outside of the NTB tree:
1) The IOMMU must know to accept MSI requests from aliased bused numbers
seeing NTB hardware typically sends proxied request IDs through
additional requester IDs. The first patch in this series adds support
for the Intel IOMMU. A quirk to add these aliases for switchtec hardware
was already accepted. See commit ad281ecf1c7d ("PCI: Add DMA alias quirk
for Microsemi Switchtec NTB") for a description of NTB proxy IDs and why
this is necessary.
2) NTB transport (and other clients) may often need more MSI interrupts
than the NTB hardware actually advertises support for. However, seeing
these interrupts will not be triggered by the hardware but through an
NTB memory window, the hardware does not actually need support or need
to know about them. Therefore we add the concept of Virtual MSI
interrupts which are allocated just like any other MSI interrupt but
are not programmed into the hardware's MSI table. This is done in
Patch 2 and then made use of in Patch 3.
The remaining patches in this series add a library for dealing with MSI
interrupts, a test client and finally support in ntb_transport.
The series is based off of v5.1-rc6 plus the patches in ntb-next.
A git repo is available here:
https://github.com/sbates130272/linux-p2pmem/ ntb_transport_msi_v4
Thanks,
Logan
--
Logan Gunthorpe (10):
PCI/MSI: Support allocating virtual MSI interrupts
PCI/switchtec: Add module parameter to request more interrupts
NTB: Introduce helper functions to calculate logical port number
NTB: Introduce functions to calculate multi-port resource index
NTB: Rename ntb.c to support multiple source files in the module
NTB: Introduce MSI library
NTB: Introduce NTB MSI Test Client
NTB: Add ntb_msi_test support to ntb_test
NTB: Add MSI interrupt support to ntb_transport
NTB: Describe the ntb_msi_test client in the documentation.
Documentation/ntb.txt | 27 ++
drivers/ntb/Kconfig | 11 +
drivers/ntb/Makefile | 3 +
drivers/ntb/{ntb.c => core.c} | 0
drivers/ntb/msi.c | 415 +++++++++++++++++++++++
drivers/ntb/ntb_transport.c | 169 ++++++++-
drivers/ntb/test/Kconfig | 9 +
drivers/ntb/test/Makefile | 1 +
drivers/ntb/test/ntb_msi_test.c | 433 ++++++++++++++++++++++++
drivers/pci/msi.c | 54 ++-
drivers/pci/switch/switchtec.c | 12 +-
include/linux/msi.h | 8 +
include/linux/ntb.h | 196 ++++++++++-
include/linux/pci.h | 9 +
tools/testing/selftests/ntb/ntb_test.sh | 54 ++-
15 files changed, 1386 insertions(+), 15 deletions(-)
rename drivers/ntb/{ntb.c => core.c} (100%)
create mode 100644 drivers/ntb/msi.c
create mode 100644 drivers/ntb/test/ntb_msi_test.c
--
2.20.1
From: Sean Christopherson <sean.j.christopherson(a)intel.com>
[ Upstream commit 0f73bbc851ed32d22bbd86be09e0365c460bcd2e ]
Documentation/virtual/kvm/api.txt states:
NOTE: For KVM_EXIT_IO, KVM_EXIT_MMIO, KVM_EXIT_OSI, KVM_EXIT_PAPR and
KVM_EXIT_EPR the corresponding operations are complete (and guest
state is consistent) only after userspace has re-entered the
kernel with KVM_RUN. The kernel side will first finish incomplete
operations and then check for pending signals. Userspace can
re-enter the guest with an unmasked signal pending to complete
pending operations.
Because guest state may be inconsistent, starting state migration after
an IO exit without first completing IO may result in test failures, e.g.
a proposed change to KVM's handling of %rip in its fast PIO handling[1]
will cause the new VM, i.e. the post-migration VM, to have its %rip set
to the IN instruction that triggered KVM_EXIT_IO, leading to a test
assertion due to a stage mismatch.
For simplicitly, require KVM_CAP_IMMEDIATE_EXIT to complete IO and skip
the test if it's not available. The addition of KVM_CAP_IMMEDIATE_EXIT
predates the state selftest by more than a year.
[1] https://patchwork.kernel.org/patch/10848545/
Fixes: fa3899add1056 ("kvm: selftests: add basic test for state save and restore")
Reported-by: Jim Mattson <jmattson(a)google.com>
Signed-off-by: Sean Christopherson <sean.j.christopherson(a)intel.com>
Signed-off-by: Paolo Bonzini <pbonzini(a)redhat.com>
Signed-off-by: Sasha Levin (Microsoft) <sashal(a)kernel.org>
---
tools/testing/selftests/kvm/include/kvm_util.h | 1 +
tools/testing/selftests/kvm/lib/kvm_util.c | 16 ++++++++++++++++
.../testing/selftests/kvm/x86_64/state_test.c | 18 ++++++++++++++++--
3 files changed, 33 insertions(+), 2 deletions(-)
diff --git a/tools/testing/selftests/kvm/include/kvm_util.h b/tools/testing/selftests/kvm/include/kvm_util.h
index a84785b02557..07b71ad9734a 100644
--- a/tools/testing/selftests/kvm/include/kvm_util.h
+++ b/tools/testing/selftests/kvm/include/kvm_util.h
@@ -102,6 +102,7 @@ vm_paddr_t addr_gva2gpa(struct kvm_vm *vm, vm_vaddr_t gva);
struct kvm_run *vcpu_state(struct kvm_vm *vm, uint32_t vcpuid);
void vcpu_run(struct kvm_vm *vm, uint32_t vcpuid);
int _vcpu_run(struct kvm_vm *vm, uint32_t vcpuid);
+void vcpu_run_complete_io(struct kvm_vm *vm, uint32_t vcpuid);
void vcpu_set_mp_state(struct kvm_vm *vm, uint32_t vcpuid,
struct kvm_mp_state *mp_state);
void vcpu_regs_get(struct kvm_vm *vm, uint32_t vcpuid, struct kvm_regs *regs);
diff --git a/tools/testing/selftests/kvm/lib/kvm_util.c b/tools/testing/selftests/kvm/lib/kvm_util.c
index b52cfdefecbf..efa0aad8b3c6 100644
--- a/tools/testing/selftests/kvm/lib/kvm_util.c
+++ b/tools/testing/selftests/kvm/lib/kvm_util.c
@@ -1121,6 +1121,22 @@ int _vcpu_run(struct kvm_vm *vm, uint32_t vcpuid)
return rc;
}
+void vcpu_run_complete_io(struct kvm_vm *vm, uint32_t vcpuid)
+{
+ struct vcpu *vcpu = vcpu_find(vm, vcpuid);
+ int ret;
+
+ TEST_ASSERT(vcpu != NULL, "vcpu not found, vcpuid: %u", vcpuid);
+
+ vcpu->state->immediate_exit = 1;
+ ret = ioctl(vcpu->fd, KVM_RUN, NULL);
+ vcpu->state->immediate_exit = 0;
+
+ TEST_ASSERT(ret == -1 && errno == EINTR,
+ "KVM_RUN IOCTL didn't exit immediately, rc: %i, errno: %i",
+ ret, errno);
+}
+
/*
* VM VCPU Set MP State
*
diff --git a/tools/testing/selftests/kvm/x86_64/state_test.c b/tools/testing/selftests/kvm/x86_64/state_test.c
index 4b3f556265f1..30f75856cf39 100644
--- a/tools/testing/selftests/kvm/x86_64/state_test.c
+++ b/tools/testing/selftests/kvm/x86_64/state_test.c
@@ -134,6 +134,11 @@ int main(int argc, char *argv[])
struct kvm_cpuid_entry2 *entry = kvm_get_supported_cpuid_entry(1);
+ if (!kvm_check_cap(KVM_CAP_IMMEDIATE_EXIT)) {
+ fprintf(stderr, "immediate_exit not available, skipping test\n");
+ exit(KSFT_SKIP);
+ }
+
/* Create VM */
vm = vm_create_default(VCPU_ID, 0, guest_code);
vcpu_set_cpuid(vm, VCPU_ID, kvm_get_supported_cpuid());
@@ -156,8 +161,6 @@ int main(int argc, char *argv[])
stage, run->exit_reason,
exit_reason_str(run->exit_reason));
- memset(®s1, 0, sizeof(regs1));
- vcpu_regs_get(vm, VCPU_ID, ®s1);
switch (get_ucall(vm, VCPU_ID, &uc)) {
case UCALL_ABORT:
TEST_ASSERT(false, "%s at %s:%d", (const char *)uc.args[0],
@@ -176,6 +179,17 @@ int main(int argc, char *argv[])
uc.args[1] == stage, "Unexpected register values vmexit #%lx, got %lx",
stage, (ulong)uc.args[1]);
+ /*
+ * When KVM exits to userspace with KVM_EXIT_IO, KVM guarantees
+ * guest state is consistent only after userspace re-enters the
+ * kernel with KVM_RUN. Complete IO prior to migrating state
+ * to a new VM.
+ */
+ vcpu_run_complete_io(vm, VCPU_ID);
+
+ memset(®s1, 0, sizeof(regs1));
+ vcpu_regs_get(vm, VCPU_ID, ®s1);
+
state = vcpu_save_state(vm, VCPU_ID);
kvm_vm_release(vm);
--
2.19.1
From: Sean Christopherson <sean.j.christopherson(a)intel.com>
[ Upstream commit ffac839d040619847217647434b2b02469926871 ]
Since 4.8.3, gcc has enabled -fstack-protector by default. This is
problematic for the KVM selftests as they do not configure fs or gs
segments (the stack canary is pulled from fs:0x28). With the default
behavior, gcc will insert a stack canary on any function that creates
buffers of 8 bytes or more. As a result, ucall() will hit a triple
fault shutdown due to reading a bad fs segment when inserting its
stack canary, i.e. every test fails with an unexpected SHUTDOWN.
Fixes: 14c47b7530e2d ("kvm: selftests: introduce ucall")
Signed-off-by: Sean Christopherson <sean.j.christopherson(a)intel.com>
Signed-off-by: Paolo Bonzini <pbonzini(a)redhat.com>
Signed-off-by: Sasha Levin (Microsoft) <sashal(a)kernel.org>
---
tools/testing/selftests/kvm/Makefile | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/tools/testing/selftests/kvm/Makefile b/tools/testing/selftests/kvm/Makefile
index 212b8f0032ae..cb4a992d6dd3 100644
--- a/tools/testing/selftests/kvm/Makefile
+++ b/tools/testing/selftests/kvm/Makefile
@@ -28,8 +28,8 @@ LIBKVM += $(LIBKVM_$(UNAME_M))
INSTALL_HDR_PATH = $(top_srcdir)/usr
LINUX_HDR_PATH = $(INSTALL_HDR_PATH)/include/
LINUX_TOOL_INCLUDE = $(top_srcdir)/tools/include
-CFLAGS += -O2 -g -std=gnu99 -no-pie -I$(LINUX_TOOL_INCLUDE) -I$(LINUX_HDR_PATH) -Iinclude -I$(<D) -Iinclude/$(UNAME_M) -I..
-LDFLAGS += -pthread
+CFLAGS += -O2 -g -std=gnu99 -fno-stack-protector -fno-PIE -I$(LINUX_TOOL_INCLUDE) -I$(LINUX_HDR_PATH) -Iinclude -I$(<D) -Iinclude/$(UNAME_M) -I..
+LDFLAGS += -pthread -no-pie
# After inclusion, $(OUTPUT) is defined and
# $(TEST_GEN_PROGS) starts with $(OUTPUT)/
--
2.19.1
From: Sean Christopherson <sean.j.christopherson(a)intel.com>
[ Upstream commit 0a3f29b5a77d6c27796d7a7adabafd199dc066d5 ]
KVM selftests embed the guest "image" as a function in the test itself
and extract the guest code at runtime by manually parsing the elf
headers. The parsing is very simple and doesn't supporting fancy things
like position independent executables. Recent versions of gcc enable
pie by default, which results in triple fault shutdowns in the guest due
to the virtual address in the headers not matching up with the virtual
address retrieved from the function pointer.
Signed-off-by: Sean Christopherson <sean.j.christopherson(a)intel.com>
Signed-off-by: Paolo Bonzini <pbonzini(a)redhat.com>
Signed-off-by: Sasha Levin (Microsoft) <sashal(a)kernel.org>
---
tools/testing/selftests/kvm/Makefile | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/selftests/kvm/Makefile b/tools/testing/selftests/kvm/Makefile
index f9a0e9938480..212b8f0032ae 100644
--- a/tools/testing/selftests/kvm/Makefile
+++ b/tools/testing/selftests/kvm/Makefile
@@ -28,7 +28,7 @@ LIBKVM += $(LIBKVM_$(UNAME_M))
INSTALL_HDR_PATH = $(top_srcdir)/usr
LINUX_HDR_PATH = $(INSTALL_HDR_PATH)/include/
LINUX_TOOL_INCLUDE = $(top_srcdir)/tools/include
-CFLAGS += -O2 -g -std=gnu99 -I$(LINUX_TOOL_INCLUDE) -I$(LINUX_HDR_PATH) -Iinclude -I$(<D) -Iinclude/$(UNAME_M) -I..
+CFLAGS += -O2 -g -std=gnu99 -no-pie -I$(LINUX_TOOL_INCLUDE) -I$(LINUX_HDR_PATH) -Iinclude -I$(<D) -Iinclude/$(UNAME_M) -I..
LDFLAGS += -pthread
# After inclusion, $(OUTPUT) is defined and
--
2.19.1
Rename the blockdev documentation files to ReST, add an
index for them and adjust in order to produce a nice html
output via the Sphinx build system.
The drbd sub-directory contains some graphs and data flows.
Add those too to the documentation.
At its new index.rst, let's add a :orphan: while this is not linked to
the main index.rst file, in order to avoid build warnings.
Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung(a)kernel.org>
---
.../admin-guide/kernel-parameters.txt | 18 +-
...structure-v9.txt => data-structure-v9.rst} | 6 +-
Documentation/blockdev/drbd/figures.rst | 28 +++
.../blockdev/drbd/{README.txt => index.rst} | 15 +-
.../blockdev/{floppy.txt => floppy.rst} | 88 ++++----
Documentation/blockdev/index.rst | 16 ++
Documentation/blockdev/{nbd.txt => nbd.rst} | 1 +
.../blockdev/{paride.txt => paride.rst} | 144 +++++++------
.../blockdev/{ramdisk.txt => ramdisk.rst} | 55 ++---
Documentation/blockdev/{zram.txt => zram.rst} | 195 ++++++++++++------
MAINTAINERS | 8 +-
drivers/block/Kconfig | 8 +-
drivers/block/floppy.c | 2 +-
drivers/block/zram/Kconfig | 6 +-
tools/testing/selftests/zram/README | 2 +-
15 files changed, 374 insertions(+), 218 deletions(-)
rename Documentation/blockdev/drbd/{data-structure-v9.txt => data-structure-v9.rst} (94%)
create mode 100644 Documentation/blockdev/drbd/figures.rst
rename Documentation/blockdev/drbd/{README.txt => index.rst} (55%)
rename Documentation/blockdev/{floppy.txt => floppy.rst} (81%)
create mode 100644 Documentation/blockdev/index.rst
rename Documentation/blockdev/{nbd.txt => nbd.rst} (96%)
rename Documentation/blockdev/{paride.txt => paride.rst} (85%)
rename Documentation/blockdev/{ramdisk.txt => ramdisk.rst} (84%)
rename Documentation/blockdev/{zram.txt => zram.rst} (76%)
diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 65d66010b134..a6297aff5598 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -1248,7 +1248,7 @@
See also Documentation/fault-injection/.
floppy= [HW]
- See Documentation/blockdev/floppy.txt.
+ See Documentation/blockdev/floppy.rst.
force_pal_cache_flush
[IA-64] Avoid check_sal_cache_flush which may hang on
@@ -2227,7 +2227,7 @@
memblock=debug [KNL] Enable memblock debug messages.
load_ramdisk= [RAM] List of ramdisks to load from floppy
- See Documentation/blockdev/ramdisk.txt.
+ See Documentation/blockdev/ramdisk.rst.
lockd.nlm_grace_period=P [NFS] Assign grace period.
Format: <integer>
@@ -3197,7 +3197,7 @@
pcd. [PARIDE]
See header of drivers/block/paride/pcd.c.
- See also Documentation/blockdev/paride.txt.
+ See also Documentation/blockdev/paride.rst.
pci=option[,option...] [PCI] various PCI subsystem options.
@@ -3439,7 +3439,7 @@
needed on a platform with proper driver support.
pd. [PARIDE]
- See Documentation/blockdev/paride.txt.
+ See Documentation/blockdev/paride.rst.
pdcchassis= [PARISC,HW] Disable/Enable PDC Chassis Status codes at
boot time.
@@ -3454,10 +3454,10 @@
and performance comparison.
pf. [PARIDE]
- See Documentation/blockdev/paride.txt.
+ See Documentation/blockdev/paride.rst.
pg. [PARIDE]
- See Documentation/blockdev/paride.txt.
+ See Documentation/blockdev/paride.rst.
pirq= [SMP,APIC] Manual mp-table setup
See Documentation/x86/i386/IO-APIC.txt.
@@ -3569,7 +3569,7 @@
prompt_ramdisk= [RAM] List of RAM disks to prompt for floppy disk
before loading.
- See Documentation/blockdev/ramdisk.txt.
+ See Documentation/blockdev/ramdisk.rst.
psi= [KNL] Enable or disable pressure stall information
tracking.
@@ -3591,7 +3591,7 @@
pstore.backend= Specify the name of the pstore backend to use
pt. [PARIDE]
- See Documentation/blockdev/paride.txt.
+ See Documentation/blockdev/paride.rst.
pti= [X86_64] Control Page Table Isolation of user and
kernel address spaces. Disabling this feature
@@ -3620,7 +3620,7 @@
See Documentation/admin-guide/md.rst.
ramdisk_size= [RAM] Sizes of RAM disks in kilobytes
- See Documentation/blockdev/ramdisk.txt.
+ See Documentation/blockdev/ramdisk.rst.
random.trust_cpu={on,off}
[KNL] Enable or disable trusting the use of the
diff --git a/Documentation/blockdev/drbd/data-structure-v9.txt b/Documentation/blockdev/drbd/data-structure-v9.rst
similarity index 94%
rename from Documentation/blockdev/drbd/data-structure-v9.txt
rename to Documentation/blockdev/drbd/data-structure-v9.rst
index 1e52a0e32624..66036b901644 100644
--- a/Documentation/blockdev/drbd/data-structure-v9.txt
+++ b/Documentation/blockdev/drbd/data-structure-v9.rst
@@ -1,3 +1,7 @@
+================================
+kernel data structure for DRBD-9
+================================
+
This describes the in kernel data structure for DRBD-9. Starting with
Linux v3.14 we are reorganizing DRBD to use this data structure.
@@ -10,7 +14,7 @@ device is represented by a block device locally.
The DRBD objects are interconnected to form a matrix as depicted below; a
drbd_peer_device object sits at each intersection between a drbd_device and a
-drbd_connection:
+drbd_connection::
/--------------+---------------+.....+---------------\
| resource | device | | device |
diff --git a/Documentation/blockdev/drbd/figures.rst b/Documentation/blockdev/drbd/figures.rst
new file mode 100644
index 000000000000..3e3fd4b8a478
--- /dev/null
+++ b/Documentation/blockdev/drbd/figures.rst
@@ -0,0 +1,28 @@
+.. The here included files are intended to help understand the implementation
+
+Data flows that Relate some functions, and write packets
+========================================================
+
+.. kernel-figure:: DRBD-8.3-data-packets.svg
+ :alt: DRBD-8.3-data-packets.svg
+ :align: center
+
+.. kernel-figure:: DRBD-data-packets.svg
+ :alt: DRBD-data-packets.svg
+ :align: center
+
+
+Sub graphs of DRBD's state transitions
+======================================
+
+.. kernel-figure:: conn-states-8.dot
+ :alt: conn-states-8.dot
+ :align: center
+
+.. kernel-figure:: disk-states-8.dot
+ :alt: disk-states-8.dot
+ :align: center
+
+.. kernel-figure:: node-states-8.dot
+ :alt: node-states-8.dot
+ :align: center
diff --git a/Documentation/blockdev/drbd/README.txt b/Documentation/blockdev/drbd/index.rst
similarity index 55%
rename from Documentation/blockdev/drbd/README.txt
rename to Documentation/blockdev/drbd/index.rst
index 627b0a1bf35e..68ecd5c113e9 100644
--- a/Documentation/blockdev/drbd/README.txt
+++ b/Documentation/blockdev/drbd/index.rst
@@ -1,4 +1,9 @@
+==========================================
+Distributed Replicated Block Device - DRBD
+==========================================
+
Description
+===========
DRBD is a shared-nothing, synchronously replicated block device. It
is designed to serve as a building block for high availability
@@ -7,10 +12,8 @@ Description
Please visit http://www.drbd.org to find out more.
-The here included files are intended to help understand the implementation
+.. toctree::
+ :maxdepth: 1
-DRBD-8.3-data-packets.svg, DRBD-data-packets.svg
- relates some functions, and write packets.
-
-conn-states-8.dot, disk-states-8.dot, node-states-8.dot
- The sub graphs of DRBD's state transitions
+ data-structure-v9
+ figures
diff --git a/Documentation/blockdev/floppy.txt b/Documentation/blockdev/floppy.rst
similarity index 81%
rename from Documentation/blockdev/floppy.txt
rename to Documentation/blockdev/floppy.rst
index e2240f5ab64d..4a8f31cf4139 100644
--- a/Documentation/blockdev/floppy.txt
+++ b/Documentation/blockdev/floppy.rst
@@ -1,35 +1,37 @@
-This file describes the floppy driver.
+=============
+Floppy Driver
+=============
FAQ list:
=========
- A FAQ list may be found in the fdutils package (see below), and also
+A FAQ list may be found in the fdutils package (see below), and also
at <http://fdutils.linux.lu/faq.html>.
LILO configuration options (Thinkpad users, read this)
======================================================
- The floppy driver is configured using the 'floppy=' option in
+The floppy driver is configured using the 'floppy=' option in
lilo. This option can be typed at the boot prompt, or entered in the
lilo configuration file.
- Example: If your kernel is called linux-2.6.9, type the following line
-at the lilo boot prompt (if you have a thinkpad):
+Example: If your kernel is called linux-2.6.9, type the following line
+at the lilo boot prompt (if you have a thinkpad)::
linux-2.6.9 floppy=thinkpad
You may also enter the following line in /etc/lilo.conf, in the description
-of linux-2.6.9:
+of linux-2.6.9::
append = "floppy=thinkpad"
- Several floppy related options may be given, example:
+Several floppy related options may be given, example::
linux-2.6.9 floppy=daring floppy=two_fdc
append = "floppy=daring floppy=two_fdc"
- If you give options both in the lilo config file and on the boot
+If you give options both in the lilo config file and on the boot
prompt, the option strings of both places are concatenated, the boot
prompt options coming last. That's why there are also options to
restore the default behavior.
@@ -38,21 +40,23 @@ restore the default behavior.
Module configuration options
============================
- If you use the floppy driver as a module, use the following syntax:
-modprobe floppy floppy="<options>"
+If you use the floppy driver as a module, use the following syntax::
-Example:
- modprobe floppy floppy="omnibook messages"
+ modprobe floppy floppy="<options>"
- If you need certain options enabled every time you load the floppy driver,
-you can put:
+Example::
- options floppy floppy="omnibook messages"
+ modprobe floppy floppy="omnibook messages"
+
+If you need certain options enabled every time you load the floppy driver,
+you can put::
+
+ options floppy floppy="omnibook messages"
in a configuration file in /etc/modprobe.d/.
- The floppy driver related options are:
+The floppy driver related options are:
floppy=asus_pci
Sets the bit mask to allow only units 0 and 1. (default)
@@ -70,8 +74,7 @@ in a configuration file in /etc/modprobe.d/.
Tells the floppy driver that you have only one floppy controller.
(default)
- floppy=two_fdc
- floppy=<address>,two_fdc
+ floppy=two_fdc / floppy=<address>,two_fdc
Tells the floppy driver that you have two floppy controllers.
The second floppy controller is assumed to be at <address>.
This option is not needed if the second controller is at address
@@ -84,8 +87,7 @@ in a configuration file in /etc/modprobe.d/.
floppy=0,thinkpad
Tells the floppy driver that you don't have a Thinkpad.
- floppy=omnibook
- floppy=nodma
+ floppy=omnibook / floppy=nodma
Tells the floppy driver not to use Dma for data transfers.
This is needed on HP Omnibooks, which don't have a workable
DMA channel for the floppy driver. This option is also useful
@@ -144,14 +146,16 @@ in a configuration file in /etc/modprobe.d/.
described in the physical CMOS), or if your BIOS uses
non-standard CMOS types. The CMOS types are:
- 0 - Use the value of the physical CMOS
- 1 - 5 1/4 DD
- 2 - 5 1/4 HD
- 3 - 3 1/2 DD
- 4 - 3 1/2 HD
- 5 - 3 1/2 ED
- 6 - 3 1/2 ED
- 16 - unknown or not installed
+ == ==================================
+ 0 Use the value of the physical CMOS
+ 1 5 1/4 DD
+ 2 5 1/4 HD
+ 3 3 1/2 DD
+ 4 3 1/2 HD
+ 5 3 1/2 ED
+ 6 3 1/2 ED
+ 16 unknown or not installed
+ == ==================================
(Note: there are two valid types for ED drives. This is because 5 was
initially chosen to represent floppy *tapes*, and 6 for ED drives.
@@ -162,8 +166,7 @@ in a configuration file in /etc/modprobe.d/.
Print a warning message when an unexpected interrupt is received.
(default)
- floppy=no_unexpected_interrupts
- floppy=L40SX
+ floppy=no_unexpected_interrupts / floppy=L40SX
Don't print a message when an unexpected interrupt is received. This
is needed on IBM L40SX laptops in certain video modes. (There seems
to be an interaction between video and floppy. The unexpected
@@ -199,47 +202,54 @@ in a configuration file in /etc/modprobe.d/.
Sets the floppy DMA channel to <nr> instead of 2.
floppy=slow
- Use PS/2 stepping rate:
- " PS/2 floppies have much slower step rates than regular floppies.
+ Use PS/2 stepping rate::
+
+ PS/2 floppies have much slower step rates than regular floppies.
It's been recommended that take about 1/4 of the default speed
- in some more extreme cases."
+ in some more extreme cases.
Supporting utilities and additional documentation:
==================================================
- Additional parameters of the floppy driver can be configured at
+Additional parameters of the floppy driver can be configured at
runtime. Utilities which do this can be found in the fdutils package.
This package also contains a new version of mtools which allows to
access high capacity disks (up to 1992K on a high density 3 1/2 disk!).
It also contains additional documentation about the floppy driver.
The latest version can be found at fdutils homepage:
+
http://fdutils.linux.lu
The fdutils releases can be found at:
+
http://fdutils.linux.lu/download.html
+
http://www.tux.org/pub/knaff/fdutils/
+
ftp://metalab.unc.edu/pub/Linux/utils/disk-management/
Reporting problems about the floppy driver
==========================================
- If you have a question or a bug report about the floppy driver, mail
+If you have a question or a bug report about the floppy driver, mail
me at Alain.Knaff(a)poboxes.com . If you post to Usenet, preferably use
comp.os.linux.hardware. As the volume in these groups is rather high,
be sure to include the word "floppy" (or "FLOPPY") in the subject
line. If the reported problem happens when mounting floppy disks, be
sure to mention also the type of the filesystem in the subject line.
- Be sure to read the FAQ before mailing/posting any bug reports!
+Be sure to read the FAQ before mailing/posting any bug reports!
- Alain
+Alain
Changelog
=========
-10-30-2004 : Cleanup, updating, add reference to module configuration.
+10-30-2004 :
+ Cleanup, updating, add reference to module configuration.
James Nelson <james4765(a)gmail.com>
-6-3-2000 : Original Document
+6-3-2000 :
+ Original Document
diff --git a/Documentation/blockdev/index.rst b/Documentation/blockdev/index.rst
new file mode 100644
index 000000000000..a9af6ed8b4aa
--- /dev/null
+++ b/Documentation/blockdev/index.rst
@@ -0,0 +1,16 @@
+:orphan:
+
+===========================
+The Linux RapidIO Subsystem
+===========================
+
+.. toctree::
+ :maxdepth: 1
+
+ floppy
+ nbd
+ paride
+ ramdisk
+ zram
+
+ drbd/index
diff --git a/Documentation/blockdev/nbd.txt b/Documentation/blockdev/nbd.rst
similarity index 96%
rename from Documentation/blockdev/nbd.txt
rename to Documentation/blockdev/nbd.rst
index db242ea2bce8..db0c96e46661 100644
--- a/Documentation/blockdev/nbd.txt
+++ b/Documentation/blockdev/nbd.rst
@@ -1,3 +1,4 @@
+==================================
Network Block Device (TCP version)
==================================
diff --git a/Documentation/blockdev/paride.txt b/Documentation/blockdev/paride.rst
similarity index 85%
rename from Documentation/blockdev/paride.txt
rename to Documentation/blockdev/paride.rst
index ee6717e3771d..b7fdd77513ab 100644
--- a/Documentation/blockdev/paride.txt
+++ b/Documentation/blockdev/paride.rst
@@ -1,9 +1,11 @@
-
- Linux and parallel port IDE devices
+===================================
+Linux and parallel port IDE devices
+===================================
PARIDE v1.03 (c) 1997-8 Grant Guenther <grant(a)torque.net>
1. Introduction
+===============
Owing to the simplicity and near universality of the parallel port interface
to personal computers, many external devices such as portable hard-disk,
@@ -35,17 +37,17 @@ devices. It does not cover parallel port SCSI devices, "ditto" tape
drives or scanners. Many different devices are supported by the
parallel port IDE subsystem, including:
- MicroSolutions backpack CD-ROM
- MicroSolutions backpack PD/CD
- MicroSolutions backpack hard-drives
- MicroSolutions backpack 8000t tape drive
- SyQuest EZ-135, EZ-230 & SparQ drives
- Avatar Shark
- Imation Superdisk LS-120
- Maxell Superdisk LS-120
- FreeCom Power CD
- Hewlett-Packard 5GB and 8GB tape drives
- Hewlett-Packard 7100 and 7200 CD-RW drives
+ - MicroSolutions backpack CD-ROM
+ - MicroSolutions backpack PD/CD
+ - MicroSolutions backpack hard-drives
+ - MicroSolutions backpack 8000t tape drive
+ - SyQuest EZ-135, EZ-230 & SparQ drives
+ - Avatar Shark
+ - Imation Superdisk LS-120
+ - Maxell Superdisk LS-120
+ - FreeCom Power CD
+ - Hewlett-Packard 5GB and 8GB tape drives
+ - Hewlett-Packard 7100 and 7200 CD-RW drives
as well as most of the clone and no-name products on the market.
@@ -55,11 +57,13 @@ paride module which provides a registry and some common methods for
accessing the parallel ports. The second component is a set of
high-level drivers for each of the different types of supported devices:
+ === =============
pd IDE disk
pcd ATAPI CD-ROM
pf ATAPI disk
pt ATAPI tape
pg ATAPI generic
+ === =============
(Currently, the pg driver is only used with CD-R drives).
@@ -69,6 +73,7 @@ for each of the parallel port IDE adapter chips. Thanks to the interest
and encouragement of Linux users from many parts of the world,
support is available for almost all known adapter protocols:
+ ==== ====================================== ====
aten ATEN EH-100 (HK)
bpck Microsolutions backpack (US)
comm DataStor (old-type) "commuter" adapter (TW)
@@ -83,9 +88,11 @@ support is available for almost all known adapter protocols:
ktti KT Technology PHd adapter (SG)
on20 OnSpec 90c20 (US)
on26 OnSpec 90c26 (US)
+ ==== ====================================== ====
2. Using the PARIDE subsystem
+=============================
While configuring the Linux kernel, you may choose either to build
the PARIDE drivers into your kernel, or to build them as modules.
@@ -105,8 +112,9 @@ subsystem to try them all for you.
For the "brand-name" products listed above, here are the protocol
and high-level drivers that you would use:
+ ================ ============ ====== ========
Manufacturer Model Driver Protocol
-
+ ================ ============ ====== ========
MicroSolutions CD-ROM pcd bpck
MicroSolutions PD drive pf bpck
MicroSolutions hard-drive pd bpck
@@ -119,8 +127,10 @@ and high-level drivers that you would use:
Hewlett-Packard 5GB Tape pt epat
Hewlett-Packard 7200e (CD) pcd epat
Hewlett-Packard 7200e (CD-R) pg epat
+ ================ ============ ====== ========
2.1 Configuring built-in drivers
+---------------------------------
We recommend that you get to know how the drivers work and how to
configure them as loadable modules, before attempting to compile a
@@ -143,7 +153,7 @@ protocol identification number and, for some devices, the drive's
chain ID. While your system is booting, a number of messages are
displayed on the console. Like all such messages, they can be
reviewed with the 'dmesg' command. Among those messages will be
-some lines like:
+some lines like::
paride: bpck registered as protocol 0
paride: epat registered as protocol 1
@@ -161,7 +171,7 @@ As an example, let's assume that you have a MicroSolutions PD/CD drive
with unit ID number 36 connected to the parallel port at 0x378, a SyQuest
EZ-135 connected to the chained port on the PD/CD drive and also an
Imation Superdisk connected to port 0x278. You could give the following
-options on your boot command:
+options on your boot command::
pd.drive0=0x378,1 pf.drive0=0x278,1 pf.drive1=0x378,0,36
@@ -175,18 +185,21 @@ PARPORT parallel port sharing system that is included by the
if you want to use chains of devices on the same parallel port.
2.2 Loading and configuring PARIDE as modules
+----------------------------------------------
It is much faster and simpler to get to understand the PARIDE drivers
if you use them as loadable kernel modules.
-Note 1: using these drivers with the "kerneld" automatic module loading
-system is not recommended for beginners, and is not documented here.
+Note 1:
+ using these drivers with the "kerneld" automatic module loading
+ system is not recommended for beginners, and is not documented here.
-Note 2: if you build PARPORT support as a loadable module, PARIDE must
-also be built as loadable modules, and PARPORT must be loaded before the
-PARIDE modules.
+Note 2:
+ if you build PARPORT support as a loadable module, PARIDE must
+ also be built as loadable modules, and PARPORT must be loaded before
+ the PARIDE modules.
-To use PARIDE, you must begin by
+To use PARIDE, you must begin by::
insmod paride
@@ -196,7 +209,7 @@ among other tasks.
Then, load as many of the protocol modules as you think you might need.
As you load each module, it will register the protocols that it supports,
and print a log message to your kernel log file and your console. For
-example:
+example::
# insmod epat
paride: epat registered as protocol 0
@@ -211,7 +224,7 @@ individual co-ordinates when you load the driver.
For example, if you had two no-name CD-ROM drives both using the
KingByte KBIC-951A adapter, one on port 0x378 and the other on 0x3bc
-you could give the following command:
+you could give the following command::
# insmod pcd drive0=0x378,1 drive1=0x3bc,1
@@ -220,7 +233,7 @@ but check the source files in linux/drivers/block/paride for more
information. (Hopefully someone will write some man pages one day !).
As another example, here's what happens when PARPORT is installed, and
-a SyQuest EZ-135 is attached to port 0x378:
+a SyQuest EZ-135 is attached to port 0x378::
# insmod paride
paride: version 1.0 installed
@@ -237,46 +250,47 @@ Note that the last line is the output from the generic partition table
scanner - in this case it reports that it has found a disk with one partition.
2.3 Using a PARIDE device
+--------------------------
Once the drivers have been loaded, you can access PARIDE devices in the
same way as their traditional counterparts. You will probably need to
create the device "special files". Here is a simple script that you can
-cut to a file and execute:
+cut to a file and execute::
-#!/bin/bash
-#
-# mkd -- a script to create the device special files for the PARIDE subsystem
-#
-function mkdev {
- mknod $1 $2 $3 $4 ; chmod 0660 $1 ; chown root:disk $1
-}
-#
-function pd {
- D=$( printf \\$( printf "x%03x" $[ $1 + 97 ] ) )
- mkdev pd$D b 45 $[ $1 * 16 ]
- for P in 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
- do mkdev pd$D$P b 45 $[ $1 * 16 + $P ]
- done
-}
-#
-cd /dev
-#
-for u in 0 1 2 3 ; do pd $u ; done
-for u in 0 1 2 3 ; do mkdev pcd$u b 46 $u ; done
-for u in 0 1 2 3 ; do mkdev pf$u b 47 $u ; done
-for u in 0 1 2 3 ; do mkdev pt$u c 96 $u ; done
-for u in 0 1 2 3 ; do mkdev npt$u c 96 $[ $u + 128 ] ; done
-for u in 0 1 2 3 ; do mkdev pg$u c 97 $u ; done
-#
-# end of mkd
+ #!/bin/bash
+ #
+ # mkd -- a script to create the device special files for the PARIDE subsystem
+ #
+ function mkdev {
+ mknod $1 $2 $3 $4 ; chmod 0660 $1 ; chown root:disk $1
+ }
+ #
+ function pd {
+ D=$( printf \\$( printf "x%03x" $[ $1 + 97 ] ) )
+ mkdev pd$D b 45 $[ $1 * 16 ]
+ for P in 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
+ do mkdev pd$D$P b 45 $[ $1 * 16 + $P ]
+ done
+ }
+ #
+ cd /dev
+ #
+ for u in 0 1 2 3 ; do pd $u ; done
+ for u in 0 1 2 3 ; do mkdev pcd$u b 46 $u ; done
+ for u in 0 1 2 3 ; do mkdev pf$u b 47 $u ; done
+ for u in 0 1 2 3 ; do mkdev pt$u c 96 $u ; done
+ for u in 0 1 2 3 ; do mkdev npt$u c 96 $[ $u + 128 ] ; done
+ for u in 0 1 2 3 ; do mkdev pg$u c 97 $u ; done
+ #
+ # end of mkd
With the device files and drivers in place, you can access PARIDE devices
-like any other Linux device. For example, to mount a CD-ROM in pcd0, use:
+like any other Linux device. For example, to mount a CD-ROM in pcd0, use::
mount /dev/pcd0 /cdrom
If you have a fresh Avatar Shark cartridge, and the drive is pda, you
-might do something like:
+might do something like::
fdisk /dev/pda -- make a new partition table with
partition 1 of type 83
@@ -289,13 +303,14 @@ might do something like:
Devices like the Imation superdisk work in the same way, except that
they do not have a partition table. For example to make a 120MB
-floppy that you could share with a DOS system:
+floppy that you could share with a DOS system::
mkdosfs /dev/pf0
mount /dev/pf0 /mnt
2.4 The pf driver
+------------------
The pf driver is intended for use with parallel port ATAPI disk
devices. The most common devices in this category are PD drives
@@ -304,6 +319,7 @@ partitioned. Consequently, the pf driver does not support partitioned
media. This may be changed in a future version of the driver.
2.5 Using the pt driver
+------------------------
The pt driver for parallel port ATAPI tape drives is a minimal driver.
It does not yet support many of the standard tape ioctl operations.
@@ -311,6 +327,7 @@ For best performance, a block size of 32KB should be used. You will
probably want to set the parallel port delay to 0, if you can.
2.6 Using the pg driver
+------------------------
The pg driver can be used in conjunction with the cdrecord program
to create CD-ROMs. Please get cdrecord version 1.6.1 or later
@@ -322,8 +339,10 @@ in EPP mode, try to use "bidirectional" or "PS/2" mode and 1x speeds only.
3. Troubleshooting
+==================
3.1 Use EPP mode if you can
+----------------------------
The most common problems that people report with the PARIDE drivers
concern the parallel port CMOS settings. At this time, none of the
@@ -332,6 +351,7 @@ If you are able to do so, please set your parallel port into EPP mode
using your CMOS setup procedure.
3.2 Check the port delay
+-------------------------
Some parallel ports cannot reliably transfer data at full speed. To
offset the errors, the PARIDE protocol modules introduce a "port
@@ -347,6 +367,7 @@ read the comments at the beginning of the driver source files in
linux/drivers/block/paride.
3.3 Some drives need a printer reset
+-------------------------------------
There appear to be a number of "noname" external drives on the market
that do not always power up correctly. We have noticed this with some
@@ -354,7 +375,7 @@ drives based on OnSpec and older Freecom adapters. In these rare cases,
the adapter can often be reinitialised by issuing a "printer reset" on
the parallel port. As the reset operation is potentially disruptive in
multiple device environments, the PARIDE drivers will not do it
-automatically. You can however, force a printer reset by doing:
+automatically. You can however, force a printer reset by doing::
insmod lp reset=1
rmmod lp
@@ -364,6 +385,7 @@ your paride drivers as modules, and arrange to do the printer reset
before loading the PARIDE drivers.
3.4 Use the verbose option and dmesg if you need help
+------------------------------------------------------
While a lot of testing has gone into these drivers to make them work
as smoothly as possible, problems will arise. If you do have problems,
@@ -373,7 +395,7 @@ clues, then please make sure that only one drive is hooked to your system,
and that either (a) PARPORT is enabled or (b) no other device driver
is using your parallel port (check in /proc/ioports). Then, load the
appropriate drivers (you can load several protocol modules if you want)
-as in:
+as in::
# insmod paride
# insmod epat
@@ -394,12 +416,14 @@ by e-mail to grant(a)torque.net, or join the linux-parport mailing list
and post your report there.
3.5 For more information or help
+---------------------------------
You can join the linux-parport mailing list by sending a mail message
-to
+to:
+
linux-parport-request(a)torque.net
-with the single word
+with the single word::
subscribe
@@ -412,6 +436,6 @@ have in your mail headers, when sending mail to the list server.
You might also find some useful information on the linux-parport
web pages (although they are not always up to date) at
- http://web.archive.org/web/*/http://www.torque.net/parport/
+ http://web.archive.org/web/%2E/http://www.torque.net/parport/
diff --git a/Documentation/blockdev/ramdisk.txt b/Documentation/blockdev/ramdisk.rst
similarity index 84%
rename from Documentation/blockdev/ramdisk.txt
rename to Documentation/blockdev/ramdisk.rst
index 501e12e0323e..b7c2268f8dec 100644
--- a/Documentation/blockdev/ramdisk.txt
+++ b/Documentation/blockdev/ramdisk.rst
@@ -1,7 +1,8 @@
+==========================================
Using the RAM disk block device with Linux
-------------------------------------------
+==========================================
-Contents:
+.. Contents:
1) Overview
2) Kernel Command Line Parameters
@@ -42,7 +43,7 @@ rescue floppy disk.
2a) Kernel Command Line Parameters
ramdisk_size=N
- ==============
+ Size of the ramdisk.
This parameter tells the RAM disk driver to set up RAM disks of N k size. The
default is 4096 (4 MB).
@@ -50,16 +51,13 @@ default is 4096 (4 MB).
2b) Module parameters
rd_nr
- =====
- /dev/ramX devices created.
+ /dev/ramX devices created.
max_part
- ========
- Maximum partition number.
+ Maximum partition number.
rd_size
- =======
- See ramdisk_size.
+ See ramdisk_size.
3) Using "rdev -r"
------------------
@@ -71,11 +69,11 @@ to 2 MB (2^11) of where to find the RAM disk (this used to be the size). Bit
prompt/wait sequence is to be given before trying to read the RAM disk. Since
the RAM disk dynamically grows as data is being written into it, a size field
is not required. Bits 11 to 13 are not currently used and may as well be zero.
-These numbers are no magical secrets, as seen below:
+These numbers are no magical secrets, as seen below::
-./arch/x86/kernel/setup.c:#define RAMDISK_IMAGE_START_MASK 0x07FF
-./arch/x86/kernel/setup.c:#define RAMDISK_PROMPT_FLAG 0x8000
-./arch/x86/kernel/setup.c:#define RAMDISK_LOAD_FLAG 0x4000
+ ./arch/x86/kernel/setup.c:#define RAMDISK_IMAGE_START_MASK 0x07FF
+ ./arch/x86/kernel/setup.c:#define RAMDISK_PROMPT_FLAG 0x8000
+ ./arch/x86/kernel/setup.c:#define RAMDISK_LOAD_FLAG 0x4000
Consider a typical two floppy disk setup, where you will have the
kernel on disk one, and have already put a RAM disk image onto disk #2.
@@ -92,20 +90,23 @@ sequence so that you have a chance to switch floppy disks.
The command line equivalent is: "prompt_ramdisk=1"
Putting that together gives 2^15 + 2^14 + 0 = 49152 for an rdev word.
-So to create disk one of the set, you would do:
+So to create disk one of the set, you would do::
/usr/src/linux# cat arch/x86/boot/zImage > /dev/fd0
/usr/src/linux# rdev /dev/fd0 /dev/fd0
/usr/src/linux# rdev -r /dev/fd0 49152
-If you make a boot disk that has LILO, then for the above, you would use:
+If you make a boot disk that has LILO, then for the above, you would use::
+
append = "ramdisk_start=0 load_ramdisk=1 prompt_ramdisk=1"
-Since the default start = 0 and the default prompt = 1, you could use:
+
+Since the default start = 0 and the default prompt = 1, you could use::
+
append = "load_ramdisk=1"
4) An Example of Creating a Compressed RAM Disk
-----------------------------------------------
+-----------------------------------------------
To create a RAM disk image, you will need a spare block device to
construct it on. This can be the RAM disk device itself, or an
@@ -120,11 +121,11 @@ a) Decide on the RAM disk size that you want. Say 2 MB for this example.
Create it by writing to the RAM disk device. (This step is not currently
required, but may be in the future.) It is wise to zero out the
area (esp. for disks) so that maximal compression is achieved for
- the unused blocks of the image that you are about to create.
+ the unused blocks of the image that you are about to create::
dd if=/dev/zero of=/dev/ram0 bs=1k count=2048
-b) Make a filesystem on it. Say ext2fs for this example.
+b) Make a filesystem on it. Say ext2fs for this example::
mke2fs -vm0 /dev/ram0 2048
@@ -133,11 +134,11 @@ c) Mount it, copy the files you want to it (eg: /etc/* /dev/* ...)
d) Compress the contents of the RAM disk. The level of compression
will be approximately 50% of the space used by the files. Unused
- space on the RAM disk will compress to almost nothing.
+ space on the RAM disk will compress to almost nothing::
dd if=/dev/ram0 bs=1k count=2048 | gzip -v9 > /tmp/ram_image.gz
-e) Put the kernel onto the floppy
+e) Put the kernel onto the floppy::
dd if=zImage of=/dev/fd0 bs=1k
@@ -146,13 +147,13 @@ f) Put the RAM disk image onto the floppy, after the kernel. Use an offset
(possibly larger) kernel onto the same floppy later without overlapping
the RAM disk image. An offset of 400 kB for kernels about 350 kB in
size would be reasonable. Make sure offset+size of ram_image.gz is
- not larger than the total space on your floppy (usually 1440 kB).
+ not larger than the total space on your floppy (usually 1440 kB)::
dd if=/tmp/ram_image.gz of=/dev/fd0 bs=1k seek=400
g) Use "rdev" to set the boot device, RAM disk offset, prompt flag, etc.
For prompt_ramdisk=1, load_ramdisk=1, ramdisk_start=400, one would
- have 2^15 + 2^14 + 400 = 49552.
+ have 2^15 + 2^14 + 400 = 49552::
rdev /dev/fd0 /dev/fd0
rdev -r /dev/fd0 49552
@@ -160,15 +161,17 @@ g) Use "rdev" to set the boot device, RAM disk offset, prompt flag, etc.
That is it. You now have your boot/root compressed RAM disk floppy. Some
users may wish to combine steps (d) and (f) by using a pipe.
---------------------------------------------------------------------------
+
Paul Gortmaker 12/95
Changelog:
----------
-10-22-04 : Updated to reflect changes in command line options, remove
+10-22-04 :
+ Updated to reflect changes in command line options, remove
obsolete references, general cleanup.
James Nelson (james4765(a)gmail.com)
-12-95 : Original Document
+12-95 :
+ Original Document
diff --git a/Documentation/blockdev/zram.txt b/Documentation/blockdev/zram.rst
similarity index 76%
rename from Documentation/blockdev/zram.txt
rename to Documentation/blockdev/zram.rst
index 4df0ce271085..2111231c9c0f 100644
--- a/Documentation/blockdev/zram.txt
+++ b/Documentation/blockdev/zram.rst
@@ -1,7 +1,9 @@
+========================================
zram: Compressed RAM based block devices
-----------------------------------------
+========================================
-* Introduction
+Introduction
+============
The zram module creates RAM based block devices named /dev/zram<id>
(<id> = 0, 1, ...). Pages written to these disks are compressed and stored
@@ -12,9 +14,11 @@ use as swap disks, various caches under /var and maybe many more :)
Statistics for individual zram devices are exported through sysfs nodes at
/sys/block/zram<id>/
-* Usage
+Usage
+=====
There are several ways to configure and manage zram device(-s):
+
a) using zram and zram_control sysfs attributes
b) using zramctl utility, provided by util-linux (util-linux(a)vger.kernel.org).
@@ -22,7 +26,7 @@ In this document we will describe only 'manual' zram configuration steps,
IOW, zram and zram_control sysfs attributes.
In order to get a better idea about zramctl please consult util-linux
-documentation, zramctl man-page or `zramctl --help'. Please be informed
+documentation, zramctl man-page or `zramctl --help`. Please be informed
that zram maintainers do not develop/maintain util-linux or zramctl, should
you have any questions please contact util-linux(a)vger.kernel.org
@@ -30,19 +34,23 @@ Following shows a typical sequence of steps for using zram.
WARNING
=======
+
For the sake of simplicity we skip error checking parts in most of the
examples below. However, it is your sole responsibility to handle errors.
zram sysfs attributes always return negative values in case of errors.
The list of possible return codes:
--EBUSY -- an attempt to modify an attribute that cannot be changed once
-the device has been initialised. Please reset device first;
--ENOMEM -- zram was not able to allocate enough memory to fulfil your
-needs;
--EINVAL -- invalid input has been provided.
+
+======== =============================================================
+-EBUSY an attempt to modify an attribute that cannot be changed once
+ the device has been initialised. Please reset device first;
+-ENOMEM zram was not able to allocate enough memory to fulfil your
+ needs;
+-EINVAL invalid input has been provided.
+======== =============================================================
If you use 'echo', the returned value that is changed by 'echo' utility,
-and, in general case, something like:
+and, in general case, something like::
echo 3 > /sys/block/zram0/max_comp_streams
if [ $? -ne 0 ];
@@ -51,7 +59,11 @@ and, in general case, something like:
should suffice.
-1) Load Module:
+1) Load Module
+==============
+
+::
+
modprobe zram num_devices=4
This creates 4 devices: /dev/zram{0,1,2,3}
@@ -59,6 +71,8 @@ num_devices parameter is optional and tells zram how many devices should be
pre-created. Default: 1.
2) Set max number of compression streams
+========================================
+
Regardless the value passed to this attribute, ZRAM will always
allocate multiple compression streams - one per online CPUs - thus
allowing several concurrent compression operations. The number of
@@ -66,16 +80,20 @@ allocated compression streams goes down when some of the CPUs
become offline. There is no single-compression-stream mode anymore,
unless you are running a UP system or has only 1 CPU online.
-To find out how many streams are currently available:
+To find out how many streams are currently available::
+
cat /sys/block/zram0/max_comp_streams
3) Select compression algorithm
+===============================
+
Using comp_algorithm device attribute one can see available and
currently selected (shown in square brackets) compression algorithms,
change selected compression algorithm (once the device is initialised
there is no way to change compression algorithm).
-Examples:
+Examples::
+
#show supported compression algorithms
cat /sys/block/zram0/comp_algorithm
lzo [lz4]
@@ -83,20 +101,23 @@ Examples:
#select lzo compression algorithm
echo lzo > /sys/block/zram0/comp_algorithm
-For the time being, the `comp_algorithm' content does not necessarily
+For the time being, the `comp_algorithm` content does not necessarily
show every compression algorithm supported by the kernel. We keep this
list primarily to simplify device configuration and one can configure
a new device with a compression algorithm that is not listed in
-`comp_algorithm'. The thing is that, internally, ZRAM uses Crypto API
+`comp_algorithm`. The thing is that, internally, ZRAM uses Crypto API
and, if some of the algorithms were built as modules, it's impossible
to list all of them using, for instance, /proc/crypto or any other
method. This, however, has an advantage of permitting the usage of
custom crypto compression modules (implementing S/W or H/W compression).
4) Set Disksize
+===============
+
Set disk size by writing the value to sysfs node 'disksize'.
The value can be either in bytes or you can use mem suffixes.
-Examples:
+Examples::
+
# Initialize /dev/zram0 with 50MB disksize
echo $((50*1024*1024)) > /sys/block/zram0/disksize
@@ -111,10 +132,13 @@ since we expect a 2:1 compression ratio. Note that zram uses about 0.1% of the
size of the disk when not in use so a huge zram is wasteful.
5) Set memory limit: Optional
+=============================
+
Set memory limit by writing the value to sysfs node 'mem_limit'.
The value can be either in bytes or you can use mem suffixes.
In addition, you could change the value in runtime.
-Examples:
+Examples::
+
# limit /dev/zram0 with 50MB memory
echo $((50*1024*1024)) > /sys/block/zram0/mem_limit
@@ -126,7 +150,11 @@ Examples:
# To disable memory limit
echo 0 > /sys/block/zram0/mem_limit
-6) Activate:
+6) Activate
+===========
+
+::
+
mkswap /dev/zram0
swapon /dev/zram0
@@ -134,6 +162,7 @@ Examples:
mount /dev/zram1 /tmp
7) Add/remove zram devices
+==========================
zram provides a control interface, which enables dynamic (on-demand) device
addition and removal.
@@ -142,37 +171,44 @@ In order to add a new /dev/zramX device, perform read operation on hot_add
attribute. This will return either new device's device id (meaning that you
can use /dev/zram<id>) or error code.
-Example:
+Example::
+
cat /sys/class/zram-control/hot_add
1
To remove the existing /dev/zramX device (where X is a device id)
-execute
+execute::
+
echo X > /sys/class/zram-control/hot_remove
-8) Stats:
+8) Stats
+========
+
Per-device statistics are exported as various nodes under /sys/block/zram<id>/
A brief description of exported device attributes. For more details please
read Documentation/ABI/testing/sysfs-block-zram.
+====================== ====== ===============================================
Name access description
----- ------ -----------
+====================== ====== ===============================================
disksize RW show and set the device's disk size
initstate RO shows the initialization state of the device
reset WO trigger device reset
-mem_used_max WO reset the `mem_used_max' counter (see later)
-mem_limit WO specifies the maximum amount of memory ZRAM can use
- to store the compressed data
-writeback_limit WO specifies the maximum amount of write IO zram can
- write out to backing device as 4KB unit
+mem_used_max WO reset the `mem_used_max` counter (see later)
+mem_limit WO specifies the maximum amount of memory ZRAM can
+ use to store the compressed data
+writeback_limit WO specifies the maximum amount of write IO zram
+ can write out to backing device as 4KB unit
writeback_limit_enable RW show and set writeback_limit feature
-max_comp_streams RW the number of possible concurrent compress operations
+max_comp_streams RW the number of possible concurrent compress
+ operations
comp_algorithm RW show and change the compression algorithm
compact WO trigger memory compaction
debug_stat RO this file is used for zram debugging purposes
backing_dev RW set up backend storage for zram to write out
idle WO mark allocated slot as idle
+====================== ====== ===============================================
User space is advised to use the following files to read the device statistics.
@@ -188,23 +224,31 @@ The stat file represents device's I/O statistics not accounted by block
layer and, thus, not available in zram<id>/stat file. It consists of a
single line of text and contains the following stats separated by
whitespace:
- failed_reads the number of failed reads
- failed_writes the number of failed writes
- invalid_io the number of non-page-size-aligned I/O requests
+
+ ============= =============================================================
+ failed_reads The number of failed reads
+ failed_writes The number of failed writes
+ invalid_io The number of non-page-size-aligned I/O requests
notify_free Depending on device usage scenario it may account
+
a) the number of pages freed because of swap slot free
- notifications or b) the number of pages freed because of
- REQ_OP_DISCARD requests sent by bio. The former ones are
- sent to a swap block device when a swap slot is freed,
- which implies that this disk is being used as a swap disk.
+ notifications
+ b) the number of pages freed because of
+ REQ_OP_DISCARD requests sent by bio. The former ones are
+ sent to a swap block device when a swap slot is freed,
+ which implies that this disk is being used as a swap disk.
+
The latter ones are sent by filesystem mounted with
discard option, whenever some data blocks are getting
discarded.
+ ============= =============================================================
File /sys/block/zram<id>/mm_stat
The stat file represents device's mm statistics. It consists of a single
line of text and contains the following stats separated by whitespace:
+
+ ================ =============================================================
orig_data_size uncompressed size of data stored in this disk.
This excludes same-element-filled pages (same_pages) since
no memory is allocated for them.
@@ -223,58 +267,71 @@ line of text and contains the following stats separated by whitespace:
No memory is allocated for such pages.
pages_compacted the number of pages freed during compaction
huge_pages the number of incompressible pages
+ ================ =============================================================
File /sys/block/zram<id>/bd_stat
The stat file represents device's backing device statistics. It consists of
a single line of text and contains the following stats separated by whitespace:
+
+ ============== =============================================================
bd_count size of data written in backing device.
Unit: 4K bytes
bd_reads the number of reads from backing device
Unit: 4K bytes
bd_writes the number of writes to backing device
Unit: 4K bytes
+ ============== =============================================================
+
+9) Deactivate
+=============
+
+::
-9) Deactivate:
swapoff /dev/zram0
umount /dev/zram1
-10) Reset:
- Write any positive value to 'reset' sysfs node
- echo 1 > /sys/block/zram0/reset
- echo 1 > /sys/block/zram1/reset
+10) Reset
+=========
+
+ Write any positive value to 'reset' sysfs node::
+
+ echo 1 > /sys/block/zram0/reset
+ echo 1 > /sys/block/zram1/reset
This frees all the memory allocated for the given device and
resets the disksize to zero. You must set the disksize again
before reusing the device.
-* Optional Feature
+Optional Feature
+================
-= writeback
+writeback
+---------
With CONFIG_ZRAM_WRITEBACK, zram can write idle/incompressible page
to backing storage rather than keeping it in memory.
-To use the feature, admin should set up backing device via
+To use the feature, admin should set up backing device via::
- "echo /dev/sda5 > /sys/block/zramX/backing_dev"
+ echo /dev/sda5 > /sys/block/zramX/backing_dev
before disksize setting. It supports only partition at this moment.
-If admin want to use incompressible page writeback, they could do via
+If admin want to use incompressible page writeback, they could do via::
- "echo huge > /sys/block/zramX/write"
+ echo huge > /sys/block/zramX/write
To use idle page writeback, first, user need to declare zram pages
-as idle.
+as idle::
- "echo all > /sys/block/zramX/idle"
+ echo all > /sys/block/zramX/idle
From now on, any pages on zram are idle pages. The idle mark
will be removed until someone request access of the block.
IOW, unless there is access request, those pages are still idle pages.
-Admin can request writeback of those idle pages at right timing via
+Admin can request writeback of those idle pages at right timing via::
- "echo idle > /sys/block/zramX/writeback"
+ echo idle > /sys/block/zramX/writeback
With the command, zram writeback idle pages from memory to the storage.
@@ -285,7 +342,7 @@ to guarantee storage health for entire product life.
To overcome the concern, zram supports "writeback_limit" feature.
The "writeback_limit_enable"'s default value is 0 so that it doesn't limit
any writeback. IOW, if admin want to apply writeback budget, he should
-enable writeback_limit_enable via
+enable writeback_limit_enable via::
$ echo 1 > /sys/block/zramX/writeback_limit_enable
@@ -296,7 +353,7 @@ until admin set the budget via /sys/block/zramX/writeback_limit.
assigned via /sys/block/zramX/writeback_limit is meaninless.)
If admin want to limit writeback as per-day 400M, he could do it
-like below.
+like below::
$ MB_SHIFT=20
$ 4K_SHIFT=12
@@ -305,16 +362,16 @@ like below.
$ echo 1 > /sys/block/zram0/writeback_limit_enable
If admin want to allow further write again once the bugdet is exausted,
-he could do it like below
+he could do it like below::
$ echo $((400<<MB_SHIFT>>4K_SHIFT)) > \
/sys/block/zram0/writeback_limit
-If admin want to see remaining writeback budget since he set,
+If admin want to see remaining writeback budget since he set::
$ cat /sys/block/zramX/writeback_limit
-If admin want to disable writeback limit, he could do
+If admin want to disable writeback limit, he could do::
$ echo 0 > /sys/block/zramX/writeback_limit_enable
@@ -326,25 +383,35 @@ budget in next setting is user's job.
If admin want to measure writeback count in a certain period, he could
know it via /sys/block/zram0/bd_stat's 3rd column.
-= memory tracking
+memory tracking
+===============
With CONFIG_ZRAM_MEMORY_TRACKING, user can know information of the
zram block. It could be useful to catch cold or incompressible
pages of the process with*pagemap.
+
If you enable the feature, you could see block state via
-/sys/kernel/debug/zram/zram0/block_state". The output is as follows,
+/sys/kernel/debug/zram/zram0/block_state". The output is as follows::
300 75.033841 .wh.
301 63.806904 s...
302 63.806919 ..hi
-First column is zram's block index.
-Second column is access time since the system was booted
-Third column is state of the block.
-(s: same page
-w: written page to backing store
-h: huge page
-i: idle page)
+First column
+ zram's block index.
+Second column
+ access time since the system was booted
+Third column
+ state of the block:
+
+ s:
+ same page
+ w:
+ written page to backing store
+ h:
+ huge page
+ i:
+ idle page
First line of above example says 300th block is accessed at 75.033841sec
and the block's state is huge so it is written back to the backing
diff --git a/MAINTAINERS b/MAINTAINERS
index ca1d09d0c44b..3ed27c0c36d8 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -10823,7 +10823,7 @@ M: Josef Bacik <josef(a)toxicpanda.com>
S: Maintained
L: linux-block(a)vger.kernel.org
L: nbd(a)other.debian.org
-F: Documentation/blockdev/nbd.txt
+F: Documentation/blockdev/nbd.rst
F: drivers/block/nbd.c
F: include/uapi/linux/nbd.h
@@ -11801,7 +11801,7 @@ PARIDE DRIVERS FOR PARALLEL PORT IDE DEVICES
M: Tim Waugh <tim(a)cyberelk.net>
L: linux-parport(a)lists.infradead.org (subscribers-only)
S: Maintained
-F: Documentation/blockdev/paride.txt
+F: Documentation/blockdev/paride.rst
F: drivers/block/paride/
PARISC ARCHITECTURE
@@ -13077,7 +13077,7 @@ F: drivers/net/wireless/ralink/rt2x00/
RAMDISK RAM BLOCK DEVICE DRIVER
M: Jens Axboe <axboe(a)kernel.dk>
S: Maintained
-F: Documentation/blockdev/ramdisk.txt
+F: Documentation/blockdev/ramdisk.rst
F: drivers/block/brd.c
RANCHU VIRTUAL BOARD FOR MIPS
@@ -17360,7 +17360,7 @@ R: Sergey Senozhatsky <sergey.senozhatsky.work(a)gmail.com>
L: linux-kernel(a)vger.kernel.org
S: Maintained
F: drivers/block/zram/
-F: Documentation/blockdev/zram.txt
+F: Documentation/blockdev/zram.rst
ZS DECSTATION Z85C30 SERIAL DRIVER
M: "Maciej W. Rozycki" <macro(a)linux-mips.org>
diff --git a/drivers/block/Kconfig b/drivers/block/Kconfig
index 96ec7e0fc1ea..c43690b973d8 100644
--- a/drivers/block/Kconfig
+++ b/drivers/block/Kconfig
@@ -31,7 +31,7 @@ config BLK_DEV_FD
If you want to use the floppy disk drive(s) of your PC under Linux,
say Y. Information about this driver, especially important for IBM
Thinkpad users, is contained in
- <file:Documentation/blockdev/floppy.txt>.
+ <file:Documentation/blockdev/floppy.rst>.
That file also contains the location of the Floppy driver FAQ as
well as location of the fdutils package used to configure additional
parameters of the driver at run time.
@@ -96,7 +96,7 @@ config PARIDE
your computer's parallel port. Most of them are actually IDE devices
using a parallel port IDE adapter. This option enables the PARIDE
subsystem which contains drivers for many of these external drives.
- Read <file:Documentation/blockdev/paride.txt> for more information.
+ Read <file:Documentation/blockdev/paride.rst> for more information.
If you have said Y to the "Parallel-port support" configuration
option, you may share a single port between your printer and other
@@ -261,7 +261,7 @@ config BLK_DEV_NBD
userland (making server and client physically the same computer,
communicating using the loopback network device).
- Read <file:Documentation/blockdev/nbd.txt> for more information,
+ Read <file:Documentation/blockdev/nbd.rst> for more information,
especially about where to find the server code, which runs in user
space and does not need special kernel support.
@@ -303,7 +303,7 @@ config BLK_DEV_RAM
during the initial install of Linux.
Note that the kernel command line option "ramdisk=XX" is now obsolete.
- For details, read <file:Documentation/blockdev/ramdisk.txt>.
+ For details, read <file:Documentation/blockdev/ramdisk.rst>.
To compile this driver as a module, choose M here: the
module will be called brd. An alias "rd" has been defined
diff --git a/drivers/block/floppy.c b/drivers/block/floppy.c
index b8998abd86a5..14701be9f916 100644
--- a/drivers/block/floppy.c
+++ b/drivers/block/floppy.c
@@ -4423,7 +4423,7 @@ static int __init floppy_setup(char *str)
pr_cont("\n");
} else
DPRINT("botched floppy option\n");
- DPRINT("Read Documentation/blockdev/floppy.txt\n");
+ DPRINT("Read Documentation/blockdev/floppy.rst\n");
return 0;
}
diff --git a/drivers/block/zram/Kconfig b/drivers/block/zram/Kconfig
index 1ffc64770643..e06b99d54816 100644
--- a/drivers/block/zram/Kconfig
+++ b/drivers/block/zram/Kconfig
@@ -12,7 +12,7 @@ config ZRAM
It has several use cases, for example: /tmp storage, use as swap
disks and maybe many more.
- See Documentation/blockdev/zram.txt for more information.
+ See Documentation/blockdev/zram.rst for more information.
config ZRAM_WRITEBACK
bool "Write back incompressible or idle page to backing device"
@@ -26,7 +26,7 @@ config ZRAM_WRITEBACK
With /sys/block/zramX/{idle,writeback}, application could ask
idle page's writeback to the backing device to save in memory.
- See Documentation/blockdev/zram.txt for more information.
+ See Documentation/blockdev/zram.rst for more information.
config ZRAM_MEMORY_TRACKING
bool "Track zRam block status"
@@ -36,4 +36,4 @@ config ZRAM_MEMORY_TRACKING
of zRAM. Admin could see the information via
/sys/kernel/debug/zram/zramX/block_state.
- See Documentation/blockdev/zram.txt for more information.
+ See Documentation/blockdev/zram.rst for more information.
diff --git a/tools/testing/selftests/zram/README b/tools/testing/selftests/zram/README
index 7972cc512408..5fa378391d3b 100644
--- a/tools/testing/selftests/zram/README
+++ b/tools/testing/selftests/zram/README
@@ -37,4 +37,4 @@ Commands required for testing:
- mkfs/ mkfs.ext4
For more information please refer:
-kernel-source-tree/Documentation/blockdev/zram.txt
+kernel-source-tree/Documentation/blockdev/zram.rst
--
2.20.1
The run_afpackettests will be marked as passed regardless the return
value of those sub-tests in the script:
--------------------
running psock_tpacket test
--------------------
[FAIL]
selftests: run_afpackettests [PASS]
Fix this by changing the return value for each tests.
Signed-off-by: Po-Hsu Lin <po-hsu.lin(a)canonical.com>
---
tools/testing/selftests/net/run_afpackettests | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/tools/testing/selftests/net/run_afpackettests b/tools/testing/selftests/net/run_afpackettests
index 2dc95fd..ea5938e 100755
--- a/tools/testing/selftests/net/run_afpackettests
+++ b/tools/testing/selftests/net/run_afpackettests
@@ -6,12 +6,14 @@ if [ $(id -u) != 0 ]; then
exit 0
fi
+ret=0
echo "--------------------"
echo "running psock_fanout test"
echo "--------------------"
./in_netns.sh ./psock_fanout
if [ $? -ne 0 ]; then
echo "[FAIL]"
+ ret=1
else
echo "[PASS]"
fi
@@ -22,6 +24,7 @@ echo "--------------------"
./in_netns.sh ./psock_tpacket
if [ $? -ne 0 ]; then
echo "[FAIL]"
+ ret=1
else
echo "[PASS]"
fi
@@ -32,6 +35,8 @@ echo "--------------------"
./in_netns.sh ./txring_overwrite
if [ $? -ne 0 ]; then
echo "[FAIL]"
+ ret=1
else
echo "[PASS]"
fi
+exit $ret
--
2.7.4
Build and run gpio when output directory is the src dir. gpio has
dependency on tools/gpio and builds tools/gpio objects in the src
directory in all cases making the src repo dirty even when object
relocation is specified.
This fixes the following commands from generating gpio objects in
the source repository:
make O=dir kselftest
export KBUILD_OUTPUT=dir; make kselftest
make O=dir -C tools/testing/selftests
expoert KBUILD_OUTPUT=dir; make -C tools/testing/selftests
The following still don't work (need fixing in gpio Makefile):
make O=dir kselftest TARGETS="gpio"
export KBUILD_OUTPUT=dir; make kselftest TARGETS="gpio"
make O=dir -C tools/testing/selftests TARGETS="gpio"
expoert KBUILD_OUTPUT=dir; make -C tools/testing/selftests TARGETS="gpio"
Signed-off-by: Shuah Khan <skhan(a)linuxfoundation.org>
---
tools/testing/selftests/Makefile | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/tools/testing/selftests/Makefile b/tools/testing/selftests/Makefile
index b43c5f41fb3e..f2ebf8cf4686 100644
--- a/tools/testing/selftests/Makefile
+++ b/tools/testing/selftests/Makefile
@@ -99,6 +99,15 @@ ARCH ?= $(SUBARCH)
export KSFT_KHDR_INSTALL_DONE := 1
export BUILD
+# build and run gpio when output directory is the src dir.
+# gpio has dependency on tools/gpio and builds tools/gpio
+# objects in the src directory in all cases making the src
+# repo dirty even when objects are relocated.
+ifneq (1,$(DEFAULT_INSTALL_HDR_PATH))
+ TMP := $(filter-out gpio, $(TARGETS))
+ TARGETS := $(TMP)
+endif
+
# set default goal to all, so make without a target runs all, even when
# all isn't the first target in the file.
.DEFAULT_GOAL := all
--
2.17.1
On smaller systems, running a test with 200 threads can take a long
time on machines with smaller number of CPUs.
Detect the number of online cpus at test runtime, and multiply that
by 6 to have 6 rseq threads per cpu preempting each other.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers(a)efficios.com>
Cc: Shuah Khan <shuah(a)kernel.org>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Cc: Joel Fernandes <joelaf(a)google.com>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: Catalin Marinas <catalin.marinas(a)arm.com>
Cc: Dave Watson <davejwatson(a)fb.com>
Cc: Will Deacon <will.deacon(a)arm.com>
Cc: Andi Kleen <andi(a)firstfloor.org>
Cc: linux-kselftest(a)vger.kernel.org
Cc: "H . Peter Anvin" <hpa(a)zytor.com>
Cc: Chris Lameter <cl(a)linux.com>
Cc: Russell King <linux(a)arm.linux.org.uk>
Cc: Michael Kerrisk <mtk.manpages(a)gmail.com>
Cc: "Paul E . McKenney" <paulmck(a)linux.vnet.ibm.com>
Cc: Paul Turner <pjt(a)google.com>
Cc: Boqun Feng <boqun.feng(a)gmail.com>
Cc: Josh Triplett <josh(a)joshtriplett.org>
Cc: Steven Rostedt <rostedt(a)goodmis.org>
Cc: Ben Maurer <bmaurer(a)fb.com>
Cc: Andy Lutomirski <luto(a)amacapital.net>
Cc: Andrew Morton <akpm(a)linux-foundation.org>
Cc: Linus Torvalds <torvalds(a)linux-foundation.org>
---
tools/testing/selftests/rseq/run_param_test.sh | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/tools/testing/selftests/rseq/run_param_test.sh b/tools/testing/selftests/rseq/run_param_test.sh
index 3acd6d75ff9f..e426304fd4a0 100755
--- a/tools/testing/selftests/rseq/run_param_test.sh
+++ b/tools/testing/selftests/rseq/run_param_test.sh
@@ -1,6 +1,8 @@
#!/bin/bash
# SPDX-License-Identifier: GPL-2.0+ or MIT
+NR_CPUS=`grep '^processor' /proc/cpuinfo | wc -l`
+
EXTRA_ARGS=${@}
OLDIFS="$IFS"
@@ -28,15 +30,16 @@ IFS="$OLDIFS"
REPS=1000
SLOW_REPS=100
+NR_THREADS=$((6*${NR_CPUS}))
function do_tests()
{
local i=0
while [ "$i" -lt "${#TEST_LIST[@]}" ]; do
echo "Running test ${TEST_NAME[$i]}"
- ./param_test ${TEST_LIST[$i]} -r ${REPS} ${@} ${EXTRA_ARGS} || exit 1
+ ./param_test ${TEST_LIST[$i]} -r ${REPS} -t ${NR_THREADS} ${@} ${EXTRA_ARGS} || exit 1
echo "Running compare-twice test ${TEST_NAME[$i]}"
- ./param_test_compare_twice ${TEST_LIST[$i]} -r ${REPS} ${@} ${EXTRA_ARGS} || exit 1
+ ./param_test_compare_twice ${TEST_LIST[$i]} -r ${REPS} -t ${NR_THREADS} ${@} ${EXTRA_ARGS} || exit 1
let "i++"
done
}
--
2.11.0
"make kselftest" fails with "Circular Makefile.o <- prepare dependency
dropped." error, when lib.mk invokes "make headers_install".
Make level 0: Main make calls selftests run_tests target
...
Make level n: selftests lib.mk invokes main make's headers_install
The secondary level make inherits builtin-rules which will use the rule
to generate Makefile.o and runs into "Circular Makefile.o <- prepare
dependency dropped." error, and kselftest compile fails.
Invoke headers_install target with --no-builtin-rules to avoid circular
error.
In addition, lib.mk installs headers in the default HDR_PATH, even when
build relocation is requested with O= or export KBUILD_OUTPUT. Fix the
problem by passing in INSTALL_HDR_PATH. The headers are installed under
the specified output "dir/usr".
Signed-off-by: Shuah Khan <skhan(a)linuxfoundation.org>
---
Changes since v1:
lib.mk needs DEFAULT_GOAL to run all
tools/testing/selftests/Makefile | 52 +++++++++++++++++++++++++++-----
tools/testing/selftests/lib.mk | 38 +++++++++++++++++++++--
2 files changed, 80 insertions(+), 10 deletions(-)
diff --git a/tools/testing/selftests/Makefile b/tools/testing/selftests/Makefile
index 971fc8428117..743cd374d07d 100644
--- a/tools/testing/selftests/Makefile
+++ b/tools/testing/selftests/Makefile
@@ -75,12 +75,15 @@ ifneq ($(KBUILD_SRC),)
override LDFLAGS =
endif
-BUILD := $(O)
-ifndef BUILD
- BUILD := $(KBUILD_OUTPUT)
-endif
-ifndef BUILD
- BUILD := $(shell pwd)
+ifneq ($(O),)
+ BUILD := $(O)
+else
+ ifneq ($(KBUILD_OUTPUT),)
+ BUILD := $(KBUILD_OUTPUT)
+ else
+ BUILD := $(shell pwd)
+ DEFAULT_INSTALL_HDR_PATH := 1
+ endif
endif
# KSFT_TAP_LEVEL is used from KSFT framework to prevent nested TAP header
@@ -89,8 +92,41 @@ endif
# with system() call. Export it here to cover override RUN_TESTS defines.
export KSFT_TAP_LEVEL=`echo 1`
+# Prepare for headers install
+top_srcdir ?= ../../..
+include $(top_srcdir)/scripts/subarch.include
+ARCH ?= $(SUBARCH)
+export KSFT_KHDR_INSTALL_DONE := 1
export BUILD
-all:
+
+# set default goal to all, so make without a target runs all, even when
+# all isn't the first target in the file.
+.DEFAULT_GOAL := all
+
+# Install headers here once for all tests. KSFT_KHDR_INSTALL_DONE
+# is used to avoid running headers_install from lib.mk.
+# Invoke headers install with --no-builtin-rules to avoid circular
+# dependency in "make kselftest" case. In this case, second level
+# make inherits builtin-rules which will use the rule generate
+# Makefile.o and runs into
+# "Circular Makefile.o <- prepare dependency dropped."
+# and headers_install fails and test compile fails.
+#
+# O= KBUILD_OUTPUT cases don't run into this error, since main Makefile
+# invokes them as sub-makes and --no-builtin-rules is not necessary,
+# but doesn't cause any failures. Keep it simple and use the same
+# flags in both cases.
+# Local build cases: "make kselftest", "make -C" - headers are installed
+# in the default INSTALL_HDR_PATH usr/include.
+khdr:
+ifeq (1,$(DEFAULT_INSTALL_HDR_PATH))
+ make --no-builtin-rules ARCH=$(ARCH) -C $(top_srcdir) headers_install
+else
+ make --no-builtin-rules INSTALL_HDR_PATH=$$BUILD/usr \
+ ARCH=$(ARCH) -C $(top_srcdir) headers_install
+endif
+
+all: khdr
@for TARGET in $(TARGETS); do \
BUILD_TARGET=$$BUILD/$$TARGET; \
mkdir $$BUILD_TARGET -p; \
@@ -173,4 +209,4 @@ clean:
make OUTPUT=$$BUILD_TARGET -C $$TARGET clean;\
done;
-.PHONY: all run_tests hotplug run_hotplug clean_hotplug run_pstore_crash install clean
+.PHONY: khdr all run_tests hotplug run_hotplug clean_hotplug run_pstore_crash install clean
diff --git a/tools/testing/selftests/lib.mk b/tools/testing/selftests/lib.mk
index 8b0f16409ed7..5979fdc4f36c 100644
--- a/tools/testing/selftests/lib.mk
+++ b/tools/testing/selftests/lib.mk
@@ -3,7 +3,16 @@
CC := $(CROSS_COMPILE)gcc
ifeq (0,$(MAKELEVEL))
-OUTPUT := $(shell pwd)
+ ifneq ($(O),)
+ OUTPUT := $(O)
+ else
+ ifneq ($(KBUILD_OUTPUT),)
+ OUTPUT := $(KBUILD_OUTPUT)
+ else
+ OUTPUT := $(shell pwd)
+ DEFAULT_INSTALL_HDR_PATH := 1
+ endif
+ endif
endif
# The following are built by lib.mk common compile rules.
@@ -21,9 +30,34 @@ top_srcdir ?= ../../../..
include $(top_srcdir)/scripts/subarch.include
ARCH ?= $(SUBARCH)
+# set default goal to all, so make without a target runs all, even when
+# all isn't the first target in the file.
+.DEFAULT_GOAL := all
+
+# Invoke headers install with --no-builtin-rules to avoid circular
+# dependency in "make kselftest" case. In this case, second level
+# make inherits builtin-rules which will use the rule generate
+# Makefile.o and runs into
+# "Circular Makefile.o <- prepare dependency dropped."
+# and headers_install fails and test compile fails.
+# O= KBUILD_OUTPUT cases don't run into this error, since main Makefile
+# invokes them as sub-makes and --no-builtin-rules is not necessary,
+# but doesn't cause any failures. Keep it simple and use the same
+# flags in both cases.
+# Note that the support to install headers from lib.mk is necessary
+# when test Makefile is run directly with "make -C".
+# When local build is done, headers are installed in the default
+# INSTALL_HDR_PATH usr/include.
.PHONY: khdr
khdr:
- make ARCH=$(ARCH) -C $(top_srcdir) headers_install
+ifndef KSFT_KHDR_INSTALL_DONE
+ifeq (1,$(DEFAULT_INSTALL_HDR_PATH))
+ make --no-builtin-rules ARCH=$(ARCH) -C $(top_srcdir) headers_install
+else
+ make --no-builtin-rules INSTALL_HDR_PATH=$$OUTPUT/usr \
+ ARCH=$(ARCH) -C $(top_srcdir) headers_install
+endif
+endif
all: khdr $(TEST_GEN_PROGS) $(TEST_GEN_PROGS_EXTENDED) $(TEST_GEN_FILES)
else
--
2.17.1
Hi Shuah,
This pull request moves the existing kexec_load test to the
selftests/kexec directory, defines a set of local "common" shell
functions, and defines a new kexec_file_load test. The local "common"
shell functions could be the basis for a generic set of "common" shell
functions.
Thanks,
Mimi
The following changes since commit 15ade5d2e7775667cf191cf2f94327a4889f8b9d:
Linux 5.1-rc4 (2019-04-07 14:09:59 -1000)
are available in the git repository at:
git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity.git next-integrity-for-shuah
for you to fetch changes up to eaa219ab1fd2c2a81704556aefd3dcde904dda33:
selftests/kexec: update get_secureboot_mode (2019-04-11 19:12:10 -0400)
----------------------------------------------------------------
Mimi Zohar (9):
selftests/kexec: move the IMA kexec_load selftest to selftests/kexec
selftests/kexec: cleanup the kexec selftest
selftests/kexec: define a set of common functions
selftests/kexec: define common logging functions
selftests/kexec: define "require_root_privileges"
selftests/kexec: kexec_file_load syscall test
selftests/kexec: check kexec_load and kexec_file_load are enabled
selftests/kexec: make kexec_load test independent of IMA being enabled
selftests/kexec: update get_secureboot_mode
Petr Vorel (1):
selftests/kexec: Add missing '=y' to config options
tools/testing/selftests/Makefile | 2 +-
tools/testing/selftests/ima/config | 4 -
tools/testing/selftests/ima/test_kexec_load.sh | 54 -----
tools/testing/selftests/{ima => kexec}/Makefile | 5 +-
tools/testing/selftests/kexec/config | 3 +
tools/testing/selftests/kexec/kexec_common_lib.sh | 220 +++++++++++++++++++++
.../selftests/kexec/test_kexec_file_load.sh | 208 +++++++++++++++++++
tools/testing/selftests/kexec/test_kexec_load.sh | 47 +++++
8 files changed, 482 insertions(+), 61 deletions(-)
delete mode 100644 tools/testing/selftests/ima/config
delete mode 100755 tools/testing/selftests/ima/test_kexec_load.sh
rename tools/testing/selftests/{ima => kexec}/Makefile (59%)
create mode 100644 tools/testing/selftests/kexec/config
create mode 100755 tools/testing/selftests/kexec/kexec_common_lib.sh
create mode 100755 tools/testing/selftests/kexec/test_kexec_file_load.sh
create mode 100755 tools/testing/selftests/kexec/test_kexec_load.sh
Fix KBUILD_OUTPUT usage instructions. The current documentation is
incorrect. Update and fix outdated information about summary option.
Add a reference to kselftest wiki for additional information on the
framework and tips on writing new tests.
Signed-off-by: Shuah Khan <skhan(a)linuxfoundation.org>
---
Changes since v1:
-- Fixed the confusing language about output directory
-- Updated outdated instructions on summary option.
-- Regenerated patch with current email address.
Documentation/dev-tools/kselftest.rst | 42 ++++++++++++++++++---------
1 file changed, 29 insertions(+), 13 deletions(-)
diff --git a/Documentation/dev-tools/kselftest.rst b/Documentation/dev-tools/kselftest.rst
index c8c03388b9de..25604904fa6e 100644
--- a/Documentation/dev-tools/kselftest.rst
+++ b/Documentation/dev-tools/kselftest.rst
@@ -7,6 +7,11 @@ directory. These are intended to be small tests to exercise individual code
paths in the kernel. Tests are intended to be run after building, installing
and booting a kernel.
+You can find additional information on Kselftest framework, how to
+write new tests using the framework on Kselftest wiki:
+
+https://kselftest.wiki.kernel.org/
+
On some systems, hot-plug tests could hang forever waiting for cpu and
memory to be ready to be offlined. A special hot-plug target is created
to run the full range of hot-plug tests. In default mode, hot-plug tests run
@@ -35,17 +40,32 @@ To build and run the tests with a single command, use::
Note that some tests will require root privileges.
-Build and run from user specific object directory (make O=dir)::
+Kselftest supports saving output files in a separate directory and then
+running tests. To locate output files in a separate directory two syntaxes
+are supported. In both cases the working directory must be the root of the
+kernel src. This is applicable to "Running a subset of selftests" section
+below.
+
+To build, save output files in a separate directory with O= ::
$ make O=/tmp/kselftest kselftest
-Build and run KBUILD_OUTPUT directory (make KBUILD_OUTPUT=)::
+To build, save output files in a separate directory with KBUILD_OUTPUT ::
+
+ $ export KBUILD_OUTPUT=/tmp/kselftest; make kselftest
- $ make KBUILD_OUTPUT=/tmp/kselftest kselftest
+The O= assignment takes precedence over the KBUILD_OUTPUT environment
+variable.
-The above commands run the tests and print pass/fail summary to make it
-easier to understand the test results. Please find the detailed individual
-test results for each test in /tmp/testname file(s).
+The above commands by default run the tests and print full pass/fail report.
+Kselftest supports "summary" option to make it easier to understand the test
+results. Please find the detailed individual test results for each test in
+/tmp/testname file(s) when summary option is specified. This is applicable
+to "Running a subset of selftests" section below.
+
+To run kselftest with summary option enabled ::
+
+ $ make summary=1 kselftest
Running a subset of selftests
=============================
@@ -61,17 +81,13 @@ You can specify multiple tests to build and run::
$ make TARGETS="size timers" kselftest
-Build and run from user specific object directory (make O=dir)::
+To build, save output files in a separate directory with O= ::
$ make O=/tmp/kselftest TARGETS="size timers" kselftest
-Build and run KBUILD_OUTPUT directory (make KBUILD_OUTPUT=)::
-
- $ make KBUILD_OUTPUT=/tmp/kselftest TARGETS="size timers" kselftest
+To build, save output files in a separate directory with KBUILD_OUTPUT ::
-The above commands run the tests and print pass/fail summary to make it
-easier to understand the test results. Please find the detailed individual
-test results for each test in /tmp/testname file(s).
+ $ export KBUILD_OUTPUT=/tmp/kselftest; make TARGETS="size timers" kselftest
See the top-level tools/testing/selftests/Makefile for the list of all
possible targets.
--
2.17.1
Test files created by test_create() and test_create_empty() tests will
stay in the $efivarfs_mount directory until the system was rebooted.
When the tester tries to run this efivarfs test again on the same
system, the immutable characteristics in that directory will cause some
"Operation not permitted" noises, and a false-positve test result as the
file was created in previous run.
--------------------
running test_create
--------------------
./efivarfs.sh: line 59: /sys/firmware/efi/efivars/test_create-210be57c-9849-4fc7-a635-e6382d1aec27: Operation not permitted
[PASS]
--------------------
running test_create_empty
--------------------
./efivarfs.sh: line 78: /sys/firmware/efi/efivars/test_create_empty-210be57c-9849-4fc7-a635-e6382d1aec27: Operation not permitted
[PASS]
--------------------
Create a file_cleanup() to remove those test files in the end of each
test to solve this issue.
For the test_create_read, we can move the clean up task to the end of
the test to ensure the system is clean.
Also, use this function to replace the existing file removal code.
Signed-off-by: Po-Hsu Lin <po-hsu.lin(a)canonical.com>
---
tools/testing/selftests/efivarfs/efivarfs.sh | 32 +++++++++++-----------------
1 file changed, 13 insertions(+), 19 deletions(-)
diff --git a/tools/testing/selftests/efivarfs/efivarfs.sh b/tools/testing/selftests/efivarfs/efivarfs.sh
index d386610..a90f394 100755
--- a/tools/testing/selftests/efivarfs/efivarfs.sh
+++ b/tools/testing/selftests/efivarfs/efivarfs.sh
@@ -7,6 +7,12 @@ test_guid=210be57c-9849-4fc7-a635-e6382d1aec27
# Kselftest framework requirement - SKIP code is 4.
ksft_skip=4
+file_cleanup()
+{
+ chattr -i $1
+ rm -f $1
+}
+
check_prereqs()
{
local msg="skip all tests:"
@@ -58,8 +64,10 @@ test_create()
if [ $(stat -c %s $file) -ne 5 ]; then
echo "$file has invalid size" >&2
+ file_cleanup $file
exit 1
fi
+ file_cleanup $file
}
test_create_empty()
@@ -72,16 +80,14 @@ test_create_empty()
echo "$file can not be created without writing" >&2
exit 1
fi
+ file_cleanup $file
}
test_create_read()
{
local file=$efivarfs_mount/$FUNCNAME-$test_guid
- if [ -f $file]; then
- chattr -i $file
- rm -rf $file
- fi
./create-read $file
+ file_cleanup $file
}
test_delete()
@@ -96,11 +102,7 @@ test_delete()
exit 1
fi
- rm $file 2>/dev/null
- if [ $? -ne 0 ]; then
- chattr -i $file
- rm $file
- fi
+ file_cleanup $file
if [ -e $file ]; then
echo "$file couldn't be deleted" >&2
@@ -154,11 +156,7 @@ test_valid_filenames()
echo "$file could not be created" >&2
ret=1
else
- rm $file 2>/dev/null
- if [ $? -ne 0 ]; then
- chattr -i $file
- rm $file
- fi
+ file_cleanup $file
fi
done
@@ -191,11 +189,7 @@ test_invalid_filenames()
if [ -e $file ]; then
echo "Creating $file should have failed" >&2
- rm $file 2>/dev/null
- if [ $? -ne 0 ]; then
- chattr -i $file
- rm $file
- fi
+ file_cleanup $file
ret=1
fi
done
--
2.7.4
Test files created by test_create*() tests will stay in the
$efivarfs_mount directory unless the system was rebooted.
When the tester tries to run this efivarfs test again on the same
system, the immutable characteristics in that directory will cause some
"Operation not permitted" noises and a false-positve test result to the
test_create_read() test.
--------------------
running test_create
--------------------
./efivarfs.sh: line 59: /sys/firmware/efi/efivars/test_create-210be57c-9849-4fc7-a635-e6382d1aec27: Operation not permitted
[PASS]
--------------------
running test_create_empty
--------------------
./efivarfs.sh: line 78: /sys/firmware/efi/efivars/test_create_empty-210be57c-9849-4fc7-a635-e6382d1aec27: Operation not permitted
[PASS]
--------------------
running test_create_read
--------------------
open(O_WRONLY): Operation not permitted
[FAIL]
--------------------
Create a file_cleanup() to remove those test files in the end of each
test to solve this issue.
Also, use this function to replace the existing file removal code.
Link: https://bugs.launchpad.net/bugs/1809704
Signed-off-by: Po-Hsu Lin <po-hsu.lin(a)canonical.com>
---
tools/testing/selftests/efivarfs/efivarfs.sh | 19 +++++++++++++------
1 file changed, 13 insertions(+), 6 deletions(-)
mode change 100755 => 100644 tools/testing/selftests/efivarfs/efivarfs.sh
diff --git a/tools/testing/selftests/efivarfs/efivarfs.sh b/tools/testing/selftests/efivarfs/efivarfs.sh
old mode 100755
new mode 100644
index a47029a..14fa6fe
--- a/tools/testing/selftests/efivarfs/efivarfs.sh
+++ b/tools/testing/selftests/efivarfs/efivarfs.sh
@@ -7,6 +7,12 @@ test_guid=210be57c-9849-4fc7-a635-e6382d1aec27
# Kselftest framework requirement - SKIP code is 4.
ksft_skip=4
+file_cleanup()
+{
+ chattr -i $1
+ rm $1
+}
+
check_prereqs()
{
local msg="skip all tests:"
@@ -58,8 +64,10 @@ test_create()
if [ $(stat -c %s $file) -ne 5 ]; then
echo "$file has invalid size" >&2
+ file_cleanup $file
exit 1
fi
+ file_cleanup $file
}
test_create_empty()
@@ -72,12 +80,14 @@ test_create_empty()
echo "$file can not be created without writing" >&2
exit 1
fi
+ file_cleanup $file
}
test_create_read()
{
local file=$efivarfs_mount/$FUNCNAME-$test_guid
./create-read $file
+ file_cleanup $file
}
test_delete()
@@ -94,8 +104,7 @@ test_delete()
rm $file 2>/dev/null
if [ $? -ne 0 ]; then
- chattr -i $file
- rm $file
+ file_cleanup $file
fi
if [ -e $file ]; then
@@ -152,8 +161,7 @@ test_valid_filenames()
else
rm $file 2>/dev/null
if [ $? -ne 0 ]; then
- chattr -i $file
- rm $file
+ file_cleanup $file
fi
fi
done
@@ -189,8 +197,7 @@ test_invalid_filenames()
echo "Creating $file should have failed" >&2
rm $file 2>/dev/null
if [ $? -ne 0 ]; then
- chattr -i $file
- rm $file
+ file_cleanup $file
fi
ret=1
fi
--
2.7.4
The run_netsocktests will be marked as passed regardless the actual test
result from the ./socket:
selftests: net: run_netsocktests
========================================
--------------------
running socket test
--------------------
[FAIL]
ok 1..6 selftests: net: run_netsocktests [PASS]
This is because the test script itself has been successfully executed.
Fix this by exit 1 when the test failed.
Signed-off-by: Po-Hsu Lin <po-hsu.lin(a)canonical.com>
---
tools/testing/selftests/net/run_netsocktests | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/selftests/net/run_netsocktests b/tools/testing/selftests/net/run_netsocktests
index b093f39c2..14e41fa 100755
--- a/tools/testing/selftests/net/run_netsocktests
+++ b/tools/testing/selftests/net/run_netsocktests
@@ -7,7 +7,7 @@ echo "--------------------"
./socket
if [ $? -ne 0 ]; then
echo "[FAIL]"
+ exit 1
else
echo "[PASS]"
fi
-
--
2.7.4
Summary
------------------------------------------------------------------------
git repo: not informed
git branch: not informed
git commit: not informed
git describe: not informed
Test details: https://qa-reports.linaro.org/lkft/linux-next-oe/build/next-20190417
Regressions (compared to build next-20190416)
------------------------------------------------------------------------
No regressions
Fixes (compared to build next-20190416)
------------------------------------------------------------------------
No fixes
In total:
------------------------------------------------------------------------
Ran 0 total tests in the following environments and test suites.
pass 0
fail 0
xfail 0
skip 0
Environments
--------------
- dragonboard-410c - arm64
- hi6220-hikey - arm64
- i386
- juno-r2 - arm64
- x15 - arm
- x86_64
Test Suites
-----------
Failures
------------------------------------------------------------------------
juno-r2:
i386:
x86:
x15:
dragonboard-410c:
hi6220-hikey:
Skips
------------------------------------------------------------------------
No skips
--
Linaro LKFT
https://lkft.linaro.org
Fix KBUILD_OUTPUT usage instructions. The current documentation
is incorrect.
Signed-off-by: Shuah Khan <shuah(a)kernel.org>
---
Documentation/dev-tools/kselftest.rst | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/Documentation/dev-tools/kselftest.rst b/Documentation/dev-tools/kselftest.rst
index c8c03388b9de..6c910ddaa9f9 100644
--- a/Documentation/dev-tools/kselftest.rst
+++ b/Documentation/dev-tools/kselftest.rst
@@ -39,9 +39,9 @@ Build and run from user specific object directory (make O=dir)::
$ make O=/tmp/kselftest kselftest
-Build and run KBUILD_OUTPUT directory (make KBUILD_OUTPUT=)::
+Build and run from KBUILD_OUTPUT directory (make KBUILD_OUTPUT=)::
- $ make KBUILD_OUTPUT=/tmp/kselftest kselftest
+ $ export KBUILD_OUTPUT=/tmp/kselftest; make kselftest
The above commands run the tests and print pass/fail summary to make it
easier to understand the test results. Please find the detailed individual
@@ -65,9 +65,9 @@ Build and run from user specific object directory (make O=dir)::
$ make O=/tmp/kselftest TARGETS="size timers" kselftest
-Build and run KBUILD_OUTPUT directory (make KBUILD_OUTPUT=)::
+Build and run from KBUILD_OUTPUT directory (make KBUILD_OUTPUT=)::
- $ make KBUILD_OUTPUT=/tmp/kselftest TARGETS="size timers" kselftest
+ $ export KBUILD_OUTPUT=/tmp/kselftest; make TARGETS="size timers" kselftest
The above commands run the tests and print pass/fail summary to make it
easier to understand the test results. Please find the detailed individual
--
2.17.1
Introduce in-kernel headers and other artifacts which are made available
as an archive through proc (/proc/kheaders.tar.xz file). This archive makes
it possible to build kernel modules, run eBPF programs, and other
tracing programs that need to extend the kernel for tracing purposes
without any dependency on the file system having headers and build
artifacts.
On Android and embedded systems, it is common to switch kernels but not
have kernel headers available on the file system. Further once a
different kernel is booted, any headers stored on the file system will
no longer be useful. By storing the headers as a compressed archive
within the kernel, we can avoid these issues that have been a hindrance
for a long time.
The best way to use this feature is by building it in. Several users
have a need for this, when they switch debug kernels, they donot want to
update the filesystem or worry about it where to store the headers on
it. However, the feature is also buildable as a module in case the user
desires it not being part of the kernel image. This makes it possible to
load and unload the headers from memory on demand. A tracing program, or
a kernel module builder can load the module, do its operations, and then
unload the module to save kernel memory. The total memory needed is 3.8MB.
By having the archive available at a fixed location independent of
filesystem dependencies and conventions, all debugging tools can
directly refer to the fixed location for the archive, without concerning
with where the headers on a typical filesystem which significantly
simplifies tooling that needs kernel headers.
The code to read the headers is based on /proc/config.gz code and uses
the same technique to embed the headers.
To build a module, the below steps have been tested on an x86 machine:
modprobe kheaders
rm -rf $HOME/headers
mkdir -p $HOME/headers
tar -xvf /proc/kheaders.tar.xz -C $HOME/headers >/dev/null
cd my-kernel-module
make -C $HOME/headers M=$(pwd) modules
rmmod kheaders
Additional notes:
(1) external modules must be built on the same arch as the host that
built vmlinux. This can be done either in a qemu emulated chroot on the
target, or natively. This is due to host arch dependency of kernel
scripts.
(2)
If module building is used, since Module.symvers is not available in the
archive due to a cyclic dependency with building of the archive into the
kernel or module binaries, the modules built using the archive will not
contain symbol versioning (modversion). This is usually not an issue
since the idea of this patch is to build a kernel module on the fly and
load it into the same kernel. An appropriate warning is already printed
by the kernel to alert the user of modules not having modversions when
built using the archive. For building with modversions, the user can use
traditional header packages. For our tracing usecases, we build modules
on the fly with this so it is not a concern.
(3) I have left IKHD_ST and IKHD_ED markers as is to facilitate
future patches that would extract the headers from a kernel or module
image.
(v4 was Tested-by the following folks,
v5 only has minor changes and has passed my testing).
Tested-by: qais.yousef(a)arm.com
Tested-by: dietmar.eggemann(a)arm.com
Tested-by: linux(a)manojrajarao.com
Signed-off-by: Joel Fernandes (Google) <joel(a)joelfernandes.org>
---
v4 -> v5:
(Thanks to Masahiro Yamada for several excellent suggestions)
- used incbin instead of bin2c (Masahiro did similar idea)
- added module.lds if ia64 otherwise ia64 may fail to build.
- added clean-files rule to Makefile
- removed strip-comments script and doing it inline
- added set -e to header generated to die on errorsr
- fixed a minor issue where find command was noisy.
- removed unneeded tar.xz rule from kernel/.gitignore
- added Tested-by tags from ARM folks.
Changes since v3:
- Blank tar was being generated because of a one line I
forgot to push. It is updated now.
- Added module.lds since arm64 needs it to build modules.
Changes since v2:
(Thanks to Masahiro Yamada for several excellent suggestions)
- Added support for out of tree builds.
- Added incremental build support bringing down build time of
incremental builds from 50 seconds to 5 seconds.
- Fixed various small nits / cleanups.
- clean ups to kheaders.c pointed by Alexey Dobriyan.
- Fixed MODULE_LICENSE in test module and kheaders.c
- Dropped Module.symvers from archive due to circular dependency.
Changes since v1:
- removed IKH_EXTRA variable, not needed (Masahiro Yamada)
- small fix ups to selftest
- added target to main Makefile etc
- added MODULE_LICENSE to test module
- made selftest more quiet
Changes since RFC:
Both changes bring size down to 3.8MB:
- use xz for compression
- strip comments except SPDX lines
- Call out the module name in Kconfig
- Also added selftests in second patch to ensure headers are always
working.
init/Kconfig | 11 ++++++
kernel/.gitignore | 1 +
kernel/Makefile | 28 ++++++++++++++
kernel/kheaders.c | 73 +++++++++++++++++++++++++++++++++++
scripts/gen_ikh_data.sh | 84 +++++++++++++++++++++++++++++++++++++++++
5 files changed, 197 insertions(+)
create mode 100644 kernel/kheaders.c
create mode 100755 scripts/gen_ikh_data.sh
diff --git a/init/Kconfig b/init/Kconfig
index 4592bf7997c0..ea75bfbf7dfa 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -580,6 +580,17 @@ config IKCONFIG_PROC
This option enables access to the kernel configuration file
through /proc/config.gz.
+config IKHEADERS_PROC
+ tristate "Enable kernel header artifacts through /proc/kheaders.tar.xz"
+ depends on PROC_FS
+ help
+ This option enables access to the kernel header and other artifacts that
+ are generated during the build process. These can be used to build kernel
+ modules or by other in-kernel programs such as those generated by eBPF
+ and systemtap tools. If you build the headers as a module, a module
+ called kheaders.ko is built which can be loaded on-demand to get access
+ to the headers.
+
config LOG_BUF_SHIFT
int "Kernel log buffer size (16 => 64KB, 17 => 128KB)"
range 12 25
diff --git a/kernel/.gitignore b/kernel/.gitignore
index 6e699100872f..34d1e77ee9df 100644
--- a/kernel/.gitignore
+++ b/kernel/.gitignore
@@ -1,5 +1,6 @@
#
# Generated files
#
+kheaders.md5
timeconst.h
hz.bc
diff --git a/kernel/Makefile b/kernel/Makefile
index 6c57e78817da..7c486bc25fd7 100644
--- a/kernel/Makefile
+++ b/kernel/Makefile
@@ -70,6 +70,7 @@ obj-$(CONFIG_UTS_NS) += utsname.o
obj-$(CONFIG_USER_NS) += user_namespace.o
obj-$(CONFIG_PID_NS) += pid_namespace.o
obj-$(CONFIG_IKCONFIG) += configs.o
+obj-$(CONFIG_IKHEADERS_PROC) += kheaders.o
obj-$(CONFIG_SMP) += stop_machine.o
obj-$(CONFIG_KPROBES_SANITY_TEST) += test_kprobes.o
obj-$(CONFIG_AUDIT) += audit.o auditfilter.o
@@ -121,3 +122,30 @@ $(obj)/configs.o: $(obj)/config_data.gz
targets += config_data.gz
$(obj)/config_data.gz: $(KCONFIG_CONFIG) FORCE
$(call if_changed,gzip)
+
+# Build a list of in-kernel headers for building kernel modules
+ikh_file_list := include/
+ikh_file_list += arch/$(SRCARCH)/Makefile
+ikh_file_list += arch/$(SRCARCH)/include/
+ikh_file_list += arch/$(SRCARCH)/module.lds
+ikh_file_list += arch/$(SRCARCH)/kernel/module.lds
+ikh_file_list += scripts/
+ikh_file_list += Makefile
+
+# Things we need from the $objtree. "OBJDIR" is for the gen_ikh_data.sh
+# script to identify that this comes from the $objtree directory
+ikh_file_list += OBJDIR/scripts/
+ikh_file_list += OBJDIR/include/
+ikh_file_list += OBJDIR/arch/$(SRCARCH)/include/
+ifeq ($(CONFIG_STACK_VALIDATION), y)
+ikh_file_list += OBJDIR/tools/objtool/objtool
+endif
+
+$(obj)/kheaders.o: $(obj)/kheaders_data.tar.xz
+
+quiet_cmd_genikh = GEN $(obj)/kheaders_data.tar.xz
+cmd_genikh = $(srctree)/scripts/gen_ikh_data.sh $@ $(ikh_file_list)
+$(obj)/kheaders_data.tar.xz: FORCE
+ $(call cmd,genikh)
+
+clean-files := kheaders_data.tar.xz kheaders.md5
diff --git a/kernel/kheaders.c b/kernel/kheaders.c
new file mode 100644
index 000000000000..d072a958a8f1
--- /dev/null
+++ b/kernel/kheaders.c
@@ -0,0 +1,73 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * kernel/kheaders.c
+ * Provide headers and artifacts needed to build kernel modules.
+ * (Borrowed code from kernel/configs.c)
+ */
+
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/proc_fs.h>
+#include <linux/init.h>
+#include <linux/uaccess.h>
+
+/*
+ * Define kernel_headers_data and kernel_headers_data_end, within which the the
+ * compressed kernel headers are stpred. The file is first compressed with xz.
+ */
+
+asm (
+" .pushsection .rodata, \"a\" \n"
+" .global kernel_headers_data \n"
+"kernel_headers_data: \n"
+" .incbin \"kernel/kheaders_data.tar.xz\" \n"
+" .global kernel_headers_data_end \n"
+"kernel_headers_data_end: \n"
+" .popsection \n"
+);
+
+extern char kernel_headers_data;
+extern char kernel_headers_data_end;
+
+static ssize_t
+ikheaders_read_current(struct file *file, char __user *buf,
+ size_t len, loff_t *offset)
+{
+ return simple_read_from_buffer(buf, len, offset,
+ &kernel_headers_data,
+ &kernel_headers_data_end -
+ &kernel_headers_data);
+}
+
+static const struct file_operations ikheaders_file_ops = {
+ .read = ikheaders_read_current,
+ .llseek = default_llseek,
+};
+
+static int __init ikheaders_init(void)
+{
+ struct proc_dir_entry *entry;
+
+ /* create the current headers file */
+ entry = proc_create("kheaders.tar.xz", S_IRUGO, NULL,
+ &ikheaders_file_ops);
+ if (!entry)
+ return -ENOMEM;
+
+ proc_set_size(entry,
+ &kernel_headers_data_end -
+ &kernel_headers_data);
+ return 0;
+}
+
+static void __exit ikheaders_cleanup(void)
+{
+ remove_proc_entry("kheaders.tar.xz", NULL);
+}
+
+module_init(ikheaders_init);
+module_exit(ikheaders_cleanup);
+
+MODULE_LICENSE("GPL v2");
+MODULE_AUTHOR("Joel Fernandes");
+MODULE_DESCRIPTION("Echo the kernel header artifacts used to build the kernel");
diff --git a/scripts/gen_ikh_data.sh b/scripts/gen_ikh_data.sh
new file mode 100755
index 000000000000..3f9cae72c2a4
--- /dev/null
+++ b/scripts/gen_ikh_data.sh
@@ -0,0 +1,84 @@
+#!/bin/bash
+# SPDX-License-Identifier: GPL-2.0
+
+# This script generates an archive consisting of kernel headers
+# for CONFIG_IKHEADERS_PROC.
+set -e
+
+spath="$(dirname "$(readlink -f "$0")")"
+kroot="$spath/.."
+outdir="$(pwd)"
+tarfile=$1
+cpio_dir=$outdir/$tarfile.tmp
+
+file_list=${@:2}
+
+src_file_list=""
+for f in $file_list; do
+ if [ ! -f "$kroot/$f" ] && [ ! -d "$kroot/$f" ]; then continue; fi
+ src_file_list="$src_file_list $(echo $f | grep -v OBJDIR)"
+done
+
+obj_file_list=""
+for f in $file_list; do
+ f=$(echo $f | grep OBJDIR | sed -e 's/OBJDIR\///g')
+ if [ ! -f $f ] && [ ! -d $f ]; then continue; fi
+ obj_file_list="$obj_file_list $f";
+done
+
+# Support incremental builds by skipping archive generation
+# if timestamps of files being archived are not changed.
+
+# This block is useful for debugging the incremental builds.
+# Uncomment it for debugging.
+# iter=1
+# if [ ! -f /tmp/iter ]; then echo 1 > /tmp/iter;
+# else; iter=$(($(cat /tmp/iter) + 1)); fi
+# find $src_file_list -type f | xargs ls -lR > /tmp/src-ls-$iter
+# find $obj_file_list -type f | xargs ls -lR > /tmp/obj-ls-$iter
+
+# modules.order and include/generated/compile.h are ignored because these are
+# touched even when none of the source files changed. This causes pointless
+# regeneration, so let us ignore them for md5 calculation.
+pushd $kroot > /dev/null
+src_files_md5="$(find $src_file_list -type f ! -name modules.order |
+ grep -v "include/generated/compile.h" |
+ xargs ls -lR | md5sum | cut -d ' ' -f1)"
+popd > /dev/null
+obj_files_md5="$(find $obj_file_list -type f ! -name modules.order |
+ grep -v "include/generated/compile.h" |
+ xargs ls -lR | md5sum | cut -d ' ' -f1)"
+
+if [ -f $tarfile ]; then tarfile_md5="$(md5sum $tarfile | cut -d ' ' -f1)"; fi
+if [ -f kernel/kheaders.md5 ] &&
+ [ "$(cat kernel/kheaders.md5|head -1)" == "$src_files_md5" ] &&
+ [ "$(cat kernel/kheaders.md5|head -2|tail -1)" == "$obj_files_md5" ] &&
+ [ "$(cat kernel/kheaders.md5|tail -1)" == "$tarfile_md5" ]; then
+ exit
+fi
+
+rm -rf $cpio_dir
+mkdir $cpio_dir
+
+pushd $kroot > /dev/null
+for f in $src_file_list;
+ do find "$f" ! -name "*.c" ! -name "*.o" ! -name "*.cmd" ! -name ".*";
+done | cpio --quiet -pd $cpio_dir
+popd > /dev/null
+
+# The second CPIO can complain if files already exist which can
+# happen with out of tree builds. Just silence CPIO for now.
+for f in $obj_file_list;
+ do find "$f" ! -name "*.c" ! -name "*.o" ! -name "*.cmd" ! -name ".*";
+done | cpio --quiet -pd $cpio_dir >/dev/null 2>&1
+
+find $cpio_dir -type f -print0 |
+ xargs -0 -P8 -n1 perl -pi -e 'BEGIN {undef $/;}; s/\/\*((?!SPDX).)*?\*\///smg;'
+
+tar -Jcf $tarfile -C $cpio_dir/ . > /dev/null
+
+echo "$src_files_md5" > kernel/kheaders.md5
+echo "$obj_files_md5" >> kernel/kheaders.md5
+echo "$(md5sum $tarfile | cut -d ' ' -f1)" >> kernel/kheaders.md5
+
+rm -rf $cpio_dir
--
2.21.0.225.g810b269d1ac-goog
"make kselftest" fails with "Circular Makefile.o <- prepare dependency
dropped." error, when lib.mk invokes "make headers_install".
Make level 0: Main make calls selftests run_tests target
...
Make level n: selftests lib.mk invokes main make's headers_install
The secondary level make inherits builtin-rules which will use the rule
to generate Makefile.o and runs into "Circular Makefile.o <- prepare
dependency dropped." error, and kselftest compile fails.
Invoke headers_install target with --no-builtin-rules to avoid circular
error.
In addition, lib.mk installs headers in the default HDR_PATH, even when
build relocation is requested with O= or export KBUILD_OUTPUT. Fix the
problem by passing in INSTALL_HDR_PATH. The headers are installed under
the specified output "dir/usr".
Signed-off-by: Shuah Khan <shuah(a)kernel.org>
---
tools/testing/selftests/Makefile | 52 +++++++++++++++++++++++++++-----
tools/testing/selftests/lib.mk | 34 +++++++++++++++++++--
2 files changed, 76 insertions(+), 10 deletions(-)
diff --git a/tools/testing/selftests/Makefile b/tools/testing/selftests/Makefile
index 971fc8428117..743cd374d07d 100644
--- a/tools/testing/selftests/Makefile
+++ b/tools/testing/selftests/Makefile
@@ -75,12 +75,15 @@ ifneq ($(KBUILD_SRC),)
override LDFLAGS =
endif
-BUILD := $(O)
-ifndef BUILD
- BUILD := $(KBUILD_OUTPUT)
-endif
-ifndef BUILD
- BUILD := $(shell pwd)
+ifneq ($(O),)
+ BUILD := $(O)
+else
+ ifneq ($(KBUILD_OUTPUT),)
+ BUILD := $(KBUILD_OUTPUT)
+ else
+ BUILD := $(shell pwd)
+ DEFAULT_INSTALL_HDR_PATH := 1
+ endif
endif
# KSFT_TAP_LEVEL is used from KSFT framework to prevent nested TAP header
@@ -89,8 +92,41 @@ endif
# with system() call. Export it here to cover override RUN_TESTS defines.
export KSFT_TAP_LEVEL=`echo 1`
+# Prepare for headers install
+top_srcdir ?= ../../..
+include $(top_srcdir)/scripts/subarch.include
+ARCH ?= $(SUBARCH)
+export KSFT_KHDR_INSTALL_DONE := 1
export BUILD
-all:
+
+# set default goal to all, so make without a target runs all, even when
+# all isn't the first target in the file.
+.DEFAULT_GOAL := all
+
+# Install headers here once for all tests. KSFT_KHDR_INSTALL_DONE
+# is used to avoid running headers_install from lib.mk.
+# Invoke headers install with --no-builtin-rules to avoid circular
+# dependency in "make kselftest" case. In this case, second level
+# make inherits builtin-rules which will use the rule generate
+# Makefile.o and runs into
+# "Circular Makefile.o <- prepare dependency dropped."
+# and headers_install fails and test compile fails.
+#
+# O= KBUILD_OUTPUT cases don't run into this error, since main Makefile
+# invokes them as sub-makes and --no-builtin-rules is not necessary,
+# but doesn't cause any failures. Keep it simple and use the same
+# flags in both cases.
+# Local build cases: "make kselftest", "make -C" - headers are installed
+# in the default INSTALL_HDR_PATH usr/include.
+khdr:
+ifeq (1,$(DEFAULT_INSTALL_HDR_PATH))
+ make --no-builtin-rules ARCH=$(ARCH) -C $(top_srcdir) headers_install
+else
+ make --no-builtin-rules INSTALL_HDR_PATH=$$BUILD/usr \
+ ARCH=$(ARCH) -C $(top_srcdir) headers_install
+endif
+
+all: khdr
@for TARGET in $(TARGETS); do \
BUILD_TARGET=$$BUILD/$$TARGET; \
mkdir $$BUILD_TARGET -p; \
@@ -173,4 +209,4 @@ clean:
make OUTPUT=$$BUILD_TARGET -C $$TARGET clean;\
done;
-.PHONY: all run_tests hotplug run_hotplug clean_hotplug run_pstore_crash install clean
+.PHONY: khdr all run_tests hotplug run_hotplug clean_hotplug run_pstore_crash install clean
diff --git a/tools/testing/selftests/lib.mk b/tools/testing/selftests/lib.mk
index 8b0f16409ed7..f888522a34e9 100644
--- a/tools/testing/selftests/lib.mk
+++ b/tools/testing/selftests/lib.mk
@@ -3,7 +3,16 @@
CC := $(CROSS_COMPILE)gcc
ifeq (0,$(MAKELEVEL))
-OUTPUT := $(shell pwd)
+ ifneq ($(O),)
+ OUTPUT := $(O)
+ else
+ ifneq ($(KBUILD_OUTPUT),)
+ OUTPUT := $(KBUILD_OUTPUT)
+ else
+ OUTPUT := $(shell pwd)
+ DEFAULT_INSTALL_HDR_PATH := 1
+ endif
+ endif
endif
# The following are built by lib.mk common compile rules.
@@ -21,9 +30,30 @@ top_srcdir ?= ../../../..
include $(top_srcdir)/scripts/subarch.include
ARCH ?= $(SUBARCH)
+# Invoke headers install with --no-builtin-rules to avoid circular
+# dependency in "make kselftest" case. In this case, second level
+# make inherits builtin-rules which will use the rule generate
+# Makefile.o and runs into
+# "Circular Makefile.o <- prepare dependency dropped."
+# and headers_install fails and test compile fails.
+# O= KBUILD_OUTPUT cases don't run into this error, since main Makefile
+# invokes them as sub-makes and --no-builtin-rules is not necessary,
+# but doesn't cause any failures. Keep it simple and use the same
+# flags in both cases.
+# Note that the support to install headers from lib.mk is necessary
+# when test Makefile is run directly with "make -C".
+# When local build is done, headers are installed in the default
+# INSTALL_HDR_PATH usr/include.
.PHONY: khdr
khdr:
- make ARCH=$(ARCH) -C $(top_srcdir) headers_install
+ifndef KSFT_KHDR_INSTALL_DONE
+ifeq (1,$(DEFAULT_INSTALL_HDR_PATH))
+ make --no-builtin-rules ARCH=$(ARCH) -C $(top_srcdir) headers_install
+else
+ make --no-builtin-rules INSTALL_HDR_PATH=$$OUTPUT/usr \
+ ARCH=$(ARCH) -C $(top_srcdir) headers_install
+endif
+endif
all: khdr $(TEST_GEN_PROGS) $(TEST_GEN_PROGS_EXTENDED) $(TEST_GEN_FILES)
else
--
2.17.1
Summary
------------------------------------------------------------------------
git repo: not informed
git branch: not informed
git commit: not informed
git describe: not informed
Test details: https://qa-reports.linaro.org/lkft/linux-next-oe/build/next-20190416
Regressions (compared to build next-20190415)
------------------------------------------------------------------------
No regressions
Fixes (compared to build next-20190415)
------------------------------------------------------------------------
No fixes
In total:
------------------------------------------------------------------------
Ran 0 total tests in the following environments and test suites.
pass 0
fail 0
xfail 0
skip 0
Environments
--------------
- dragonboard-410c - arm64
- hi6220-hikey - arm64
- i386
- juno-r2 - arm64
- x15 - arm
- x86_64
Test Suites
-----------
Failures
------------------------------------------------------------------------
x86:
i386:
dragonboard-410c:
hi6220-hikey:
juno-r2:
x15:
Skips
------------------------------------------------------------------------
No skips
--
Linaro LKFT
https://lkft.linaro.org
TEST_PROGS variable is for test shell scripts and common clean target
in lib.mk doesn't touch them. TEST_GEN_PROGS are removed by it.
Fix it to use TEST_PROGS for test shell scripts and TEST_PROGS_EXTENDED
for common functions.sh.
Signed-off-by: Shuah Khan <shuah(a)kernel.org>
---
tools/testing/selftests/livepatch/Makefile | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/livepatch/Makefile b/tools/testing/selftests/livepatch/Makefile
index af4aee79bebb..fd405402c3ff 100644
--- a/tools/testing/selftests/livepatch/Makefile
+++ b/tools/testing/selftests/livepatch/Makefile
@@ -1,6 +1,7 @@
# SPDX-License-Identifier: GPL-2.0
-TEST_GEN_PROGS := \
+TEST_PROGS_EXTENDED := functions.sh
+TEST_PROGS := \
test-livepatch.sh \
test-callbacks.sh \
test-shadow-vars.sh
--
2.17.1
Summary
------------------------------------------------------------------------
git repo: not informed
git branch: not informed
git commit: not informed
git describe: not informed
Test details: https://qa-reports.linaro.org/lkft/linux-next-oe/build/next-20190415
Regressions (compared to build next-20190412)
------------------------------------------------------------------------
No regressions
Fixes (compared to build next-20190412)
------------------------------------------------------------------------
No fixes
In total:
------------------------------------------------------------------------
Ran 0 total tests in the following environments and test suites.
pass 0
fail 0
xfail 0
skip 0
Environments
--------------
- dragonboard-410c - arm64
- hi6220-hikey - arm64
- i386
- juno-r2 - arm64
- x15 - arm
- x86_64
Test Suites
-----------
Failures
------------------------------------------------------------------------
i386:
juno-r2:
hi6220-hikey:
x15:
x86:
dragonboard-410c:
Skips
------------------------------------------------------------------------
No skips
--
Linaro LKFT
https://lkft.linaro.org
---------- Forwarded message ---------
From: Jeffrin Thalakkottoor <jeffrin(a)rajagiritech.edu.in>
Date: Sat, Apr 13, 2019 at 6:09 PM
Subject: Re: [PATCH] selftests : netfilter: Wrote a error and exit
code for a command which needed veth kernel module.
To: Shuah Khan <shuah(a)kernel.org>
Cc: Florian Westphal <fw(a)strlen.de>, <linu-kselftest(a)vger.kernel.org>,
<pablo(a)netfilter.org>, lkml <linux-kernel(a)vger.kernel.org>
Hello Shuah,
did you get the mail related stuff below ?
On Fri, Apr 5, 2019 at 10:17 PM Florian Westphal <fw(a)strlen.de> wrote:
>
> Jeffrin Jose T <jeffrin(a)rajagiritech.edu.in> wrote:
> > A test for the basic NAT functionality uses ip command which
> > needs veth device.There is a condition where the kernel support
> > for veth is not compiled into the kernel and the test script
> > breaks.This patch contains code for reasonable error display
> > and correct code exit.
>
> Looks good to me, thanks for following up on this.
>
> Acked-by: Florian Westphal <fw(a)strlen.de>
--
software engineer
rajagiri school of engineering and technology
--
software engineer
rajagiri school of engineering and technology
Introduce in-kernel headers which are made available as an archive
through proc (/proc/kheaders.tar.xz file). This archive makes it
possible to run eBPF and other tracing programs tracing programs that
need to extend the kernel for tracing purposes without any dependency on
the file system having headers.
On Android and embedded systems, it is common to switch kernels but not
have kernel headers available on the file system. Further once a
different kernel is booted, any headers stored on the file system will
no longer be useful. By storing the headers as a compressed archive
within the kernel, we can avoid these issues that have been a hindrance
for a long time.
The best way to use this feature is by building it in. Several users
have a need for this, when they switch debug kernels, they donot want to
update the filesystem or worry about it where to store the headers on
it. However, the feature is also buildable as a module in case the user
desires it not being part of the kernel image. This makes it possible to
load and unload the headers from memory on demand. A tracing program, or
a kernel module builder can load the module, do its operations, and then
unload the module to save kernel memory. The total memory needed is 3.8MB.
By having the archive available at a fixed location independent of
filesystem dependencies and conventions, all debugging tools can
directly refer to the fixed location for the archive, without concerning
with where the headers on a typical filesystem which significantly
simplifies tooling that needs kernel headers.
The code to read the headers is based on /proc/config.gz code and uses
the same technique to embed the headers.
IKHD_ST and IKHD_ED markers as is to facilitate future patches that
would extract the headers from a kernel or module image.
Signed-off-by: Joel Fernandes (Google) <joel(a)joelfernandes.org>
---
v5 -> v6: (Masahiro Yamada suggestions mostly)
- Dropped support for module building.
- Rebuild archive if script changes.
- Move archive file list to script.
- Move build script to kernel directory.
v4 -> v5:
(v4 was Tested-by the following folks)
Tested-by: qais.yousef(a)arm.com
Tested-by: dietmar.eggemann(a)arm.com
Tested-by: linux(a)manojrajarao.com
(Thanks to Masahiro Yamada for several excellent suggestions)
- used incbin instead of bin2c (Masahiro did similar idea)
- added module.lds if ia64 otherwise ia64 may fail to build.
- added clean-files rule to Makefile
- removed strip-comments script and doing it inline
- added set -e to header generated to die on errorsr
- fixed a minor issue where find command was noisy.
- removed unneeded tar.xz rule from kernel/.gitignore
- added Tested-by tags from ARM folks.
Changes since v3:
- Blank tar was being generated because of a one line I
forgot to push. It is updated now.
- Added module.lds since arm64 needs it to build modules.
Changes since v2:
(Thanks to Masahiro Yamada for several excellent suggestions)
- Added support for out of tree builds.
- Added incremental build support bringing down build time of
incremental builds from 50 seconds to 5 seconds.
- Fixed various small nits / cleanups.
- clean ups to kheaders.c pointed by Alexey Dobriyan.
- Fixed MODULE_LICENSE in test module and kheaders.c
- Dropped Module.symvers from archive due to circular dependency.
Changes since v1:
- removed IKH_EXTRA variable, not needed (Masahiro Yamada)
- small fix ups to selftest
- added target to main Makefile etc
- added MODULE_LICENSE to test module
- made selftest more quiet
Changes since RFC:
Both changes bring size down to 3.8MB:
- use xz for compression
- strip comments except SPDX lines
- Call out the module name in Kconfig
- Also added selftests in second patch to ensure headers are always
working.
Signed-off-by: Joel Fernandes (Google) <joel(a)joelfernandes.org>
init/Kconfig | 11 ++++++
kernel/.gitignore | 1 +
kernel/Makefile | 10 +++++
kernel/gen_ikh_data.sh | 84 ++++++++++++++++++++++++++++++++++++++++++
kernel/kheaders.c | 73 ++++++++++++++++++++++++++++++++++++
5 files changed, 179 insertions(+)
create mode 100755 kernel/gen_ikh_data.sh
create mode 100644 kernel/kheaders.c
diff --git a/init/Kconfig b/init/Kconfig
index 4592bf7997c0..ea75bfbf7dfa 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -580,6 +580,17 @@ config IKCONFIG_PROC
This option enables access to the kernel configuration file
through /proc/config.gz.
+config IKHEADERS_PROC
+ tristate "Enable kernel header artifacts through /proc/kheaders.tar.xz"
+ depends on PROC_FS
+ help
+ This option enables access to the kernel header and other artifacts that
+ are generated during the build process. These can be used to build kernel
+ modules or by other in-kernel programs such as those generated by eBPF
+ and systemtap tools. If you build the headers as a module, a module
+ called kheaders.ko is built which can be loaded on-demand to get access
+ to the headers.
+
config LOG_BUF_SHIFT
int "Kernel log buffer size (16 => 64KB, 17 => 128KB)"
range 12 25
diff --git a/kernel/.gitignore b/kernel/.gitignore
index 6e699100872f..34d1e77ee9df 100644
--- a/kernel/.gitignore
+++ b/kernel/.gitignore
@@ -1,5 +1,6 @@
#
# Generated files
#
+kheaders.md5
timeconst.h
hz.bc
diff --git a/kernel/Makefile b/kernel/Makefile
index 6c57e78817da..e3c581d8cde7 100644
--- a/kernel/Makefile
+++ b/kernel/Makefile
@@ -70,6 +70,7 @@ obj-$(CONFIG_UTS_NS) += utsname.o
obj-$(CONFIG_USER_NS) += user_namespace.o
obj-$(CONFIG_PID_NS) += pid_namespace.o
obj-$(CONFIG_IKCONFIG) += configs.o
+obj-$(CONFIG_IKHEADERS_PROC) += kheaders.o
obj-$(CONFIG_SMP) += stop_machine.o
obj-$(CONFIG_KPROBES_SANITY_TEST) += test_kprobes.o
obj-$(CONFIG_AUDIT) += audit.o auditfilter.o
@@ -121,3 +122,12 @@ $(obj)/configs.o: $(obj)/config_data.gz
targets += config_data.gz
$(obj)/config_data.gz: $(KCONFIG_CONFIG) FORCE
$(call if_changed,gzip)
+
+$(obj)/kheaders.o: $(obj)/kheaders_data.tar.xz
+
+quiet_cmd_genikh = GEN $(obj)/kheaders_data.tar.xz
+cmd_genikh = $(srctree)/kernel/gen_ikh_data.sh $@
+$(obj)/kheaders_data.tar.xz: FORCE
+ $(call cmd,genikh)
+
+clean-files := kheaders_data.tar.xz kheaders.md5
diff --git a/kernel/gen_ikh_data.sh b/kernel/gen_ikh_data.sh
new file mode 100755
index 000000000000..ef72c2740d01
--- /dev/null
+++ b/kernel/gen_ikh_data.sh
@@ -0,0 +1,84 @@
+#!/bin/bash
+# SPDX-License-Identifier: GPL-2.0
+
+# This script generates an archive consisting of kernel headers
+# for CONFIG_IKHEADERS_PROC.
+set -e
+spath="$(dirname "$(readlink -f "$0")")"
+kroot="$spath/.."
+outdir="$(pwd)"
+tarfile=$1
+cpio_dir=$outdir/$tarfile.tmp
+
+# Script filename relative to the kernel source root
+# We add it to the archive because it is small and any changes
+# to this script will also cause a rebuild of the archive.
+sfile="$(realpath --relative-to $kroot "$(readlink -f "$0")")"
+
+src_file_list="
+include/
+arch/$SRCARCH/include/
+$sfile
+"
+
+obj_file_list="
+include/
+arch/$SRCARCH/include/
+"
+
+# Support incremental builds by skipping archive generation
+# if timestamps of files being archived are not changed.
+
+# This block is useful for debugging the incremental builds.
+# Uncomment it for debugging.
+# iter=1
+# if [ ! -f /tmp/iter ]; then echo 1 > /tmp/iter;
+# else; iter=$(($(cat /tmp/iter) + 1)); fi
+# find $src_file_list -type f | xargs ls -lR > /tmp/src-ls-$iter
+# find $obj_file_list -type f | xargs ls -lR > /tmp/obj-ls-$iter
+
+# include/generated/compile.h is ignored because it is touched even when none
+# of the source files changed. This causes pointless regeneration, so let us
+# ignore them for md5 calculation.
+pushd $kroot > /dev/null
+src_files_md5="$(find $src_file_list -type f |
+ grep -v "include/generated/compile.h" |
+ xargs ls -lR | md5sum | cut -d ' ' -f1)"
+popd > /dev/null
+obj_files_md5="$(find $obj_file_list -type f |
+ grep -v "include/generated/compile.h" |
+ xargs ls -lR | md5sum | cut -d ' ' -f1)"
+
+if [ -f $tarfile ]; then tarfile_md5="$(md5sum $tarfile | cut -d ' ' -f1)"; fi
+if [ -f kernel/kheaders.md5 ] &&
+ [ "$(cat kernel/kheaders.md5|head -1)" == "$src_files_md5" ] &&
+ [ "$(cat kernel/kheaders.md5|head -2|tail -1)" == "$obj_files_md5" ] &&
+ [ "$(cat kernel/kheaders.md5|tail -1)" == "$tarfile_md5" ]; then
+ exit
+fi
+
+rm -rf $cpio_dir
+mkdir $cpio_dir
+
+pushd $kroot > /dev/null
+for f in $src_file_list;
+ do find "$f" ! -name "*.cmd" ! -name ".*";
+done | cpio --quiet -pd $cpio_dir
+popd > /dev/null
+
+# The second CPIO can complain if files already exist which can
+# happen with out of tree builds. Just silence CPIO for now.
+for f in $obj_file_list;
+ do find "$f" ! -name "*.cmd" ! -name ".*";
+done | cpio --quiet -pd $cpio_dir >/dev/null 2>&1
+
+find $cpio_dir -type f -print0 |
+ xargs -0 -P8 -n1 perl -pi -e 'BEGIN {undef $/;}; s/\/\*((?!SPDX).)*?\*\///smg;'
+
+tar -Jcf $tarfile -C $cpio_dir/ . > /dev/null
+
+echo "$src_files_md5" > kernel/kheaders.md5
+echo "$obj_files_md5" >> kernel/kheaders.md5
+echo "$(md5sum $tarfile | cut -d ' ' -f1)" >> kernel/kheaders.md5
+
+rm -rf $cpio_dir
diff --git a/kernel/kheaders.c b/kernel/kheaders.c
new file mode 100644
index 000000000000..d072a958a8f1
--- /dev/null
+++ b/kernel/kheaders.c
@@ -0,0 +1,73 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * kernel/kheaders.c
+ * Provide headers and artifacts needed to build kernel modules.
+ * (Borrowed code from kernel/configs.c)
+ */
+
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/proc_fs.h>
+#include <linux/init.h>
+#include <linux/uaccess.h>
+
+/*
+ * Define kernel_headers_data and kernel_headers_data_end, within which the the
+ * compressed kernel headers are stpred. The file is first compressed with xz.
+ */
+
+asm (
+" .pushsection .rodata, \"a\" \n"
+" .global kernel_headers_data \n"
+"kernel_headers_data: \n"
+" .incbin \"kernel/kheaders_data.tar.xz\" \n"
+" .global kernel_headers_data_end \n"
+"kernel_headers_data_end: \n"
+" .popsection \n"
+);
+
+extern char kernel_headers_data;
+extern char kernel_headers_data_end;
+
+static ssize_t
+ikheaders_read_current(struct file *file, char __user *buf,
+ size_t len, loff_t *offset)
+{
+ return simple_read_from_buffer(buf, len, offset,
+ &kernel_headers_data,
+ &kernel_headers_data_end -
+ &kernel_headers_data);
+}
+
+static const struct file_operations ikheaders_file_ops = {
+ .read = ikheaders_read_current,
+ .llseek = default_llseek,
+};
+
+static int __init ikheaders_init(void)
+{
+ struct proc_dir_entry *entry;
+
+ /* create the current headers file */
+ entry = proc_create("kheaders.tar.xz", S_IRUGO, NULL,
+ &ikheaders_file_ops);
+ if (!entry)
+ return -ENOMEM;
+
+ proc_set_size(entry,
+ &kernel_headers_data_end -
+ &kernel_headers_data);
+ return 0;
+}
+
+static void __exit ikheaders_cleanup(void)
+{
+ remove_proc_entry("kheaders.tar.xz", NULL);
+}
+
+module_init(ikheaders_init);
+module_exit(ikheaders_cleanup);
+
+MODULE_LICENSE("GPL v2");
+MODULE_AUTHOR("Joel Fernandes");
+MODULE_DESCRIPTION("Echo the kernel header artifacts used to build the kernel");
--
2.21.0.392.gf8f6787159e-goog
Introduce in-kernel headers which are made available as an archive
through proc (/proc/kheaders.tar.xz file). This archive makes it
possible to run eBPF and other tracing programs that need to extend the
kernel for tracing purposes without any dependency on the file system
having headers.
A github PR is sent for the corresponding BCC patch at:
https://github.com/iovisor/bcc/pull/2312
On Android and embedded systems, it is common to switch kernels but not
have kernel headers available on the file system. Further once a
different kernel is booted, any headers stored on the file system will
no longer be useful. This is an issue even well known to distros.
By storing the headers as a compressed archive within the kernel, we can
avoid these issues that have been a hindrance for a long time.
The best way to use this feature is by building it in. Several users
have a need for this, when they switch debug kernels, they do not want to
update the filesystem or worry about it where to store the headers on
it. However, the feature is also buildable as a module in case the user
desires it not being part of the kernel image. This makes it possible to
load and unload the headers from memory on demand. A tracing program can
load the module, do its operations, and then unload the module to save
kernel memory. The total memory needed is 3.3MB.
By having the archive available at a fixed location independent of
filesystem dependencies and conventions, all debugging tools can
directly refer to the fixed location for the archive, without concerning
with where the headers on a typical filesystem which significantly
simplifies tooling that needs kernel headers.
The code to read the headers is based on /proc/config.gz code and uses
the same technique to embed the headers.
Other approaches were discussed such as having an in-memory mountable
filesystem, but that has drawbacks such as requiring an in-kernel xz
decompressor which we don't have today, and requiring usage of 42 MB of
kernel memory to host the decompressed headers at anytime. Also this
approach is simpler than such approaches.
Signed-off-by: Joel Fernandes (Google) <joel(a)joelfernandes.org>
---
v6 -> v7:
- Minor nits from Masahiro Yamada are addressed.
v5 -> v6: (Masahiro Yamada suggestions mostly)
- Dropped support for module building.
- Rebuild archive if script changes.
- Move archive file list to script.
- Move build script to kernel directory.
v4 -> v5:
(v4 was Tested-by the following folks)
Tested-by: qais.yousef(a)arm.com
Tested-by: dietmar.eggemann(a)arm.com
Tested-by: linux(a)manojrajarao.com
(Thanks to Masahiro Yamada for several excellent suggestions)
- used incbin instead of bin2c (Masahiro did similar idea)
- added module.lds if ia64 otherwise ia64 may fail to build.
- added clean-files rule to Makefile
- removed strip-comments script and doing it inline
- added set -e to header generated to die on errorsr
- fixed a minor issue where find command was noisy.
- removed unneeded tar.xz rule from kernel/.gitignore
- added Tested-by tags from ARM folks.
Changes since v3:
- Blank tar was being generated because of a one line I
forgot to push. It is updated now.
- Added module.lds since arm64 needs it to build modules.
Changes since v2:
(Thanks to Masahiro Yamada for several excellent suggestions)
- Added support for out of tree builds.
- Added incremental build support bringing down build time of
incremental builds from 50 seconds to 5 seconds.
- Fixed various small nits / cleanups.
- clean ups to kheaders.c pointed by Alexey Dobriyan.
- Fixed MODULE_LICENSE in test module and kheaders.c
- Dropped Module.symvers from archive due to circular dependency.
Changes since v1:
- removed IKH_EXTRA variable, not needed (Masahiro Yamada)
- small fix ups to selftest
- added target to main Makefile etc
- added MODULE_LICENSE to test module
- made selftest more quiet
Changes since RFC:
Both changes bring size down to 3.8MB:
- use xz for compression
- strip comments except SPDX lines
- Call out the module name in Kconfig
- Also added selftests in second patch to ensure headers are always
working.
Signed-off-by: Joel Fernandes (Google) <joel(a)joelfernandes.org>
init/Kconfig | 10 +++++
kernel/.gitignore | 1 +
kernel/Makefile | 10 +++++
kernel/gen_ikh_data.sh | 89 ++++++++++++++++++++++++++++++++++++++++++
kernel/kheaders.c | 74 +++++++++++++++++++++++++++++++++++
5 files changed, 184 insertions(+)
create mode 100755 kernel/gen_ikh_data.sh
create mode 100644 kernel/kheaders.c
diff --git a/init/Kconfig b/init/Kconfig
index 4592bf7997c0..47c0db6e63a5 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -580,6 +580,16 @@ config IKCONFIG_PROC
This option enables access to the kernel configuration file
through /proc/config.gz.
+config IKHEADERS_PROC
+ tristate "Enable kernel header artifacts through /proc/kheaders.tar.xz"
+ depends on PROC_FS
+ help
+ This option enables access to the kernel header and other artifacts that
+ are generated during the build process. These can be used to build eBPF
+ tracing programs, or similar programs. If you build the headers as a
+ module, a module called kheaders.ko is built which can be loaded on-demand
+ to get access to the headers.
+
config LOG_BUF_SHIFT
int "Kernel log buffer size (16 => 64KB, 17 => 128KB)"
range 12 25
diff --git a/kernel/.gitignore b/kernel/.gitignore
index 6e699100872f..34d1e77ee9df 100644
--- a/kernel/.gitignore
+++ b/kernel/.gitignore
@@ -1,5 +1,6 @@
#
# Generated files
#
+kheaders.md5
timeconst.h
hz.bc
diff --git a/kernel/Makefile b/kernel/Makefile
index 6c57e78817da..12399614c350 100644
--- a/kernel/Makefile
+++ b/kernel/Makefile
@@ -70,6 +70,7 @@ obj-$(CONFIG_UTS_NS) += utsname.o
obj-$(CONFIG_USER_NS) += user_namespace.o
obj-$(CONFIG_PID_NS) += pid_namespace.o
obj-$(CONFIG_IKCONFIG) += configs.o
+obj-$(CONFIG_IKHEADERS_PROC) += kheaders.o
obj-$(CONFIG_SMP) += stop_machine.o
obj-$(CONFIG_KPROBES_SANITY_TEST) += test_kprobes.o
obj-$(CONFIG_AUDIT) += audit.o auditfilter.o
@@ -121,3 +122,12 @@ $(obj)/configs.o: $(obj)/config_data.gz
targets += config_data.gz
$(obj)/config_data.gz: $(KCONFIG_CONFIG) FORCE
$(call if_changed,gzip)
+
+$(obj)/kheaders.o: $(obj)/kheaders_data.tar.xz
+
+quiet_cmd_genikh = CHK $(obj)/kheaders_data.tar.xz
+cmd_genikh = $(srctree)/kernel/gen_ikh_data.sh $@
+$(obj)/kheaders_data.tar.xz: FORCE
+ $(call cmd,genikh)
+
+clean-files := kheaders_data.tar.xz kheaders.md5
diff --git a/kernel/gen_ikh_data.sh b/kernel/gen_ikh_data.sh
new file mode 100755
index 000000000000..591a94f7b387
--- /dev/null
+++ b/kernel/gen_ikh_data.sh
@@ -0,0 +1,89 @@
+#!/bin/bash
+# SPDX-License-Identifier: GPL-2.0
+
+# This script generates an archive consisting of kernel headers
+# for CONFIG_IKHEADERS_PROC.
+set -e
+spath="$(dirname "$(readlink -f "$0")")"
+kroot="$spath/.."
+outdir="$(pwd)"
+tarfile=$1
+cpio_dir=$outdir/$tarfile.tmp
+
+# Script filename relative to the kernel source root
+# We add it to the archive because it is small and any changes
+# to this script will also cause a rebuild of the archive.
+sfile="$(realpath --relative-to $kroot "$(readlink -f "$0")")"
+
+src_file_list="
+include/
+arch/$SRCARCH/include/
+$sfile
+"
+
+obj_file_list="
+include/
+arch/$SRCARCH/include/
+"
+
+# Support incremental builds by skipping archive generation
+# if timestamps of files being archived are not changed.
+
+# This block is useful for debugging the incremental builds.
+# Uncomment it for debugging.
+# iter=1
+# if [ ! -f /tmp/iter ]; then echo 1 > /tmp/iter;
+# else; iter=$(($(cat /tmp/iter) + 1)); fi
+# find $src_file_list -type f | xargs ls -lR > /tmp/src-ls-$iter
+# find $obj_file_list -type f | xargs ls -lR > /tmp/obj-ls-$iter
+
+# include/generated/compile.h is ignored because it is touched even when none
+# of the source files changed. This causes pointless regeneration, so let us
+# ignore them for md5 calculation.
+pushd $kroot > /dev/null
+src_files_md5="$(find $src_file_list -type f |
+ grep -v "include/generated/compile.h" |
+ xargs ls -lR | md5sum | cut -d ' ' -f1)"
+popd > /dev/null
+obj_files_md5="$(find $obj_file_list -type f |
+ grep -v "include/generated/compile.h" |
+ xargs ls -lR | md5sum | cut -d ' ' -f1)"
+
+if [ -f $tarfile ]; then tarfile_md5="$(md5sum $tarfile | cut -d ' ' -f1)"; fi
+if [ -f kernel/kheaders.md5 ] &&
+ [ "$(cat kernel/kheaders.md5|head -1)" == "$src_files_md5" ] &&
+ [ "$(cat kernel/kheaders.md5|head -2|tail -1)" == "$obj_files_md5" ] &&
+ [ "$(cat kernel/kheaders.md5|tail -1)" == "$tarfile_md5" ]; then
+ exit
+fi
+
+if [ "${quiet}" != "silent_" ]; then
+ echo " GEN $tarfile"
+fi
+
+rm -rf $cpio_dir
+mkdir $cpio_dir
+
+pushd $kroot > /dev/null
+for f in $src_file_list;
+ do find "$f" ! -name "*.cmd" ! -name ".*";
+done | cpio --quiet -pd $cpio_dir
+popd > /dev/null
+
+# The second CPIO can complain if files already exist which can
+# happen with out of tree builds. Just silence CPIO for now.
+for f in $obj_file_list;
+ do find "$f" ! -name "*.cmd" ! -name ".*";
+done | cpio --quiet -pd $cpio_dir >/dev/null 2>&1
+
+# Remove comments except SDPX lines
+find $cpio_dir -type f -print0 |
+ xargs -0 -P8 -n1 perl -pi -e 'BEGIN {undef $/;}; s/\/\*((?!SPDX).)*?\*\///smg;'
+
+tar -Jcf $tarfile -C $cpio_dir/ . > /dev/null
+
+echo "$src_files_md5" > kernel/kheaders.md5
+echo "$obj_files_md5" >> kernel/kheaders.md5
+echo "$(md5sum $tarfile | cut -d ' ' -f1)" >> kernel/kheaders.md5
+
+rm -rf $cpio_dir
diff --git a/kernel/kheaders.c b/kernel/kheaders.c
new file mode 100644
index 000000000000..19526294a4e8
--- /dev/null
+++ b/kernel/kheaders.c
@@ -0,0 +1,74 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Provide kernel headers useful to build tracing programs
+ * such as for running eBPF tracing tools.
+ *
+ * (Borrowed code from kernel/configs.c)
+ */
+
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/proc_fs.h>
+#include <linux/init.h>
+#include <linux/uaccess.h>
+
+/*
+ * Define kernel_headers_data and kernel_headers_data_end, within which the the
+ * compressed kernel headers are stpred. The file is first compressed with xz.
+ */
+
+asm (
+" .pushsection .rodata, \"a\" \n"
+" .global kernel_headers_data \n"
+"kernel_headers_data: \n"
+" .incbin \"kernel/kheaders_data.tar.xz\" \n"
+" .global kernel_headers_data_end \n"
+"kernel_headers_data_end: \n"
+" .popsection \n"
+);
+
+extern char kernel_headers_data;
+extern char kernel_headers_data_end;
+
+static ssize_t
+ikheaders_read_current(struct file *file, char __user *buf,
+ size_t len, loff_t *offset)
+{
+ return simple_read_from_buffer(buf, len, offset,
+ &kernel_headers_data,
+ &kernel_headers_data_end -
+ &kernel_headers_data);
+}
+
+static const struct file_operations ikheaders_file_ops = {
+ .read = ikheaders_read_current,
+ .llseek = default_llseek,
+};
+
+static int __init ikheaders_init(void)
+{
+ struct proc_dir_entry *entry;
+
+ /* create the current headers file */
+ entry = proc_create("kheaders.tar.xz", S_IRUGO, NULL,
+ &ikheaders_file_ops);
+ if (!entry)
+ return -ENOMEM;
+
+ proc_set_size(entry,
+ &kernel_headers_data_end -
+ &kernel_headers_data);
+ return 0;
+}
+
+static void __exit ikheaders_cleanup(void)
+{
+ remove_proc_entry("kheaders.tar.xz", NULL);
+}
+
+module_init(ikheaders_init);
+module_exit(ikheaders_cleanup);
+
+MODULE_LICENSE("GPL v2");
+MODULE_AUTHOR("Joel Fernandes");
+MODULE_DESCRIPTION("Echo the kernel header artifacts used to build the kernel");
--
2.21.0.392.gf8f6787159e-goog
[Resending due to accidental HTML. I need to take Joel's advice and
switch to a real email client]
On Fri, Apr 12, 2019 at 5:54 PM Daniel Colascione <dancol(a)google.com> wrote:
>
> On Fri, Apr 12, 2019 at 5:09 PM Joel Fernandes <joel(a)joelfernandes.org> wrote:
>>
>> Hi Andy!
>>
>> On Fri, Apr 12, 2019 at 02:32:53PM -0700, Andy Lutomirski wrote:
>> > On Thu, Apr 11, 2019 at 10:51 AM Joel Fernandes (Google)
>> > <joel(a)joelfernandes.org> wrote:
>> > >
>> > > pidfd are /proc/pid directory file descriptors referring to a task group
>> > > leader. Android low memory killer (LMK) needs pidfd polling support to
>> > > replace code that currently checks for existence of /proc/pid for
>> > > knowing a process that is signalled to be killed has died, which is both
>> > > racy and slow. The pidfd poll approach is race-free, and also allows the
>> > > LMK to do other things (such as by polling on other fds) while awaiting
>> > > the process being killed to die.
>> > >
>> > > It prevents a situation where a PID is reused between when LMK sends a
>> > > kill signal and checks for existence of the PID, since the wrong PID is
>> > > now possibly checked for existence.
>> > >
>> > > In this patch, we follow the same mechanism used uhen the parent of the
>> > > task group is to be notified, that is when the tasks waiting on a poll
>> > > of pidfd are also awakened.
>> > >
>> > > We have decided to include the waitqueue in struct pid for the following
>> > > reasons:
>> > > 1. The wait queue has to survive for the lifetime of the poll. Including
>> > > it in task_struct would not be option in this case because the task can
>> > > be reaped and destroyed before the poll returns.
>> >
>> > Are you sure? I admit I'm not all that familiar with the innards of
>> > poll() on Linux, but I thought that the waitqueue only had to survive
>> > long enough to kick the polling thread and did *not* have to survive
>> > until poll() actually returned.
>>
>> I am not sure now. I thought epoll(2) was based on the wait_event APIs,
>> however more closely looking at the eventpoll code, it looks like there are 2
>> waitqueues involved, one that we pass and the other that is a part of the
>> eventpoll session itself, so you could be right about that. Daniel Colascione
>> may have some more thoughts about it since he brought up the possiblity of a
>> wq life-time issue. Daniel? We were just playing it safe.
I think you (Joel) and Andy are talking about different meanings of
poll(). Joel is talking about the VFS method; Andy is talking about
the system call. ISTM that the lifetime of wait queue we give to
poll_wait needs to last through the poll. Normally the wait queue gets
pinned by the struct file that we give to poll_wait (which takes a
reference on the struct file), but the pidfd struct file doesn't pin
the struct task, so we can't use a wait queue in struct task.
(remove_wait_queue, which poll implementations call to undo wait queue
additions, takes the wait queue head we pass to poll_wait, and we
don't want to pass a dangling pointer to remove_wait_queue.) If the
lifetime requirements for the queue aren't this strict, I don't see it
documented anywhere. Besides: if we don't actually need to pin the
waitqueue lifetime for the duration of the poll, why bother taking a
reference on the polled struct file?
=== Overview
arm64 has a feature called Top Byte Ignore, which allows to embed pointer
tags into the top byte of each pointer. Userspace programs (such as
HWASan, a memory debugging tool [1]) might use this feature and pass
tagged user pointers to the kernel through syscalls or other interfaces.
Right now the kernel is already able to handle user faults with tagged
pointers, due to these patches:
1. 81cddd65 ("arm64: traps: fix userspace cache maintenance emulation on a
tagged pointer")
2. 7dcd9dd8 ("arm64: hw_breakpoint: fix watchpoint matching for tagged
pointers")
3. 276e9327 ("arm64: entry: improve data abort handling of tagged
pointers")
This patchset extends tagged pointer support to syscall arguments.
As per the proposed ABI change [3], tagged pointers are only allowed to be
passed to syscalls when they point to memory ranges obtained by anonymous
mmap() or brk().
For non-memory syscalls this is done by untaging user pointers when the
kernel performs pointer checking to find out whether the pointer comes
from userspace (most notably in access_ok). The untagging is done only
when the pointer is being checked, the tag is preserved as the pointer
makes its way through the kernel and stays tagged when the kernel
dereferences the pointer when perfoming user memory accesses.
Memory syscalls (mmap, mprotect, etc.) don't do user memory accesses but
rather deal with memory ranges, and untagged pointers are better suited to
describe memory ranges internally. Thus for memory syscalls we untag
pointers completely when they enter the kernel.
=== Other approaches
One of the alternative approaches to untagging that was considered is to
completely strip the pointer tag as the pointer enters the kernel with
some kind of a syscall wrapper, but that won't work with the countless
number of different ioctl calls. With this approach we would need a custom
wrapper for each ioctl variation, which doesn't seem practical.
An alternative approach to untagging pointers in memory syscalls prologues
is to inspead allow tagged pointers to be passed to find_vma() (and other
vma related functions) and untag them there. Unfortunately, a lot of
find_vma() callers then compare or subtract the returned vma start and end
fields against the pointer that was being searched. Thus this approach
would still require changing all find_vma() callers.
=== Testing
The following testing approaches has been taken to find potential issues
with user pointer untagging:
1. Static testing (with sparse [2] and separately with a custom static
analyzer based on Clang) to track casts of __user pointers to integer
types to find places where untagging needs to be done.
2. Static testing with grep to find parts of the kernel that call
find_vma() (and other similar functions) or directly compare against
vm_start/vm_end fields of vma.
3. Static testing with grep to find parts of the kernel that compare
user pointers with TASK_SIZE or other similar consts and macros.
4. Dynamic testing: adding BUG_ON(has_tag(addr)) to find_vma() and running
a modified syzkaller version that passes tagged pointers to the kernel.
Based on the results of the testing the requried patches have been added
to the patchset.
=== Notes
This patchset is meant to be merged together with "arm64 relaxed ABI" [3].
This patchset is a prerequisite for ARM's memory tagging hardware feature
support [4].
This patchset has been merged into the Pixel 2 kernel tree and is now
being used to enable testing of Pixel 2 phones with HWASan.
Thanks!
[1] http://clang.llvm.org/docs/HardwareAssistedAddressSanitizerDesign.html
[2] https://github.com/lucvoo/sparse-dev/commit/5f960cb10f56ec2017c128ef9d16060…
[3] https://lkml.org/lkml/2018/12/10/402
[4] https://community.arm.com/processors/b/blog/posts/arm-a-profile-architectur…
Changes in v11:
- Added "uprobes, arm64: untag user pointers in find_active_uprobe" patch.
- Added "bpf, arm64: untag user pointers in stack_map_get_build_id_offset"
patch.
- Fixed "tracing, arm64: untag user pointers in seq_print_user_ip" to
correctly perform subtration with a tagged addr.
- Moved untagged_addr() from SYSCALL_DEFINE3(mprotect) and
SYSCALL_DEFINE4(pkey_mprotect) to do_mprotect_pkey().
- Moved untagged_addr() definition for other arches from
include/linux/memory.h to include/linux/mm.h.
- Changed untagging in strn*_user() to perform userspace accesses through
tagged pointers.
- Updated the documentation to mention that passing tagged pointers to
memory syscalls is allowed.
- Updated the test to use malloc'ed memory instead of stack memory.
Changes in v10:
- Added "mm, arm64: untag user pointers passed to memory syscalls" back.
- New patch "fs, arm64: untag user pointers in fs/userfaultfd.c".
- New patch "net, arm64: untag user pointers in tcp_zerocopy_receive".
- New patch "kernel, arm64: untag user pointers in prctl_set_mm*".
- New patch "tracing, arm64: untag user pointers in seq_print_user_ip".
Changes in v9:
- Rebased onto 4.20-rc6.
- Used u64 instead of __u64 in type casts in the untagged_addr macro for
arm64.
- Added braces around (addr) in the untagged_addr macro for other arches.
Changes in v8:
- Rebased onto 65102238 (4.20-rc1).
- Added a note to the cover letter on why syscall wrappers/shims that untag
user pointers won't work.
- Added a note to the cover letter that this patchset has been merged into
the Pixel 2 kernel tree.
- Documentation fixes, in particular added a list of syscalls that don't
support tagged user pointers.
Changes in v7:
- Rebased onto 17b57b18 (4.19-rc6).
- Dropped the "arm64: untag user address in __do_user_fault" patch, since
the existing patches already handle user faults properly.
- Dropped the "usb, arm64: untag user addresses in devio" patch, since the
passed pointer must come from a vma and therefore be untagged.
- Dropped the "arm64: annotate user pointers casts detected by sparse"
patch (see the discussion to the replies of the v6 of this patchset).
- Added more context to the cover letter.
- Updated Documentation/arm64/tagged-pointers.txt.
Changes in v6:
- Added annotations for user pointer casts found by sparse.
- Rebased onto 050cdc6c (4.19-rc1+).
Changes in v5:
- Added 3 new patches that add untagging to places found with static
analysis.
- Rebased onto 44c929e1 (4.18-rc8).
Changes in v4:
- Added a selftest for checking that passing tagged pointers to the
kernel succeeds.
- Rebased onto 81e97f013 (4.18-rc1+).
Changes in v3:
- Rebased onto e5c51f30 (4.17-rc6+).
- Added linux-arch@ to the list of recipients.
Changes in v2:
- Rebased onto 2d618bdf (4.17-rc3+).
- Removed excessive untagging in gup.c.
- Removed untagging pointers returned from __uaccess_mask_ptr.
Changes in v1:
- Rebased onto 4.17-rc1.
Changes in RFC v2:
- Added "#ifndef untagged_addr..." fallback in linux/uaccess.h instead of
defining it for each arch individually.
- Updated Documentation/arm64/tagged-pointers.txt.
- Dropped "mm, arm64: untag user addresses in memory syscalls".
- Rebased onto 3eb2ce82 (4.16-rc7).
Andrey Konovalov (14):
uaccess: add untagged_addr definition for other arches
arm64: untag user pointers in access_ok and __uaccess_mask_ptr
lib, arm64: untag user pointers in strn*_user
mm, arm64: untag user pointers passed to memory syscalls
mm, arm64: untag user pointers in mm/gup.c
fs, arm64: untag user pointers in copy_mount_options
fs, arm64: untag user pointers in fs/userfaultfd.c
net, arm64: untag user pointers in tcp_zerocopy_receive
kernel, arm64: untag user pointers in prctl_set_mm*
tracing, arm64: untag user pointers in seq_print_user_ip
uprobes, arm64: untag user pointers in find_active_uprobe
bpf, arm64: untag user pointers in stack_map_get_build_id_offset
arm64: update Documentation/arm64/tagged-pointers.txt
selftests, arm64: add a selftest for passing tagged pointers to kernel
Documentation/arm64/tagged-pointers.txt | 18 +++++++---------
arch/arm64/include/asm/uaccess.h | 10 +++++----
fs/namespace.c | 2 +-
fs/userfaultfd.c | 5 +++++
include/linux/mm.h | 4 ++++
ipc/shm.c | 2 ++
kernel/bpf/stackmap.c | 6 ++++--
kernel/events/uprobes.c | 2 ++
kernel/sys.c | 14 +++++++++++++
kernel/trace/trace_output.c | 5 +++--
lib/strncpy_from_user.c | 3 ++-
lib/strnlen_user.c | 3 ++-
mm/gup.c | 4 ++++
mm/madvise.c | 2 ++
mm/mempolicy.c | 5 +++++
mm/migrate.c | 1 +
mm/mincore.c | 2 ++
mm/mlock.c | 5 +++++
mm/mmap.c | 7 +++++++
mm/mprotect.c | 1 +
mm/mremap.c | 2 ++
mm/msync.c | 2 ++
net/ipv4/tcp.c | 2 ++
tools/testing/selftests/arm64/.gitignore | 1 +
tools/testing/selftests/arm64/Makefile | 11 ++++++++++
.../testing/selftests/arm64/run_tags_test.sh | 12 +++++++++++
tools/testing/selftests/arm64/tags_test.c | 21 +++++++++++++++++++
27 files changed, 131 insertions(+), 21 deletions(-)
create mode 100644 tools/testing/selftests/arm64/.gitignore
create mode 100644 tools/testing/selftests/arm64/Makefile
create mode 100755 tools/testing/selftests/arm64/run_tags_test.sh
create mode 100644 tools/testing/selftests/arm64/tags_test.c
--
2.21.0.360.g471c308f928-goog
Summary
------------------------------------------------------------------------
git repo: not informed
git branch: not informed
git commit: not informed
git describe: not informed
Test details: https://qa-reports.linaro.org/lkft/linux-next-oe/build/next-20190412
Regressions (compared to build next-20190411)
------------------------------------------------------------------------
No regressions
Fixes (compared to build next-20190411)
------------------------------------------------------------------------
No fixes
In total:
------------------------------------------------------------------------
Ran 0 total tests in the following environments and test suites.
pass 0
fail 0
xfail 0
skip 0
Environments
--------------
- dragonboard-410c - arm64
- hi6220-hikey - arm64
- i386
- juno-r2 - arm64
- x15 - arm
- x86_64
Test Suites
-----------
Failures
------------------------------------------------------------------------
dragonboard-410c:
x86:
hi6220-hikey:
i386:
x15:
juno-r2:
Skips
------------------------------------------------------------------------
No skips
--
Linaro LKFT
https://lkft.linaro.org
Extend bpf_skb_adjust_room growth to mark inner MAC header so that
L2 encapsulation can be used for tc tunnels.
Patch #1 extends the existing test_tc_tunnel to support UDP
encapsulation; later we want to be able to test MPLS over UDP and
MPLS over GRE encapsulation.
Patch #2 adds the BPF_F_ADJ_ROOM_ENCAP_L2(len) macro, which
allows specification of inner mac length. Other approaches were
explored prior to taking this approach. Specifically, I tried
automatically computing the inner mac length on the basis of the
specified flags (so inner maclen for GRE/IPv4 encap is the len_diff
specified to bpf_skb_adjust_room minus GRE + IPv4 header length
for example). Problem with this is that we don't know for sure
what form of GRE/UDP header we have; is it a full GRE header,
or is it a FOU UDP header or generic UDP encap header? My fear
here was we'd end up with an explosion of flags. The other approach
tried was to support inner L2 header marking as a separate room
adjustment, i.e. adjust for L3/L4 encap, then call
bpf_skb_adjust_room for L2 encap. This can be made to work but
because it imposed an order on operations, felt a bit clunky.
Patch #3 syncs tools/ bpf.h.
Patch #4 extends the tests again to support MPLSoverGRE,
MPLSoverUDP, and transparent ethernet bridging (TEB) where
the inner L2 header is an ethernet header. Testing of BPF
encap against tunnels is done for cases where configuration
of such tunnels is possible (MPLSoverGRE[6], MPLSoverUDP,
gre[6]tap), and skipped otherwise. Testing of BPF encap/decap
is always carried out.
Changes since v2:
- updated tools/testing/selftest/bpf/config with FOU/MPLS CONFIG
variables (patches 1, 4)
- reduced noise in patch 1 by avoiding unnecessary movement of code
- eliminated inner_mac variable in bpf_skb_net_grow (patch 2)
Changes since v1:
- fixed formatting of commit references.
- BPF_F_ADJ_ROOM_FIXED_GSO flag enabled on all variants (patch 1)
- fixed fou6 options for UDP encap; checksum errors observed were
due to the fact fou6 tunnel was not set up with correct ipproto
options (41 -6). 0 checksums work fine (patch 1)
- added definitions for mask and shift used in setting L2 length
(patch 2)
- allow udp encap with fixed GSO (patch 2)
- changed "elen" to "l2_len" to be more descriptive (patch 4)
Alan Maguire (4):
selftests_bpf: extend test_tc_tunnel for UDP encap
bpf: add layer 2 encap support to bpf_skb_adjust_room
bpf: sync bpf.h to tools/ for BPF_F_ADJ_ROOM_ENCAP_L2
selftests_bpf: add L2 encap to test_tc_tunnel
include/uapi/linux/bpf.h | 10 +
net/core/filter.c | 12 +-
tools/include/uapi/linux/bpf.h | 10 +
tools/testing/selftests/bpf/config | 8 +
tools/testing/selftests/bpf/progs/test_tc_tunnel.c | 321 +++++++++++++++++----
tools/testing/selftests/bpf/test_tc_tunnel.sh | 136 +++++++--
6 files changed, 417 insertions(+), 80 deletions(-)
--
1.8.3.1
Summary
------------------------------------------------------------------------
git repo: not informed
git branch: not informed
git commit: not informed
git describe: not informed
Test details: https://qa-reports.linaro.org/lkft/linux-next-oe/build/next-20190411
Regressions (compared to build next-20190410)
------------------------------------------------------------------------
No regressions
Fixes (compared to build next-20190410)
------------------------------------------------------------------------
No fixes
In total:
------------------------------------------------------------------------
Ran 0 total tests in the following environments and test suites.
pass 0
fail 0
xfail 0
skip 0
Environments
--------------
- dragonboard-410c - arm64
- hi6220-hikey - arm64
- i386
- juno-r2 - arm64
- x15 - arm
- x86_64
Test Suites
-----------
Failures
------------------------------------------------------------------------
i386:
x15:
x86:
dragonboard-410c:
hi6220-hikey:
juno-r2:
Skips
------------------------------------------------------------------------
No skips
--
Linaro LKFT
https://lkft.linaro.org
Hi Kees,
secomp_bpf test hangs after global.user_notification_sibling_pid_ns
when a non-root user runs it on 5.1-rc4
I thought you fixed the problem with this commit:
commit 3d244c192afeee7dd4f5fb1b916ea4e47420d401
Author: Kees Cook <keescook(a)chromium.org>
Date: Wed Jan 16 16:35:25 2019 -0800
selftests/seccomp: Abort without user notification support
seccomp_bpf.c:3380:global.user_notification_sibling_pid_ns:Expected
unshare(CLONE_NEWPID) (18446744073709551615) == 0 (0)
seccomp_bpf.c:3381:global.user_notification_sibling_pid_ns:Expected
errno (1) == 0 (0)
seccomp_bpf.c:3365:global.user_notification_sibling_pid_ns:Expected
unshare(CLONE_NEWPID) (18446744073709551615) == 0 (0)
seccomp_bpf.c:3392:global.user_notification_sibling_pid_ns:Expected
req.pid (23610) == 0 (0)
hangs right after the above message,
root:
[ RUN ] global.user_notification_sibling_pid_ns
[ OK ] global.user_notification_sibling_pid_ns
[ RUN ] global.user_notification_fault_recv
[ OK ] global.user_notification_fault_recv
[ RUN ] global.seccomp_get_notif_sizes
[ OK ] global.seccomp_get_notif_sizes
[==========] 74 / 74 tests passed.
[ PASSED ]
I would debug this myself, double edged sword of harness, makes it
hard to debug:)
Can you take a look. "make kselftest" just hangs now and doesn't
complete.
thanks,
-- Shuah
Summary
------------------------------------------------------------------------
git repo: not informed
git branch: not informed
git commit: not informed
git describe: not informed
Test details: https://qa-reports.linaro.org/lkft/linux-next-oe/build/next-20190410
Regressions (compared to build next-20190409)
------------------------------------------------------------------------
No regressions
Fixes (compared to build next-20190409)
------------------------------------------------------------------------
No fixes
In total:
------------------------------------------------------------------------
Ran 0 total tests in the following environments and test suites.
pass 0
fail 0
xfail 0
skip 0
Environments
--------------
- dragonboard-410c - arm64
- hi6220-hikey - arm64
- i386
- juno-r2 - arm64
- x15 - arm
- x86_64
Test Suites
-----------
Failures
------------------------------------------------------------------------
dragonboard-410c:
x15:
i386:
x86:
juno-r2:
hi6220-hikey:
Skips
------------------------------------------------------------------------
No skips
--
Linaro LKFT
https://lkft.linaro.org
This changes the selftest output is several ways:
- total test count is reported at start.
- each test's result is on a single line with "# SKIP" as needed per spec.
- each test's output is prefixed with "# " as a "diagnostic line".
This creates a bit of a kernel-specific TAP dialect where the diagnostics
precede the results. The TAP spec isn't entirely clear about this, though,
so I think it's the correct solution so as to keep interactive runs making
sense. If the output _followed_ the result line in the spec-suggested
YAML form, each test would dump all of its output at once instead of as
it went, making debugging harder.
This does, however, solve the recursive TAP output problem, as sub-tests
will simply be prefixed by "# ". Parsing sub-tests becomes a simple
problem of just removing the first two characters of a given top-level
test's diagnostic output, and parsing the results.
Note that the shell construct needed to both get an exit code from
the first command in a pipe and still filter the pipe (to add the "# "
prefix) uses a POSIX solution rather than the bash "pipefail" option
which is not supported by dash.
Example:
$ tools/testing/selftests
$ make --silent -C x86 run_tests
TAP version 13
1..15 selftests: x86
# selftests: x86: single_step_syscall_64
# [RUN] Set TF and check nop
# [OK] Survived with TF set and 9 traps
# [RUN] Set TF and check syscall-less opportunistic sysret
# [OK] Survived with TF set and 12 traps
# [RUN] Set TF and check a fast syscall
# [OK] Survived with TF set and 22 traps
# [RUN] Fast syscall with TF cleared
# [OK] Nothing unexpected happened
ok 1 selftests: x86: single_step_syscall_64
# selftests: x86: sysret_ss_attrs_64
# [RUN] Syscalls followed by SS validation
# [OK] We survived
ok 2 selftests: x86: sysret_ss_attrs_64
# selftests: x86: syscall_nt_64
# [RUN] Set NT and issue a syscall
# [OK] The syscall worked and flags are still set
# [RUN] Set NT|TF and issue a syscall
# [OK] The syscall worked and flags are still set
ok 3 selftests: x86: syscall_nt_64
...[skipping example output]...
ok 15 selftests: x86: sysret_rip_64
$ make --silent -C x86 run_tests | tappy
............F..
======================================================================
FAIL: <file=stream>
selftests: x86: mov_ss_trap_64
----------------------------------------------------------------------
----------------------------------------------------------------------
Ran 15 tests in 0.000s
FAILED (failures=1)
Signed-off-by: Kees Cook <keescook(a)chromium.org>
---
No longer RFC! :)
---
tools/testing/selftests/lib.mk | 59 ++++++++++++++++++-------------
tools/testing/selftests/prefix.pl | 23 ++++++++++++
2 files changed, 57 insertions(+), 25 deletions(-)
create mode 100755 tools/testing/selftests/prefix.pl
diff --git a/tools/testing/selftests/lib.mk b/tools/testing/selftests/lib.mk
index 8b0f16409ed7..d3267c08a7b1 100644
--- a/tools/testing/selftests/lib.mk
+++ b/tools/testing/selftests/lib.mk
@@ -5,6 +5,8 @@ CC := $(CROSS_COMPILE)gcc
ifeq (0,$(MAKELEVEL))
OUTPUT := $(shell pwd)
endif
+testname = $(notdir $(shell pwd))
+selfdir = $(realpath $(dir $(filter %/lib.mk,$(MAKEFILE_LIST))))
# The following are built by lib.mk common compile rules.
# TEST_CUSTOM_PROGS should be used by tests that require
@@ -32,38 +34,45 @@ endif
.ONESHELL:
define RUN_TEST_PRINT_RESULT
- TEST_HDR_MSG="selftests: "`basename $$PWD`:" $$BASENAME_TEST"; \
- echo $$TEST_HDR_MSG; \
- echo "========================================"; \
+ TEST_HDR_MSG="selftests: $(testname): $$BASENAME_TEST"; \
if [ ! -x $$TEST ]; then \
- echo "$$TEST_HDR_MSG: Warning: file $$BASENAME_TEST is not executable, correct this.";\
- echo "not ok 1..$$test_num $$TEST_HDR_MSG [FAIL]"; \
- else \
- cd `dirname $$TEST` > /dev/null; \
- if [ "X$(summary)" != "X" ]; then \
- (./$$BASENAME_TEST > /tmp/$$BASENAME_TEST 2>&1 && \
- echo "ok 1..$$test_num $$TEST_HDR_MSG [PASS]") || \
- (if [ $$? -eq $$skip ]; then \
- echo "not ok 1..$$test_num $$TEST_HDR_MSG [SKIP]"; \
- else echo "not ok 1..$$test_num $$TEST_HDR_MSG [FAIL]"; \
- fi;) \
- else \
- (./$$BASENAME_TEST && \
- echo "ok 1..$$test_num $$TEST_HDR_MSG [PASS]") || \
- (if [ $$? -eq $$skip ]; then \
- echo "not ok 1..$$test_num $$TEST_HDR_MSG [SKIP]"; \
- else echo "not ok 1..$$test_num $$TEST_HDR_MSG [FAIL]"; \
- fi;) \
- fi; \
- cd - > /dev/null; \
+ if [ "X$(summary)" = "X" ]; then \
+ echo "# warning: 'file $$TEST is not executable, correct this.'";\
+ fi; \
+ echo "not ok $$test_num $$TEST_HDR_MSG"; \
+ else \
+ cd `dirname $$TEST` > /dev/null; \
+ if [ "X$(summary)" != "X" ]; then \
+ (./$$BASENAME_TEST > /tmp/$$BASENAME_TEST 2>&1 &&\
+ echo "ok $$test_num $$TEST_HDR_MSG") || \
+ (if [ $$? -eq $$skip ]; then \
+ echo "not ok $$test_num $$TEST_HDR_MSG # SKIP"; \
+ else \
+ echo "not ok $$test_num $$TEST_HDR_MSG";\
+ fi); \
+ else \
+ echo "# $$TEST_HDR_MSG"; \
+ (((((stdbuf -i0 -o0 -e0 ./$$BASENAME_TEST 2>&1; echo $$? >&3) | \
+ $(selfdir)/prefix.pl >&4) 3>&1) | \
+ (read xs; exit $$xs)) 4>&1 && \
+ echo "ok $$test_num $$TEST_HDR_MSG") || \
+ (if [ $$? -eq $$skip ]; then \
+ echo "not ok $$test_num $$TEST_HDR_MSG # SKIP"; \
+ else \
+ echo "not ok $$test_num $$TEST_HDR_MSG";\
+ fi); \
+ fi; \
+ cd - > /dev/null; \
fi;
endef
define RUN_TESTS
@export KSFT_TAP_LEVEL=`echo 1`; \
- test_num=`echo 0`; \
- skip=`echo 4`; \
+ test_num=0; \
+ total=`echo "$(1)" | wc -w`; \
+ skip=4; \
echo "TAP version 13"; \
+ echo "1..$$total selftests: $(testname)"; \
for TEST in $(1); do \
BASENAME_TEST=`basename $$TEST`; \
test_num=`echo $$test_num+1 | bc`; \
diff --git a/tools/testing/selftests/prefix.pl b/tools/testing/selftests/prefix.pl
new file mode 100755
index 000000000000..ec7e48118183
--- /dev/null
+++ b/tools/testing/selftests/prefix.pl
@@ -0,0 +1,23 @@
+#!/usr/bin/perl
+# SPDX-License-Identifier: GPL-2.0
+# Prefix all lines with "# ", unbuffered. Command being piped in may need
+# to have unbuffering forced with "stdbuf -i0 -o0 -e0 $cmd".
+use strict;
+
+binmode STDIN;
+binmode STDOUT;
+
+STDOUT->autoflush(1);
+
+my $needed = 1;
+while (1) {
+ my $char;
+ my $bytes = sysread(STDIN, $char, 1);
+ exit 0 if ($bytes == 0);
+ if ($needed) {
+ print "# ";
+ $needed = 0;
+ }
+ print $char;
+ $needed = 1 if ($char eq "\n");
+}
--
2.17.1
--
Kees Cook
This adds most of the non-system-crashing selftests that exist in LKDTM.
The ones that leave the system in an unstable state are marked as "SKIP"
in the test run. Since output can change based on architecture and other
kernel configs, all output is reported during the tests.
Example output:
# cd tools/testing/selftests
# make --silent -C lkdtm run_tests
TAP version 13
1..64 selftests: lkdtm
# selftests: lkdtm: PANIC.sh
# PANIC: crashes entire system [SKIP]
not ok 1 selftests: lkdtm: PANIC.sh # SKIP
# selftests: lkdtm: BUG.sh
# [19716.955023] lkdtm: Performing direct entry BUG
# [19716.956236] ------------[ cut here ]------------
# [19716.957318] kernel BUG at drivers/misc/lkdtm/bugs.c:63!
...
# BUG: saw 'call trace:': ok
ok 2 selftests: lkdtm: BUG.sh
...
Signed-off-by: Kees Cook <keescook(a)chromium.org>
---
MAINTAINERS | 1 +
tools/testing/selftests/lkdtm/.gitignore | 1 +
tools/testing/selftests/lkdtm/Makefile | 11 +++
tools/testing/selftests/lkdtm/config | 1 +
tools/testing/selftests/lkdtm/run.sh | 86 ++++++++++++++++++++++++
tools/testing/selftests/lkdtm/tests.txt | 64 ++++++++++++++++++
6 files changed, 164 insertions(+)
create mode 100644 tools/testing/selftests/lkdtm/.gitignore
create mode 100644 tools/testing/selftests/lkdtm/Makefile
create mode 100644 tools/testing/selftests/lkdtm/config
create mode 100755 tools/testing/selftests/lkdtm/run.sh
create mode 100644 tools/testing/selftests/lkdtm/tests.txt
diff --git a/MAINTAINERS b/MAINTAINERS
index 3e5a5d263f29..716d6f0fbd0c 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -8976,6 +8976,7 @@ LINUX KERNEL DUMP TEST MODULE (LKDTM)
M: Kees Cook <keescook(a)chromium.org>
S: Maintained
F: drivers/misc/lkdtm/*
+F: tools/testing/selftests/lkdtm/*
LINUX KERNEL MEMORY CONSISTENCY MODEL (LKMM)
M: Alan Stern <stern(a)rowland.harvard.edu>
diff --git a/tools/testing/selftests/lkdtm/.gitignore b/tools/testing/selftests/lkdtm/.gitignore
new file mode 100644
index 000000000000..09df4a4da272
--- /dev/null
+++ b/tools/testing/selftests/lkdtm/.gitignore
@@ -0,0 +1 @@
+[A-Z][A-Z]*.sh
diff --git a/tools/testing/selftests/lkdtm/Makefile b/tools/testing/selftests/lkdtm/Makefile
new file mode 100644
index 000000000000..f322d281fd8e
--- /dev/null
+++ b/tools/testing/selftests/lkdtm/Makefile
@@ -0,0 +1,11 @@
+# SPDX-License-Identifier: GPL-2.0
+# Makefile for LKDTM regression tests
+
+include ../lib.mk
+
+# NOTE: $(OUTPUT) won't get default value if used before lib.mk
+TEST_GEN_PROGS = $(patsubst %,$(OUTPUT)/%.sh,$(shell awk '{print $$1}' tests.txt | sed -e 's/\#//'))
+all: $(TEST_GEN_PROGS)
+
+$(OUTPUT)/%: run.sh tests.txt
+ install -m 0744 run.sh $@
diff --git a/tools/testing/selftests/lkdtm/config b/tools/testing/selftests/lkdtm/config
new file mode 100644
index 000000000000..d874990e442b
--- /dev/null
+++ b/tools/testing/selftests/lkdtm/config
@@ -0,0 +1 @@
+CONFIG_LKDTM=y
diff --git a/tools/testing/selftests/lkdtm/run.sh b/tools/testing/selftests/lkdtm/run.sh
new file mode 100755
index 000000000000..a166b925c43d
--- /dev/null
+++ b/tools/testing/selftests/lkdtm/run.sh
@@ -0,0 +1,86 @@
+#!/bin/sh
+# SPDX-License-Identifier: GPL-2.0
+#
+# This reads tests.txt for the list of LKDTM tests to invoke. Any marked
+# with a leading "#" are skipped. The line after the test name is either
+# the text to look for in dmesg for a "success", or the rationale for why
+# a test is marked to be skipped.
+#
+set -e
+TRIGGER=/sys/kernel/debug/provoke-crash/DIRECT
+
+# Verify we have LKDTM available in the kernel.
+if [ ! -r $TRIGGER ] ; then
+ /sbin/modprobe -q lkdtm || true
+ if [ ! -r $TRIGGER ] ; then
+ echo "Cannot find $TRIGGER (missing CONFIG_LKDTM?) [SKIP]"
+ else
+ echo "Cannot write $TRIGGER (need to run as root?) [SKIP]"
+ fi
+ # Skip this test
+ exit 4
+fi
+
+# Figure out which test to run from our script name.
+test=$(basename $0 .sh)
+# Look up details about the test from master list of LKDTM tests.
+line=$(egrep '^#?'"$test"'\b' tests.txt)
+if [ -z "$line" ]; then
+ echo "Missing test '$test' [SKIP]"
+ exit 4
+fi
+# Check that the test is known to LKDTM.
+if ! egrep -q '^'"$test"'$' "$TRIGGER" ; then
+ echo "'$test' missing in $TRIGGER! [SKIP]"
+ exit 4
+fi
+
+# Extract notes/expected output from test list.
+test=$(echo "$line" | cut -d" " -f1)
+if echo "$line" | grep -q ' ' ; then
+ expect=$(echo "$line" | cut -d" " -f2-)
+else
+ expect=""
+fi
+
+# If the test is commented out, report a skip
+if echo "$test" | grep -q '^#' ; then
+ test=$(echo "$test" | cut -c2-)
+ if [ -z "$expect" ]; then
+ expect="crashes entire system"
+ fi
+ echo "$test: $expect [SKIP]"
+ exit 4
+fi
+
+# If no expected output given, assume an Oops with back trace is success.
+if [ -z "$expect" ]; then
+ expect="call trace:"
+fi
+
+# Clear out dmesg for output reporting
+dmesg -c >/dev/null
+
+# Prepare log for report checking
+LOG=$(mktemp --tmpdir -t lkdtm-XXXXXX)
+cleanup() {
+ rm -f "$LOG"
+}
+trap cleanup EXIT
+
+# Most shells yell about signals and we're expecting the "cat" process
+# to usually be killed by the kernel. So we have to run it in a sub-shell
+# and silence errors.
+($SHELL -c 'cat <(echo '"$test"') >'"$TRIGGER" 2>/dev/null) || true
+
+# Record and dump the results
+dmesg -c >"$LOG"
+cat "$LOG"
+# Check for expected output
+if grep -qi "$expect" "$LOG" ; then
+ echo "$test: saw '$expect': ok"
+ exit 0
+else
+ echo "$test: missing '$expect': [FAIL]"
+ exit 1
+fi
diff --git a/tools/testing/selftests/lkdtm/tests.txt b/tools/testing/selftests/lkdtm/tests.txt
new file mode 100644
index 000000000000..3ee43df87d71
--- /dev/null
+++ b/tools/testing/selftests/lkdtm/tests.txt
@@ -0,0 +1,64 @@
+#PANIC
+BUG
+WARNING
+EXCEPTION
+#LOOP Hangs the system
+#EXHAUST_STACK Corrupts memory on failure
+#CORRUPT_STACK Crashes entire system on success
+#CORRUPT_STACK_STRONG Crashes entire system on success
+CORRUPT_LIST_ADD
+CORRUPT_LIST_DEL
+CORRUPT_USER_DS
+STACK_GUARD_PAGE_LEADING
+STACK_GUARD_PAGE_TRAILING
+UNALIGNED_LOAD_STORE_WRITE
+#OVERWRITE_ALLOCATION Corrupts memory on failure
+#WRITE_AFTER_FREE Corrupts memory on failure
+READ_AFTER_FREE
+#WRITE_BUDDY_AFTER_FREE Corrupts memory on failure
+READ_BUDDY_AFTER_FREE
+#SOFTLOCKUP Hangs the system
+#HARDLOCKUP Hangs the system
+#SPINLOCKUP Hangs the system
+#HUNG_TASK Hangs the system
+EXEC_DATA
+EXEC_STACK
+EXEC_KMALLOC
+EXEC_VMALLOC
+EXEC_RODATA
+EXEC_USERSPACE
+EXEC_NULL
+ACCESS_USERSPACE
+ACCESS_NULL
+WRITE_RO
+WRITE_RO_AFTER_INIT
+WRITE_KERN
+REFCOUNT_INC_OVERFLOW
+REFCOUNT_ADD_OVERFLOW
+REFCOUNT_INC_NOT_ZERO_OVERFLOW
+REFCOUNT_ADD_NOT_ZERO_OVERFLOW
+REFCOUNT_DEC_ZERO
+REFCOUNT_DEC_NEGATIVE Negative detected: saturated
+REFCOUNT_DEC_AND_TEST_NEGATIVE Negative detected: saturated
+REFCOUNT_SUB_AND_TEST_NEGATIVE Negative detected: saturated
+REFCOUNT_INC_ZERO
+REFCOUNT_ADD_ZERO
+REFCOUNT_INC_SATURATED Saturation detected: still saturated
+REFCOUNT_DEC_SATURATED Saturation detected: still saturated
+REFCOUNT_ADD_SATURATED Saturation detected: still saturated
+REFCOUNT_INC_NOT_ZERO_SATURATED
+REFCOUNT_ADD_NOT_ZERO_SATURATED
+REFCOUNT_DEC_AND_TEST_SATURATED Saturation detected: still saturated
+REFCOUNT_SUB_AND_TEST_SATURATED Saturation detected: still saturated
+#REFCOUNT_TIMING timing only
+#ATOMIC_TIMING timing only
+USERCOPY_HEAP_SIZE_TO
+USERCOPY_HEAP_SIZE_FROM
+USERCOPY_HEAP_WHITELIST_TO
+USERCOPY_HEAP_WHITELIST_FROM
+USERCOPY_STACK_FRAME_TO
+USERCOPY_STACK_FRAME_FROM
+USERCOPY_STACK_BEYOND
+USERCOPY_KERNEL
+USERCOPY_KERNEL_DS
+STACKLEAK_ERASING
--
2.17.1
--
Kees Cook
This changes the selftest output is several ways:
- total test count is reported at start.
- each test's result is on a single line with "# SKIP" as needed per spec.
- each test's output is report _before_ the result line as a commented
"diagnostic" line.
This creates a bit of a kernel-specific TAP output where the diagnostics
precede the results. The TAP spec isn't entirely clear about this, though,
so I think it's the correct solution so as to not keep interactive
runs making sense. If the output _followed_ the result line in the
spec-suggested YAML form, each test would dump all of its output at once
instead of as it went, making debugging harder.
Also, while "sed -u" is used to add the "# " line prefixes, this
still doesn't work for all output. For example, the "timer" lists
print out text, then do the work, then print a result and a newline.
This isn't visible any more. And some tests still show nothing until
they finish. I haven't found a way to force the prefixing while keeping
the output entirely unbuffered. :(
Note that the shell construct needed to both get an exit code from
the first command in a pipe and still filter the pipe (to add the "# "
prefix) uses a POSIX solution rather than the bash "pipefail" option
which is not supported by dash.
Signed-off-by: Kees Cook <keescook(a)chromium.org>
---
tools/testing/selftests/lib.mk | 54 +++++++++++++++++++---------------
1 file changed, 31 insertions(+), 23 deletions(-)
diff --git a/tools/testing/selftests/lib.mk b/tools/testing/selftests/lib.mk
index 8b0f16409ed7..056dac8f5701 100644
--- a/tools/testing/selftests/lib.mk
+++ b/tools/testing/selftests/lib.mk
@@ -5,6 +5,7 @@ CC := $(CROSS_COMPILE)gcc
ifeq (0,$(MAKELEVEL))
OUTPUT := $(shell pwd)
endif
+srcdir = $(notdir $(shell pwd))
# The following are built by lib.mk common compile rules.
# TEST_CUSTOM_PROGS should be used by tests that require
@@ -32,38 +33,45 @@ endif
.ONESHELL:
define RUN_TEST_PRINT_RESULT
- TEST_HDR_MSG="selftests: "`basename $$PWD`:" $$BASENAME_TEST"; \
- echo $$TEST_HDR_MSG; \
- echo "========================================"; \
+ TEST_HDR_MSG="selftests: $(srcdir): $$BASENAME_TEST"; \
if [ ! -x $$TEST ]; then \
- echo "$$TEST_HDR_MSG: Warning: file $$BASENAME_TEST is not executable, correct this.";\
- echo "not ok 1..$$test_num $$TEST_HDR_MSG [FAIL]"; \
- else \
- cd `dirname $$TEST` > /dev/null; \
- if [ "X$(summary)" != "X" ]; then \
- (./$$BASENAME_TEST > /tmp/$$BASENAME_TEST 2>&1 && \
- echo "ok 1..$$test_num $$TEST_HDR_MSG [PASS]") || \
- (if [ $$? -eq $$skip ]; then \
- echo "not ok 1..$$test_num $$TEST_HDR_MSG [SKIP]"; \
- else echo "not ok 1..$$test_num $$TEST_HDR_MSG [FAIL]"; \
- fi;) \
- else \
- (./$$BASENAME_TEST && \
- echo "ok 1..$$test_num $$TEST_HDR_MSG [PASS]") || \
- (if [ $$? -eq $$skip ]; then \
- echo "not ok 1..$$test_num $$TEST_HDR_MSG [SKIP]"; \
- else echo "not ok 1..$$test_num $$TEST_HDR_MSG [FAIL]"; \
- fi;) \
- fi; \
- cd - > /dev/null; \
+ if [ "X$(summary)" = "X" ]; then \
+ echo "# warning: 'file $$TEST is not executable, correct this.'";\
+ fi; \
+ echo "not ok $$test_num $$TEST_HDR_MSG"; \
+ else \
+ cd `dirname $$TEST` > /dev/null; \
+ if [ "X$(summary)" != "X" ]; then \
+ (./$$BASENAME_TEST > /tmp/$$BASENAME_TEST 2>&1 &&\
+ echo "ok $$test_num $$TEST_HDR_MSG") || \
+ (if [ $$? -eq $$skip ]; then \
+ echo "not ok $$test_num $$TEST_HDR_MSG # SKIP"; \
+ else \
+ echo "not ok $$test_num $$TEST_HDR_MSG";\
+ fi); \
+ else \
+ echo "# $$TEST_HDR_MSG"; \
+ (((((./$$BASENAME_TEST 2>&1; echo $$? >&3) | \
+ sed -ue 's/^/# /' >&4) 3>&1) | \
+ (read xs; exit $$xs)) 4>&1 && \
+ echo "ok $$test_num $$TEST_HDR_MSG") || \
+ (if [ $$? -eq $$skip ]; then \
+ echo "not ok $$test_num $$TEST_HDR_MSG # SKIP"; \
+ else \
+ echo "not ok $$test_num $$TEST_HDR_MSG";\
+ fi); \
+ fi; \
+ cd - > /dev/null; \
fi;
endef
define RUN_TESTS
@export KSFT_TAP_LEVEL=`echo 1`; \
test_num=`echo 0`; \
+ total=`echo "$(1)" | wc -w`; \
skip=`echo 4`; \
echo "TAP version 13"; \
+ echo "1..$$total selftests: $(srcdir)"; \
for TEST in $(1); do \
BASENAME_TEST=`basename $$TEST`; \
test_num=`echo $$test_num+1 | bc`; \
--
2.17.1
--
Kees Cook
Extend bpf_skb_adjust_room growth to mark inner MAC header so that
L2 encapsulation can be used for tc tunnels.
Patch #1 extends the existing test_tc_tunnel to support UDP
encapsulation; later we want to be able to test MPLS over UDP and
MPLS over GRE encapsulation.
Patch #2 adds the BPF_F_ADJ_ROOM_ENCAP_L2(len) macro, which
allows specification of inner mac length. Other approaches were
explored prior to taking this approach. Specifically, I tried
automatically computing the inner mac length on the basis of the
specified flags (so inner maclen for GRE/IPv4 encap is the len_diff
specified to bpf_skb_adjust_room minus GRE + IPv4 header length
for example). Problem with this is that we don't know for sure
what form of GRE/UDP header we have; is it a full GRE header,
or is it a FOU UDP header or generic UDP encap header? My fear
here was we'd end up with an explosion of flags. The other approach
tried was to support inner L2 header marking as a separate room
adjustment, i.e. adjust for L3/L4 encap, then call
bpf_skb_adjust_room for L2 encap. This can be made to work but
because it imposed an order on operations, felt a bit clunky.
Patch #3 syncs tools/ bpf.h.
Patch #4 extends the tests again to support MPLSoverGRE,
MPLSoverUDP, and transparent ethernet bridging (TEB) where
the inner L2 header is an ethernet header. Testing of BPF
encap against tunnels is done for cases where configuration
of such tunnels is possible (MPLSoverGRE[6], MPLSoverUDP,
gre[6]tap), and skipped otherwise. Testing of BPF encap/decap
is always carried out.
Changes since v1:
- fixed formatting of commit references.
- BPF_F_ADJ_ROOM_FIXED_GSO flag enabled on all variants (patch 1)
- fixed fou6 options for UDP encap; checksum errors observed were
due to the fact fou6 tunnel was not set up with correct ipproto
options (41 -6). 0 checksums work fine (patch 1)
- added definitions for mask and shift used in setting L2 length
(patch 2)
- allow udp encap with fixed GSO (patch 2)
- changed "elen" to "l2_len" to be more descriptive (patch 4)
- added support for ethernet L2 encap to tests; testing against
gre[6]tap tunnels (patch 4).
Alan Maguire (4):
selftests_bpf: add UDP encap to test_tc_tunnel
bpf: add layer 2 encap support to bpf_skb_adjust_room
bpf: sync bpf.h to tools/ for BPF_F_ADJ_ROOM_ENCAP_L2
selftests_bpf: add L2 encap to test_tc_tunnel
include/uapi/linux/bpf.h | 10 +
net/core/filter.c | 15 +-
tools/include/uapi/linux/bpf.h | 10 +
tools/testing/selftests/bpf/progs/test_tc_tunnel.c | 335 +++++++++++++++++----
tools/testing/selftests/bpf/test_tc_tunnel.sh | 147 +++++++--
5 files changed, 427 insertions(+), 90 deletions(-)
--
1.8.3.1
On 4/3/19 7:34 PM, shuah wrote:
> Hi Linus,
>
> Please pull the following Kselftest update for Linux 5.1-rc4
>
> This Kselftest update for Linux 5.1-rc4 consists of fixes to rseq,
> cgroup, and efivarfs tests.
>
> diff is attached.
>
> thanks,
> -- Shuah
>
Hi Linus,
I just noticed that one commit in this pull request has mismatched
author and signed-off. Please disregard the pull request.
I will fix this and send a new pull request. Sorry for the noise.
thanks,
-- Shuah
[I suggest to stop waiting for more acks and merge this into linux-next as is.]
PTRACE_GET_SYSCALL_INFO is a generic ptrace API that lets ptracer obtain
details of the syscall the tracee is blocked in.
There are two reasons for a special syscall-related ptrace request.
Firstly, with the current ptrace API there are cases when ptracer cannot
retrieve necessary information about syscalls. Some examples include:
* The notorious int-0x80-from-64-bit-task issue. See [1] for details.
In short, if a 64-bit task performs a syscall through int 0x80, its tracer
has no reliable means to find out that the syscall was, in fact,
a compat syscall, and misidentifies it.
* Syscall-enter-stop and syscall-exit-stop look the same for the tracer.
Common practice is to keep track of the sequence of ptrace-stops in order
not to mix the two syscall-stops up. But it is not as simple as it looks;
for example, strace had a (just recently fixed) long-standing bug where
attaching strace to a tracee that is performing the execve system call
led to the tracer identifying the following syscall-exit-stop as
syscall-enter-stop, which messed up all the state tracking.
* Since the introduction of commit 84d77d3f06e7e8dea057d10e8ec77ad71f721be3
("ptrace: Don't allow accessing an undumpable mm"), both PTRACE_PEEKDATA
and process_vm_readv become unavailable when the process dumpable flag
is cleared. On such architectures as ia64 this results in all syscall
arguments being unavailable for the tracer.
Secondly, ptracers also have to support a lot of arch-specific code for
obtaining information about the tracee. For some architectures, this
requires a ptrace(PTRACE_PEEKUSER, ...) invocation for every syscall
argument and return value.
PTRACE_GET_SYSCALL_INFO returns the following structure:
struct ptrace_syscall_info {
__u8 op; /* PTRACE_SYSCALL_INFO_* */
__u32 arch __attribute__((__aligned__(sizeof(__u32))));
__u64 instruction_pointer;
__u64 stack_pointer;
union {
struct {
__u64 nr;
__u64 args[6];
} entry;
struct {
__s64 rval;
__u8 is_error;
} exit;
struct {
__u64 nr;
__u64 args[6];
__u32 ret_data;
} seccomp;
};
};
The structure was chosen according to [2], except for the following
changes:
* seccomp substructure was added as a superset of entry substructure;
* the type of nr field was changed from int to __u64 because syscall
numbers are, as a practical matter, 64 bits;
* stack_pointer field was added along with instruction_pointer field
since it is readily available and can save the tracer from extra
PTRACE_GETREGS/PTRACE_GETREGSET calls;
* arch is always initialized to aid with tracing system calls
* such as execve();
* instruction_pointer and stack_pointer are always initialized
so they could be easily obtained for non-syscall stops;
* a boolean is_error field was added along with rval field, this way
the tracer can more reliably distinguish a return value
from an error value.
strace has been ported to PTRACE_GET_SYSCALL_INFO.
Starting with release 4.26, strace uses PTRACE_GET_SYSCALL_INFO API
as the preferred mechanism of obtaining syscall information.
[1] https://lore.kernel.org/lkml/CA+55aFzcSVmdDj9Lh_gdbz1OzHyEm6ZrGPBDAJnywm2LF…
[2] https://lore.kernel.org/lkml/CAObL_7GM0n80N7J_DFw_eQyfLyzq+sf4y2AvsCCV88Tb3…
---
Notes:
v9:
* Rebased to linux-next again due to syscall_get_arguments() signature change.
v8:
* Moved syscall_get_arch() specific patches to a separate patchset
which is now merged into audit/next tree.
* Rebased to linux-next.
* Moved ptrace_get_syscall_info code under #ifdef CONFIG_HAVE_ARCH_TRACEHOOK,
narrowing down the set of architectures supported by this implementation
back to those 19 that enable CONFIG_HAVE_ARCH_TRACEHOOK because
I failed to get all syscall_get_*(), instruction_pointer(),
and user_stack_pointer() functions implemented on some niche
architectures. This leaves the following architectures out:
alpha, h8300, m68k, microblaze, and unicore32.
v7:
* Rebased to v5.0-rc1.
* 5 arch-specific preparatory patches out of 25 have been merged
into v5.0-rc1 via arch trees.
v6:
* Add syscall_get_arguments and syscall_set_arguments wrappers
to asm-generic/syscall.h, requested by Geert.
* Change PTRACE_GET_SYSCALL_INFO return code: do not take trailing paddings
into account, use the end of the last field of the structure being written.
* Change struct ptrace_syscall_info:
* remove .frame_pointer field, is is not needed and not portable;
* make .arch field explicitly aligned, remove no longer needed
padding before .arch field;
* remove trailing pads, they are no longer needed.
v5:
* Merge separate series and patches into the single series.
* Change PTRACE_EVENTMSG_SYSCALL_{ENTRY,EXIT} values as requested by Oleg.
* Change struct ptrace_syscall_info: generalize instruction_pointer,
stack_pointer, and frame_pointer fields by moving them from
ptrace_syscall_info.{entry,seccomp} substructures to ptrace_syscall_info
and initializing them for all stops.
* Add PTRACE_SYSCALL_INFO_NONE, set it when not in a syscall stop,
so e.g. "strace -i" could use PTRACE_SYSCALL_INFO_SECCOMP to obtain
instruction_pointer when the tracee is in a signal stop.
* Patch all remaining architectures to provide all necessary
syscall_get_* functions.
* Make available for all architectures: do not conditionalize on
CONFIG_HAVE_ARCH_TRACEHOOK since all syscall_get_* functions
are implemented on all architectures.
* Add a test for PTRACE_GET_SYSCALL_INFO to selftests/ptrace.
v4:
* Do not introduce task_struct.ptrace_event,
use child->last_siginfo->si_code instead.
* Implement PTRACE_SYSCALL_INFO_SECCOMP and ptrace_syscall_info.seccomp
support along with PTRACE_SYSCALL_INFO_{ENTRY,EXIT} and
ptrace_syscall_info.{entry,exit}.
v3:
* Change struct ptrace_syscall_info.
* Support PTRACE_EVENT_SECCOMP by adding ptrace_event to task_struct.
* Add proper defines for ptrace_syscall_info.op values.
* Rename PT_SYSCALL_IS_ENTERING and PT_SYSCALL_IS_EXITING to
PTRACE_EVENTMSG_SYSCALL_ENTRY and PTRACE_EVENTMSG_SYSCALL_EXIT
* and move them to uapi.
v2:
* Do not use task->ptrace.
* Replace entry_info.is_compat with entry_info.arch, use syscall_get_arch().
* Use addr argument of sys_ptrace to get expected size of the struct;
return full size of the struct.
Dmitry V. Levin (6):
nds32: fix asm/syscall.h # waiting for ack since early January
hexagon: define syscall_get_error() and syscall_get_return_value() # waiting for ack since November
mips: define syscall_get_error() # acked
parisc: define syscall_get_error() # acked
powerpc: define syscall_get_error() # waiting for ack since early December
selftests/ptrace: add a test case for PTRACE_GET_SYSCALL_INFO
Elvira Khabirova (1):
ptrace: add PTRACE_GET_SYSCALL_INFO request # reviewed
arch/hexagon/include/asm/syscall.h | 14 +
arch/mips/include/asm/syscall.h | 6 +
arch/nds32/include/asm/syscall.h | 27 +-
arch/parisc/include/asm/syscall.h | 7 +
arch/powerpc/include/asm/syscall.h | 10 +
include/linux/tracehook.h | 9 +-
include/uapi/linux/ptrace.h | 35 +++
kernel/ptrace.c | 103 ++++++-
tools/testing/selftests/ptrace/.gitignore | 1 +
tools/testing/selftests/ptrace/Makefile | 2 +-
.../selftests/ptrace/get_syscall_info.c | 271 ++++++++++++++++++
11 files changed, 470 insertions(+), 15 deletions(-)
create mode 100644 tools/testing/selftests/ptrace/get_syscall_info.c
--
ldv
Hi,
I'm working on writing a LKDTM selftest, and I noticed that the output
format for the existing selftest tests don't seem to actually conform
to the TAP 13 specification[1]. Namely, there is a ton of output
between the "ok" and "not ok" lines, in various places for various
reasons. IIUC, the spec says these lines should follow the "ok" lines
and be (at least) indented to be parsed as YAML.
How does the existing CI do the selftest output parsing? Is there some
other spec I should have read for the correct interpretation here?
(TAP 13 says this part isn't exactly agreed on yet.)
e.g., I see this right now:
# cd tools/testing/selftests
# make --silent -C lib run_tests
TAP version 13
selftests: lib: printf.sh
========================================
printf: module test_printf is not found [SKIP]
not ok 1..1 selftests: lib: printf.sh [SKIP]
selftests: lib: bitmap.sh
========================================
bitmap: module test_bitmap is not found [SKIP]
not ok 1..2 selftests: lib: bitmap.sh [SKIP]
selftests: lib: prime_numbers.sh
========================================
prime_numbers: module prime_numbers is not found [SKIP]
not ok 1..3 selftests: lib: prime_numbers.sh [SKIP]
There is no "plan" line, lots of lines between test lines, and the
test numbering is non-spec. I would have expected the following
output:
TAP version 13
1..3 lib
not ok 1 lib: printf.sh # SKIP module test_printf is not found
not ok 2 lib: bitmap.sh # SKIP module test_bitmap is not found
not ok 3 lib: prime_numbers.sh # SKIP module prime_numbers is not found
And for output, here's another:
# make --silent -C ipc run_tests
TAP version 13
selftests: ipc: msgque
========================================
Pass 0 Fail 0 Xfail 0 Xpass 0 Skip 0 Error 0
1..0
ok 1..1 selftests: ipc: msgque [PASS]
The test-specific output should be after the "ok" line:
TAP version 13
1..1 ipc
ok 1 ipc: msgque
---
output: |+
Pass 0 Fail 0 Xfail 0 Xpass 0 Skip 0 Error 0
...
The test runner knows how many tests are going to be run, so the
"plan" line can be correct. The ok/not-ok lines can be fixed to avoid
the prefixed "1..", and the output during the tests can be captured
and indented for valid YAML (above uses "output" to capture the text
output with a literal block quote style of "|+"[2]).
It seems like the current output cannot be parsed by things like
Test::Harness. The spec is pretty specific about "Any output that is
not a version, a plan, a test line, a YAML block, a diagnostic or a
bail out is incorrect."
Has this been discussed before? I spent a little time looking but
couldn't find an earlier thread...
Thanks!
-Kees
[1] https://testanything.org/tap-version-13-specification.html
[2] https://yaml-multiline.info
--
Kees Cook
Hi,
this is a draft trying to define some API in order to remove some
redundancy from kselftest shell scripts. Existing kselftest.h already
defines some sort of API for C, there is none for shell.
It's just a small example how things could be. Draft, not meant to be
really merged. But instead of defining shell library (with more useful
helpers), I'd rather adopt LTP shell [1] and C [2] API to kselftest.
LTP API [1] is more like a framework, easy to use with a lot of helpers
making tests 1) small, concentrating on the problem itself 2) have
unique output. API is well documented [3] [4], it's creator Cyril Hrubis
made it after years experience of handling (at the time) quite bad
quality LTP code. Rewriting LTP tests to use this API improved tests a
lot (less buggy, easier to read).
Some examples of advantages of LTP API:
* SAFE_*() macros for C, which handles errors inside a library
* unified messages, unified test status, unified way to exit testing due
missing functionality, at the end of testing there is summary of passed,
failed and skipped tests
* many prepared functionality for both C and shell
* handling threads, parent-child synchronization
* setup and cleanup functions
* "flags" for defining requirements or certain functionality (need root, temporary
directory, ...)
* and many other
kselftest and LTP has a bit different goals and approach. Probably
not all of LTP API is needed atm, but I guess it's at least worth of
thinking to adopt it.
There are of course other options: reinvent a wheel or left kselftest
code in a state it is now (code quality varies, some of the code is
really messy, buggy, not even compile).
[1] https://github.com/linux-test-project/ltp/blob/master/testcases/lib/tst_tes…
[2] https://github.com/linux-test-project/ltp/tree/master/lib
[3] https://github.com/linux-test-project/ltp/wiki/Test-Writing-Guidelines#22-w…
[4] https://github.com/linux-test-project/ltp/wiki/Test-Writing-Guidelines#23-w…
Petr Vorel (2):
selftests: Start shell API
selftest/kexec: Use kselftest shell API
.../selftests/kexec/kexec_common_lib.sh | 74 +++++--------------
.../selftests/kexec/test_kexec_file_load.sh | 53 ++++++-------
.../selftests/kexec/test_kexec_load.sh | 20 ++---
tools/testing/selftests/kselftest.sh | 53 +++++++++++++
4 files changed, 105 insertions(+), 95 deletions(-)
mode change 100755 => 100644 tools/testing/selftests/kexec/kexec_common_lib.sh
create mode 100644 tools/testing/selftests/kselftest.sh
--
2.20.1