[VIRT-263 # ARMv8.1-VHE Virtual Host Extensions ]
Merged some fixes from Alex.
Starting to test kvm-inside-tcg.
[VIRT-327 # Richard's upstream QEMU work ]
Posted v1 of aa32 base isa conversion to decodetree.
Reviewed v5 of the sparc64 invert endian tlb bit patches.
r~
[ACTIVITY] On buildbot and Linux kernel CI duty
- One obscure alias analysis bug that only failed on GCC on AArch64
that suddenly started failing a unit test. Tracked down to an
unitialised variable when API called directly via unit-test. Test has
been reverted, but I've put enough information in a PR for an AA
expert to either fix the test or initialise class member pointer to
nullptr.
- A long review of an LLD patch that changes page-alignment rules and
this has impact on the TLS address. Spent way too long trying to
reverse engineer why the formula used would work, needed to find the
corresponding loader code to find the other side.
- Assertion added to LLVM triggered a failure in Linux kernel build,
reproduced example and forwarded on
- Code size investigation Linaro Connect proposal accepted, now have
to write it.
Progress:
* VIRT-65 [QEMU upstream maintainership]
+ QEMU 4.1 rc2 tagged
+ investigated and sent patches for a problem reported by Mark Rutland
where we still were sometimes putting the initrd on top of the kernel
in our built-in bootloader code
+ spent a couple of days tracking down the cause of LP:1696773, an
intermittent bug when running AArch64 Go programs under QEMU linux-user
mode. This turned out to be an incorrect implementation of 'sigaltstack'
as setting a process-wide signal stack rather than a per-thread one.
+ prompted by a patch from Greensocs fixing a vmstate migration bug
in the pl330 model, improved the vmstate macros to catch that category
of bug; this detected one other bug lurking in a different device.
thanks
-- PMM
== Progress ==
* LLVM 8.0.1 final binaries uploaded
* Use ninja in release job [LLVM-536]
- Patches ready, but waiting for 9.0.0-rc1 so I can test (was
supposed to come out this week but got postponed to Monday)
* Investigate running benchmarks in containers [TCWG-1513]
- Still having trouble, but working on it
* IR SVE reviews [LLVM-545]
- Posted some feedback to D53137
* A bit of fuzzing of GlobalISel CallLowering for AArch32
== Plan ==
* Upcoming vacation: 6 - 13 August
[LLVM-478] Clang and GCC code size comparison
- Looked at CMSIS-DSP, clang consistently produces smaller code than
GCC in contrast to Zephyr. Recorded some specific areas
- Tidied up scripts and build system patches to post for review
- Wrote up a readme for how to reproduce the results
[Misc]
- LLD reviews for reducing image-size using some VA/PA address alignment tricks.
- MC and LLD changes to add MOV[WNZ] relocations.
- Looked into why our container might be giving Resource unavailable
during ninja check-all
Some thoughts on ABI process.
Progress:
* VIRT-65 [QEMU upstream maintainership]
+ QEMU 4.1 rc1 tagged
+ investigating various bugs:
- LP:1836501 (assert using vexpress-a15 board with KVM) : couldn't
reproduce; this reminded me I needed to do a debian upgrade on my
cubietruck devboard, though.
- LP:1830864 (assert using -cpu host,aarch64=off with KVM on a host
kernel older than 4.15): diagnosed problem and sent a patch to fix it
- sent a patch to fix a URL in our configure script which pointed
users at the binary downloads page of the project website when it
wanted the source downloads page
- sent a patch fixing a problem building elf2dmp when libcurl's header
files are in a non-standard location
thanks
-- PMM
== Progress ==
* LLVM 8.0.1 rc4 binaries uploaded
* Buildbot and kernel builds monitoring
- Investigated/reported/fixed a couple of issues
- Tried to reproduce a clang-native-arm-lnt-perf failure that's been
keeping the bot red since the 3rd of July, but it turns out to be very
tricky
* Investigate running benchmarks in containers [TCWG-1513]
- Got quite far with this, but got a bit stuck with a file that
cannot be accessed for reasons that are not clear to me yet
* IR SVE reviews [LLVM-545]
- Looking at Graham's rebased size queries patch
== Plan ==
* Upcoming vacation: 6 - 13 August
Good afternoon,
GDB has a valuable feature consisting of process record and replay. In
fact, GDB can record a log of process execution and save it. This record
can be loaded later on, and used for debugging. This is called offline
debugging. it offers the advantage that you can catch the issue once, and
replay it as much as needed to find the root cause and fix it. this is
extremely valuable for non reproduce-able or hard to reproduce bugs. you
can replay the record either forwards or backwards, which is very convivial
for observing and analyzing the software.
To realize this functionality, GDB is in fact executing the software, one
assembly instruction after another and recording relevant registers and
memory locations. This is a slow operation that can drastically change the
timing of process execution, and thus change the conditions that raise the
bug. To overcome this limitation, GDB can use available SoC IPs to
accelerate the operation. As per today, GDB has support for "Intel
Processor Trace" and " Branch Trace Store" IP on Intel processors.
ARM based SoCs have also IPs that can be used to assist process record,
namely CoreSight trace sources (ETM, PTM ..), trace links ( Funnels ...)
and trace sinks (ETB, ETR, TPIU...). They are now supported in Linux
kernel, through corresponding drivers and the extension of perf. A library
for decoding ETM traces (OpenCSD) is also available. The way is now paved
to bring acceleration of process record for ARM based SoC to GDB.
I am re-sending RFC and making it available as basis for discussions for
implementing this feature. it is also attached as text file
B.R.
Zied Guermazi
Non intrusive execution recording
for GDB using ARM CoreSight
Status of this Memo
This memo provides information for Linaro coresight and toolchain
communities.
Distribution of this memo is unlimited.
Abstract
A method of realizing execution recording in GDB in a non-intrusive way.
This method is based on the use of CoreSight hardware tracing, available on
ARM Cortex devices.
Table of Contents
1 Introduction
2 State of the art
3 Use cases
3.1 Self hosted debug monitor
3.2 Remote debug monitor
3.3 External debugger
4 Implementation needs
4.1 Self hosted debug monitor
4.2 Remote Debug monitor
4.3 External debugger
5 Remote protocol execution sequence
6 Remote protocol extensions
7 Solutions and alternatives .
7.1 Scope definition
7.2 CoreSight infrastructure exposure to the user
7.3 Parameters needed for parsing traces
1. Introduction
CoreSight technology offers a toolset for tracing the execution of a
program on a CPU, as well as routing the traces to an external trace port
analyzer or storing it in a dedicated internal memory. Those traces do not
affects system performance, and can be used as a record for program
execution.
GDB offers reverse debugging by recording program execution and storing
it. GDB offers either full record or program flow (branch) record. Records
can be replayed later-on for forwards or backwards debugging.
This request for comments is about realizing GDB record and replay
functionality using CoreSight technology. it presents typical use cases
and discuss different alternatives for realizing above mentioned feature.
2. State of the art
GDB currently supports two execution recording variants:
- full record: where registers as well as memory are recorded for each
instruction. in this case GDB collects the registers as well as involved
memory area after each instruction. currently this has no support for
hardware accelerators
- branch record: where only program flow is recorded. In this case GDB
collects program execution flow. currently branch record is implemented
either with or without hardware acceleration by using Intel branch trace
store "bts" and Intel processor trace "pt" hardware accelerator on
supported cpus.
3. Use cases
Programs running on ARM processors can be be debugged in many
configurations. three of them are selected in this RFC as base for
discussion :
3.1. Self hosted debug monitor
Those are systems where the debugger program runs on the same cpu as the
debugged program and monitors it. user interacts with the debugging session
on the target host itself.
Linux GDB is an example of such systems. in such a system following
setup is considered
- Target: a process running on an ARM cortex A
- Debugger: gnu GDB via ptrace API (arm-linux-gnueabihf-gdb)
+-----------------------------------+
| Target |
| +------------+ |
| +------+ | Coresight | |
| | | | components:| |
| | GDB |<--------->| | |
| | | ^ | DWT, ETM, | |
| +------+ | | ITM, TPIU | |
| ^ | | TMC, ETB | |
| | | +------------+ |
+----|---------|--------------------+
| |
| |
arm-linux- |
gnueabihf- |
gdb |
debug: ptrace
trace: perf/CoreSight drivers
3.2. Remote debug monitor
Those are systems where the debugger program runs on the same cpu as the
debugged program and monitors it. user interacts with the debugging session
remotely from a PC
Linux GDB is an example of such systems. in such a system following
setup is considered
- Target: a process running on an ARM cortex A
- GDB server: gnu gdbserver (arm-linux-gnueabihf-gdbserver)
- GDB client: gnu GDB (arm-linux-gnueabihf-gdb)
- UI: eclipse with needed plugins, MI interface is used.
+--------------------------+ +---------------------------------------+
| Host | | Target |
| | | +------------+ |
| +-----+ +------+ | | +------+ | Coresight | |
| | | | GDB | | | | GDB | | components:| |
| | UI |<--->| |<--->|<--->|<--->| |<--------->| | |
| | | ^ |Client| ^ | ^ | |Server| ^ | DWT, ETM, | |
| +-----+ | +------+ | | | | +------+ | | ITM, TPIU | |
| ^ | ^ | | | | ^ | | TMC, ETB | |
| | | | | | | | | | +------------+ |
+----|-----|-----|------|--+ | +--------|---------|--------------------+
| | | | | | |
| | | | | | |
Eclipse | arm-linux- | | arm-linux- |
| gnueabihf- | TCP/IP gnueabihf- |
| gdb | UART gdbserver |
GDB MI GDB remote debug: ptrace
protocol trace: perf/CoreSight drivers
3.3. External debugger
Those are systems where an external debugger is used. It accesses the
target using JTAG or SWD. Target is usually a bare metal embedded systems
or systems with an rtos.
as an example, following setup is considered:
- Target: firmware running on ARM cortex M.
- Debugger: external debug and trace device.
- GDB server: OpenOcd.
- GDB Client: arm-none-eabi-gdb.
- UI: eclipse with needed plugins, MI interface is used.
+--------------------------------------+ +-------+ +-------------+
| Host | | dbggr | | Target |
| | | | | |
| +-----+ +------+ +------+ | | | | Coresight |
| | | | GDB | | GDB | | | Debug | | components: |
| | UI |<--->| |<--->| |<-->|<--->| + |<--->| |
| | | ^ |Client| ^ |Server| | ^ | Trace | ^ | DWT, ETM, |
| +-----+ | +------+ | +------+ | | | | | | ITM, TPIU |
| ^ | ^ | ^ | | | | | | |
| | | | | | | | | | | | |
+----|-----|-----|------|-----|--------+ | +-------+ | +-------------+
| | | | | | |
| | | | | | |
Eclipse | arm-none- | OpenOcd | |
| eabi-gdb | PyOcd | |
| | | |
GDB MI GDB remote Ethernet debug: JTAG/SWD
protocol USB trace: Serial/Parallel
4. Implementation needs
4.1 Self hosted debug monitor
GDB : arm-linux-gnueabihf-gdb
the interface defined in btrace.h for capturing and processing traces
has to be implemented for arm CoreSight.
needed actions:
- in btrace-common.h: add needed structures for capturing and
handling etm traces
- in linux-btrace.h:
- add btrace_tinfo_etm
- amend btrace_target_info
- in linux-btrace.c: change following functions to support etm
traces
- linux_enable_btrace
- linux_disable_btrace
- linux_read_btrace
- linux_btrace_conf
- in arm-linux-nat.c:add an api to
- configure btrace
- enable btrace
- disable btrace
- read btrace
- in btrace.c
- btrace_add_pc btrace_fetch has to be implemented for
Coresight this means using opencsd library to parse etms and then
reconstruct executed instructions accordingly (btrace_compute_ftrace_1)
- in record-btrace.c
- add command for showing record btrace etm options
- add command for starting tracing with CoreSight and its
handler (cmd_record_btrace_etm_start)
- adapt cmd_show_record_btrace_cpu
...
perf:
needed actions:
- make sure that perf can start/stop tracing a process with its
threads, collect etm traces and deliver them to the user
4.2 Remote Debug monitor
changes described in 7.1 are needed. in addition, and to support remote
protocol following changes are needed
GDB server: arm-linux-gnueabihf-gdbserver
needed actions:
- in linux-low
- linux_low_read_btrace: add support for etm traces formatting.
- linux_low_btrace_conf: :add support for etm configuration
formatting.
GDB client: arm-linux-gnueabihf-gdb
needed actions:
- in remote.c
- adapt enable_btrace
- adapt disable_btrace
- in btrace.c
- parse_xml_btrace: update btrace.dtd [2] and related data
structures btrace_xxx
- parse_xml_btrace_conf: update btrace-conf.dtd [3] and related
data structures btrace_conf_xxx
- extend Remote protocol handling to support coresight etm traces
UI: eclipse
needed actions
make sure that the plugin for recoding execution and replaying it
is coping well in case of arm-linux
Remote protocol needs to be extended by
-1- Adding Qbtrace:CoreSight (or etm) to start collecting etm traces
-2- Amending 'Branch Trace Format' xml specification to consider etm
traces transfer
-3- Amending 'Branch Trace Configuration Format' xml specification to
consider parameters needed for etm
4.3 External debugger
changes described in 7.2 are needed. in addition, and to support tracing
a remote dealing with an external debugger (bare metal embedded system)
following changes are needed
GDB server: OpenOcd
needed actions:
- rework etm driver to make it up to date.
- add a driver for configuring trace interconnect IPs
- rework the driver for TPIU.
- integrate support for a Trace port analyzer.
- extend remote protocol implementation to support recording
Coresight infrastructure of the SoC is to be set in OpenOcd through
configuration files. Parameters that are not relevant for GDB are also
specified in configuration files (trace sink, trace protocol, port size,
trace synch frequency, cycle accurate tracing etc ...)
GDB client: arm-none-eabi-gdb
needed actions:
- extend Remote protocol to support coresight etm traces
- integrate etm trace parsing library
- interface the parser to record_btrace_target
Remote protocol needs -in addition to 7.2- to be extended by
- Adding Qbtrace-conf:CoreSight:core=value to support multicore SoC
- Adding btrace-conf:CoreSight:id=value to support demultiplexing
multiple trace sources
- Adding Qbtrace-conf:CoreSight:filter:context=value to support
filtering traces belonging to a given process/thread
- Adding Qbtrace-conf:CoreSight:filter:start-address=value
and Qbtrace-conf:CoreSight:filter:end-address=value to
support filtering traces for given functions/blocks/lib
- Adding Qbtrace-conf:CoreSight:trigger:on-address=value
and Qbtrace-conf:CoreSight:trigger:off-address=value to
support triggering tracing or stop tracing if a certain function/block/lib
is executed
alternatively some of configurations related to filtering and
triggering can be delegated to the GDB server.
UI: eclipse
test and verify that existing plugins cope well with GDB extensions
5. Remote protocol execution sequence
GDB and gdbserver are communicating using the GDB remote protocol.
on a semantic level a tracing session runs though following sequence
(1) GDB client queries gdb server support for branch trace
(2) GDB server answers with
- qXfer:btrace:read
- qXfer:btrace-conf:read
- Qbtrace:off
- Qbtrace:CoreSight
- Qbtrace-conf:CoreSight:xxx where xxx is the parameter name
(3) GDB client sends command to let start emitting and collecting
traces (Qbtrace:CoreSight)
(4) GDB server executes the commands
(5) GDB client sends command to stop emitting and collecting traces
(Qbtrace:off)
(6) GDB server exectues the command
(7) GDB client sends command to get collected traces from trace sink
(qXfer:btrace:read:annex:offset,length)
(8) GDB server executes the command and sends back collected traces
(9) GDB client parses the traces and reconstructs target states
6. Remote protocol extensions
the remote protocol needs be extended with following primitives to
support CoreSight tracing
- start tracing and traces capture using CoreSight (Qbtrace:CoreSight)
the remote protocol can be extended with following primitives to take
advantages of etm functionalities.
- select the core to trace on in the case of a multicore system
GDB client sends command to select the core to trace
(Qbtrace-conf:CoreSight:core=value)
- set the trace id for the traces
GDB client sends command to set trace id
(Qbtrace-conf:CoreSight:id=value)
- select the context to trace
GDB client sends command to select the context to trace
(Qbtrace-conf:CoreSight:filter:context=value)
- select the address range to trace
GDB client sends command to select the address range to trace
(Qbtrace-conf:CoreSight:filter:start-address=value)
(Qbtrace-conf:CoreSight:filter:end-address=value)
- set triggers for starting and stopping tracing
GDB client sends command to select the address to trigger tracing
(Qbtrace-conf:CoreSight:trigger:on-address=value)
(Qbtrace-conf:CoreSight:trigger:off-address=value)
7. alternatives and discussions
7.1. Scope definition
Coresight ETM IP comes in many versions and many implementations.
According to its capabilities, it can trace instructions only or
instructions and involved data/data address. All ETMs variants support
instructions tracing and can therefore be used for for branch tracing.
7.2. CoreSight infrastructure exposure to the user
it is here about assigning the responsibility of configuring Coresight
infrastructure to generate and route traces. two alternatives are possible:
- coresight infrastructure exposed to GDB client (and UI):
in this alternative the user or the UI is responsible for
configuring coresight IPs in the SoC, by accessing their registers
directly or via coresigh driver. Remote protocol is used to configure trace
sink (ETB or TPA) to start/stop collecting traces
- coresight infrastructure is not exposed outside of gdbserver.
in this case high level commands can be provided by gdbserver
remote protocol to setup and configure coresight IPs in the SoC.
My recommendation is to extend remote protocol to provide high level
commands to setup and configure coresight IPs in the SoC, or to use a
different channel to pass configuration parameters to GDB server
7.3. Parameters needed for parsing traces
Some configuration parameters like etm version, trace id ... (content of
registers ETMCR, ETMIDR, ETMCCER, ETMTRACEIDR) are needed for extracting
and parsing etm trace, those parameters needs to be exchanged between GDB
server and client. following alternatives are possible:
- extend the remote protocol to get those params with explicit queries
- add them to the content of the response to qXfer:btrace-conf:read
- add them to the content of the response to qXfer:btrace:read
[VIRT-263 # ARMv8.1-VHE Virtual Host Extensions ]
Got to "kvm enabled with vhe" kernel message, but then
the kernel hangs there. Irritatingly works with the
kernel I built myself, but not a distro supplied kernel.
Need to track down the config difference so I can continue
using gdbstub.
[VIRT-344 # ARMv8.5-MemTag, Memory Tagging Extension ]
Regenerated an mte+linux-user branch for Google engineers
to use to develop llvm. This is code previously posted,
but my current branch striped out linux-user for ease of
review of the system code.
[VIRT-327 # Richard's upstream QEMU work ]
Fixed mmap assert, signal handler method.
Fixed constant folding of extract2.
Fixed aarch64 host output of extract2.
Posted pull request for those.
Started reviewing the GSoC risugen patches, v3 for avx.
r~
== This Week ==
* PR90723
- Committed fix to trunk in r273466.
* PR90724
- Posted patch upstream, waiting for feedback.
* Validation
- SPEC2k6 with SVE seems to compile, but spotted couple of infra issues.
- Sent abe patch to store sum files.
* Misc
- Meetings
== Progress ==
* GCC:
- FDPIC: received feedback on generic patches, will address after holidays
- noinit attribute: iterated on generic attribute patch, not approved yet
* GNU-583 (Fix Linux kernel built for Thumb-2 with GCC using LTO)
- no progress this week
* GCC upstream validation:
- reported a few regressions.
* Binutils:
- Non-contiguous memory regions support in the BFD linker: proposal
looks OK, will start implementation after holidays
* misc:
- infra fixes / troubleshooting
- reported several regressions in QEMU, promptly fixed by our awesome
team members
== Next ==
GCC:
- handle feedback on FDPIC and noinit patches
- binutils/linker support for non-contiguous memory regions
- GNU-583
- GCC upstream validation: Add a config for cortex-m33 (v8-m)
== Holidays ==
July 13-27
Aug 2-11
QEMU Tooling ([VIRT-252])
=========================
QEMU plugin support ([VIRT-280])
- posted {PATCH for semihosting-tests} semihosting tests: add v7m
tests Message-Id: <20190711135726.14191-1-alex.bennee(a)linaro.org>
- semihosting re-factor now in v4 branch
- cleaned up translator_ld stuff for arm
- posted {PATCH for 4.1?} includes: remove stale {smp|max}_cpus
externs Message-Id: <20190711130546.18578-1-alex.bennee(a)linaro.org>
- fixed up code needing smp/max_cpus
[VIRT-280] https://projects.linaro.org/browse/VIRT-280
[v4 branch] https://github.com/stsquad/qemu/tree/plugins/plugins-v4
GSoC Mentoring ([VIRT-348])
- starting to look quite workable
- looks like chunks of CONFIG_PROFILER can be made runtime
select-able
Upstream Work ([VIRT-109])
==========================
- more regression hunting for 4.1 release
- looked at bugs [1834496] and [1836078]
- posted {PATCH v2 for 4.1} target/arm: report ARMv8-A FP support
for AArch32 -cpu max Message-Id:
<20190711103737.10017-1-alex.bennee(a)linaro.org>
- rth and pm215 also posted various fixes
- ieee_6 test looks like a [fortran/gcc runtime issue]
- posted {PATCH for 4.1? v1 0/7} testing/next (docker, win-cross)
Message-Id: <20190712111849.9006-1-alex.bennee(a)linaro.org>
[VIRT-109] https://projects.linaro.org/browse/VIRT-109
[1834496] https://bugs.launchpad.net/bugs/1834496
[1836078] https://bugs.launchpad.net/bugs/1836078
[fortran/gcc runtime issue]
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78314
Completed Reviews [5/5]
=======================
{PATCH 0/5} tcg: Fix mmap_lock assertion failure, take 2
Message-Id: <87zhlned2x.fsf(a)zen.linaroharston>
{PATCH for-4.1} target/arm: Set VFP-related MVFR0 fields for arm926 and arm1026
Message-Id: <20190711121231.3601-1-peter.maydell(a)linaro.org>
{PATCH for-4.1 0/2} Compatibility fixes for nettle 2.7 vs 3.0 vs 3.5
Message-Id: <20190712101849.8993-3-berrange(a)redhat.com>
{PATCH v2 0/5} tests/docker: add podman support
Message-Id: <20190709194330.837-1-marcandre.lureau(a)redhat.com>
- CLOSING NOTE [2019-07-12 Fri 18:07]
Looks ok - need to get a podman system up for testing
{RISU PATCH v3 00/18} Support for generating x86 SIMD test images
Message-Id: <20190711223300.6061-5-jan.bobek(a)gmail.com>
Absences
========
- 18-19th July
Current Review Queue
====================
* {PATCH 0/2} tests/acceptance: Add test of NeXTcube framebuffer using OCR
Message-Id: <20190629150056.9071-1-f4bug(a)amsat.org>
* {Qemu-devel} {PATCH 0/4} Introduce the microvm machine type
Message-Id: <20190628115349.60293-1-slp(a)redhat.com>
* {PATCH 0/3} tests/acceptance: Add tests for the Leon3 board
Message-Id: <20190627115331.2373-1-f4bug(a)amsat.org>
* {PATCH 0/5} tests/acceptance: Add bFLT loader linux-user test
Message-Id: <20190625101524.13447-1-philmd(a)redhat.com>
* {PATCH v2 0/9} KVM: arm/arm64: vgic: ITS translation cache
Message-Id: <20190611170336.121706-1-marc.zyngier(a)arm.com>
* {Qemu-devel} {RFC PATCH 0/7} Proof of concept for Meson integration
Message-Id: <1560165301-39026-1-git-send-email-pbonzini(a)redhat.com>
--
Alex Bennée
Progress:
* VIRT-65 [QEMU upstream maintainership]
+ QEMU 4.1 rc0 sent out of the door
+ investigated and sent fix for a regression with arm926 and arm1020
emulation: we accidentally turned off VFP double-precision support
on these cores with the recent VFP refactoring
+ helped track down a booting failure Beata ran into on aarch64 hosts
to a regression in the TCG backend, which RTH has now sent a patch for
+ sent a cleanup for some dead code spotted by Coverity in the imx6ul SoC
thanks
-- PMM
== Progress ==
* Investigate running benchmarks in containers [TCWG-1513]
- Faffing about with our benchmarking scripts, not sure how to test
changes without disrupting our infrastructure
- Cooked up some viz scripts so I can easily look at the noise
levels in benchmark results with/without containers
* Started LLVM 8.0.1 rc4 build
- In progress on ARM, infrastructure issues on AArch64
== Plan ==
* Upcoming vacation: 6 - 13 August
[LLVM-583] LLVM Code Size reduction ideas from Zephyr and CMSIS
- Started a ticket to record areas of improvement where GCC does
better than LLVM.
- Upstream defaults to -mno-unaligned-access for clang which needs to
be corrected for.
- Much of the difference goes away when inlining is disabled, implying
that different inlining strategies could be most significant
difference.
- Sent in Linaro Connect presentation submission to cover all of TCWGs
code-size improvement work.
Planned absences
- Rest of this week, back in the office on the 15th July
[Code size investigation]
Results (clang 2% larger than gcc) replicated on cortex-m0 and
cortex-m4 on Zephyr.
- Clang optimisation to use BLX rather than BL when same function
called multiple times is a pessimisation on Zephyr, especially on M0.
- GCC register allocation seems to result in fewer spills
TODO: Get an estimate of how much code-size difference is down to
different inlining decisions.
On CMSIS DSP cortex-m4f clang appears to be producing smaller than
GCC, not measured averages yet.
[LLD]
- Quite a few upstream reviews, PRs and investigations surrounding them.
- Likely that LLD will be converting to the new variable naming convention.
- Received a request to add cortex-a8 erratum fix for Google Android team.
[Linaro Connect]
Registered and contacted travel.
Drafted a submission for presentation, will submit next week.
Planned Absences:
On holiday Wednesday, Thursday, Friday next week
Progress:
* VIRT-65 [QEMU upstream maintainership]
+ catching up with email and code review -- managed to get some
series reviewed in time for softfreeze on Tuesday, notably the
'sbsa-ref' reference platform model that Hongbo Zhang was working on
+ a lot of release-herding, working through the huge pile of pull
requests that need merging
+ fixed a silly bug in recent VFP refactoring, spotted by coverity
+ fixed a memory leak that broke our CI sanitizer build (not a new
piece of code, but we currently only sanitize the x86-64 targets
and a recent change meant this old code is now used on x86-64 for
the ATI PCI display device model)
* Misc:
+ first KVM Forum Programme Committee meeting (and attendant
review of all the submitted abstracts; bumper crop this year)
thanks
-- PMM
== Progress ==
* GCC:
- FDPIC: No progress, still waiting for feedback
- noinit attribute: reviewers asked to make it a generic attribute,
rather than target-specific. New patch sent.
* GNU-583 (Fix Linux kernel built for Thumb-2 with GCC using LTO)
- no progress this week
* GCC upstream validation:
- reported a few regressions.
* Binutils:
- PR24709 (linker crash and assertion failure with CMSE). Use case not
considered worth the headache of supporting correctly CSME+long-branch
stubs (tricky to get right). Replaced the linker crash with a user
error message.
- Non-contiguous memory regions support in the BFD linker: Received a
good summary of the consensus reached in 2017.
* misc:
- infra fixes / troubleshooting
== Next ==
GCC:
- handle feedback on FDPIC and noinit patches
- binutils/linker support for non-contiguous memory regions
- GNU-583
- GCC upstream validation: Add a config for cortex-m33 (v8-m)
== Holidays ==
July 13-27
Aug 2-11
== This Week ==
* PR88833
- Fixed pending issues with x86 and committed fix to trunk
* PR90723
- Issue seems to be infinite recursion overflowing the stack, investigating.
* Misc
- Meetings
== Next Week ==
- PR90723
- Add testsuite comparison to tcwg_gnu
== Progress ==
* LLVM SPEC2k6 Performance Analysis [LLVM-134]
- Still working around perf version mismatches, going to investigate
if we can use a newer version of perf to collect data
- Had a look at the assembly for sphinx from gcc-6, clang-3.9.1 and
clang-8.0.0, but going to wait for better perf before I rush to
conclusions
* IR SVE Reviews [ LLVM-545]
- Had a look at the clang patches, gave some feedback for one of
them; the other ones are very subtle and best left to the clang
maintainers
* [GlobalISel] Refactor CallLowering [LLVM-568]
- Fixed a little ASAN snag, should be done for good now
* Trying to get up an ABI fuzzer for GlobalISel
== Plan ==
* Update benchmarking infrastructure (in support of LLVM-134)
* Deprioritize GlobalISel
* Upcoming vacation: 6 - 13 August
Short week; 3 days.
[VIRT-263 # ARMv8.1-VHE Virtual Host Extensions ]
Fixed 3 bugs:
* SVE length calculation,
* PNX bit while in EL2&0 regime,
* Interrupt routing w/ TGE bit.
A bare 5.2 kernel boots to root file system not found.
A 4.19 kernel hangs during boot somewhere.
Now doing a fedora30 install, which I believe has a 5.x kernel...
[VIRT-327 # Richard's upstream QEMU work ]
Patch review:
* GSoC x86 risugen,
* arm semihost cleanups.
r~
== This Week ==
* PR88833 (6/10)
- Patch approved by Richard.
- Regresses pr88152.C due to a possibly latent issue with combine.
* PR90722 (3/10)
- Investigating the issue
* Misc (1/10)
- Meetings
o LLVM:
* 8.0.1-rc3: Built ARM Binaries, AArch64 on-going.
* Machine Outliner:
- Uncovered issues in stack fixup handling, working on a fix.
o Misc
* Various meetings and discussions.
[VIRT-263 # ARMv8.1-VHE Virtual Host Extensions ]
Rebased on master. Now trying to remember the incantation that produced a
minimum number of insns before the kernel actually tries to use this. At
present things seem to be crashing before I even get that far, as if I've
misconfigured something.
[VIRT-327 # Richard's upstream QEMU work ]
Fixed a couple of bugs in my tcg/ppc host vector patch set.
Reviewed qemu kvm sve patches.
Reviewed target/ppc altivec optimization patches.
Reviewed GSoC risugen x86 patches.
Other misc upstream review.
r~
QEMU Tooling ([VIRT-252])
=========================
QEMU plugin support ([VIRT-280])
- more work on [v4 branch] but cutting it too fine for 4.1
- reworking the memory tracing to track mmu_idx
- hope to have v4 posted on Monday if I can squash the bugs
- the plugin call isn't getting the full TCGMemopidx (maybe only
TCGmemop?)
- some interest on list from HW manufacturers
[VIRT-280] https://projects.linaro.org/browse/VIRT-280
[v4 branch] https://github.com/stsquad/qemu/tree/plugins/plugins-v4
GSoC Mentoring ([VIRT-348])
- reviewed {Qemu-devel} {PATCH v2 0/4} dumping hot TBs Message-Id:
<20190624055442.2973-1-vandersonmr2(a)gmail.com>
- worked up some [suggestions for HMP interface and refactoring]
- first evaluation period work
[VIRT-348] https://projects.linaro.org/browse/VIRT-384
[suggestions for HMP interface and refactoring]
https://github.com/stsquad/qemu/tree/review/hotblocks-v2-tweaks
Upstream Work ([VIRT-109])
==========================
- posted {PULL 00/19} testing/next (tests/vm, Travis and hyperv build
fix) Message-Id: <20190624134337.10532-1-alex.bennee(a)linaro.org>
- bit late, will wait until pm215 returns
[VIRT-109] https://projects.linaro.org/browse/VIRT-109
Other
=====
- booking flights for Connect
- submitted a talk for KVM Forum (plugins)
Completed Reviews [2/2]
=======================
{Qemu-devel} {PATCH v2 0/4} dumping hot TBs
Message-Id: <20190624055442.2973-1-vandersonmr2(a)gmail.com>
- CLOSING NOTE [2019-06-27 Thu 12:39]
Made a bunch of notes and [tweaks]
[tweaks]
https://github.com/stsquad/qemu/tree/review/hotblocks-v2-tweaks%0A
{PATCH} Makefile: Rename the 'vm-test' target as 'vm-help'
Message-Id: <20190531064341.29730-1-philmd(a)redhat.com>
- CLOSING NOTE [2019-06-27 Thu 12:40]
Queued to my tree
Absences
========
- June 21st
Current Review Queue
====================
* {Qemu-devel} {PATCH 0/4} Introduce the microvm machine type
Message-Id: <20190628115349.60293-1-slp(a)redhat.com>
* {PATCH 0/3} tests/acceptance: Add tests for the Leon3 board
Message-Id: <20190627115331.2373-1-f4bug(a)amsat.org>
* {PATCH 0/5} tests/acceptance: Add bFLT loader linux-user test
Message-Id: <20190625101524.13447-1-philmd(a)redhat.com>
* {PATCH v2 0/9} KVM: arm/arm64: vgic: ITS translation cache
Message-Id: <20190611170336.121706-1-marc.zyngier(a)arm.com>
* {Qemu-devel} {RFC PATCH 0/7} Proof of concept for Meson integration
Message-Id: <1560165301-39026-1-git-send-email-pbonzini(a)redhat.com>
* {PATCH 00/59} KVM: arm64: ARMv8.3 Nested Virtualization support
Message-Id: <20190621093843.220980-1-marc.zyngier(a)arm.com>
--
Alex Bennée
== Progress ==
* GCC:
- FDPIC: No progress, still waiting for feedback
- noinit attribute: no feedback yet
* GNU-583 (Fix Linux kernel built for Thumb-2 with GCC using LTO)
- managed to reproduce similar error messages
* GCC upstream validation:
- reported a few regressions.
* Binutils:
- PR24709 (linker crash and assertion failure with CMSE). CMSE stubs
do not support long branches. Tried two approaches, but didn't find a
solution yet
- resurrected a thread about non-contiguous memory regions support in
the BFD linker
* misc:
- reported regressions in qemu, found with the GCC testsuite
- infra fixes / troubleshooting
== Next ==
GCC:
- handle feedback on FDPIC and noinit patches
- binutils/PR24709
- GNU-583
== Progress ==
* [GlobalISel] Refactor CallLowering [LLVM-568]
- Committed upstream
* [ARM GlobalISel] Add support for integers > 32 bits wide [LLVM-310]
- In progress
* LLVM SPEC2k6 Performance Analysis [LLVM-134]
- Got some results with clang-3.9.1 and clang-8.0.0, trying to work
around a perf annotate issue so I can investigate
== Plan ==
* Continue with these
* Friday off
== Progress ==
* FDPIC
- GCC: No progress, still waiting for feedback
* GCC upstream validation:
- reported a few regressions.
* GCC:
- noinit attribute: no feedback yet
* Binutils:
- started looking at PR24709 (linker crash and assertion failure with CMSE)
* misc:
- ABE: pushed fix to glibc-2.29+ builds, tcwg-backport job now works again
- forwarded GCC/Linaro bug #5314 to upstream; quickly fixed by richi
- looked at a couple of old Jira cards. Not sure how to resume work on
GNU-583 (Fix Linux kernel built for Thumb-2 with GCC using LTO)
== Next ==
GCC:
- handle feedback on FDPIC and noinit patches
- binutils/PR24709
o 4 days week (1 training day)
o LLVM:
* 8.0.1-rc2: ARM and AArch64 binaries built and uploaded.
* Buildbots babysitting: couple of issues reported.
* Machine Outliner:
- Adding testcases before upstream submission.
o Misc
* Various meetings and discussions.
== Progress ==
* Out of office on Friday (bank holiday)
* [GlobalISel] Refactor CallLowering [LLVM-568]
- Patches upstream
* [ARM GlobalISel] Add support for integers > 32 bits wide [LLVM-310]
- Started looking into call lowering for 64-bit types
* LLVM SPEC2k6 Performance Analysis [LLVM-134]
- Trying to reproduce results from Connect, hit a little snag with Jenkins
== Plan ==
* Continue with these
[LLVM-542] Build Zephyr with clang
- Spent quite a bit of time working out why a Clang built zephy hello
world wouldn't boot, tracked down to a missing clobber list in an
inline assembly block
- Wrote some scripts to collect code size information on the samples.
Some initial figures on mostly cortex-m3 put llvm -Oz trunk about 2%
larger than arm-none-eabi-gcc (9.1.1) -Os with frame pointers
disabled. The samples are making very little use of the library
(newlib built with arm-none-eabi-gcc).
- Working out which samples will build with cortex-m0.
- Investigated latest version of bloaty mc bloat face a code size tool
from Google. Has some interesting features including an inline
detection feature that can map a portion of a function's code size to
inlined functions.
Misc:
LLD reviews and mailing list comments.
Planned Absences:
Likely to take some holiday around 13th July
Week ending 23 June is very short.
* Patch review
- target/ppc vsr cleanup
- target/tricore translator loop conversion
- target/arm vfp decodetree cleanup
- cortex-strings strrchr fix
- continuing on the plugin api
* Xilinx meeting
* Fix qemu assert for clyon
- Found two other bugs in the process.
* Debugging my own USHR/SSHR patch vs aa32.
r~
Progress: (very short week, 2 days)
* VIRT-65 [QEMU upstream maintainership]
+ found and sent patches for a handful of minor M-profile bugs
* VIRT-268 [QEMU support for dual-core Cortex-M Musca board]
+ the final patches for this have now gone upstream and I have
marked the JIRA issue as resolved. There may still be minor
bug fixes but we can handle those under the usual 'upstream
maintainership' JIRA
thanks
-- PMM
Hello folks,
I got a few people asking me to do this in the last Connect, so I've
proposed a beginner session that explores gcc under the hood. The
tentative plan I have for the talk is:
1. A high level view of how the source code is laid out
2. Front end, middle end, backend. This includes a high level
introduction of GIMPLE and machine descriptions
3. A walkthrough of one or two simple programs and usage of diagnostic
flags like -fdump-*
Additional suggestions are most welcome. Also, I was thinking maybe
it would be good to have a llvm under the hood talk along similar
lines. Thoughts?
Siddhesh
[LLVM-571] Build GNU rmprofile toolchain with Linaro scripts (abe)
The existing build-system was only set up to build the A-profile bare
metal toolchain. Managed to find right combination of flags and
modifications to get a toolchain that zephyr can use.
[LLVM-158] Buildbot maintenance
An interesting failure introduced in LLD, but causing segfaults in
2-stage build, now fixed.
[LLVM-542] Zephyr code size investigation
- Rebased modifications to Zephyr
- Wrote script to build all the examples with GCC and Clang
- Fixed problems with modifications found by building all the examples
- Clang built helloworld no longer booting, need to investigate
- Found some areas for more investigation:
-- llvm-objcopy missing --gap-fill (used by one of the sample programs)
-- lld missing --print-memory-usage, while I'm using gcc for the main
link, zephyr build system seems to be feature testing using clang bare
metal default linker (lld)
-- clang always generates .note.GNU.stack, gcc embedded does not,
leading to lots of orphan section warnings. Probably best solved by
linker script modification.
Progress:
* VIRT-65 [QEMU upstream maintainership]
+ some work on getting Sphinx to generate manpages, with a conversion
of the qemu-ga manpage from texinfo as the demonstration case
+ set up to run pre-merge tests on our packet.net qemu-test machine
rather than the gcc compile farm one (as the latter is running an
ancient ubuntu whose python is now too old to build QEMU with)
+ sent some follow-on cleanup patches now that VFP decodetree is in master
* VIRT-268 [QEMU support for dual-core Cortex-M Musca board]
+ sent out patches which correctly make the Cortex-M33 (and the -M4)
implement single-precision-only floating point, so the double-precision
instructions UNDEF as they should
+ once those and the other on-list patches have made their way through
code review and into master this epic will be complete
thanks
-- PMM
== Progress ==
* FDPIC
- GCC: No progress, still waiting for feedback
* GCC upstream validation:
- reported a few regressions.
* GCC:
- noinit attribute: patch posted for upstream review
* misc:
- training 3 days
== Next ==
GCC:
- handle feedback on FDPIC and noinitpatches
- UBSAN/bare-metal: do more testing
== Progress ==
* Catching up after holidays
* [GlobalISel] Refactor CallLowering [LLVM-568]
- Still in progress, but I fixed the AMDGPU failure
- Decided to go a bit deeper than initially intended, since it gets
really awkward otherwise
* IR SVE Reviews [ LLVM-545]
- New version of the patch was posted, had a quick look
== Plan ==
* Continue LLVM-568
* Start LLVM-134 (LLVM SPEC2k6 Performance Analysis Leftovers from BUD17)
* LLVM-545 (IR SVE Reviews) - Have a look at the new, revised patch
hi all,considering the big progress achieved in coresight drivers, perf, as well as opencsd, the prerequisites for making a move towards developing branch tracing in gdb for arm processors, based on etm are now available. Therefore I am publishing this request for comment, and looking forward for your feedback on this proposal.
Non intrusive execution recording for GDB using ARM CoreSight
Status of this Memo
This memo provides information for Linaro coresight and toolchain communities. Distribution of this memo is unlimited.
Abstract
A method of realizing execution recording in GDB in a non-intrusive way. This method is based on the use of CoreSight hardware tracing, available on ARM Cortex devices.
Table of Contents
1 Introduction 2 State of the art 3 Use cases 3.1 Self hosted debug monitor 3.2 Remote debug monitor 3.3 External debugger 4 Implementation needs 4.1 Self hosted debug monitor 4.2 Remote Debug monitor 4.3 External debugger 5 Remote protocol execution sequence 6 Remote protocol extensions 7 Solutions and alternatives 7.1 Scope definition 7.2 CoreSight infrastructure exposure to the user 7.3 Parameters needed for parsing traces
1. Introduction
CoreSight technology offers a toolset for tracing the execution of a program on a CPU, as well as routing the traces to an external trace port analyzer or storing it in a dedicated internal memory. Those traces do not affects system performance, and can be used as a record for program execution. GDB offers reverse debugging by recording program execution and storing it. GDB offers either full record or program flow (branch) record. Records can be replayed later-on for forwards or backwards debugging. This request for comments is about realizing GDB record and replay functionality using CoreSight technology. it presents typical use cases and discuss different alternatives for realizing above mentioned feature. 2. State of the art
GDB currently supports two execution recording variants: - full record: where registers as well as memory are recorded for each instruction. in this case GDB collects the registers as well as involved memory area after each instruction. currently this has no support for hardware accelerators - branch record: where only program flow is recorded. in this case GDB collects a list of linear execution called blocks. each branch will terminate previous block and start a new one. currently branch is implemented either without hardware acceleration or using Intel branch trace store "bts" and Intel processor trace "pt" hardware accelerator on supported cpus.
3. Use cases
Programs running on ARM processors can be be debugged in many configurations. three of them are selected in this RFC as base for discussion : 3.1. Self hosted debug monitor Those are systems where the debugger program runs on the same cpu as the debugged program and monitors it. user interacts with the debugging session on the target host itself. Linux gdb is an example of such systems. in such a system following setup is considered - Target: a process running on an ARM cortex A - Debugger: gnu gdb via ptrace API (arm-linux-gnueabihf-gdb)
+-----------------------------------+ | Target | | +------------+ | | +------+ | Coresight | | | | | | components:| | | | GDB |<--------->| | | | | | ^ | DWT, ETM, | | | +------+ | | ITM, TPIU | | | ^ | | TMC, ETB | | | | | +------------+ | +----|---------|--------------------+ | | | | arm-linux- | gnueabihf- | gdb | debug: ptrace trace: perf/CoreSight drivers
3.2. Remote debug monitor
Those are usually systems where the debugger program runs on the same cpu as the debugged program and monitors it. user interacts with the debugging session remotely from a PC Linux gdb is an example of such systems. in such a system following setup is considered - Target: a process running on an ARM cortex A - Gdb server: gnu gdbserver (arm-linux-gnueabihf-gdbserver) - Gdb client: gnu gdb (arm-linux-gnueabihf-gdb) - UI: eclipse with needed plugins, MI interface is used.
+--------------------------+ +---------------------------------------+ | Host | | Target | | | | +------------+ | | +-----+ +------+ | | +------+ | Coresight | | | | | | GDB | | | | GDB | | components:| | | | UI |<--->| |<--->|<--->|<--->| |<--------->| | | | | | ^ |Client| ^ | ^ | |Server| ^ | DWT, ETM, | | | +-----+ | +------+ | | | | +------+ | | ITM, TPIU | | | ^ | ^ | | | | ^ | | TMC, ETB | | | | | | | | | | | | +------------+ | +----|-----|-----|------|--+ | +--------|---------|--------------------+ | | | | | | | | | | | | | | Eclipse | arm-linux- | | arm-linux- | | gnueabihf- | TCP/IP gnueabihf- | | gdb | UART gdbserver | GDB MI GDB remote debug: ptrace protocol trace: perf/CoreSight drivers
3.3. External debugger
Those are systems where an external debugger is used. It accesses the target using JTAG or SWD. Target is usually a bare metal embedded systems or systems with an rtos. as an example, following setup is considered: - Target: firmware running on ARM cortex M. - Debugger: external debug and trace device. - Gdb server: OpenOcd. - Gdb Client: arm-none-eabi-gdb. - UI: eclipse with needed plugins, MI interface is used.
+--------------------------------------+ +-------+ +-------------+ | Host | | dbggr | | Target | | | | | | | | +-----+ +------+ +------+ | | | | Coresight | | | | | GDB | | GDB | | | Debug | | components: | | | UI |<--->| |<--->| |<-->|<--->| + |<--->| | | | | ^ |Client| ^ |Server| | ^ | Trace | ^ | DWT, ETM, | | +-----+ | +------+ | +------+ | | | | | | ITM, TPIU | | ^ | ^ | ^ | | | | | | | | | | | | | | | | | | | | +----|-----|-----|------|-----|--------+ | +-------+ | +-------------+ | | | | | | | | | | | | | | Eclipse | arm-none- | OpenOcd | | | eabi-gdb | PyOcd | | | | | | GDB MI GDB remote Ethernet debug: JTAG/SWD protocol USB trace: Serial/Parallel
4. Implementation needs
4.1 Self hosted debug monitor
gdb : arm-linux-gnueabihf-gdb the interface defined in btrace.h for capturing and processing traces has to be implemented for arm CoreSight needed actions: - in btrace-common.h: add needed structures for capturing and handling etm traces - in linux-btrace.h: - add btrace_tinfo_etm - amend btrace_target_info - in linux-btrace.c: change following functions to support etm traces - linux_enable_btrace - linux_disable_btrace - linux_read_btrace - linux_btrace_conf - in arm-linux-nat.c:add an api to - configure btrace - enable btrace - disable btrace - read btrace - in btrace.c - btrace_add_pc btrace_fetch has to be implemented for Coresight this means using opencsd library to parse etms and then reconstruct executed instructions accordingly (btrace_compute_ftrace_1) - in record-btrace.c - add command for showing record btrace etm options - add command for starting tracing with CoreSight and its handler (cmd_record_btrace_etm_start) - adapt cmd_show_record_btrace_cpu ... perf: needed actions: - make sure that perf can start/stop tracing a process with its threads, collect etm traces and deliver them to the user
4.2 Remote Debug monitor
changes described in 7.1 are needed. in addition, and to support remote protocol following changes are needed gdb server: arm-linux-gnueabihf-gdbserver needed actions: - in linux-low - linux_low_read_btrace: add support for etm traces formatting. - linux_low_btrace_conf: :add support for etm configuration formatting. gdb client: arm-linux-gnueabihf-gdb needed actions: - in remote.c - adapt enable_btrace - adapt disable_btrace - in btrace.c - parse_xml_btrace: update btrace.dtd [2] and related data structures btrace_xxx - parse_xml_btrace_conf: update btrace-conf.dtd [3] and related data structures btrace_conf_xxx - extend Remote protocol handling to support coresight etm traces UI: eclipse needed actions make sure that the plugin for recoding execution and replaying it is coping well in case of arm-linux
Remote protocol needs to be extended by -1- Adding Qbtrace:CoreSight (or etm) to start collecting etm traces -2- Amending 'Branch Trace Format' xml specification to consider etm traces transfer -3- Amending 'Branch Trace Configuration Format' xml specification to consider parameters needed for etm
4.3 External debugger
changes described in 4.2 are needed. in addition, and to support tracing a remote dealing with an external debugger (bare metal embedded system) following changes are needed gdb server: OpenOcd needed actions: - rework etm driver to make it up to date. - add a driver for configuring trace interconnect IPs - rework the driver for TPIU. - integrate support for a Trace port analyzer. -Extend remote protocol implementation to support recording Coresight infrastructure of the SoC is to be set in OpenOcd through configuration files. Parameters that are not relevant for gdb are also specified in configuration files (trace sink, trace protocol, port size, trace synch frequency, cycle accurate tracing etc ...) gdb client: arm-none-eabi-gdb needed actions: - extend Remote protocol to support coresight etm traces - integrate etm trace parsing library - interface the parser to record_btrace_target Remote protocol needs -in addition to 4.2- to be extended by - Adding Qbtrace-conf:CoreSight:core=value to support multicore SoC - Adding btrace-conf:CoreSight:id=value to support demultiplexing multiple trace sources - Adding Qbtrace-conf:CoreSight:filter:context=value to support filtering traces belonging to a given process/thread - Adding Qbtrace-conf:CoreSight:filter:start-address=value and Qbtrace-conf:CoreSight:filter:end-address=value to support filtering traces for given functions/blocks/lib - Adding Qbtrace-conf:CoreSight:trigger:on-address=value and Qbtrace-conf:CoreSight:trigger:off-address=value to support triggering tracing or stop tracing if a certain function/block/lib is executed alternatively some of configurations related to filtering and triggering can be delegated to the GDB server. UI: eclipse test and verify that existing plugins cope well with gdb extensions
5. Remote protocol execution sequence
gdb and gdbserver are communicating using the gdb remote protocol. on a semantic level a tracing session runs though following sequence (1) gdb client queries gdb server support for branch trace (2) gdb server answers with - qXfer:btrace:read - qXfer:btrace-conf:read - Qbtrace:off - Qbtrace:CoreSight - Qbtrace-conf:CoreSight:xxx where xxx is the parameter name (3) gdb client sends command to let start emitting and collecting traces (Qbtrace:CoreSight) (4) gdb server executes the commands (5) gdb client sends command to stop emitting and collecting traces (Qbtrace:off) (6) gdb server exectues the command (7) gdb client sends command to get collected traces from trace sink (qXfer:btrace:read:annex:offset,length) (8) gdb server executes the command and sends back collected traces (9) gdb client parses the traces and reconstructs target states
6. Remote protocol extensions
the remote protocol needs be extended with following primitives to support CoreSight tracing - start tracing and traces capture using CoreSight (Qbtrace:CoreSight) the remote protocol can be extended with following primitives to take advantages of etm functionalities. - select the core to trace on in the case of a multicore system gdb client sends command to select the core to trace (Qbtrace-conf:CoreSight:core=value) - set the trace id for the traces gdb client sends command to set trace id (Qbtrace-conf:CoreSight:id=value) - select the context to trace gdb client sends command to select the context to trace (Qbtrace-conf:CoreSight:filter:context=value) - select the address range to trace gdb client sends command to select the address range to trace (Qbtrace-conf:CoreSight:filter:start-address=value) (Qbtrace-conf:CoreSight:filter:end-address=value) - set triggers for starting and stopping tracing gdb client sends command to select the address to trigger tracing (Qbtrace-conf:CoreSight:trigger:on-address=value) (Qbtrace-conf:CoreSight:trigger:off-address=value)
7. alternatives and discussions
7.1. Scope definition
Coresight ETM IP comes in many versions and many implementations. According to its capabilities, it can trace instructions only or instructions and involved data/data address. All ETMs variants support instructions tracing and can therefore be used for for branch tracing.
7.2. CoreSight infrastructure exposure to the user
it is here about assigning the responsibility of configuring Coresight infrastructure to generate and route traces. two alternatives are possible: - coresight infrastructure exposed to gdb client (and UI): in this alternative the user or the UI is responsible for configuring coresight IPs in the SoC, by accessing their registers directly or via coresigh driver. Remote protocol is used to configure trace sink (ETB or TPA) to start/stop collecting traces - coresight infrastructure is not exposed outside of gdbserver. in this case high level commands can be provided by gdbserver remote protocol to setup and configure coresight IPs in the SoC. My recommendation is to extend remote protocol to provide high level commands to setup and configure coresight IPs in the SoC, or to use a different channel to pass configuration parameters to gdb server
7.3. parameters needed for parsing traces Some configuration parameters like etm version, trace id ... (content of registers ETMCR, ETMIDR, ETMCCER, ETMTRACEIDR) are needed for extracting and parsing etm trace, those parameters needs to be exchanged between gdb server and client. following alternatives are possible: - extend the remote protocol to get those params with explicit queries - add them to the content of the response to qXfer:btrace-conf:read - add them to the content of the response to qXfer:btrace:read
Best RegardsZied Guermazi
o LLVM
* Machine outliner:
- Rebased on upstream.
- Improved stack alignment handling
- More cleanup before submission
o Misc
* Various meetings and discussions.
[VIRT-339 # ARMv8.5-BTI, Branch Target Identification ]
Posted v6. Some review from Dave Martin; will need at least
one further revision, and to wait til the kernel patches land.
[VIRT-327 # Richard's upstream QEMU work ]
Fix a reported bug in pauth Auth results.
Fix a reported bug in vector variable shift.
Review Peter's vfp decodetree patch set.
Review of v18 of target/rx. Never-ending, it seems...
Review of some target/ppc vector patches.
Posted v4 of CPUNegativeOffsetState. This looks ready to pull.
r~
Upstream Work ([VIRT-109])
==========================
- spent ages tracking down 64-on-32 cputlb errors which led to:
- adding x86_64 support to TCG system tests
- {PATCH v1 0/4} softmmu de-macro fix with tests Message-Id:
<20190605162326.13896-1-alex.bennee(a)linaro.org>
- {PATCH} cputlb: cast size_t to target_ulong before using for
address masks Message-Id:
<20190606154310.15830-1-alex.bennee(a)linaro.org>
- took over maintainership of orphaned gdbstub
- posted {PULL 00/52} testing, gdbstub and cputlb fixes Message-Id:
<20190607090552.12434-1-alex.bennee(a)linaro.org>
- problems on hackbox lead to {PATCH} tests/vm: favour the locally
built QEMU for bootstrapping Message-Id:
<20190607185337.14524-1-alex.bennee(a)linaro.org>
[VIRT-109] https://projects.linaro.org/browse/VIRT-109
Other Tasks
===========
- Did some initial reading on RPMB
- Half day fire training
Absences
========
- May 27th is a Bank Holiday
- May 31st working on train in the afternoon
Current Review Queue
====================
* {Qemu-devel} {PATCH v4 00/39} tcg: Move the softmmu tlb to CPUNegativeOffsetState
Message-Id: <20190604203351.27778-1-richard.henderson(a)linaro.org>
* {Qemu-devel} {PATCH v2} target/arm: Vectorize USHL and SSHL
Message-Id: <20190603232209.20704-1-richard.henderson(a)linaro.org>
* {Qemu-devel} {PATCH resend} test-thread-pool: be more reliable
Message-Id: <20190530093417.23370-1-pbonzini(a)redhat.com>
* {PATCH 0/4} tests/docker: add podman support
Message-Id: <20190523234011.583-1-marcandre.lureau(a)redhat.com>
* {Qemu-devel} {PATCH 0/2} Implement PowerPC FPSCR flag Fraction Rounded
Message-Id: <20190525022008.24788-1-programmingkidx(a)gmail.com>
* {Qemu-devel} {PATCH for-4.1 v2 00/36} tcg: Move the softmmu tlb to CPUNegativeOffsetState
Message-Id: <20190328230404.12909-1-richard.henderson(a)linaro.org>
--
Alex Bennée
[LLVM-122] BTI and PAC support
Committed the LLD work. Modulo bugs this should now be done.
[LLVM-542] LLVM/GCC code size investigation
- Revisited my Zephy build with clang patches and updated so that it
works with trunk.
- Work out next steps of work.
- Work out to build an embedded gcc toolchain using the linaro infrastructure.
[Misc]
Reported bug in gold whereupon it would generate v4t veneers for v8-a CPUs
Still waiting for TK-1 board to finish building clang so that it can
run the testsuite. Hopefully finished over the weekend.
Progress:
* VIRT-65 [QEMU upstream maintainership]
+ Got the VFP decodetree conversion patchset out for review
(42 patches, 8 files changed, 3024 insertions(+), 1476 deletions(-))
+ sent a patchset which does the (easy) first step in my plan
for converting QEMU's documentation to Sphinx; sadly all the other
steps are much trickier...
thanks
-- PMM
== Progress ==
* FDPIC
- GCC: No progress, still waiting for feedback
* GCC upstream validation:
- reported a few regressions.
* GCC:
- UBSAN/bare-metal: added sync primitives implementation for low-end
cores (eg cortex-m0) Seems OK
- noinit attribute: started work
* Infra
- Fixed Dejagnu configuration issue in ABE which prevented us from
using target variant specifications
- Investigated problems with ABE and failure to cross-build "recent"
glibc (trouble with C++ compiler detection). ABE patches under review.
- Reduced load on APM machines to avoid depending on them too much
== Next ==
GCC:
- handle feedback on FDPIC patches
- noinit attribute
- UBSAN/bare-metal: do more testing
Hi,
Food for thought for today's sync up. I've been writting QEMU plugins to
exercise the plugin system and see what sort of useful information you
can extract when you can control the instruction stream.
For example I now have a plugin that can break down instruction counts
for any given run, for example a kernel boot:
Instruction Classes:
Class: UDEF not counted
Class: SVE (68 hits)
Class: Reserved (0 hits)
Class: PCrel addr (4589078 hits)
Class: Add/Sub (imm,tags) (0 hits)
Class: Add/Sub (imm) (26832113 hits)
Class: Logical (imm) (74304974 hits)
Class: Move Wide (imm) (10933759 hits)
Class: Bitfield (71470957 hits)
Class: Extract (85655 hits)
Class: Data Proc Imm (0 hits)
Class: Cond Branch (imm) (37227632 hits)
Class: Exception Gen (6 hits)
Class: NOP not counted
Class: Hints (244825554 hits)
Class: Barriers (1668558 hits)
Class: PSTATE (202144 hits)
Class: System Insn (7132992 hits)
Class: System Reg (2268308 hits)
Class: Branch (reg) (6280976 hits)
Class: Branch (imm) (18347905 hits)
Class: Cmp & Branch (180167025 hits)
Class: Tst & Branch (4092972 hits)
Class: Branches (0 hits)
Class: AdvSimd ldstmult (0 hits)
Class: AdvSimd ldstmult++ (0 hits)
Class: AdvSimd ldst (0 hits)
Class: AdvSimd ldst++ (0 hits)
Class: ldst excl (160861365 hits)
Class: Prefetch (0 hits)
Class: Load Reg (lit) (12828544 hits)
Class: ldst noalloc pair (0 hits)
Class: ldst pair (60381349 hits)
Class: ldst reg (0 hits)
Class: Atomic ldst (0 hits)
Class: ldst reg (reg off) (0 hits)
Class: ldst reg (pac) (0 hits)
Class: ldst reg (imm) (119597941 hits)
Class: Loads & Stores (0 hits)
Class: Data Proc Reg (113586343 hits)
Class: Scalar FP (0 hits)
Class: Unclassified (0 hits)
You can break down each class to individual instructions. For example
the Hints are mostly:
Individual Instructions:
Instr: wfe (132400072 hits) (op=0xd503205f/ Hints)
Instr: sevl (66433640 hits) (op=0xd50320bf/ Hints)
Instr: yield (29619246 hits) (op=0xd503203f/ Hints)
Instr: wfi (2865 hits) (op=0xd503207f/ Hints)
So I'm looking for a similar experiment that would be useful for the
memory sub-system. When I chatted to Maxim we thought maybe a simplified
cache line simulator might be useful. The aim wouldn't be to simulate
what a real cache might do but to be useful say for identifying regions
of code which might be susceptible to cache line bouncing. So as
compiler writers what sort of run time memory behaviour would you like
to track? What sort of information would be useful to extract with such
a tool?
I'm open to ideas ;-)
--
Alex Bennée
Four day week.
[VIRT-327 # Richard's upstream QEMU work ]
Reviewed s390 fp vector patch set.
Posted v16 rx. This seemed so close to being ready
last week, but now I don't know. I think I should
quit pushing it myself and let Yoshinori do more of
the lifting here.
Reviewed avr v20 patch set.
Reviewed Alex's testing patch set.
Submitted patches to constify upstream capstone
(500k from .data to .rodata).
r~
Short week (off Thursday/Friday)
== Progress ==
* FDPIC
- GCC: No progress, still waiting for feedback
* GCC upstream validation:
- reported a few regressions / fixed some testcases
* GCC:
- UBSAN/bare-metal: added sync primitives implementation for low-end
cores (eg cortex-m0) Seems OK
* Infra
- cleanup
- handling some problems with boards upgrades and crashes
== Next ==
FDPIC:
- GCC: handle feedback
UBSAN/bare-metal: do more testing
== This Week ==
* PR88837 (7/10)
- Addressed all upstream suggestions.
- Found a (hackish) way to test patch with qemu (multiple issues).
- Sorting thru "strange" testsuite fallout most of which seems
unrelated to my patch.
* PR88833 (2/10)
- Looking at fwprop pass
* Misc (1/10)
- Meetings
== Next Week ==
- Continue ongoing tasks.
(Short week, three days.)
Progress:
* VIRT-65 [QEMU upstream maintainership]
+ continuing with the conversion of the VFP decoder to 'decodetree'.
With some useful advice from RTH I have now got a big chunk of
it done, and it looks like this will provide:
- better places to put "UNDEF if CPU doesn't have double support" checks
- checks of "VFP enabled?" only after all UNDEF checks have happened
- cleanup of a lot of code that uses some TCG globals cpu_F0 and cpu_F1,
which is weird ancient style and overdue for a cleanup
- a VFP decoder which isn't a single thousand-line function with multiple
nested switch statements
thanks
-- PMM
3 day week.
[LLVM-122] BTI and PAC support in lld, llvm-readobj, llvm-objdump
- Now in upstream review. Most of the week spent writing and updating tests.
Some time reviewing some asm goto patches patches.
Planned absences:
Holiday Friday 31st May.
== Progress ==
* Out of office 22 & 24 May
* [GlobalISel] Refactor CallLowering [LLVM-568]
- In progress, likely going to take a while
- Found a minor bug in the lowering for AArch64 (I can get it to
crash on some edge case), not sure if it's worth fixing independently
since it gets fixed anyway with the refactoring that I have in
progress
- Trying to understand an AMDGPU failure
== Plan ==
* Out of office 29 May - 10 June
* More of the same
o LLVM
* 8.0.1-rc1 ARM and AArch64 Binaries uploaded.
* Buildbots: One fixe committed upstream.
* Machine outliner:
- Fixed liveness issue.
- Preparing pat6ch for re-submission
o Misc
* Various meetings and discussions.
[VIRT-343 # ARMv8.5-RNG, Random Number Generator ]
Merged!
[VIRT-263 # ARMv8.1-VHE Virtual Host Extensions ]
Started dusting off and rebasing wip.
[VIRT-339 # ARMv8.5-BTI, Branch Target Identification ]
Started reviewing the kernel patch set for this feature.
[VIRT-327 # Richard's upstream QEMU work ]
PR for tcg gvec work.
PR for Sato-san's RX target.
Patch set to update capstone and enable s390x.
GSOC: Review v3 of Jan's enable risu for x86 patch set.
[Other]
Travel arrangements for Xilinx meeting in San Jose, June 13.
Will need to pick Peter's brain re m-profile before then.
r~
== This Week ==
* PR88833 (4/10)
- Started investigating the issue, it seems one of the code-movement
RTL passes like cse2
do not remove identical register copies resulting in extra register move.
* PR88837 (5/10)
- Patch almost approved offline by Richard, suggested me to move
discussion upstream.
- Observed "strange" issue with return value vectors on qemu for
run-time tests for fixed-length vectors. Turned out due to mismatch
in vector-length at compile and run time -;)
- Trying to run SVE tests with qemu.
* Misc (1/10)
- Meetings
== Next Week ==
- Continue ongoing tasks
[LLVM-122] BTI and PAC support in LLD
Implementation now working, have written BTI tests, next step is
finishing off PAC tests.
[Misc]
Helped out debugging an asm-goto problem on ARM targets.
Investigated a GNU ld LMA overlap when VMA and LMA got out of sync.
Helped out with CMSIS use of ld scripts when using a fast-model,
needed to get LMA == VMA for program header covering BSS.
QEMU Tooling ([VIRT-252])
=========================
QEMU plugin support ([VIRT-280])
- synced up with Emilio, will take over branch and submit
- latest branch is [plugins/plugins-v3]
- will peel off simple clean-ups and tweaks next week
- then need to split up some more and better separate code
- exposed plugin_disas to for "howvec" instruction counter
- some [example] [output] while booting kernel
[VIRT-280] https://projects.linaro.org/browse/VIRT-280
[plugins/plugins-v3]
https://github.com/stsquad/qemu/tree/plugins/plugins-v3
[example] http://ix.io/1JXC
[output] http://ix.io/1JXl
GSoC Mentoring ([VIRT-348])
- planning for start of coding next week
[VIRT-348] https://projects.linaro.org/browse/VIRT-384
Upstream Work ([VIRT-109])
==========================
- prepared [testing/next] for PR
[VIRT-109] https://projects.linaro.org/browse/VIRT-109
[testing/next] https://github.com/stsquad/qemu/tree/testing/next
Completed Reviews [3/3]
=======================
{RISU v3 00/11} Support for i386/x86_64 with vector extensions
Message-Id: <20190523204409.21068-1-jan.bobek(a)gmail.com>
{PATCH v10 00/20} gdbstub: Refactor command packets handler
Message-Id: <20190521095948.8204-1-arilou(a)gmail.com>
- CLOSING NOTE [2019-05-24 Fri 17:30]
awaiting re-spin with tags applied.
{RFC v2 00/38} Plugin support
Message-Id: <20181209193749.12277-1-cota(a)braap.org>
- CLOSING NOTE [2019-05-24 Fri 17:47]
taking over the tree
Absences
========
- May 27th is a Bank Holiday
- May 31st working on train in the afternoon
Current Review Queue
====================
* {PATCH 0/5} tests/vm: Python 3, improve image caching, and misc
Message-Id: <20190329210804.22121-1-wainersm(a)redhat.com>
* {Qemu-devel} {PATCH for-4.1 v2 00/36} tcg: Move the softmmu tlb to CPUNegativeOffsetState
Message-Id: <20190328230404.12909-1-richard.henderson(a)linaro.org>
* {Qemu-devel} {RFC v4 0/7} Baby steps towards saner headers
Message-Id: <20190523081538.2291-1-armbru(a)redhat.com>
* {Qemu-arm} {PATCH v2 0/4} hw/arm/boot: handle large Images more gracefully
Message-Id: <20190516144733.32399-1-peter.maydell(a)linaro.org>
* {Qemu-devel} {PATCH v12 00/12} Add RX archtecture support
Message-Id: <20190514061458.125225-1-ysato(a)users.sourceforge.jp>
* {PATCH 00/13} target/arm/kvm: enable SVE in guests
Message-Id: <20190512083624.8916-1-drjones(a)redhat.com>
--
Alex Bennée
Progress:
* VIRT-65 [QEMU upstream maintainership]
+ sent patchset fixing a handful of simple GICv3 bugs
+ usual codereview work
+ sent out a sketch of how we can transition our documentation
from the current texinfo manual to a set of sphinx manuals
+ had a look at the practicalities of converting our hand-written
VFP decoder to 'decodetree' -- this may be the easiest way to
support FPU configs which only support single-precision, like Cortex-M33
thanks
-- PMM
== Progress ==
* FDPIC
- GCC: Updated patch 03/21 with changes in the handling of -static
according to feedback. Pinged the whole series.
* GCC upstream validation:
- reported a couple of regressions
* Infra
- [stalled] working on adding binutils regression testing to round-robin jobs
- cleanup
- handling some problems with boards upgrades and crashes
== Next ==
FDPIC:
- GCC: handle feedback
UBSAN/bare-metal: look at how to make it easier to use on CPUs that
lack sync primivites (eg cortex-m0)
== This Week ==
* PR88837 (9/10)
- Tweaked patch to handle few more special cases with suggestions from
Richard.
* Misc (1/10)
- Meetings
== Next Week ==
- Continue ongoing tasks
o LLVM
* 8.0.1-rc1 Started Binaries build.
* Buildbots babysitting:
- Two fixes committed upstream
* Machine outliner:
- Liveness informations are not accurate after FrameLowering,
investigation on-going.
o Misc
* Various meetings and discussions.
[VIRT-343 # ARMv8.5-RNG, Random Number Generator ]
Posted v7 and v8. I think this is now ready for merge,
but I said that last week as well. :-P
[VIRT-327 # Richard's upstream QEMU work ]
More gvec work, some of which applies to target/arm,
and some to tcg/aarch64/, but all of which is in support
of David's target/s390x work. Should be coming to a
close on that soon.
Posted v7 of my do_syscall split.
Reviewed v13 of the RX target, adjusted it slightly for
my tlb_fill changes. I think this now ready to merge.
r~
Progress:
* VIRT-65 [QEMU upstream maintainership]
+ finally managed to complete review of Damien's reset handling rework
+ rolled v2 of patchset to support booting large kernel images
+ sent a cleanup patchset to rename arm.h to boot.h
* VIRT-268 [QEMU support for dual-core Cortex-M Musca board]
+ working on making the CPU model configurable without FPU or DSP,
so that we can correctly model the Musca-A and MPS2-AN521 boards as
not having FPU or DSP on CPU #0.
thanks
-- PMM
== Progress ==
* FDPIC
- GCC: sent updated patch series (v5). Received feedback about -static
support vs dynamic-linkker need. Discussing options.
* GCC upstream validation:
- reported a couple of regressions
- found a bug in qemu while testing v4.0.0, preparing a reproducer
* Infra
- [stalled] working on adding binutils regression testing to round-robin jobs
- cleanup
== Next ==
FDPIC:
- GCC: handle feedback
UBSAN/bare-metal: look at how to make it easier to use on CPUs that
lack sync primivites (eg cortex-m0)
== Progress ==
* [GlobalISel] Add support for integers > 32 bits wide [LLVM-310]
- While looking into this I found and fixed a bug in the generic
part of IRTranslator, which reduced the number of fallbacks on the ARM
test-suite by about 20%
- Currently working on lowering function calls etc for 64-bit types
* [GlobalISel] Refactor CallLowering [LLVM-568]
- The CallLowering interface really needs a cleanup before I
continue with LLVM-310
- This has been discussed upstream in the past and would benefit all
targets, so I'm going to give it a shot
* SVE2 code reviews
== Plan ==
* More of the same
* Out of office 29 May - 10 June
== Progress ==
* Out of office on Friday (sick)
* [GlobalISel] Better support for small types [LLVM-553]
- Committed upstream
* GlobalISel
- quickfix for a DBG_VALUE-related bug
- code reviews
* SVE code reviews
* Catching up on Connect / EuroLLVM
== Plan ==
* More of the same
* Out of office end of May - beginning of June
[VIRT-343 # ARMv8.5-RNG, Random Number Generator ]
Posted v4, v5, v6. I think this is now ready for merge.
[VIRT-327 # Richard's upstream QEMU work ]
Posted v3 of the CPUNegativeOffset patch set.
Posted v2, v3, and a pull request for the tlb_fill patch set.
Debugged one more fix for Sparc testthreads.
Reposted some long dormant linux-user fixes.
Started reviving the do_syscall split patch set,
since Laurent asked after it.
r~
o 4 days week.
o LLVM
* Machine outliner:
- Fixed LR save issue, when saved into a register.
- Dealing with LR save/restore when outlined region is a
pop{...,PC} tail-call.
- Investigating potential issue with condition flags.
o Misc
* Various meetings and discussions.
[LLVM-158] buildbot maintenance
- Increased timeouts on some libfuzzer tests, aarch64 full bots should
fail less frequently under load.
[LLVM-534] -n -N support in LLD (needed for Linux kernel allyesconfig
CI with LLD on AArch64)
Rewrote using a different approach after upstream comments
[LLVM-122] BTI and PAC support in LLD
Wrote an implementation, it compiles, but completely untested as of today.
(short week: 3 days)
Brief writeup of a pair of talks I attended on Tuesday at the
Cambridge University Computer Lab by some people from Amazon:
Diana Popa talked about Amazon's new "Firecracker" VMM (virtual
machine monitor -- the userspace component that uses the kernel's KVM
APIs to create and control virtual machines; kvmtool and QEMU are both
VMMs). Their use case is the AWS Lambda service, where VMs are
generally fairly short-lived (on the order of hours), startup time
matters a lot, and the VMs typically don't need very much CPU/RAM
resource. Firecracker is written in Rust, and provides a very simple
guest device model (virtio block and network devices), booting a
kernel that knows it is virtualized. It boots the kernel directly,
without running a BIOS. It has a memory footprint of less than 5MB and
a boot time of 125ms. They are currently working on Arm support (they
have it booting, but some bits still need work, eg the VM doesn't get
the right time because there is no RTC device exposed to the guest).
My feeling was that this shows an advantage of the KVM design: the
kernel/userspace split makes it easy to replace the userspace VMM
part with something customised for the task at hand if you don't
need a full-fat all-bells-and-whistles general-purpose solution.
Andreea Florescu talked next, about the "rust-vmm" libraries. This is
a set of open-source Rust crates which are intended to abstract out
some of the common building blocks for VMMs. Firecracker started as a
fork of Google's crosvm project, but since the use-case requirements
for the two projects are markedly different the code diverged fairly
rapidly. rust-vmm is intended to allow the projects to share code for
things like "nice Rust interfaces to the KVM ioctls" and
"implementations of virtio devices". The project is still in quite an
early stage of development -- they have a few crates that have made it
to the "stable, published on crates.io" phase, but most are either in
"being developed" or still just "planned/proposed/discussed". It's
currently Apache-2.0 licensed, but they are planning to dual-license
to Apache-2.0 | 3-BSD because Apache-2.0 isn't GPL-2.0 compatible, and
they have had some interest in being able to experiment with using
these crates with QEMU. (That sounds a bit outlandish but it's
actually something I'm planning to look into myself -- the nice thing
about Rust is that you can potentially incrementally add it to an
existing C codebase without requiring a ground-up rewrite, so allowing
security hardening of the more "risky" parts. This is very definitely
all still just "exploratory prototyping" though.)
Progress:
* just miscellaneous upstream stuff
thanks
-- PMM
* 1 day off (public holiday)
== Progress ==
* FDPIC
- rebased GCC FDPIC patches. Fixing conflict with fstack-protector.
* GCC upstream validation:
- Fixed ST internal validation broken since GCC bumped to version 10.
Still some spurious failures probably caused by NFS. Testing
workarounds.
- reported a couple of regressions
* GCC
- ubsan on bare-metal toolchain: no news.
* Infra
- [stalled] working on adding binutils regression testing to round-robin jobs
== Next ==
FDPIC:
- GCC: fix problems with fstack-protector
UBSAN/bare-metal: look at how to make it easier to use on CPUs that
lack sync primivites (eg cortex-m0)
o 4 days week.
o LLVM
* Machine outliner:
- Identified an issue related to LR saving inside an outlined
chunk, working on a proper fix.
o Misc
* Various meetings and discussions.
[VIRT-327 # Richard's upstream QEMU work ]
Review Mark's target/ppc getVSR patch set.
Two rounds of "tcg vector improvments"; hopefully that's
ready to go in on Monday.
More work on "bit select" and "compare select" primitives.
I can now vectorize Neon VSHL/VSHR variable shift (where
positive values are left shift and negative values are
right shift). Waiting on posting this while previous tcg
vector patch set is still in flight.
Review Alex's demacrofy v5. Wrote a boot.S for Alpha.
Review David's latest target/s390 vector patch set.
Review Sato-san's target/rx v8. Played around with a few
disassembler improvements, but I'll not confuse the review
process by posting them now.
r~
Progress:
* VIRT-65 [QEMU upstream maintainership]
+ pushed QEMU 4.0 out the door
+ code review:
- RTH's patchset that cleans up the softmmu TLB structs
- Nios2 nommu and semihosting patchset from codesourcery
- cleanup series removing a "bucket of random stuff" header file
- RTH's patchset adding BTI support for linux-user mode
- RTH's patchset cleaning up the tlb_fill API
- RTH's patchset implementing Cortex-A73, A75, A76
- "SBSA reference platform" new board model
- patchset adding Netduino Plus 2 board model
- linux-user patch to correctly handle loading ELF segments which
have no file data (ie only bss)
- patch adding the RTC device to the ASpeed board models
- patchset fixing various minor problems preventing QEMU building
cleanly for Windows-on-Arm
- started looking at Damian's patchset that overhauls how we do
device reset; this is good work that's long overdue, but reviewing
it requires me to wrap my head around the problem space...
+ sent out v2 versions of a few minor patches that needed respins
+ wrote email to qemu-devel asking for volunteers to help with
QEMU release work so it's not only me doing this every cycle
* VIRT-268 [QEMU support for dual-core Cortex-M Musca board]
+ FPU support now upstream
+ a few loose ends remain to be tidied up, but this epic is
now essentially complete
NB: out of office Tues 7th afternoon to attend a couple of lectures
at the CL by people from Amazon on their virtualization stack written
in Rust (http://talks.cam.ac.uk/talk/index/119491 and
http://talks.cam.ac.uk/talk/index/121069)
thanks
-- PMM
[LLVM-158] Buildbot monitoring duty
- Reported bug that libc++ when built as part as libfuzzer is not
built with PIC or PIE, yet some tests for non-x86 force PIE which then
fails at link-time.
- Reported bugs in libstdc++ and clang where exception specifications
didn't match due to extra parentheses. libstdc++ now fixed to not have
any discrepancy, clang bug for not ignoring the extra parentheses
still active.
- Investigated libfuzzer intermittent failures, 2 look like timeouts
not being long enough, submitted patch to get this increased.
[LLVM-122] BTI/PAC Started prototyping an implementation based on top
of the yet to land LLD patch for Intel CET.
Think about how to add crypto extensions without overriding
architecture in a complex build system.
Review comments for LLD and compiler-rt, and mailing list proposal for
something similar to __attribute__((at(address))).
* 1 day off (public holiday)
== Progress ==
* FDPIC
- Looked at gdbserver memory consumption increased since release 7.5.
Found similar results to Prathamesh. On arm-linux-gnueabihf with a
sample test program, gdbserver memory usage increased from ~500kB to
~1.5MB. But that should not prevent execution on board (which has
16MB); maybe memory fragmentation?
- rebased GCC FDPIC patches. There's a regression since Thomas
committed fixes to fstack-protector.
* GCC upstream validation:
- ST internal validation broken since GCC bumped to version 10. I was
using an old glibc. Upgrading glibc proved to be painful (requiring
new versions of make, bison, python....). Still using RH6 servers.
* GCC
- ubsan on bare-metal toolchain: Sent an email to llvm-dev list,
requesting help in how-to-cross-build runtime libs in clang/llvm. No
response so far....
* Infra
- [stalled] working on adding binutils regression testing to round-robin jobs
- fixed legacy binutils regression testing by switching to new slaves
- sent patches to support new slave (tcwg-lc-01)
- sent ABE patches to support new gcc9 config, and update to latest-rel config
== Next ==
FDPIC:
- GCC: fix problems with fstack-protector
UBSAN/bare-metal: look at how to make it easier to use on CPUs that
lack sync primivites (eg cortex-m0)
Infra:
- Fix ST internal validation
== Progress ==
* Out of office 1 day (public holiday)
* [GlobalISel] Better support for small types [LLVM-553]
- Fixed the bug that I'd been looking into
- Committed support for several instructions, only 3 left to commit next week
* IR SVE Reviews [LLVM-545]
- Looked into the patches for stack management
* GlobalISel code review
- Currently looking into an unpleasant patch adding a new opcode
* Catching up on Connect / EuroLLVM
== Plan ==
* More of the same
* Out of office end of May - beginning of June
[VIRT-327 # Richard's upstream QEMU work ]
Another round on launchpad 1824853, TB overflow.
This time handling relocation overflow. Which would
not be seen on an x86 host (2GB displacement), but
would affect some of the risc hosts.
Reviewed Peter's v7m fpu patches.
Another round on util/path.c, fixing the startup loop
that we get into for using a full chroot for -L.
Poked my nose into Alex's cputlb demacrofy patch set.
Hopefully the feedback was helpful...
First two pull requests for 4.1.
r~
[PR40542] Sent patch for -n and -N support in LLD for upstream review
[LTO]
Investigated problems when using -Os -Oz with LTO, raised 2 PRs
- error if clang linker invocation uses -Os and -Oz
- strange error message when .bc used as a file extension for a
separate compile and link step
Crash in GNU ld when linking LLVM lto via the gold plugin. Looks like
a memory access/corruption problem in the conversion from .bc to bfd.
Other miscellaneous reviews for some linker script support in LLD.
Investigation into why ld.bfd with NOLOAD on the .gnu.build-id section
corrupts debug information.
== This Week ==
* PR88837 (7/10)
- Discussed and finalized algorithm for vector construction with Richard.
* GNU-606 (1/10)
- Experimented gdbserver memory consumption with docker.
* Public Holiday (2/10)
== Next Week ==
- Continue ongoing tasks.
== Progress ==
* Short week (Out of office 22 - 24 April)
* [GlobalISel] Better support for small types [LLVM-553]
- Investigated my bug some more, it doesn't seem to be related to my
recent patches but rather an existing issue which is exposed because
we select more functions now
- Might be related to LLVM-456 (Fix frame index sizes for i<32)
* Catching up on Connect / EuroLLVM
== Plan ==
* Try to confirm root cause of LLVM-553 issue
[VIRT-327 # Richard's upstream QEMU work ]
Fix TranslationBlock overflow, launchpad 1824853.
Investigated launchpad 1824768, i386 emulation on arm32.
But works-for-me.
A bunch of work on new gvec primitives. Primarily to
support David Hildebrand's target/s390 conversion, but
it does enable more vectorization in target/arm as well.
r~