[VIRT-263 # ARMv8.1-VHE Virtual Host Extensions ]
Rebased on master. Now trying to remember the incantation that produced a
minimum number of insns before the kernel actually tries to use this. At
present things seem to be crashing before I even get that far, as if I've
misconfigured something.
[VIRT-327 # Richard's upstream QEMU work ]
Fixed a couple of bugs in my tcg/ppc host vector patch set.
Reviewed qemu kvm sve patches.
Reviewed target/ppc altivec optimization patches.
Reviewed GSoC risugen x86 patches.
Other misc upstream review.
r~
QEMU Tooling ([VIRT-252])
=========================
QEMU plugin support ([VIRT-280])
- more work on [v4 branch] but cutting it too fine for 4.1
- reworking the memory tracing to track mmu_idx
- hope to have v4 posted on Monday if I can squash the bugs
- the plugin call isn't getting the full TCGMemopidx (maybe only
TCGmemop?)
- some interest on list from HW manufacturers
[VIRT-280] https://projects.linaro.org/browse/VIRT-280
[v4 branch] https://github.com/stsquad/qemu/tree/plugins/plugins-v4
GSoC Mentoring ([VIRT-348])
- reviewed {Qemu-devel} {PATCH v2 0/4} dumping hot TBs Message-Id:
<20190624055442.2973-1-vandersonmr2(a)gmail.com>
- worked up some [suggestions for HMP interface and refactoring]
- first evaluation period work
[VIRT-348] https://projects.linaro.org/browse/VIRT-384
[suggestions for HMP interface and refactoring]
https://github.com/stsquad/qemu/tree/review/hotblocks-v2-tweaks
Upstream Work ([VIRT-109])
==========================
- posted {PULL 00/19} testing/next (tests/vm, Travis and hyperv build
fix) Message-Id: <20190624134337.10532-1-alex.bennee(a)linaro.org>
- bit late, will wait until pm215 returns
[VIRT-109] https://projects.linaro.org/browse/VIRT-109
Other
=====
- booking flights for Connect
- submitted a talk for KVM Forum (plugins)
Completed Reviews [2/2]
=======================
{Qemu-devel} {PATCH v2 0/4} dumping hot TBs
Message-Id: <20190624055442.2973-1-vandersonmr2(a)gmail.com>
- CLOSING NOTE [2019-06-27 Thu 12:39]
Made a bunch of notes and [tweaks]
[tweaks]
https://github.com/stsquad/qemu/tree/review/hotblocks-v2-tweaks%0A
{PATCH} Makefile: Rename the 'vm-test' target as 'vm-help'
Message-Id: <20190531064341.29730-1-philmd(a)redhat.com>
- CLOSING NOTE [2019-06-27 Thu 12:40]
Queued to my tree
Absences
========
- June 21st
Current Review Queue
====================
* {Qemu-devel} {PATCH 0/4} Introduce the microvm machine type
Message-Id: <20190628115349.60293-1-slp(a)redhat.com>
* {PATCH 0/3} tests/acceptance: Add tests for the Leon3 board
Message-Id: <20190627115331.2373-1-f4bug(a)amsat.org>
* {PATCH 0/5} tests/acceptance: Add bFLT loader linux-user test
Message-Id: <20190625101524.13447-1-philmd(a)redhat.com>
* {PATCH v2 0/9} KVM: arm/arm64: vgic: ITS translation cache
Message-Id: <20190611170336.121706-1-marc.zyngier(a)arm.com>
* {Qemu-devel} {RFC PATCH 0/7} Proof of concept for Meson integration
Message-Id: <1560165301-39026-1-git-send-email-pbonzini(a)redhat.com>
* {PATCH 00/59} KVM: arm64: ARMv8.3 Nested Virtualization support
Message-Id: <20190621093843.220980-1-marc.zyngier(a)arm.com>
--
Alex Bennée
== Progress ==
* GCC:
- FDPIC: No progress, still waiting for feedback
- noinit attribute: no feedback yet
* GNU-583 (Fix Linux kernel built for Thumb-2 with GCC using LTO)
- managed to reproduce similar error messages
* GCC upstream validation:
- reported a few regressions.
* Binutils:
- PR24709 (linker crash and assertion failure with CMSE). CMSE stubs
do not support long branches. Tried two approaches, but didn't find a
solution yet
- resurrected a thread about non-contiguous memory regions support in
the BFD linker
* misc:
- reported regressions in qemu, found with the GCC testsuite
- infra fixes / troubleshooting
== Next ==
GCC:
- handle feedback on FDPIC and noinit patches
- binutils/PR24709
- GNU-583
== Progress ==
* [GlobalISel] Refactor CallLowering [LLVM-568]
- Committed upstream
* [ARM GlobalISel] Add support for integers > 32 bits wide [LLVM-310]
- In progress
* LLVM SPEC2k6 Performance Analysis [LLVM-134]
- Got some results with clang-3.9.1 and clang-8.0.0, trying to work
around a perf annotate issue so I can investigate
== Plan ==
* Continue with these
* Friday off
== Progress ==
* FDPIC
- GCC: No progress, still waiting for feedback
* GCC upstream validation:
- reported a few regressions.
* GCC:
- noinit attribute: no feedback yet
* Binutils:
- started looking at PR24709 (linker crash and assertion failure with CMSE)
* misc:
- ABE: pushed fix to glibc-2.29+ builds, tcwg-backport job now works again
- forwarded GCC/Linaro bug #5314 to upstream; quickly fixed by richi
- looked at a couple of old Jira cards. Not sure how to resume work on
GNU-583 (Fix Linux kernel built for Thumb-2 with GCC using LTO)
== Next ==
GCC:
- handle feedback on FDPIC and noinit patches
- binutils/PR24709
o 4 days week (1 training day)
o LLVM:
* 8.0.1-rc2: ARM and AArch64 binaries built and uploaded.
* Buildbots babysitting: couple of issues reported.
* Machine Outliner:
- Adding testcases before upstream submission.
o Misc
* Various meetings and discussions.
== Progress ==
* Out of office on Friday (bank holiday)
* [GlobalISel] Refactor CallLowering [LLVM-568]
- Patches upstream
* [ARM GlobalISel] Add support for integers > 32 bits wide [LLVM-310]
- Started looking into call lowering for 64-bit types
* LLVM SPEC2k6 Performance Analysis [LLVM-134]
- Trying to reproduce results from Connect, hit a little snag with Jenkins
== Plan ==
* Continue with these
[LLVM-542] Build Zephyr with clang
- Spent quite a bit of time working out why a Clang built zephy hello
world wouldn't boot, tracked down to a missing clobber list in an
inline assembly block
- Wrote some scripts to collect code size information on the samples.
Some initial figures on mostly cortex-m3 put llvm -Oz trunk about 2%
larger than arm-none-eabi-gcc (9.1.1) -Os with frame pointers
disabled. The samples are making very little use of the library
(newlib built with arm-none-eabi-gcc).
- Working out which samples will build with cortex-m0.
- Investigated latest version of bloaty mc bloat face a code size tool
from Google. Has some interesting features including an inline
detection feature that can map a portion of a function's code size to
inlined functions.
Misc:
LLD reviews and mailing list comments.
Planned Absences:
Likely to take some holiday around 13th July
Week ending 23 June is very short.
* Patch review
- target/ppc vsr cleanup
- target/tricore translator loop conversion
- target/arm vfp decodetree cleanup
- cortex-strings strrchr fix
- continuing on the plugin api
* Xilinx meeting
* Fix qemu assert for clyon
- Found two other bugs in the process.
* Debugging my own USHR/SSHR patch vs aa32.
r~
Progress: (very short week, 2 days)
* VIRT-65 [QEMU upstream maintainership]
+ found and sent patches for a handful of minor M-profile bugs
* VIRT-268 [QEMU support for dual-core Cortex-M Musca board]
+ the final patches for this have now gone upstream and I have
marked the JIRA issue as resolved. There may still be minor
bug fixes but we can handle those under the usual 'upstream
maintainership' JIRA
thanks
-- PMM
Hello folks,
I got a few people asking me to do this in the last Connect, so I've
proposed a beginner session that explores gcc under the hood. The
tentative plan I have for the talk is:
1. A high level view of how the source code is laid out
2. Front end, middle end, backend. This includes a high level
introduction of GIMPLE and machine descriptions
3. A walkthrough of one or two simple programs and usage of diagnostic
flags like -fdump-*
Additional suggestions are most welcome. Also, I was thinking maybe
it would be good to have a llvm under the hood talk along similar
lines. Thoughts?
Siddhesh
[LLVM-571] Build GNU rmprofile toolchain with Linaro scripts (abe)
The existing build-system was only set up to build the A-profile bare
metal toolchain. Managed to find right combination of flags and
modifications to get a toolchain that zephyr can use.
[LLVM-158] Buildbot maintenance
An interesting failure introduced in LLD, but causing segfaults in
2-stage build, now fixed.
[LLVM-542] Zephyr code size investigation
- Rebased modifications to Zephyr
- Wrote script to build all the examples with GCC and Clang
- Fixed problems with modifications found by building all the examples
- Clang built helloworld no longer booting, need to investigate
- Found some areas for more investigation:
-- llvm-objcopy missing --gap-fill (used by one of the sample programs)
-- lld missing --print-memory-usage, while I'm using gcc for the main
link, zephyr build system seems to be feature testing using clang bare
metal default linker (lld)
-- clang always generates .note.GNU.stack, gcc embedded does not,
leading to lots of orphan section warnings. Probably best solved by
linker script modification.
Progress:
* VIRT-65 [QEMU upstream maintainership]
+ some work on getting Sphinx to generate manpages, with a conversion
of the qemu-ga manpage from texinfo as the demonstration case
+ set up to run pre-merge tests on our packet.net qemu-test machine
rather than the gcc compile farm one (as the latter is running an
ancient ubuntu whose python is now too old to build QEMU with)
+ sent some follow-on cleanup patches now that VFP decodetree is in master
* VIRT-268 [QEMU support for dual-core Cortex-M Musca board]
+ sent out patches which correctly make the Cortex-M33 (and the -M4)
implement single-precision-only floating point, so the double-precision
instructions UNDEF as they should
+ once those and the other on-list patches have made their way through
code review and into master this epic will be complete
thanks
-- PMM
== Progress ==
* FDPIC
- GCC: No progress, still waiting for feedback
* GCC upstream validation:
- reported a few regressions.
* GCC:
- noinit attribute: patch posted for upstream review
* misc:
- training 3 days
== Next ==
GCC:
- handle feedback on FDPIC and noinitpatches
- UBSAN/bare-metal: do more testing
== Progress ==
* Catching up after holidays
* [GlobalISel] Refactor CallLowering [LLVM-568]
- Still in progress, but I fixed the AMDGPU failure
- Decided to go a bit deeper than initially intended, since it gets
really awkward otherwise
* IR SVE Reviews [ LLVM-545]
- New version of the patch was posted, had a quick look
== Plan ==
* Continue LLVM-568
* Start LLVM-134 (LLVM SPEC2k6 Performance Analysis Leftovers from BUD17)
* LLVM-545 (IR SVE Reviews) - Have a look at the new, revised patch
hi all,considering the big progress achieved in coresight drivers, perf, as well as opencsd, the prerequisites for making a move towards developing branch tracing in gdb for arm processors, based on etm are now available. Therefore I am publishing this request for comment, and looking forward for your feedback on this proposal.
Non intrusive execution recording for GDB using ARM CoreSight
Status of this Memo
This memo provides information for Linaro coresight and toolchain communities. Distribution of this memo is unlimited.
Abstract
A method of realizing execution recording in GDB in a non-intrusive way. This method is based on the use of CoreSight hardware tracing, available on ARM Cortex devices.
Table of Contents
1 Introduction 2 State of the art 3 Use cases 3.1 Self hosted debug monitor 3.2 Remote debug monitor 3.3 External debugger 4 Implementation needs 4.1 Self hosted debug monitor 4.2 Remote Debug monitor 4.3 External debugger 5 Remote protocol execution sequence 6 Remote protocol extensions 7 Solutions and alternatives 7.1 Scope definition 7.2 CoreSight infrastructure exposure to the user 7.3 Parameters needed for parsing traces
1. Introduction
CoreSight technology offers a toolset for tracing the execution of a program on a CPU, as well as routing the traces to an external trace port analyzer or storing it in a dedicated internal memory. Those traces do not affects system performance, and can be used as a record for program execution. GDB offers reverse debugging by recording program execution and storing it. GDB offers either full record or program flow (branch) record. Records can be replayed later-on for forwards or backwards debugging. This request for comments is about realizing GDB record and replay functionality using CoreSight technology. it presents typical use cases and discuss different alternatives for realizing above mentioned feature. 2. State of the art
GDB currently supports two execution recording variants: - full record: where registers as well as memory are recorded for each instruction. in this case GDB collects the registers as well as involved memory area after each instruction. currently this has no support for hardware accelerators - branch record: where only program flow is recorded. in this case GDB collects a list of linear execution called blocks. each branch will terminate previous block and start a new one. currently branch is implemented either without hardware acceleration or using Intel branch trace store "bts" and Intel processor trace "pt" hardware accelerator on supported cpus.
3. Use cases
Programs running on ARM processors can be be debugged in many configurations. three of them are selected in this RFC as base for discussion : 3.1. Self hosted debug monitor Those are systems where the debugger program runs on the same cpu as the debugged program and monitors it. user interacts with the debugging session on the target host itself. Linux gdb is an example of such systems. in such a system following setup is considered - Target: a process running on an ARM cortex A - Debugger: gnu gdb via ptrace API (arm-linux-gnueabihf-gdb)
+-----------------------------------+ | Target | | +------------+ | | +------+ | Coresight | | | | | | components:| | | | GDB |<--------->| | | | | | ^ | DWT, ETM, | | | +------+ | | ITM, TPIU | | | ^ | | TMC, ETB | | | | | +------------+ | +----|---------|--------------------+ | | | | arm-linux- | gnueabihf- | gdb | debug: ptrace trace: perf/CoreSight drivers
3.2. Remote debug monitor
Those are usually systems where the debugger program runs on the same cpu as the debugged program and monitors it. user interacts with the debugging session remotely from a PC Linux gdb is an example of such systems. in such a system following setup is considered - Target: a process running on an ARM cortex A - Gdb server: gnu gdbserver (arm-linux-gnueabihf-gdbserver) - Gdb client: gnu gdb (arm-linux-gnueabihf-gdb) - UI: eclipse with needed plugins, MI interface is used.
+--------------------------+ +---------------------------------------+ | Host | | Target | | | | +------------+ | | +-----+ +------+ | | +------+ | Coresight | | | | | | GDB | | | | GDB | | components:| | | | UI |<--->| |<--->|<--->|<--->| |<--------->| | | | | | ^ |Client| ^ | ^ | |Server| ^ | DWT, ETM, | | | +-----+ | +------+ | | | | +------+ | | ITM, TPIU | | | ^ | ^ | | | | ^ | | TMC, ETB | | | | | | | | | | | | +------------+ | +----|-----|-----|------|--+ | +--------|---------|--------------------+ | | | | | | | | | | | | | | Eclipse | arm-linux- | | arm-linux- | | gnueabihf- | TCP/IP gnueabihf- | | gdb | UART gdbserver | GDB MI GDB remote debug: ptrace protocol trace: perf/CoreSight drivers
3.3. External debugger
Those are systems where an external debugger is used. It accesses the target using JTAG or SWD. Target is usually a bare metal embedded systems or systems with an rtos. as an example, following setup is considered: - Target: firmware running on ARM cortex M. - Debugger: external debug and trace device. - Gdb server: OpenOcd. - Gdb Client: arm-none-eabi-gdb. - UI: eclipse with needed plugins, MI interface is used.
+--------------------------------------+ +-------+ +-------------+ | Host | | dbggr | | Target | | | | | | | | +-----+ +------+ +------+ | | | | Coresight | | | | | GDB | | GDB | | | Debug | | components: | | | UI |<--->| |<--->| |<-->|<--->| + |<--->| | | | | ^ |Client| ^ |Server| | ^ | Trace | ^ | DWT, ETM, | | +-----+ | +------+ | +------+ | | | | | | ITM, TPIU | | ^ | ^ | ^ | | | | | | | | | | | | | | | | | | | | +----|-----|-----|------|-----|--------+ | +-------+ | +-------------+ | | | | | | | | | | | | | | Eclipse | arm-none- | OpenOcd | | | eabi-gdb | PyOcd | | | | | | GDB MI GDB remote Ethernet debug: JTAG/SWD protocol USB trace: Serial/Parallel
4. Implementation needs
4.1 Self hosted debug monitor
gdb : arm-linux-gnueabihf-gdb the interface defined in btrace.h for capturing and processing traces has to be implemented for arm CoreSight needed actions: - in btrace-common.h: add needed structures for capturing and handling etm traces - in linux-btrace.h: - add btrace_tinfo_etm - amend btrace_target_info - in linux-btrace.c: change following functions to support etm traces - linux_enable_btrace - linux_disable_btrace - linux_read_btrace - linux_btrace_conf - in arm-linux-nat.c:add an api to - configure btrace - enable btrace - disable btrace - read btrace - in btrace.c - btrace_add_pc btrace_fetch has to be implemented for Coresight this means using opencsd library to parse etms and then reconstruct executed instructions accordingly (btrace_compute_ftrace_1) - in record-btrace.c - add command for showing record btrace etm options - add command for starting tracing with CoreSight and its handler (cmd_record_btrace_etm_start) - adapt cmd_show_record_btrace_cpu ... perf: needed actions: - make sure that perf can start/stop tracing a process with its threads, collect etm traces and deliver them to the user
4.2 Remote Debug monitor
changes described in 7.1 are needed. in addition, and to support remote protocol following changes are needed gdb server: arm-linux-gnueabihf-gdbserver needed actions: - in linux-low - linux_low_read_btrace: add support for etm traces formatting. - linux_low_btrace_conf: :add support for etm configuration formatting. gdb client: arm-linux-gnueabihf-gdb needed actions: - in remote.c - adapt enable_btrace - adapt disable_btrace - in btrace.c - parse_xml_btrace: update btrace.dtd [2] and related data structures btrace_xxx - parse_xml_btrace_conf: update btrace-conf.dtd [3] and related data structures btrace_conf_xxx - extend Remote protocol handling to support coresight etm traces UI: eclipse needed actions make sure that the plugin for recoding execution and replaying it is coping well in case of arm-linux
Remote protocol needs to be extended by -1- Adding Qbtrace:CoreSight (or etm) to start collecting etm traces -2- Amending 'Branch Trace Format' xml specification to consider etm traces transfer -3- Amending 'Branch Trace Configuration Format' xml specification to consider parameters needed for etm
4.3 External debugger
changes described in 4.2 are needed. in addition, and to support tracing a remote dealing with an external debugger (bare metal embedded system) following changes are needed gdb server: OpenOcd needed actions: - rework etm driver to make it up to date. - add a driver for configuring trace interconnect IPs - rework the driver for TPIU. - integrate support for a Trace port analyzer. -Extend remote protocol implementation to support recording Coresight infrastructure of the SoC is to be set in OpenOcd through configuration files. Parameters that are not relevant for gdb are also specified in configuration files (trace sink, trace protocol, port size, trace synch frequency, cycle accurate tracing etc ...) gdb client: arm-none-eabi-gdb needed actions: - extend Remote protocol to support coresight etm traces - integrate etm trace parsing library - interface the parser to record_btrace_target Remote protocol needs -in addition to 4.2- to be extended by - Adding Qbtrace-conf:CoreSight:core=value to support multicore SoC - Adding btrace-conf:CoreSight:id=value to support demultiplexing multiple trace sources - Adding Qbtrace-conf:CoreSight:filter:context=value to support filtering traces belonging to a given process/thread - Adding Qbtrace-conf:CoreSight:filter:start-address=value and Qbtrace-conf:CoreSight:filter:end-address=value to support filtering traces for given functions/blocks/lib - Adding Qbtrace-conf:CoreSight:trigger:on-address=value and Qbtrace-conf:CoreSight:trigger:off-address=value to support triggering tracing or stop tracing if a certain function/block/lib is executed alternatively some of configurations related to filtering and triggering can be delegated to the GDB server. UI: eclipse test and verify that existing plugins cope well with gdb extensions
5. Remote protocol execution sequence
gdb and gdbserver are communicating using the gdb remote protocol. on a semantic level a tracing session runs though following sequence (1) gdb client queries gdb server support for branch trace (2) gdb server answers with - qXfer:btrace:read - qXfer:btrace-conf:read - Qbtrace:off - Qbtrace:CoreSight - Qbtrace-conf:CoreSight:xxx where xxx is the parameter name (3) gdb client sends command to let start emitting and collecting traces (Qbtrace:CoreSight) (4) gdb server executes the commands (5) gdb client sends command to stop emitting and collecting traces (Qbtrace:off) (6) gdb server exectues the command (7) gdb client sends command to get collected traces from trace sink (qXfer:btrace:read:annex:offset,length) (8) gdb server executes the command and sends back collected traces (9) gdb client parses the traces and reconstructs target states
6. Remote protocol extensions
the remote protocol needs be extended with following primitives to support CoreSight tracing - start tracing and traces capture using CoreSight (Qbtrace:CoreSight) the remote protocol can be extended with following primitives to take advantages of etm functionalities. - select the core to trace on in the case of a multicore system gdb client sends command to select the core to trace (Qbtrace-conf:CoreSight:core=value) - set the trace id for the traces gdb client sends command to set trace id (Qbtrace-conf:CoreSight:id=value) - select the context to trace gdb client sends command to select the context to trace (Qbtrace-conf:CoreSight:filter:context=value) - select the address range to trace gdb client sends command to select the address range to trace (Qbtrace-conf:CoreSight:filter:start-address=value) (Qbtrace-conf:CoreSight:filter:end-address=value) - set triggers for starting and stopping tracing gdb client sends command to select the address to trigger tracing (Qbtrace-conf:CoreSight:trigger:on-address=value) (Qbtrace-conf:CoreSight:trigger:off-address=value)
7. alternatives and discussions
7.1. Scope definition
Coresight ETM IP comes in many versions and many implementations. According to its capabilities, it can trace instructions only or instructions and involved data/data address. All ETMs variants support instructions tracing and can therefore be used for for branch tracing.
7.2. CoreSight infrastructure exposure to the user
it is here about assigning the responsibility of configuring Coresight infrastructure to generate and route traces. two alternatives are possible: - coresight infrastructure exposed to gdb client (and UI): in this alternative the user or the UI is responsible for configuring coresight IPs in the SoC, by accessing their registers directly or via coresigh driver. Remote protocol is used to configure trace sink (ETB or TPA) to start/stop collecting traces - coresight infrastructure is not exposed outside of gdbserver. in this case high level commands can be provided by gdbserver remote protocol to setup and configure coresight IPs in the SoC. My recommendation is to extend remote protocol to provide high level commands to setup and configure coresight IPs in the SoC, or to use a different channel to pass configuration parameters to gdb server
7.3. parameters needed for parsing traces Some configuration parameters like etm version, trace id ... (content of registers ETMCR, ETMIDR, ETMCCER, ETMTRACEIDR) are needed for extracting and parsing etm trace, those parameters needs to be exchanged between gdb server and client. following alternatives are possible: - extend the remote protocol to get those params with explicit queries - add them to the content of the response to qXfer:btrace-conf:read - add them to the content of the response to qXfer:btrace:read
Best RegardsZied Guermazi
o LLVM
* Machine outliner:
- Rebased on upstream.
- Improved stack alignment handling
- More cleanup before submission
o Misc
* Various meetings and discussions.
[VIRT-339 # ARMv8.5-BTI, Branch Target Identification ]
Posted v6. Some review from Dave Martin; will need at least
one further revision, and to wait til the kernel patches land.
[VIRT-327 # Richard's upstream QEMU work ]
Fix a reported bug in pauth Auth results.
Fix a reported bug in vector variable shift.
Review Peter's vfp decodetree patch set.
Review of v18 of target/rx. Never-ending, it seems...
Review of some target/ppc vector patches.
Posted v4 of CPUNegativeOffsetState. This looks ready to pull.
r~
Upstream Work ([VIRT-109])
==========================
- spent ages tracking down 64-on-32 cputlb errors which led to:
- adding x86_64 support to TCG system tests
- {PATCH v1 0/4} softmmu de-macro fix with tests Message-Id:
<20190605162326.13896-1-alex.bennee(a)linaro.org>
- {PATCH} cputlb: cast size_t to target_ulong before using for
address masks Message-Id:
<20190606154310.15830-1-alex.bennee(a)linaro.org>
- took over maintainership of orphaned gdbstub
- posted {PULL 00/52} testing, gdbstub and cputlb fixes Message-Id:
<20190607090552.12434-1-alex.bennee(a)linaro.org>
- problems on hackbox lead to {PATCH} tests/vm: favour the locally
built QEMU for bootstrapping Message-Id:
<20190607185337.14524-1-alex.bennee(a)linaro.org>
[VIRT-109] https://projects.linaro.org/browse/VIRT-109
Other Tasks
===========
- Did some initial reading on RPMB
- Half day fire training
Absences
========
- May 27th is a Bank Holiday
- May 31st working on train in the afternoon
Current Review Queue
====================
* {Qemu-devel} {PATCH v4 00/39} tcg: Move the softmmu tlb to CPUNegativeOffsetState
Message-Id: <20190604203351.27778-1-richard.henderson(a)linaro.org>
* {Qemu-devel} {PATCH v2} target/arm: Vectorize USHL and SSHL
Message-Id: <20190603232209.20704-1-richard.henderson(a)linaro.org>
* {Qemu-devel} {PATCH resend} test-thread-pool: be more reliable
Message-Id: <20190530093417.23370-1-pbonzini(a)redhat.com>
* {PATCH 0/4} tests/docker: add podman support
Message-Id: <20190523234011.583-1-marcandre.lureau(a)redhat.com>
* {Qemu-devel} {PATCH 0/2} Implement PowerPC FPSCR flag Fraction Rounded
Message-Id: <20190525022008.24788-1-programmingkidx(a)gmail.com>
* {Qemu-devel} {PATCH for-4.1 v2 00/36} tcg: Move the softmmu tlb to CPUNegativeOffsetState
Message-Id: <20190328230404.12909-1-richard.henderson(a)linaro.org>
--
Alex Bennée
[LLVM-122] BTI and PAC support
Committed the LLD work. Modulo bugs this should now be done.
[LLVM-542] LLVM/GCC code size investigation
- Revisited my Zephy build with clang patches and updated so that it
works with trunk.
- Work out next steps of work.
- Work out to build an embedded gcc toolchain using the linaro infrastructure.
[Misc]
Reported bug in gold whereupon it would generate v4t veneers for v8-a CPUs
Still waiting for TK-1 board to finish building clang so that it can
run the testsuite. Hopefully finished over the weekend.
Progress:
* VIRT-65 [QEMU upstream maintainership]
+ Got the VFP decodetree conversion patchset out for review
(42 patches, 8 files changed, 3024 insertions(+), 1476 deletions(-))
+ sent a patchset which does the (easy) first step in my plan
for converting QEMU's documentation to Sphinx; sadly all the other
steps are much trickier...
thanks
-- PMM
== Progress ==
* FDPIC
- GCC: No progress, still waiting for feedback
* GCC upstream validation:
- reported a few regressions.
* GCC:
- UBSAN/bare-metal: added sync primitives implementation for low-end
cores (eg cortex-m0) Seems OK
- noinit attribute: started work
* Infra
- Fixed Dejagnu configuration issue in ABE which prevented us from
using target variant specifications
- Investigated problems with ABE and failure to cross-build "recent"
glibc (trouble with C++ compiler detection). ABE patches under review.
- Reduced load on APM machines to avoid depending on them too much
== Next ==
GCC:
- handle feedback on FDPIC patches
- noinit attribute
- UBSAN/bare-metal: do more testing
Hi,
Food for thought for today's sync up. I've been writting QEMU plugins to
exercise the plugin system and see what sort of useful information you
can extract when you can control the instruction stream.
For example I now have a plugin that can break down instruction counts
for any given run, for example a kernel boot:
Instruction Classes:
Class: UDEF not counted
Class: SVE (68 hits)
Class: Reserved (0 hits)
Class: PCrel addr (4589078 hits)
Class: Add/Sub (imm,tags) (0 hits)
Class: Add/Sub (imm) (26832113 hits)
Class: Logical (imm) (74304974 hits)
Class: Move Wide (imm) (10933759 hits)
Class: Bitfield (71470957 hits)
Class: Extract (85655 hits)
Class: Data Proc Imm (0 hits)
Class: Cond Branch (imm) (37227632 hits)
Class: Exception Gen (6 hits)
Class: NOP not counted
Class: Hints (244825554 hits)
Class: Barriers (1668558 hits)
Class: PSTATE (202144 hits)
Class: System Insn (7132992 hits)
Class: System Reg (2268308 hits)
Class: Branch (reg) (6280976 hits)
Class: Branch (imm) (18347905 hits)
Class: Cmp & Branch (180167025 hits)
Class: Tst & Branch (4092972 hits)
Class: Branches (0 hits)
Class: AdvSimd ldstmult (0 hits)
Class: AdvSimd ldstmult++ (0 hits)
Class: AdvSimd ldst (0 hits)
Class: AdvSimd ldst++ (0 hits)
Class: ldst excl (160861365 hits)
Class: Prefetch (0 hits)
Class: Load Reg (lit) (12828544 hits)
Class: ldst noalloc pair (0 hits)
Class: ldst pair (60381349 hits)
Class: ldst reg (0 hits)
Class: Atomic ldst (0 hits)
Class: ldst reg (reg off) (0 hits)
Class: ldst reg (pac) (0 hits)
Class: ldst reg (imm) (119597941 hits)
Class: Loads & Stores (0 hits)
Class: Data Proc Reg (113586343 hits)
Class: Scalar FP (0 hits)
Class: Unclassified (0 hits)
You can break down each class to individual instructions. For example
the Hints are mostly:
Individual Instructions:
Instr: wfe (132400072 hits) (op=0xd503205f/ Hints)
Instr: sevl (66433640 hits) (op=0xd50320bf/ Hints)
Instr: yield (29619246 hits) (op=0xd503203f/ Hints)
Instr: wfi (2865 hits) (op=0xd503207f/ Hints)
So I'm looking for a similar experiment that would be useful for the
memory sub-system. When I chatted to Maxim we thought maybe a simplified
cache line simulator might be useful. The aim wouldn't be to simulate
what a real cache might do but to be useful say for identifying regions
of code which might be susceptible to cache line bouncing. So as
compiler writers what sort of run time memory behaviour would you like
to track? What sort of information would be useful to extract with such
a tool?
I'm open to ideas ;-)
--
Alex Bennée
Four day week.
[VIRT-327 # Richard's upstream QEMU work ]
Reviewed s390 fp vector patch set.
Posted v16 rx. This seemed so close to being ready
last week, but now I don't know. I think I should
quit pushing it myself and let Yoshinori do more of
the lifting here.
Reviewed avr v20 patch set.
Reviewed Alex's testing patch set.
Submitted patches to constify upstream capstone
(500k from .data to .rodata).
r~
Short week (off Thursday/Friday)
== Progress ==
* FDPIC
- GCC: No progress, still waiting for feedback
* GCC upstream validation:
- reported a few regressions / fixed some testcases
* GCC:
- UBSAN/bare-metal: added sync primitives implementation for low-end
cores (eg cortex-m0) Seems OK
* Infra
- cleanup
- handling some problems with boards upgrades and crashes
== Next ==
FDPIC:
- GCC: handle feedback
UBSAN/bare-metal: do more testing
== This Week ==
* PR88837 (7/10)
- Addressed all upstream suggestions.
- Found a (hackish) way to test patch with qemu (multiple issues).
- Sorting thru "strange" testsuite fallout most of which seems
unrelated to my patch.
* PR88833 (2/10)
- Looking at fwprop pass
* Misc (1/10)
- Meetings
== Next Week ==
- Continue ongoing tasks.
(Short week, three days.)
Progress:
* VIRT-65 [QEMU upstream maintainership]
+ continuing with the conversion of the VFP decoder to 'decodetree'.
With some useful advice from RTH I have now got a big chunk of
it done, and it looks like this will provide:
- better places to put "UNDEF if CPU doesn't have double support" checks
- checks of "VFP enabled?" only after all UNDEF checks have happened
- cleanup of a lot of code that uses some TCG globals cpu_F0 and cpu_F1,
which is weird ancient style and overdue for a cleanup
- a VFP decoder which isn't a single thousand-line function with multiple
nested switch statements
thanks
-- PMM
3 day week.
[LLVM-122] BTI and PAC support in lld, llvm-readobj, llvm-objdump
- Now in upstream review. Most of the week spent writing and updating tests.
Some time reviewing some asm goto patches patches.
Planned absences:
Holiday Friday 31st May.
== Progress ==
* Out of office 22 & 24 May
* [GlobalISel] Refactor CallLowering [LLVM-568]
- In progress, likely going to take a while
- Found a minor bug in the lowering for AArch64 (I can get it to
crash on some edge case), not sure if it's worth fixing independently
since it gets fixed anyway with the refactoring that I have in
progress
- Trying to understand an AMDGPU failure
== Plan ==
* Out of office 29 May - 10 June
* More of the same
o LLVM
* 8.0.1-rc1 ARM and AArch64 Binaries uploaded.
* Buildbots: One fixe committed upstream.
* Machine outliner:
- Fixed liveness issue.
- Preparing pat6ch for re-submission
o Misc
* Various meetings and discussions.
[VIRT-343 # ARMv8.5-RNG, Random Number Generator ]
Merged!
[VIRT-263 # ARMv8.1-VHE Virtual Host Extensions ]
Started dusting off and rebasing wip.
[VIRT-339 # ARMv8.5-BTI, Branch Target Identification ]
Started reviewing the kernel patch set for this feature.
[VIRT-327 # Richard's upstream QEMU work ]
PR for tcg gvec work.
PR for Sato-san's RX target.
Patch set to update capstone and enable s390x.
GSOC: Review v3 of Jan's enable risu for x86 patch set.
[Other]
Travel arrangements for Xilinx meeting in San Jose, June 13.
Will need to pick Peter's brain re m-profile before then.
r~
== This Week ==
* PR88833 (4/10)
- Started investigating the issue, it seems one of the code-movement
RTL passes like cse2
do not remove identical register copies resulting in extra register move.
* PR88837 (5/10)
- Patch almost approved offline by Richard, suggested me to move
discussion upstream.
- Observed "strange" issue with return value vectors on qemu for
run-time tests for fixed-length vectors. Turned out due to mismatch
in vector-length at compile and run time -;)
- Trying to run SVE tests with qemu.
* Misc (1/10)
- Meetings
== Next Week ==
- Continue ongoing tasks
[LLVM-122] BTI and PAC support in LLD
Implementation now working, have written BTI tests, next step is
finishing off PAC tests.
[Misc]
Helped out debugging an asm-goto problem on ARM targets.
Investigated a GNU ld LMA overlap when VMA and LMA got out of sync.
Helped out with CMSIS use of ld scripts when using a fast-model,
needed to get LMA == VMA for program header covering BSS.
QEMU Tooling ([VIRT-252])
=========================
QEMU plugin support ([VIRT-280])
- synced up with Emilio, will take over branch and submit
- latest branch is [plugins/plugins-v3]
- will peel off simple clean-ups and tweaks next week
- then need to split up some more and better separate code
- exposed plugin_disas to for "howvec" instruction counter
- some [example] [output] while booting kernel
[VIRT-280] https://projects.linaro.org/browse/VIRT-280
[plugins/plugins-v3]
https://github.com/stsquad/qemu/tree/plugins/plugins-v3
[example] http://ix.io/1JXC
[output] http://ix.io/1JXl
GSoC Mentoring ([VIRT-348])
- planning for start of coding next week
[VIRT-348] https://projects.linaro.org/browse/VIRT-384
Upstream Work ([VIRT-109])
==========================
- prepared [testing/next] for PR
[VIRT-109] https://projects.linaro.org/browse/VIRT-109
[testing/next] https://github.com/stsquad/qemu/tree/testing/next
Completed Reviews [3/3]
=======================
{RISU v3 00/11} Support for i386/x86_64 with vector extensions
Message-Id: <20190523204409.21068-1-jan.bobek(a)gmail.com>
{PATCH v10 00/20} gdbstub: Refactor command packets handler
Message-Id: <20190521095948.8204-1-arilou(a)gmail.com>
- CLOSING NOTE [2019-05-24 Fri 17:30]
awaiting re-spin with tags applied.
{RFC v2 00/38} Plugin support
Message-Id: <20181209193749.12277-1-cota(a)braap.org>
- CLOSING NOTE [2019-05-24 Fri 17:47]
taking over the tree
Absences
========
- May 27th is a Bank Holiday
- May 31st working on train in the afternoon
Current Review Queue
====================
* {PATCH 0/5} tests/vm: Python 3, improve image caching, and misc
Message-Id: <20190329210804.22121-1-wainersm(a)redhat.com>
* {Qemu-devel} {PATCH for-4.1 v2 00/36} tcg: Move the softmmu tlb to CPUNegativeOffsetState
Message-Id: <20190328230404.12909-1-richard.henderson(a)linaro.org>
* {Qemu-devel} {RFC v4 0/7} Baby steps towards saner headers
Message-Id: <20190523081538.2291-1-armbru(a)redhat.com>
* {Qemu-arm} {PATCH v2 0/4} hw/arm/boot: handle large Images more gracefully
Message-Id: <20190516144733.32399-1-peter.maydell(a)linaro.org>
* {Qemu-devel} {PATCH v12 00/12} Add RX archtecture support
Message-Id: <20190514061458.125225-1-ysato(a)users.sourceforge.jp>
* {PATCH 00/13} target/arm/kvm: enable SVE in guests
Message-Id: <20190512083624.8916-1-drjones(a)redhat.com>
--
Alex Bennée
Progress:
* VIRT-65 [QEMU upstream maintainership]
+ sent patchset fixing a handful of simple GICv3 bugs
+ usual codereview work
+ sent out a sketch of how we can transition our documentation
from the current texinfo manual to a set of sphinx manuals
+ had a look at the practicalities of converting our hand-written
VFP decoder to 'decodetree' -- this may be the easiest way to
support FPU configs which only support single-precision, like Cortex-M33
thanks
-- PMM
== Progress ==
* FDPIC
- GCC: Updated patch 03/21 with changes in the handling of -static
according to feedback. Pinged the whole series.
* GCC upstream validation:
- reported a couple of regressions
* Infra
- [stalled] working on adding binutils regression testing to round-robin jobs
- cleanup
- handling some problems with boards upgrades and crashes
== Next ==
FDPIC:
- GCC: handle feedback
UBSAN/bare-metal: look at how to make it easier to use on CPUs that
lack sync primivites (eg cortex-m0)
== This Week ==
* PR88837 (9/10)
- Tweaked patch to handle few more special cases with suggestions from
Richard.
* Misc (1/10)
- Meetings
== Next Week ==
- Continue ongoing tasks
o LLVM
* 8.0.1-rc1 Started Binaries build.
* Buildbots babysitting:
- Two fixes committed upstream
* Machine outliner:
- Liveness informations are not accurate after FrameLowering,
investigation on-going.
o Misc
* Various meetings and discussions.
[VIRT-343 # ARMv8.5-RNG, Random Number Generator ]
Posted v7 and v8. I think this is now ready for merge,
but I said that last week as well. :-P
[VIRT-327 # Richard's upstream QEMU work ]
More gvec work, some of which applies to target/arm,
and some to tcg/aarch64/, but all of which is in support
of David's target/s390x work. Should be coming to a
close on that soon.
Posted v7 of my do_syscall split.
Reviewed v13 of the RX target, adjusted it slightly for
my tlb_fill changes. I think this now ready to merge.
r~
Progress:
* VIRT-65 [QEMU upstream maintainership]
+ finally managed to complete review of Damien's reset handling rework
+ rolled v2 of patchset to support booting large kernel images
+ sent a cleanup patchset to rename arm.h to boot.h
* VIRT-268 [QEMU support for dual-core Cortex-M Musca board]
+ working on making the CPU model configurable without FPU or DSP,
so that we can correctly model the Musca-A and MPS2-AN521 boards as
not having FPU or DSP on CPU #0.
thanks
-- PMM
== Progress ==
* FDPIC
- GCC: sent updated patch series (v5). Received feedback about -static
support vs dynamic-linkker need. Discussing options.
* GCC upstream validation:
- reported a couple of regressions
- found a bug in qemu while testing v4.0.0, preparing a reproducer
* Infra
- [stalled] working on adding binutils regression testing to round-robin jobs
- cleanup
== Next ==
FDPIC:
- GCC: handle feedback
UBSAN/bare-metal: look at how to make it easier to use on CPUs that
lack sync primivites (eg cortex-m0)
== Progress ==
* [GlobalISel] Add support for integers > 32 bits wide [LLVM-310]
- While looking into this I found and fixed a bug in the generic
part of IRTranslator, which reduced the number of fallbacks on the ARM
test-suite by about 20%
- Currently working on lowering function calls etc for 64-bit types
* [GlobalISel] Refactor CallLowering [LLVM-568]
- The CallLowering interface really needs a cleanup before I
continue with LLVM-310
- This has been discussed upstream in the past and would benefit all
targets, so I'm going to give it a shot
* SVE2 code reviews
== Plan ==
* More of the same
* Out of office 29 May - 10 June
== Progress ==
* Out of office on Friday (sick)
* [GlobalISel] Better support for small types [LLVM-553]
- Committed upstream
* GlobalISel
- quickfix for a DBG_VALUE-related bug
- code reviews
* SVE code reviews
* Catching up on Connect / EuroLLVM
== Plan ==
* More of the same
* Out of office end of May - beginning of June
[VIRT-343 # ARMv8.5-RNG, Random Number Generator ]
Posted v4, v5, v6. I think this is now ready for merge.
[VIRT-327 # Richard's upstream QEMU work ]
Posted v3 of the CPUNegativeOffset patch set.
Posted v2, v3, and a pull request for the tlb_fill patch set.
Debugged one more fix for Sparc testthreads.
Reposted some long dormant linux-user fixes.
Started reviving the do_syscall split patch set,
since Laurent asked after it.
r~