== Progress ==
* Investigate running benchmarks in containers [TCWG-1513]
- Addressed review comments, merged changes to dockerfiles
- Fixed mcf on AArch64; also managed to run on armv7 but with some failures
* SVE IR fuzzer [LLVM-586]
- Didn't get to work much on it, but rebuilt it with debug info
* Uploaded LLVM 9.0.0-rc3
== Plan ==
* Investigate armv7 failures in docker
* Iterate on benchmarking patches
* Figure out why the fuzzer almost never introduces new code
* 2 vacation days left this month
[VIRT-327 # Richard's upstream QEMU work ]
Fix two ppc fp launchpad bugs.
Resurect patches for openrisc v1.3
Once-over review of risc-v vector extension.
Posted v3 of a32 coversion to decodetree.
Started poking at neon conversion to decodetree,
as a prerequisite to a32 support for fp16.
Some cleanups to watchpoints. Generic stuff now queued to tcg-next.
DavidH is taking care of target/s390x updates, but there are some
changes wanted within target/arm SVE code.
r~
== Progress ==
* GCC:
- FDPIC: sent updated patches, already got feedback on them. almost OK
* GCC upstream validation:
- reported a couple of issues. Helped with testing.
- getting ready to add cortex-m33 validation with qemu
* Binutils:
- Non-contiguous memory regions support in the BFD linker: not started yet.
* misc:
- infra fixes / troubleshooting / reviews
- a bit of Jira
== Next ==
GCC:
- handle feedback on FDPIC patches
- binutils/linker support for non-contiguous memory regions
- GNU-583
- GCC upstream validation: Add a config for cortex-m33 (v8-m)
== This Week ==
* GCC:
(i) PR86753 - Addressing upstream suggestions, pivoted to another approach.
(ii) PR91272 - Patch fixes issue but not correct approach, need to rework it.
(iii) PR78736 - Submitted patch upstream.
== Next Week ==
- Continue ongoing tasks.
== Progress ==
* Investigate running benchmarks in containers [TCWG-1513]
- Cleaned and uploaded scripts for comments
- mcf still hangs, but only when run with clang; investigating root cause
* SVE IR fuzzer [LLVM-586]
- Made some progress with the prototype but it still needs work
* Re-running 8.0.1 release on AArch64 since the archive on
llvm.org/releases seems to be broken
[VIRT-263 # ARMv8.1-VHE Virtual Host Extensions ]
Reorg ARMMMUIdx again; this time, do not overlap
EL1&0 and EL2&0 mmu_idx. This makes debugging a
bit easier.
Fix one more bug in EL2&0 selection. This was not
the last, because a nested kernel does not yet boot.
[VIRT-327 # Richard's upstream QEMU work ]
Another round of aa32 decodetree patches.
Another round of arm hflags patches.
Review of and some patches for cpu watchpoints.
r~
Progress:
* VIRT-65 [QEMU upstream maintainership]
+ sent fix for LP:1840922 (incorrect handling of branches to
M-profile exception-return magic addresses in linux-user mode)
+ some more work on requirements for JIRA KVM related issues
+ started reviewing RTH's monster decodetree conversion patchset
(and made it nearly halfway through it)
thanks
-- PMM
== Progress ==
* GCC:
- FDPIC: handling feedback.
* GCC upstream validation:
- no issue to report this week.
* Binutils:
- Non-contiguous memory regions support in the BFD linker: not started yet.
* misc:
- infra fixes / troubleshooting
- qemu-4.1 confirmed OK for GCC validations
- trying validations for cortex-m33, found a few issues in GCC and a
small one in QEMU
- a bit of Jira
== Next ==
GCC:
- handle feedback on FDPIC patches
- binutils/linker support for non-contiguous memory regions
- GNU-583
- GCC upstream validation: Add a config for cortex-m33 (v8-m)
- Reorganised buildbots to remove some of the duplication between v7
and v8 AArch32 to free up more v7 hardware.
- Got final code-size information and wrote my slides for Linaro Connect
- Track down an AArch64 specific LLD problem due to a recent change,
now resolved.
- Wrote worst case thunk/veneer convergence patch to help move through
an iterative symbol assignment algorithm.
UK Bank Holiday Monday
== This Week ==
* GCC :
(i) PR83756: Addressing feedback from upstream.
(ii) PR90724: Committed to trunk.
(iii) PR91272: Started working on patch.
(iv) PR88839: Merged fix from sve-acle-branch to trunk.
* Validation
(i) Patch to separate build and test steps in tcwg_gnu.
(ii) SVE validation
== Next Week ==
- Continue ongoing tasks
== Progress ==
* Investigate running benchmarks in containers [TCWG-1513]
- Discussed the noise levels with the team and we decided to
continue with this
- mcf hangs, need to investigate the cause
- WIP: cleaning up the scripts so I can send a sensible patch for review
* IR SVE reviews [LLVM-545]
- Looked at some patches but didn't have much to add to what other people said
* SVE IR fuzzer [LLVM-586]
- Started prototyping
- The fuzzer relies a lot on inserting constants, and since we can't
produce SVE constants yet I'll have to dig in and get it to also use
function parameters
- Need to add operation descriptions for SVE
* Buildbot babysitting
- Reported an issue
Progress:
* VIRT-65 [QEMU upstream maintainership]
+ unfortunately a late-breaking security-bug meant we needed an rc5, but
we put out the final 4.1.0 release on Thursday
+ code review:
- RTH's set of minor cleanups to the 32-bit arm codegen
+ sent out patches which fix our emulation of ATS instructions so
they cause exceptions when the spec says they should (a cleanup
of a change I'd half-implemented a year or so ago)
+ put together and sent the first target-arm pullreq for 4.2
* booked travel/hotel for KVM Forum; did some other prep like working
out a preliminary list of QEMU Summit invitees, and emailing people
whose GPG keys I need to sign to find out if they're going.
thanks
-- PMM
* Thursday off
== Progress ==
* GCC:
- FDPIC: handling feedback.
- noinit attribute: committed (ld patch too)
* GNU-583 (Fix Linux kernel built for Thumb-2 with GCC using LTO)
- no progress this week
* GCC upstream validation:
- a few issues reported this week.
* Binutils:
- Non-contiguous memory regions support in the BFD linker: not started
yet.
* misc:
- infra fixes / troubleshooting
== Next ==
GCC:
- handle feedback on FDPIC patches
- binutils/linker support for non-contiguous memory regions
- GNU-583
- GCC upstream validation: Add a config for cortex-m33 (v8-m)
QEMU Tooling ([VIRT-252])
=========================
QEMU plugin support ([VIRT-280])
- continued working through v4 comments on [next version branch]
- did some benchmarking to justify no fast path in translator_ld
[VIRT-280] https://projects.linaro.org/browse/VIRT-280
[next version branch]
https://github.com/stsquad/qemu/tree/plugins/plugins-v5
GSoC Mentoring ([VIRT-348])
- spent some time reviewing the current state and experimenting
[VIRT-348] https://projects.linaro.org/browse/VIRT-384
Upstream Work ([VIRT-109])
==========================
- more work on [my docker fixup branch]
- now can run make docker-test-build on qemu-test
- posted {PATCH v3 00/13} softfloat updates (include tweaks, rm LIT64)
Message-Id: <20190813124946.25322-1-alex.bennee(a)linaro.org>
[VIRT-109] https://projects.linaro.org/browse/VIRT-109
[my docker fixup branch]
https://github.com/stsquad/qemu/tree/testing/docker-def-and-buster-fixes
Completed Reviews [6/6]
=======================
{Qemu-devel} {PATCH v5 00/10} Measure Tiny Code Generation Quality
Message-Id: <20190815021857.19526-1-vandersonmr2(a)gmail.com>
- CLOSING NOTE [2019-08-15 Thu 20:45]
Looks pretty good but some racy crashes need to be fixed.
{Qemu-devel} {PATCH v1 0/2} Integrating qemu to Linux Perf
Message-Id: <20190815023725.2659-1-vandersonmr2(a)gmail.com>
- CLOSING NOTE [2019-08-15 Thu 20:46]
Needs a v2 with compile fixes
Absences
========
- Moving house & office ~20th/21st Aug
Current Review Queue
====================
* {PATCH v3 00/34} target/arm: Implement ARMv8.1-VHE
Message-Id: <20190803184800.8221-1-richard.henderson(a)linaro.org>
* {Qemu-devel} {PATCH v4 00/22} target/arm: Implement ARMv8.5-MemTag, system mode
Message-Id: <20190307170440.3113-1-richard.henderson(a)linaro.org>
* {Qemu-devel} {RFC PATCH} Implement qemu_thread_yield for posix, use it in mttcg to handle EXCP_YIELD
Message-Id: <20190717054655.14104-1-npiggin(a)gmail.com>
* {PATCH v3 0/6} tests/docker: add podman support
Message-Id: <20190713143311.17620-1-marcandre.lureau(a)redhat.com>
* {PATCH 0/2} tests/acceptance: Add test of NeXTcube framebuffer using OCR
Message-Id: <20190629150056.9071-1-f4bug(a)amsat.org>
* {Qemu-devel} {PATCH 0/4} Introduce the microvm machine type
Message-Id: <20190628115349.60293-1-slp(a)redhat.com>
--
Alex Bennée
- Fixed LLD segfault when exceptions.
- Implement large code model TLSLD relocations in LLD.
- More thoughts on MOVK prel relocations in the ABI as there looks to
be a use case for them in HWASAN that isn't covered by the existing
relocations.
- Some thoughts on how TCWG LLVM Jira initiatives could be worded and
how we may want to word them in the next cycle.
- Initial thoughts written down on Connect presentation, need to flesh
out more this week.
- Upgraded machine to 18.04, almost managed to break it in the process
but luckily it was recoverable. Will likely have some teething
problems.
o LLVM:
* Machine Outliner:
- Improved stack fixup handling (remove cnadidates when it is not
bebeficial)
- Fixing an issue related to R12 usage.
o Misc
* Various meetings and discussions.
[VIRT-327 # Richard's upstream QEMU work ]
Posted v4 of arm hflags reorg.
Split out 3 minor patch sets from the larger aa32 decodetree set.
Reviewed v6 of invert-endian tlb patch set.
Reviewed ajb's fpu header reorg.
Posted an RFC vs Andrew Jones' SVE-in-KVM patch set.
r~
QEMU Tooling ([VIRT-252])
=========================
- working through v4 comments on [next version branch]
[next version branch]
https://github.com/stsquad/qemu/tree/plugins/plugins-v5
Upstream Work ([VIRT-109])
==========================
- spent some more time looking at TCG EL2 behaviour
- there is something not working properly even without the VHE
patches
- need to do some tooling work to be able to properly debug HYP code
- posted {PATCH v1 0/7} softfloat header cleanups Message-Id:
<20190808164117.23348-1-alex.bennee(a)linaro.org>
- posted {PATCH v2 0/7} softfloat includes cleanup Message-Id:
<20190809091940.1223-1-alex.bennee(a)linaro.org>
- posted {PATCH v1 0/2} docker DEF_TARGET_LIST cleanup Message-Id:
<20190809155047.24526-1-alex.bennee(a)linaro.org>
[VIRT-109] https://projects.linaro.org/browse/VIRT-109
Completed Reviews [6/6]
=======================
{Qemu-arm} {PATCH 0/2} target/arm: Fix routing of singlestep exceptions
Message-Id: <20190805130952.4415-1-peter.maydell(a)linaro.org>
- CLOSING NOTE [2019-08-07 Wed 11:47]
Looks good, would be nice to integrate a better testcase
{PATCH} gdbstub: Fix handling of '!' packet with new infra
Message-Id: <20190805190901.14072-1-ramiro.polla(a)gmail.com>
- CLOSING NOTE [2019-08-07 Wed 11:59]
Queued to my tree
{PATCH 0/3} tests/tcg: disentangle makefiles
Message-Id: <20190730123759.21723-1-pbonzini(a)redhat.com>
{Qemu-devel} {PATCH v2 00/29} Tame a few "touch this, recompile the world" headers
Message-Id: <20190806151435.10740-1-armbru(a)redhat.com>
- CLOSING NOTE [2019-08-07 Wed 17:32]
Looks OK but breaks a lot of cross compiles. More subtly to fix.
{Qemu-devel} {PATCH untested for-4.2} memory: fix race between TCG and accesses to dirty bitmap
Message-Id: <20190729214717.6616-1-pbonzini(a)redhat.com>
{PATCH v3 00/29} Tame a few "touch this, recompile the world" headers
Message-Id: <87k1bmpn7y.fsf(a)dusky.pond.sub.org>
Current Review Queue
====================
* {PATCH v3 00/34} target/arm: Implement ARMv8.1-VHE
Message-Id: <20190803184800.8221-1-richard.henderson(a)linaro.org>
* {Qemu-devel} {PATCH v4 00/22} target/arm: Implement ARMv8.5-MemTag, system mode
Message-Id: <20190307170440.3113-1-richard.henderson(a)linaro.org>
* {Qemu-devel} {RFC PATCH} Implement qemu_thread_yield for posix, use it in mttcg to handle EXCP_YIELD
Message-Id: <20190717054655.14104-1-npiggin(a)gmail.com>
* {PATCH v3 0/6} tests/docker: add podman support
Message-Id: <20190713143311.17620-1-marcandre.lureau(a)redhat.com>
* {PATCH 0/2} tests/acceptance: Add test of NeXTcube framebuffer using OCR
Message-Id: <20190629150056.9071-1-f4bug(a)amsat.org>
* {Qemu-devel} {PATCH 0/4} Introduce the microvm machine type
Message-Id: <20190628115349.60293-1-slp(a)redhat.com>
--
Alex Bennée
Progress:
* VIRT-65 [QEMU upstream maintainership]
+ we needed an rc4 for 4.1.0 (predictably), so herded the necessary
fixes into it
+ bug fixing:
- LP:1838913 : single-step exceptions were always being taken to EL1,
even if the guest configured debug exceptions to go to EL2 or EL3
- LP:1815423 : x86 TCG bug where we were giving the wrong result for
SSE float-to-int conversions that raised the 'invalid' exception
- LP:1796520 and LP:1839325 : started investigating some sh4 linux-user bugs
+ code review:
- patch to implement aspeed SD controller model
- Eric Auger's patchset of minor SMMUv3 fixes
- Aspeed GPIO controller model
- v3 of the GreenSocs reset-handling refactoring
- RTH's preliminary cleanup of handling of PC in arm decoder
thanks
-- PMM
[VIRT-263 # ARMv8.1-VHE Virtual Host Extensions ]
Posted a couple of revisions. Good feedback between this
and MemTag, both of which need to adjust the set of TLBs.
[VIRT-339 # ARMv8.5-BTI, Branch Target Identification ]
Posted v7.
[VIRT-344 # ARMv8.5-MemTag, Memory Tagging Extension ]
Rebased upon current VHE+BTI work. Updated from beta manuals
to the ARM ARM issue E.a manual.
[VIRT-327 # Richard's upstream QEMU work ]
Reviewed v1 x86 gen_sse rewrite.
Reviewed Alex's v4 plugin patchset.
r~
o LLVM:
* Bots babysitting
* Machine Outliner:
- Rebased branch on upstream
- Working on stack fixup limits test cases
o Misc
* Various meetings and discussions.
Progress:
* VIRT-65 [QEMU upstream maintainership]
+ QEMU 4.1.0 rc3 sent out. As usual, we've found some last minute
bugs and we'll need an rc4 next week...
+ patch review:
- series from Philippe making some cleanups to use object_initialize_child()
and friends
- Alex's patch "generate a custom MIDR for -cpu max"
- made a start on reviewing RTH's 32-bit arm decodetree conversion
+ bug fixing:
- LP:1838277 : we were taking exceptions caused by BRK instructions at
EL2 to EL1, rather than to EL2; sent a patch fixing this
- LP:1838475 : FPU register stacking in M-profile CPUs without the
Security Extensions was incorrectly failing an NSACR check and
taking a bogus exception: sent a patch to fix
+ wrote a patchset that converts the sparc target away from the
deprecated and broken do_unassigned_access hook to use the new-in-2017
do_transaction_failure hook instead; one step closer to being able
to remove the old hook entirely...
+ did the same for the mips target
thanks
-- PMM
== This Week ==
* PR86753
- Working on approach suggested by Richard.
* PR90724
- Pinged upstream.
* Validation
- Merged patch to jenkins-scripts to add testsuite comparison to tcwg_gnu.
- Submitted patch for separating build and test steps in tcwg_gnu.
== Next Week ==
- Continue ongoing tasks.
== Progress ==
* LLVM 9.0.0 rc1 binaries uploaded
- Ran into one cross-platform issue (x86, AArch64, ARM etc)
- Opened a bug for libfuzzer tests on AArch64
* Use ninja in release job [LLVM-536]
- Done
* Investigate running benchmarks in containers [TCWG-1513]
- Seems to work, except for mcf which errors out; will investigate next week
* IR SVE reviews [LLVM-545]
- Another round of feedback to D53137
== Plan ==
* Review D47770
* Upcoming vacation: 6 - 13 August
* Friday off
== Progress ==
* GCC:
- FDPIC: started taking feedback into account. Changes under test.
- noinit attribute: posted a new version
* GNU-583 (Fix Linux kernel built for Thumb-2 with GCC using LTO)
- no progress this week
* GCC upstream validation:
- no new issue this week.
* Binutils:
- Non-contiguous memory regions support in the BFD linker: not started
yet. Received a few "warnings" as feedback.
* misc:
- infra fixes / troubleshooting
== Next ==
GCC:
- handle feedback on FDPIC and noinit patches
- binutils/linker support for non-contiguous memory regions
- GNU-583
- GCC upstream validation: Add a config for cortex-m33 (v8-m)
== Holidays ==
Aug 2-11 (next week)
[PR42719] BasicAA UnitTest failure on AArch64 when compiled with GCC
- Patch submitted upstream
[PR42853] LLD support for TPREL_G0 relocations
- In theory a simple change, but llvm-mc seems to be deliberately
doing something strange with MOVZ encodings fixMOVZ and I haven't
worked out why it is needed.
Investigations with respect to inline assembler constraints.
- Some patch review comments for my ABE (Linaro GCC build system)
patch for GCC RM multilibs
- LLD review for smaller AArch64 images.
- Wrote a summary of what TCWG do for the upstream community, may turn
into a Linaro blogpost
- Raised GCC PR91299 to cover incorrect weak definition inlining in
GCC LTO in presence of non-weak definition.
On holiday Friday 2nd August, back in the office on Monday.
[VIRT-263 # ARMv8.1-VHE Virtual Host Extensions ]
Merged some fixes from Alex.
Starting to test kvm-inside-tcg.
[VIRT-327 # Richard's upstream QEMU work ]
Posted v1 of aa32 base isa conversion to decodetree.
Reviewed v5 of the sparc64 invert endian tlb bit patches.
r~
[ACTIVITY] On buildbot and Linux kernel CI duty
- One obscure alias analysis bug that only failed on GCC on AArch64
that suddenly started failing a unit test. Tracked down to an
unitialised variable when API called directly via unit-test. Test has
been reverted, but I've put enough information in a PR for an AA
expert to either fix the test or initialise class member pointer to
nullptr.
- A long review of an LLD patch that changes page-alignment rules and
this has impact on the TLS address. Spent way too long trying to
reverse engineer why the formula used would work, needed to find the
corresponding loader code to find the other side.
- Assertion added to LLVM triggered a failure in Linux kernel build,
reproduced example and forwarded on
- Code size investigation Linaro Connect proposal accepted, now have
to write it.
Progress:
* VIRT-65 [QEMU upstream maintainership]
+ QEMU 4.1 rc2 tagged
+ investigated and sent patches for a problem reported by Mark Rutland
where we still were sometimes putting the initrd on top of the kernel
in our built-in bootloader code
+ spent a couple of days tracking down the cause of LP:1696773, an
intermittent bug when running AArch64 Go programs under QEMU linux-user
mode. This turned out to be an incorrect implementation of 'sigaltstack'
as setting a process-wide signal stack rather than a per-thread one.
+ prompted by a patch from Greensocs fixing a vmstate migration bug
in the pl330 model, improved the vmstate macros to catch that category
of bug; this detected one other bug lurking in a different device.
thanks
-- PMM
== Progress ==
* LLVM 8.0.1 final binaries uploaded
* Use ninja in release job [LLVM-536]
- Patches ready, but waiting for 9.0.0-rc1 so I can test (was
supposed to come out this week but got postponed to Monday)
* Investigate running benchmarks in containers [TCWG-1513]
- Still having trouble, but working on it
* IR SVE reviews [LLVM-545]
- Posted some feedback to D53137
* A bit of fuzzing of GlobalISel CallLowering for AArch32
== Plan ==
* Upcoming vacation: 6 - 13 August
[LLVM-478] Clang and GCC code size comparison
- Looked at CMSIS-DSP, clang consistently produces smaller code than
GCC in contrast to Zephyr. Recorded some specific areas
- Tidied up scripts and build system patches to post for review
- Wrote up a readme for how to reproduce the results
[Misc]
- LLD reviews for reducing image-size using some VA/PA address alignment tricks.
- MC and LLD changes to add MOV[WNZ] relocations.
- Looked into why our container might be giving Resource unavailable
during ninja check-all
Some thoughts on ABI process.
Progress:
* VIRT-65 [QEMU upstream maintainership]
+ QEMU 4.1 rc1 tagged
+ investigating various bugs:
- LP:1836501 (assert using vexpress-a15 board with KVM) : couldn't
reproduce; this reminded me I needed to do a debian upgrade on my
cubietruck devboard, though.
- LP:1830864 (assert using -cpu host,aarch64=off with KVM on a host
kernel older than 4.15): diagnosed problem and sent a patch to fix it
- sent a patch to fix a URL in our configure script which pointed
users at the binary downloads page of the project website when it
wanted the source downloads page
- sent a patch fixing a problem building elf2dmp when libcurl's header
files are in a non-standard location
thanks
-- PMM
== Progress ==
* LLVM 8.0.1 rc4 binaries uploaded
* Buildbot and kernel builds monitoring
- Investigated/reported/fixed a couple of issues
- Tried to reproduce a clang-native-arm-lnt-perf failure that's been
keeping the bot red since the 3rd of July, but it turns out to be very
tricky
* Investigate running benchmarks in containers [TCWG-1513]
- Got quite far with this, but got a bit stuck with a file that
cannot be accessed for reasons that are not clear to me yet
* IR SVE reviews [LLVM-545]
- Looking at Graham's rebased size queries patch
== Plan ==
* Upcoming vacation: 6 - 13 August
Good afternoon,
GDB has a valuable feature consisting of process record and replay. In
fact, GDB can record a log of process execution and save it. This record
can be loaded later on, and used for debugging. This is called offline
debugging. it offers the advantage that you can catch the issue once, and
replay it as much as needed to find the root cause and fix it. this is
extremely valuable for non reproduce-able or hard to reproduce bugs. you
can replay the record either forwards or backwards, which is very convivial
for observing and analyzing the software.
To realize this functionality, GDB is in fact executing the software, one
assembly instruction after another and recording relevant registers and
memory locations. This is a slow operation that can drastically change the
timing of process execution, and thus change the conditions that raise the
bug. To overcome this limitation, GDB can use available SoC IPs to
accelerate the operation. As per today, GDB has support for "Intel
Processor Trace" and " Branch Trace Store" IP on Intel processors.
ARM based SoCs have also IPs that can be used to assist process record,
namely CoreSight trace sources (ETM, PTM ..), trace links ( Funnels ...)
and trace sinks (ETB, ETR, TPIU...). They are now supported in Linux
kernel, through corresponding drivers and the extension of perf. A library
for decoding ETM traces (OpenCSD) is also available. The way is now paved
to bring acceleration of process record for ARM based SoC to GDB.
I am re-sending RFC and making it available as basis for discussions for
implementing this feature. it is also attached as text file
B.R.
Zied Guermazi
Non intrusive execution recording
for GDB using ARM CoreSight
Status of this Memo
This memo provides information for Linaro coresight and toolchain
communities.
Distribution of this memo is unlimited.
Abstract
A method of realizing execution recording in GDB in a non-intrusive way.
This method is based on the use of CoreSight hardware tracing, available on
ARM Cortex devices.
Table of Contents
1 Introduction
2 State of the art
3 Use cases
3.1 Self hosted debug monitor
3.2 Remote debug monitor
3.3 External debugger
4 Implementation needs
4.1 Self hosted debug monitor
4.2 Remote Debug monitor
4.3 External debugger
5 Remote protocol execution sequence
6 Remote protocol extensions
7 Solutions and alternatives .
7.1 Scope definition
7.2 CoreSight infrastructure exposure to the user
7.3 Parameters needed for parsing traces
1. Introduction
CoreSight technology offers a toolset for tracing the execution of a
program on a CPU, as well as routing the traces to an external trace port
analyzer or storing it in a dedicated internal memory. Those traces do not
affects system performance, and can be used as a record for program
execution.
GDB offers reverse debugging by recording program execution and storing
it. GDB offers either full record or program flow (branch) record. Records
can be replayed later-on for forwards or backwards debugging.
This request for comments is about realizing GDB record and replay
functionality using CoreSight technology. it presents typical use cases
and discuss different alternatives for realizing above mentioned feature.
2. State of the art
GDB currently supports two execution recording variants:
- full record: where registers as well as memory are recorded for each
instruction. in this case GDB collects the registers as well as involved
memory area after each instruction. currently this has no support for
hardware accelerators
- branch record: where only program flow is recorded. In this case GDB
collects program execution flow. currently branch record is implemented
either with or without hardware acceleration by using Intel branch trace
store "bts" and Intel processor trace "pt" hardware accelerator on
supported cpus.
3. Use cases
Programs running on ARM processors can be be debugged in many
configurations. three of them are selected in this RFC as base for
discussion :
3.1. Self hosted debug monitor
Those are systems where the debugger program runs on the same cpu as the
debugged program and monitors it. user interacts with the debugging session
on the target host itself.
Linux GDB is an example of such systems. in such a system following
setup is considered
- Target: a process running on an ARM cortex A
- Debugger: gnu GDB via ptrace API (arm-linux-gnueabihf-gdb)
+-----------------------------------+
| Target |
| +------------+ |
| +------+ | Coresight | |
| | | | components:| |
| | GDB |<--------->| | |
| | | ^ | DWT, ETM, | |
| +------+ | | ITM, TPIU | |
| ^ | | TMC, ETB | |
| | | +------------+ |
+----|---------|--------------------+
| |
| |
arm-linux- |
gnueabihf- |
gdb |
debug: ptrace
trace: perf/CoreSight drivers
3.2. Remote debug monitor
Those are systems where the debugger program runs on the same cpu as the
debugged program and monitors it. user interacts with the debugging session
remotely from a PC
Linux GDB is an example of such systems. in such a system following
setup is considered
- Target: a process running on an ARM cortex A
- GDB server: gnu gdbserver (arm-linux-gnueabihf-gdbserver)
- GDB client: gnu GDB (arm-linux-gnueabihf-gdb)
- UI: eclipse with needed plugins, MI interface is used.
+--------------------------+ +---------------------------------------+
| Host | | Target |
| | | +------------+ |
| +-----+ +------+ | | +------+ | Coresight | |
| | | | GDB | | | | GDB | | components:| |
| | UI |<--->| |<--->|<--->|<--->| |<--------->| | |
| | | ^ |Client| ^ | ^ | |Server| ^ | DWT, ETM, | |
| +-----+ | +------+ | | | | +------+ | | ITM, TPIU | |
| ^ | ^ | | | | ^ | | TMC, ETB | |
| | | | | | | | | | +------------+ |
+----|-----|-----|------|--+ | +--------|---------|--------------------+
| | | | | | |
| | | | | | |
Eclipse | arm-linux- | | arm-linux- |
| gnueabihf- | TCP/IP gnueabihf- |
| gdb | UART gdbserver |
GDB MI GDB remote debug: ptrace
protocol trace: perf/CoreSight drivers
3.3. External debugger
Those are systems where an external debugger is used. It accesses the
target using JTAG or SWD. Target is usually a bare metal embedded systems
or systems with an rtos.
as an example, following setup is considered:
- Target: firmware running on ARM cortex M.
- Debugger: external debug and trace device.
- GDB server: OpenOcd.
- GDB Client: arm-none-eabi-gdb.
- UI: eclipse with needed plugins, MI interface is used.
+--------------------------------------+ +-------+ +-------------+
| Host | | dbggr | | Target |
| | | | | |
| +-----+ +------+ +------+ | | | | Coresight |
| | | | GDB | | GDB | | | Debug | | components: |
| | UI |<--->| |<--->| |<-->|<--->| + |<--->| |
| | | ^ |Client| ^ |Server| | ^ | Trace | ^ | DWT, ETM, |
| +-----+ | +------+ | +------+ | | | | | | ITM, TPIU |
| ^ | ^ | ^ | | | | | | |
| | | | | | | | | | | | |
+----|-----|-----|------|-----|--------+ | +-------+ | +-------------+
| | | | | | |
| | | | | | |
Eclipse | arm-none- | OpenOcd | |
| eabi-gdb | PyOcd | |
| | | |
GDB MI GDB remote Ethernet debug: JTAG/SWD
protocol USB trace: Serial/Parallel
4. Implementation needs
4.1 Self hosted debug monitor
GDB : arm-linux-gnueabihf-gdb
the interface defined in btrace.h for capturing and processing traces
has to be implemented for arm CoreSight.
needed actions:
- in btrace-common.h: add needed structures for capturing and
handling etm traces
- in linux-btrace.h:
- add btrace_tinfo_etm
- amend btrace_target_info
- in linux-btrace.c: change following functions to support etm
traces
- linux_enable_btrace
- linux_disable_btrace
- linux_read_btrace
- linux_btrace_conf
- in arm-linux-nat.c:add an api to
- configure btrace
- enable btrace
- disable btrace
- read btrace
- in btrace.c
- btrace_add_pc btrace_fetch has to be implemented for
Coresight this means using opencsd library to parse etms and then
reconstruct executed instructions accordingly (btrace_compute_ftrace_1)
- in record-btrace.c
- add command for showing record btrace etm options
- add command for starting tracing with CoreSight and its
handler (cmd_record_btrace_etm_start)
- adapt cmd_show_record_btrace_cpu
...
perf:
needed actions:
- make sure that perf can start/stop tracing a process with its
threads, collect etm traces and deliver them to the user
4.2 Remote Debug monitor
changes described in 7.1 are needed. in addition, and to support remote
protocol following changes are needed
GDB server: arm-linux-gnueabihf-gdbserver
needed actions:
- in linux-low
- linux_low_read_btrace: add support for etm traces formatting.
- linux_low_btrace_conf: :add support for etm configuration
formatting.
GDB client: arm-linux-gnueabihf-gdb
needed actions:
- in remote.c
- adapt enable_btrace
- adapt disable_btrace
- in btrace.c
- parse_xml_btrace: update btrace.dtd [2] and related data
structures btrace_xxx
- parse_xml_btrace_conf: update btrace-conf.dtd [3] and related
data structures btrace_conf_xxx
- extend Remote protocol handling to support coresight etm traces
UI: eclipse
needed actions
make sure that the plugin for recoding execution and replaying it
is coping well in case of arm-linux
Remote protocol needs to be extended by
-1- Adding Qbtrace:CoreSight (or etm) to start collecting etm traces
-2- Amending 'Branch Trace Format' xml specification to consider etm
traces transfer
-3- Amending 'Branch Trace Configuration Format' xml specification to
consider parameters needed for etm
4.3 External debugger
changes described in 7.2 are needed. in addition, and to support tracing
a remote dealing with an external debugger (bare metal embedded system)
following changes are needed
GDB server: OpenOcd
needed actions:
- rework etm driver to make it up to date.
- add a driver for configuring trace interconnect IPs
- rework the driver for TPIU.
- integrate support for a Trace port analyzer.
- extend remote protocol implementation to support recording
Coresight infrastructure of the SoC is to be set in OpenOcd through
configuration files. Parameters that are not relevant for GDB are also
specified in configuration files (trace sink, trace protocol, port size,
trace synch frequency, cycle accurate tracing etc ...)
GDB client: arm-none-eabi-gdb
needed actions:
- extend Remote protocol to support coresight etm traces
- integrate etm trace parsing library
- interface the parser to record_btrace_target
Remote protocol needs -in addition to 7.2- to be extended by
- Adding Qbtrace-conf:CoreSight:core=value to support multicore SoC
- Adding btrace-conf:CoreSight:id=value to support demultiplexing
multiple trace sources
- Adding Qbtrace-conf:CoreSight:filter:context=value to support
filtering traces belonging to a given process/thread
- Adding Qbtrace-conf:CoreSight:filter:start-address=value
and Qbtrace-conf:CoreSight:filter:end-address=value to
support filtering traces for given functions/blocks/lib
- Adding Qbtrace-conf:CoreSight:trigger:on-address=value
and Qbtrace-conf:CoreSight:trigger:off-address=value to
support triggering tracing or stop tracing if a certain function/block/lib
is executed
alternatively some of configurations related to filtering and
triggering can be delegated to the GDB server.
UI: eclipse
test and verify that existing plugins cope well with GDB extensions
5. Remote protocol execution sequence
GDB and gdbserver are communicating using the GDB remote protocol.
on a semantic level a tracing session runs though following sequence
(1) GDB client queries gdb server support for branch trace
(2) GDB server answers with
- qXfer:btrace:read
- qXfer:btrace-conf:read
- Qbtrace:off
- Qbtrace:CoreSight
- Qbtrace-conf:CoreSight:xxx where xxx is the parameter name
(3) GDB client sends command to let start emitting and collecting
traces (Qbtrace:CoreSight)
(4) GDB server executes the commands
(5) GDB client sends command to stop emitting and collecting traces
(Qbtrace:off)
(6) GDB server exectues the command
(7) GDB client sends command to get collected traces from trace sink
(qXfer:btrace:read:annex:offset,length)
(8) GDB server executes the command and sends back collected traces
(9) GDB client parses the traces and reconstructs target states
6. Remote protocol extensions
the remote protocol needs be extended with following primitives to
support CoreSight tracing
- start tracing and traces capture using CoreSight (Qbtrace:CoreSight)
the remote protocol can be extended with following primitives to take
advantages of etm functionalities.
- select the core to trace on in the case of a multicore system
GDB client sends command to select the core to trace
(Qbtrace-conf:CoreSight:core=value)
- set the trace id for the traces
GDB client sends command to set trace id
(Qbtrace-conf:CoreSight:id=value)
- select the context to trace
GDB client sends command to select the context to trace
(Qbtrace-conf:CoreSight:filter:context=value)
- select the address range to trace
GDB client sends command to select the address range to trace
(Qbtrace-conf:CoreSight:filter:start-address=value)
(Qbtrace-conf:CoreSight:filter:end-address=value)
- set triggers for starting and stopping tracing
GDB client sends command to select the address to trigger tracing
(Qbtrace-conf:CoreSight:trigger:on-address=value)
(Qbtrace-conf:CoreSight:trigger:off-address=value)
7. alternatives and discussions
7.1. Scope definition
Coresight ETM IP comes in many versions and many implementations.
According to its capabilities, it can trace instructions only or
instructions and involved data/data address. All ETMs variants support
instructions tracing and can therefore be used for for branch tracing.
7.2. CoreSight infrastructure exposure to the user
it is here about assigning the responsibility of configuring Coresight
infrastructure to generate and route traces. two alternatives are possible:
- coresight infrastructure exposed to GDB client (and UI):
in this alternative the user or the UI is responsible for
configuring coresight IPs in the SoC, by accessing their registers
directly or via coresigh driver. Remote protocol is used to configure trace
sink (ETB or TPA) to start/stop collecting traces
- coresight infrastructure is not exposed outside of gdbserver.
in this case high level commands can be provided by gdbserver
remote protocol to setup and configure coresight IPs in the SoC.
My recommendation is to extend remote protocol to provide high level
commands to setup and configure coresight IPs in the SoC, or to use a
different channel to pass configuration parameters to GDB server
7.3. Parameters needed for parsing traces
Some configuration parameters like etm version, trace id ... (content of
registers ETMCR, ETMIDR, ETMCCER, ETMTRACEIDR) are needed for extracting
and parsing etm trace, those parameters needs to be exchanged between GDB
server and client. following alternatives are possible:
- extend the remote protocol to get those params with explicit queries
- add them to the content of the response to qXfer:btrace-conf:read
- add them to the content of the response to qXfer:btrace:read
[VIRT-263 # ARMv8.1-VHE Virtual Host Extensions ]
Got to "kvm enabled with vhe" kernel message, but then
the kernel hangs there. Irritatingly works with the
kernel I built myself, but not a distro supplied kernel.
Need to track down the config difference so I can continue
using gdbstub.
[VIRT-344 # ARMv8.5-MemTag, Memory Tagging Extension ]
Regenerated an mte+linux-user branch for Google engineers
to use to develop llvm. This is code previously posted,
but my current branch striped out linux-user for ease of
review of the system code.
[VIRT-327 # Richard's upstream QEMU work ]
Fixed mmap assert, signal handler method.
Fixed constant folding of extract2.
Fixed aarch64 host output of extract2.
Posted pull request for those.
Started reviewing the GSoC risugen patches, v3 for avx.
r~
== This Week ==
* PR90723
- Committed fix to trunk in r273466.
* PR90724
- Posted patch upstream, waiting for feedback.
* Validation
- SPEC2k6 with SVE seems to compile, but spotted couple of infra issues.
- Sent abe patch to store sum files.
* Misc
- Meetings
== Progress ==
* GCC:
- FDPIC: received feedback on generic patches, will address after holidays
- noinit attribute: iterated on generic attribute patch, not approved yet
* GNU-583 (Fix Linux kernel built for Thumb-2 with GCC using LTO)
- no progress this week
* GCC upstream validation:
- reported a few regressions.
* Binutils:
- Non-contiguous memory regions support in the BFD linker: proposal
looks OK, will start implementation after holidays
* misc:
- infra fixes / troubleshooting
- reported several regressions in QEMU, promptly fixed by our awesome
team members
== Next ==
GCC:
- handle feedback on FDPIC and noinit patches
- binutils/linker support for non-contiguous memory regions
- GNU-583
- GCC upstream validation: Add a config for cortex-m33 (v8-m)
== Holidays ==
July 13-27
Aug 2-11
QEMU Tooling ([VIRT-252])
=========================
QEMU plugin support ([VIRT-280])
- posted {PATCH for semihosting-tests} semihosting tests: add v7m
tests Message-Id: <20190711135726.14191-1-alex.bennee(a)linaro.org>
- semihosting re-factor now in v4 branch
- cleaned up translator_ld stuff for arm
- posted {PATCH for 4.1?} includes: remove stale {smp|max}_cpus
externs Message-Id: <20190711130546.18578-1-alex.bennee(a)linaro.org>
- fixed up code needing smp/max_cpus
[VIRT-280] https://projects.linaro.org/browse/VIRT-280
[v4 branch] https://github.com/stsquad/qemu/tree/plugins/plugins-v4
GSoC Mentoring ([VIRT-348])
- starting to look quite workable
- looks like chunks of CONFIG_PROFILER can be made runtime
select-able
Upstream Work ([VIRT-109])
==========================
- more regression hunting for 4.1 release
- looked at bugs [1834496] and [1836078]
- posted {PATCH v2 for 4.1} target/arm: report ARMv8-A FP support
for AArch32 -cpu max Message-Id:
<20190711103737.10017-1-alex.bennee(a)linaro.org>
- rth and pm215 also posted various fixes
- ieee_6 test looks like a [fortran/gcc runtime issue]
- posted {PATCH for 4.1? v1 0/7} testing/next (docker, win-cross)
Message-Id: <20190712111849.9006-1-alex.bennee(a)linaro.org>
[VIRT-109] https://projects.linaro.org/browse/VIRT-109
[1834496] https://bugs.launchpad.net/bugs/1834496
[1836078] https://bugs.launchpad.net/bugs/1836078
[fortran/gcc runtime issue]
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78314
Completed Reviews [5/5]
=======================
{PATCH 0/5} tcg: Fix mmap_lock assertion failure, take 2
Message-Id: <87zhlned2x.fsf(a)zen.linaroharston>
{PATCH for-4.1} target/arm: Set VFP-related MVFR0 fields for arm926 and arm1026
Message-Id: <20190711121231.3601-1-peter.maydell(a)linaro.org>
{PATCH for-4.1 0/2} Compatibility fixes for nettle 2.7 vs 3.0 vs 3.5
Message-Id: <20190712101849.8993-3-berrange(a)redhat.com>
{PATCH v2 0/5} tests/docker: add podman support
Message-Id: <20190709194330.837-1-marcandre.lureau(a)redhat.com>
- CLOSING NOTE [2019-07-12 Fri 18:07]
Looks ok - need to get a podman system up for testing
{RISU PATCH v3 00/18} Support for generating x86 SIMD test images
Message-Id: <20190711223300.6061-5-jan.bobek(a)gmail.com>
Absences
========
- 18-19th July
Current Review Queue
====================
* {PATCH 0/2} tests/acceptance: Add test of NeXTcube framebuffer using OCR
Message-Id: <20190629150056.9071-1-f4bug(a)amsat.org>
* {Qemu-devel} {PATCH 0/4} Introduce the microvm machine type
Message-Id: <20190628115349.60293-1-slp(a)redhat.com>
* {PATCH 0/3} tests/acceptance: Add tests for the Leon3 board
Message-Id: <20190627115331.2373-1-f4bug(a)amsat.org>
* {PATCH 0/5} tests/acceptance: Add bFLT loader linux-user test
Message-Id: <20190625101524.13447-1-philmd(a)redhat.com>
* {PATCH v2 0/9} KVM: arm/arm64: vgic: ITS translation cache
Message-Id: <20190611170336.121706-1-marc.zyngier(a)arm.com>
* {Qemu-devel} {RFC PATCH 0/7} Proof of concept for Meson integration
Message-Id: <1560165301-39026-1-git-send-email-pbonzini(a)redhat.com>
--
Alex Bennée
Progress:
* VIRT-65 [QEMU upstream maintainership]
+ QEMU 4.1 rc0 sent out of the door
+ investigated and sent fix for a regression with arm926 and arm1020
emulation: we accidentally turned off VFP double-precision support
on these cores with the recent VFP refactoring
+ helped track down a booting failure Beata ran into on aarch64 hosts
to a regression in the TCG backend, which RTH has now sent a patch for
+ sent a cleanup for some dead code spotted by Coverity in the imx6ul SoC
thanks
-- PMM
== Progress ==
* Investigate running benchmarks in containers [TCWG-1513]
- Faffing about with our benchmarking scripts, not sure how to test
changes without disrupting our infrastructure
- Cooked up some viz scripts so I can easily look at the noise
levels in benchmark results with/without containers
* Started LLVM 8.0.1 rc4 build
- In progress on ARM, infrastructure issues on AArch64
== Plan ==
* Upcoming vacation: 6 - 13 August
[LLVM-583] LLVM Code Size reduction ideas from Zephyr and CMSIS
- Started a ticket to record areas of improvement where GCC does
better than LLVM.
- Upstream defaults to -mno-unaligned-access for clang which needs to
be corrected for.
- Much of the difference goes away when inlining is disabled, implying
that different inlining strategies could be most significant
difference.
- Sent in Linaro Connect presentation submission to cover all of TCWGs
code-size improvement work.
Planned absences
- Rest of this week, back in the office on the 15th July
[Code size investigation]
Results (clang 2% larger than gcc) replicated on cortex-m0 and
cortex-m4 on Zephyr.
- Clang optimisation to use BLX rather than BL when same function
called multiple times is a pessimisation on Zephyr, especially on M0.
- GCC register allocation seems to result in fewer spills
TODO: Get an estimate of how much code-size difference is down to
different inlining decisions.
On CMSIS DSP cortex-m4f clang appears to be producing smaller than
GCC, not measured averages yet.
[LLD]
- Quite a few upstream reviews, PRs and investigations surrounding them.
- Likely that LLD will be converting to the new variable naming convention.
- Received a request to add cortex-a8 erratum fix for Google Android team.
[Linaro Connect]
Registered and contacted travel.
Drafted a submission for presentation, will submit next week.
Planned Absences:
On holiday Wednesday, Thursday, Friday next week
Progress:
* VIRT-65 [QEMU upstream maintainership]
+ catching up with email and code review -- managed to get some
series reviewed in time for softfreeze on Tuesday, notably the
'sbsa-ref' reference platform model that Hongbo Zhang was working on
+ a lot of release-herding, working through the huge pile of pull
requests that need merging
+ fixed a silly bug in recent VFP refactoring, spotted by coverity
+ fixed a memory leak that broke our CI sanitizer build (not a new
piece of code, but we currently only sanitize the x86-64 targets
and a recent change meant this old code is now used on x86-64 for
the ATI PCI display device model)
* Misc:
+ first KVM Forum Programme Committee meeting (and attendant
review of all the submitted abstracts; bumper crop this year)
thanks
-- PMM
== Progress ==
* GCC:
- FDPIC: No progress, still waiting for feedback
- noinit attribute: reviewers asked to make it a generic attribute,
rather than target-specific. New patch sent.
* GNU-583 (Fix Linux kernel built for Thumb-2 with GCC using LTO)
- no progress this week
* GCC upstream validation:
- reported a few regressions.
* Binutils:
- PR24709 (linker crash and assertion failure with CMSE). Use case not
considered worth the headache of supporting correctly CSME+long-branch
stubs (tricky to get right). Replaced the linker crash with a user
error message.
- Non-contiguous memory regions support in the BFD linker: Received a
good summary of the consensus reached in 2017.
* misc:
- infra fixes / troubleshooting
== Next ==
GCC:
- handle feedback on FDPIC and noinit patches
- binutils/linker support for non-contiguous memory regions
- GNU-583
- GCC upstream validation: Add a config for cortex-m33 (v8-m)
== Holidays ==
July 13-27
Aug 2-11
== This Week ==
* PR88833
- Fixed pending issues with x86 and committed fix to trunk
* PR90723
- Issue seems to be infinite recursion overflowing the stack, investigating.
* Misc
- Meetings
== Next Week ==
- PR90723
- Add testsuite comparison to tcwg_gnu
== Progress ==
* LLVM SPEC2k6 Performance Analysis [LLVM-134]
- Still working around perf version mismatches, going to investigate
if we can use a newer version of perf to collect data
- Had a look at the assembly for sphinx from gcc-6, clang-3.9.1 and
clang-8.0.0, but going to wait for better perf before I rush to
conclusions
* IR SVE Reviews [ LLVM-545]
- Had a look at the clang patches, gave some feedback for one of
them; the other ones are very subtle and best left to the clang
maintainers
* [GlobalISel] Refactor CallLowering [LLVM-568]
- Fixed a little ASAN snag, should be done for good now
* Trying to get up an ABI fuzzer for GlobalISel
== Plan ==
* Update benchmarking infrastructure (in support of LLVM-134)
* Deprioritize GlobalISel
* Upcoming vacation: 6 - 13 August
Short week; 3 days.
[VIRT-263 # ARMv8.1-VHE Virtual Host Extensions ]
Fixed 3 bugs:
* SVE length calculation,
* PNX bit while in EL2&0 regime,
* Interrupt routing w/ TGE bit.
A bare 5.2 kernel boots to root file system not found.
A 4.19 kernel hangs during boot somewhere.
Now doing a fedora30 install, which I believe has a 5.x kernel...
[VIRT-327 # Richard's upstream QEMU work ]
Patch review:
* GSoC x86 risugen,
* arm semihost cleanups.
r~
== This Week ==
* PR88833 (6/10)
- Patch approved by Richard.
- Regresses pr88152.C due to a possibly latent issue with combine.
* PR90722 (3/10)
- Investigating the issue
* Misc (1/10)
- Meetings
o LLVM:
* 8.0.1-rc3: Built ARM Binaries, AArch64 on-going.
* Machine Outliner:
- Uncovered issues in stack fixup handling, working on a fix.
o Misc
* Various meetings and discussions.
[VIRT-263 # ARMv8.1-VHE Virtual Host Extensions ]
Rebased on master. Now trying to remember the incantation that produced a
minimum number of insns before the kernel actually tries to use this. At
present things seem to be crashing before I even get that far, as if I've
misconfigured something.
[VIRT-327 # Richard's upstream QEMU work ]
Fixed a couple of bugs in my tcg/ppc host vector patch set.
Reviewed qemu kvm sve patches.
Reviewed target/ppc altivec optimization patches.
Reviewed GSoC risugen x86 patches.
Other misc upstream review.
r~
QEMU Tooling ([VIRT-252])
=========================
QEMU plugin support ([VIRT-280])
- more work on [v4 branch] but cutting it too fine for 4.1
- reworking the memory tracing to track mmu_idx
- hope to have v4 posted on Monday if I can squash the bugs
- the plugin call isn't getting the full TCGMemopidx (maybe only
TCGmemop?)
- some interest on list from HW manufacturers
[VIRT-280] https://projects.linaro.org/browse/VIRT-280
[v4 branch] https://github.com/stsquad/qemu/tree/plugins/plugins-v4
GSoC Mentoring ([VIRT-348])
- reviewed {Qemu-devel} {PATCH v2 0/4} dumping hot TBs Message-Id:
<20190624055442.2973-1-vandersonmr2(a)gmail.com>
- worked up some [suggestions for HMP interface and refactoring]
- first evaluation period work
[VIRT-348] https://projects.linaro.org/browse/VIRT-384
[suggestions for HMP interface and refactoring]
https://github.com/stsquad/qemu/tree/review/hotblocks-v2-tweaks
Upstream Work ([VIRT-109])
==========================
- posted {PULL 00/19} testing/next (tests/vm, Travis and hyperv build
fix) Message-Id: <20190624134337.10532-1-alex.bennee(a)linaro.org>
- bit late, will wait until pm215 returns
[VIRT-109] https://projects.linaro.org/browse/VIRT-109
Other
=====
- booking flights for Connect
- submitted a talk for KVM Forum (plugins)
Completed Reviews [2/2]
=======================
{Qemu-devel} {PATCH v2 0/4} dumping hot TBs
Message-Id: <20190624055442.2973-1-vandersonmr2(a)gmail.com>
- CLOSING NOTE [2019-06-27 Thu 12:39]
Made a bunch of notes and [tweaks]
[tweaks]
https://github.com/stsquad/qemu/tree/review/hotblocks-v2-tweaks%0A
{PATCH} Makefile: Rename the 'vm-test' target as 'vm-help'
Message-Id: <20190531064341.29730-1-philmd(a)redhat.com>
- CLOSING NOTE [2019-06-27 Thu 12:40]
Queued to my tree
Absences
========
- June 21st
Current Review Queue
====================
* {Qemu-devel} {PATCH 0/4} Introduce the microvm machine type
Message-Id: <20190628115349.60293-1-slp(a)redhat.com>
* {PATCH 0/3} tests/acceptance: Add tests for the Leon3 board
Message-Id: <20190627115331.2373-1-f4bug(a)amsat.org>
* {PATCH 0/5} tests/acceptance: Add bFLT loader linux-user test
Message-Id: <20190625101524.13447-1-philmd(a)redhat.com>
* {PATCH v2 0/9} KVM: arm/arm64: vgic: ITS translation cache
Message-Id: <20190611170336.121706-1-marc.zyngier(a)arm.com>
* {Qemu-devel} {RFC PATCH 0/7} Proof of concept for Meson integration
Message-Id: <1560165301-39026-1-git-send-email-pbonzini(a)redhat.com>
* {PATCH 00/59} KVM: arm64: ARMv8.3 Nested Virtualization support
Message-Id: <20190621093843.220980-1-marc.zyngier(a)arm.com>
--
Alex Bennée
== Progress ==
* GCC:
- FDPIC: No progress, still waiting for feedback
- noinit attribute: no feedback yet
* GNU-583 (Fix Linux kernel built for Thumb-2 with GCC using LTO)
- managed to reproduce similar error messages
* GCC upstream validation:
- reported a few regressions.
* Binutils:
- PR24709 (linker crash and assertion failure with CMSE). CMSE stubs
do not support long branches. Tried two approaches, but didn't find a
solution yet
- resurrected a thread about non-contiguous memory regions support in
the BFD linker
* misc:
- reported regressions in qemu, found with the GCC testsuite
- infra fixes / troubleshooting
== Next ==
GCC:
- handle feedback on FDPIC and noinit patches
- binutils/PR24709
- GNU-583
== Progress ==
* [GlobalISel] Refactor CallLowering [LLVM-568]
- Committed upstream
* [ARM GlobalISel] Add support for integers > 32 bits wide [LLVM-310]
- In progress
* LLVM SPEC2k6 Performance Analysis [LLVM-134]
- Got some results with clang-3.9.1 and clang-8.0.0, trying to work
around a perf annotate issue so I can investigate
== Plan ==
* Continue with these
* Friday off
== Progress ==
* FDPIC
- GCC: No progress, still waiting for feedback
* GCC upstream validation:
- reported a few regressions.
* GCC:
- noinit attribute: no feedback yet
* Binutils:
- started looking at PR24709 (linker crash and assertion failure with CMSE)
* misc:
- ABE: pushed fix to glibc-2.29+ builds, tcwg-backport job now works again
- forwarded GCC/Linaro bug #5314 to upstream; quickly fixed by richi
- looked at a couple of old Jira cards. Not sure how to resume work on
GNU-583 (Fix Linux kernel built for Thumb-2 with GCC using LTO)
== Next ==
GCC:
- handle feedback on FDPIC and noinit patches
- binutils/PR24709
o 4 days week (1 training day)
o LLVM:
* 8.0.1-rc2: ARM and AArch64 binaries built and uploaded.
* Buildbots babysitting: couple of issues reported.
* Machine Outliner:
- Adding testcases before upstream submission.
o Misc
* Various meetings and discussions.
== Progress ==
* Out of office on Friday (bank holiday)
* [GlobalISel] Refactor CallLowering [LLVM-568]
- Patches upstream
* [ARM GlobalISel] Add support for integers > 32 bits wide [LLVM-310]
- Started looking into call lowering for 64-bit types
* LLVM SPEC2k6 Performance Analysis [LLVM-134]
- Trying to reproduce results from Connect, hit a little snag with Jenkins
== Plan ==
* Continue with these
[LLVM-542] Build Zephyr with clang
- Spent quite a bit of time working out why a Clang built zephy hello
world wouldn't boot, tracked down to a missing clobber list in an
inline assembly block
- Wrote some scripts to collect code size information on the samples.
Some initial figures on mostly cortex-m3 put llvm -Oz trunk about 2%
larger than arm-none-eabi-gcc (9.1.1) -Os with frame pointers
disabled. The samples are making very little use of the library
(newlib built with arm-none-eabi-gcc).
- Working out which samples will build with cortex-m0.
- Investigated latest version of bloaty mc bloat face a code size tool
from Google. Has some interesting features including an inline
detection feature that can map a portion of a function's code size to
inlined functions.
Misc:
LLD reviews and mailing list comments.
Planned Absences:
Likely to take some holiday around 13th July
Week ending 23 June is very short.
* Patch review
- target/ppc vsr cleanup
- target/tricore translator loop conversion
- target/arm vfp decodetree cleanup
- cortex-strings strrchr fix
- continuing on the plugin api
* Xilinx meeting
* Fix qemu assert for clyon
- Found two other bugs in the process.
* Debugging my own USHR/SSHR patch vs aa32.
r~
Progress: (very short week, 2 days)
* VIRT-65 [QEMU upstream maintainership]
+ found and sent patches for a handful of minor M-profile bugs
* VIRT-268 [QEMU support for dual-core Cortex-M Musca board]
+ the final patches for this have now gone upstream and I have
marked the JIRA issue as resolved. There may still be minor
bug fixes but we can handle those under the usual 'upstream
maintainership' JIRA
thanks
-- PMM
Hello folks,
I got a few people asking me to do this in the last Connect, so I've
proposed a beginner session that explores gcc under the hood. The
tentative plan I have for the talk is:
1. A high level view of how the source code is laid out
2. Front end, middle end, backend. This includes a high level
introduction of GIMPLE and machine descriptions
3. A walkthrough of one or two simple programs and usage of diagnostic
flags like -fdump-*
Additional suggestions are most welcome. Also, I was thinking maybe
it would be good to have a llvm under the hood talk along similar
lines. Thoughts?
Siddhesh
[LLVM-571] Build GNU rmprofile toolchain with Linaro scripts (abe)
The existing build-system was only set up to build the A-profile bare
metal toolchain. Managed to find right combination of flags and
modifications to get a toolchain that zephyr can use.
[LLVM-158] Buildbot maintenance
An interesting failure introduced in LLD, but causing segfaults in
2-stage build, now fixed.
[LLVM-542] Zephyr code size investigation
- Rebased modifications to Zephyr
- Wrote script to build all the examples with GCC and Clang
- Fixed problems with modifications found by building all the examples
- Clang built helloworld no longer booting, need to investigate
- Found some areas for more investigation:
-- llvm-objcopy missing --gap-fill (used by one of the sample programs)
-- lld missing --print-memory-usage, while I'm using gcc for the main
link, zephyr build system seems to be feature testing using clang bare
metal default linker (lld)
-- clang always generates .note.GNU.stack, gcc embedded does not,
leading to lots of orphan section warnings. Probably best solved by
linker script modification.
Progress:
* VIRT-65 [QEMU upstream maintainership]
+ some work on getting Sphinx to generate manpages, with a conversion
of the qemu-ga manpage from texinfo as the demonstration case
+ set up to run pre-merge tests on our packet.net qemu-test machine
rather than the gcc compile farm one (as the latter is running an
ancient ubuntu whose python is now too old to build QEMU with)
+ sent some follow-on cleanup patches now that VFP decodetree is in master
* VIRT-268 [QEMU support for dual-core Cortex-M Musca board]
+ sent out patches which correctly make the Cortex-M33 (and the -M4)
implement single-precision-only floating point, so the double-precision
instructions UNDEF as they should
+ once those and the other on-list patches have made their way through
code review and into master this epic will be complete
thanks
-- PMM
== Progress ==
* FDPIC
- GCC: No progress, still waiting for feedback
* GCC upstream validation:
- reported a few regressions.
* GCC:
- noinit attribute: patch posted for upstream review
* misc:
- training 3 days
== Next ==
GCC:
- handle feedback on FDPIC and noinitpatches
- UBSAN/bare-metal: do more testing
== Progress ==
* Catching up after holidays
* [GlobalISel] Refactor CallLowering [LLVM-568]
- Still in progress, but I fixed the AMDGPU failure
- Decided to go a bit deeper than initially intended, since it gets
really awkward otherwise
* IR SVE Reviews [ LLVM-545]
- New version of the patch was posted, had a quick look
== Plan ==
* Continue LLVM-568
* Start LLVM-134 (LLVM SPEC2k6 Performance Analysis Leftovers from BUD17)
* LLVM-545 (IR SVE Reviews) - Have a look at the new, revised patch
hi all,considering the big progress achieved in coresight drivers, perf, as well as opencsd, the prerequisites for making a move towards developing branch tracing in gdb for arm processors, based on etm are now available. Therefore I am publishing this request for comment, and looking forward for your feedback on this proposal.
Non intrusive execution recording for GDB using ARM CoreSight
Status of this Memo
This memo provides information for Linaro coresight and toolchain communities. Distribution of this memo is unlimited.
Abstract
A method of realizing execution recording in GDB in a non-intrusive way. This method is based on the use of CoreSight hardware tracing, available on ARM Cortex devices.
Table of Contents
1 Introduction 2 State of the art 3 Use cases 3.1 Self hosted debug monitor 3.2 Remote debug monitor 3.3 External debugger 4 Implementation needs 4.1 Self hosted debug monitor 4.2 Remote Debug monitor 4.3 External debugger 5 Remote protocol execution sequence 6 Remote protocol extensions 7 Solutions and alternatives 7.1 Scope definition 7.2 CoreSight infrastructure exposure to the user 7.3 Parameters needed for parsing traces
1. Introduction
CoreSight technology offers a toolset for tracing the execution of a program on a CPU, as well as routing the traces to an external trace port analyzer or storing it in a dedicated internal memory. Those traces do not affects system performance, and can be used as a record for program execution. GDB offers reverse debugging by recording program execution and storing it. GDB offers either full record or program flow (branch) record. Records can be replayed later-on for forwards or backwards debugging. This request for comments is about realizing GDB record and replay functionality using CoreSight technology. it presents typical use cases and discuss different alternatives for realizing above mentioned feature. 2. State of the art
GDB currently supports two execution recording variants: - full record: where registers as well as memory are recorded for each instruction. in this case GDB collects the registers as well as involved memory area after each instruction. currently this has no support for hardware accelerators - branch record: where only program flow is recorded. in this case GDB collects a list of linear execution called blocks. each branch will terminate previous block and start a new one. currently branch is implemented either without hardware acceleration or using Intel branch trace store "bts" and Intel processor trace "pt" hardware accelerator on supported cpus.
3. Use cases
Programs running on ARM processors can be be debugged in many configurations. three of them are selected in this RFC as base for discussion : 3.1. Self hosted debug monitor Those are systems where the debugger program runs on the same cpu as the debugged program and monitors it. user interacts with the debugging session on the target host itself. Linux gdb is an example of such systems. in such a system following setup is considered - Target: a process running on an ARM cortex A - Debugger: gnu gdb via ptrace API (arm-linux-gnueabihf-gdb)
+-----------------------------------+ | Target | | +------------+ | | +------+ | Coresight | | | | | | components:| | | | GDB |<--------->| | | | | | ^ | DWT, ETM, | | | +------+ | | ITM, TPIU | | | ^ | | TMC, ETB | | | | | +------------+ | +----|---------|--------------------+ | | | | arm-linux- | gnueabihf- | gdb | debug: ptrace trace: perf/CoreSight drivers
3.2. Remote debug monitor
Those are usually systems where the debugger program runs on the same cpu as the debugged program and monitors it. user interacts with the debugging session remotely from a PC Linux gdb is an example of such systems. in such a system following setup is considered - Target: a process running on an ARM cortex A - Gdb server: gnu gdbserver (arm-linux-gnueabihf-gdbserver) - Gdb client: gnu gdb (arm-linux-gnueabihf-gdb) - UI: eclipse with needed plugins, MI interface is used.
+--------------------------+ +---------------------------------------+ | Host | | Target | | | | +------------+ | | +-----+ +------+ | | +------+ | Coresight | | | | | | GDB | | | | GDB | | components:| | | | UI |<--->| |<--->|<--->|<--->| |<--------->| | | | | | ^ |Client| ^ | ^ | |Server| ^ | DWT, ETM, | | | +-----+ | +------+ | | | | +------+ | | ITM, TPIU | | | ^ | ^ | | | | ^ | | TMC, ETB | | | | | | | | | | | | +------------+ | +----|-----|-----|------|--+ | +--------|---------|--------------------+ | | | | | | | | | | | | | | Eclipse | arm-linux- | | arm-linux- | | gnueabihf- | TCP/IP gnueabihf- | | gdb | UART gdbserver | GDB MI GDB remote debug: ptrace protocol trace: perf/CoreSight drivers
3.3. External debugger
Those are systems where an external debugger is used. It accesses the target using JTAG or SWD. Target is usually a bare metal embedded systems or systems with an rtos. as an example, following setup is considered: - Target: firmware running on ARM cortex M. - Debugger: external debug and trace device. - Gdb server: OpenOcd. - Gdb Client: arm-none-eabi-gdb. - UI: eclipse with needed plugins, MI interface is used.
+--------------------------------------+ +-------+ +-------------+ | Host | | dbggr | | Target | | | | | | | | +-----+ +------+ +------+ | | | | Coresight | | | | | GDB | | GDB | | | Debug | | components: | | | UI |<--->| |<--->| |<-->|<--->| + |<--->| | | | | ^ |Client| ^ |Server| | ^ | Trace | ^ | DWT, ETM, | | +-----+ | +------+ | +------+ | | | | | | ITM, TPIU | | ^ | ^ | ^ | | | | | | | | | | | | | | | | | | | | +----|-----|-----|------|-----|--------+ | +-------+ | +-------------+ | | | | | | | | | | | | | | Eclipse | arm-none- | OpenOcd | | | eabi-gdb | PyOcd | | | | | | GDB MI GDB remote Ethernet debug: JTAG/SWD protocol USB trace: Serial/Parallel
4. Implementation needs
4.1 Self hosted debug monitor
gdb : arm-linux-gnueabihf-gdb the interface defined in btrace.h for capturing and processing traces has to be implemented for arm CoreSight needed actions: - in btrace-common.h: add needed structures for capturing and handling etm traces - in linux-btrace.h: - add btrace_tinfo_etm - amend btrace_target_info - in linux-btrace.c: change following functions to support etm traces - linux_enable_btrace - linux_disable_btrace - linux_read_btrace - linux_btrace_conf - in arm-linux-nat.c:add an api to - configure btrace - enable btrace - disable btrace - read btrace - in btrace.c - btrace_add_pc btrace_fetch has to be implemented for Coresight this means using opencsd library to parse etms and then reconstruct executed instructions accordingly (btrace_compute_ftrace_1) - in record-btrace.c - add command for showing record btrace etm options - add command for starting tracing with CoreSight and its handler (cmd_record_btrace_etm_start) - adapt cmd_show_record_btrace_cpu ... perf: needed actions: - make sure that perf can start/stop tracing a process with its threads, collect etm traces and deliver them to the user
4.2 Remote Debug monitor
changes described in 7.1 are needed. in addition, and to support remote protocol following changes are needed gdb server: arm-linux-gnueabihf-gdbserver needed actions: - in linux-low - linux_low_read_btrace: add support for etm traces formatting. - linux_low_btrace_conf: :add support for etm configuration formatting. gdb client: arm-linux-gnueabihf-gdb needed actions: - in remote.c - adapt enable_btrace - adapt disable_btrace - in btrace.c - parse_xml_btrace: update btrace.dtd [2] and related data structures btrace_xxx - parse_xml_btrace_conf: update btrace-conf.dtd [3] and related data structures btrace_conf_xxx - extend Remote protocol handling to support coresight etm traces UI: eclipse needed actions make sure that the plugin for recoding execution and replaying it is coping well in case of arm-linux
Remote protocol needs to be extended by -1- Adding Qbtrace:CoreSight (or etm) to start collecting etm traces -2- Amending 'Branch Trace Format' xml specification to consider etm traces transfer -3- Amending 'Branch Trace Configuration Format' xml specification to consider parameters needed for etm
4.3 External debugger
changes described in 4.2 are needed. in addition, and to support tracing a remote dealing with an external debugger (bare metal embedded system) following changes are needed gdb server: OpenOcd needed actions: - rework etm driver to make it up to date. - add a driver for configuring trace interconnect IPs - rework the driver for TPIU. - integrate support for a Trace port analyzer. -Extend remote protocol implementation to support recording Coresight infrastructure of the SoC is to be set in OpenOcd through configuration files. Parameters that are not relevant for gdb are also specified in configuration files (trace sink, trace protocol, port size, trace synch frequency, cycle accurate tracing etc ...) gdb client: arm-none-eabi-gdb needed actions: - extend Remote protocol to support coresight etm traces - integrate etm trace parsing library - interface the parser to record_btrace_target Remote protocol needs -in addition to 4.2- to be extended by - Adding Qbtrace-conf:CoreSight:core=value to support multicore SoC - Adding btrace-conf:CoreSight:id=value to support demultiplexing multiple trace sources - Adding Qbtrace-conf:CoreSight:filter:context=value to support filtering traces belonging to a given process/thread - Adding Qbtrace-conf:CoreSight:filter:start-address=value and Qbtrace-conf:CoreSight:filter:end-address=value to support filtering traces for given functions/blocks/lib - Adding Qbtrace-conf:CoreSight:trigger:on-address=value and Qbtrace-conf:CoreSight:trigger:off-address=value to support triggering tracing or stop tracing if a certain function/block/lib is executed alternatively some of configurations related to filtering and triggering can be delegated to the GDB server. UI: eclipse test and verify that existing plugins cope well with gdb extensions
5. Remote protocol execution sequence
gdb and gdbserver are communicating using the gdb remote protocol. on a semantic level a tracing session runs though following sequence (1) gdb client queries gdb server support for branch trace (2) gdb server answers with - qXfer:btrace:read - qXfer:btrace-conf:read - Qbtrace:off - Qbtrace:CoreSight - Qbtrace-conf:CoreSight:xxx where xxx is the parameter name (3) gdb client sends command to let start emitting and collecting traces (Qbtrace:CoreSight) (4) gdb server executes the commands (5) gdb client sends command to stop emitting and collecting traces (Qbtrace:off) (6) gdb server exectues the command (7) gdb client sends command to get collected traces from trace sink (qXfer:btrace:read:annex:offset,length) (8) gdb server executes the command and sends back collected traces (9) gdb client parses the traces and reconstructs target states
6. Remote protocol extensions
the remote protocol needs be extended with following primitives to support CoreSight tracing - start tracing and traces capture using CoreSight (Qbtrace:CoreSight) the remote protocol can be extended with following primitives to take advantages of etm functionalities. - select the core to trace on in the case of a multicore system gdb client sends command to select the core to trace (Qbtrace-conf:CoreSight:core=value) - set the trace id for the traces gdb client sends command to set trace id (Qbtrace-conf:CoreSight:id=value) - select the context to trace gdb client sends command to select the context to trace (Qbtrace-conf:CoreSight:filter:context=value) - select the address range to trace gdb client sends command to select the address range to trace (Qbtrace-conf:CoreSight:filter:start-address=value) (Qbtrace-conf:CoreSight:filter:end-address=value) - set triggers for starting and stopping tracing gdb client sends command to select the address to trigger tracing (Qbtrace-conf:CoreSight:trigger:on-address=value) (Qbtrace-conf:CoreSight:trigger:off-address=value)
7. alternatives and discussions
7.1. Scope definition
Coresight ETM IP comes in many versions and many implementations. According to its capabilities, it can trace instructions only or instructions and involved data/data address. All ETMs variants support instructions tracing and can therefore be used for for branch tracing.
7.2. CoreSight infrastructure exposure to the user
it is here about assigning the responsibility of configuring Coresight infrastructure to generate and route traces. two alternatives are possible: - coresight infrastructure exposed to gdb client (and UI): in this alternative the user or the UI is responsible for configuring coresight IPs in the SoC, by accessing their registers directly or via coresigh driver. Remote protocol is used to configure trace sink (ETB or TPA) to start/stop collecting traces - coresight infrastructure is not exposed outside of gdbserver. in this case high level commands can be provided by gdbserver remote protocol to setup and configure coresight IPs in the SoC. My recommendation is to extend remote protocol to provide high level commands to setup and configure coresight IPs in the SoC, or to use a different channel to pass configuration parameters to gdb server
7.3. parameters needed for parsing traces Some configuration parameters like etm version, trace id ... (content of registers ETMCR, ETMIDR, ETMCCER, ETMTRACEIDR) are needed for extracting and parsing etm trace, those parameters needs to be exchanged between gdb server and client. following alternatives are possible: - extend the remote protocol to get those params with explicit queries - add them to the content of the response to qXfer:btrace-conf:read - add them to the content of the response to qXfer:btrace:read
Best RegardsZied Guermazi
o LLVM
* Machine outliner:
- Rebased on upstream.
- Improved stack alignment handling
- More cleanup before submission
o Misc
* Various meetings and discussions.
[VIRT-339 # ARMv8.5-BTI, Branch Target Identification ]
Posted v6. Some review from Dave Martin; will need at least
one further revision, and to wait til the kernel patches land.
[VIRT-327 # Richard's upstream QEMU work ]
Fix a reported bug in pauth Auth results.
Fix a reported bug in vector variable shift.
Review Peter's vfp decodetree patch set.
Review of v18 of target/rx. Never-ending, it seems...
Review of some target/ppc vector patches.
Posted v4 of CPUNegativeOffsetState. This looks ready to pull.
r~
Upstream Work ([VIRT-109])
==========================
- spent ages tracking down 64-on-32 cputlb errors which led to:
- adding x86_64 support to TCG system tests
- {PATCH v1 0/4} softmmu de-macro fix with tests Message-Id:
<20190605162326.13896-1-alex.bennee(a)linaro.org>
- {PATCH} cputlb: cast size_t to target_ulong before using for
address masks Message-Id:
<20190606154310.15830-1-alex.bennee(a)linaro.org>
- took over maintainership of orphaned gdbstub
- posted {PULL 00/52} testing, gdbstub and cputlb fixes Message-Id:
<20190607090552.12434-1-alex.bennee(a)linaro.org>
- problems on hackbox lead to {PATCH} tests/vm: favour the locally
built QEMU for bootstrapping Message-Id:
<20190607185337.14524-1-alex.bennee(a)linaro.org>
[VIRT-109] https://projects.linaro.org/browse/VIRT-109
Other Tasks
===========
- Did some initial reading on RPMB
- Half day fire training
Absences
========
- May 27th is a Bank Holiday
- May 31st working on train in the afternoon
Current Review Queue
====================
* {Qemu-devel} {PATCH v4 00/39} tcg: Move the softmmu tlb to CPUNegativeOffsetState
Message-Id: <20190604203351.27778-1-richard.henderson(a)linaro.org>
* {Qemu-devel} {PATCH v2} target/arm: Vectorize USHL and SSHL
Message-Id: <20190603232209.20704-1-richard.henderson(a)linaro.org>
* {Qemu-devel} {PATCH resend} test-thread-pool: be more reliable
Message-Id: <20190530093417.23370-1-pbonzini(a)redhat.com>
* {PATCH 0/4} tests/docker: add podman support
Message-Id: <20190523234011.583-1-marcandre.lureau(a)redhat.com>
* {Qemu-devel} {PATCH 0/2} Implement PowerPC FPSCR flag Fraction Rounded
Message-Id: <20190525022008.24788-1-programmingkidx(a)gmail.com>
* {Qemu-devel} {PATCH for-4.1 v2 00/36} tcg: Move the softmmu tlb to CPUNegativeOffsetState
Message-Id: <20190328230404.12909-1-richard.henderson(a)linaro.org>
--
Alex Bennée
[LLVM-122] BTI and PAC support
Committed the LLD work. Modulo bugs this should now be done.
[LLVM-542] LLVM/GCC code size investigation
- Revisited my Zephy build with clang patches and updated so that it
works with trunk.
- Work out next steps of work.
- Work out to build an embedded gcc toolchain using the linaro infrastructure.
[Misc]
Reported bug in gold whereupon it would generate v4t veneers for v8-a CPUs
Still waiting for TK-1 board to finish building clang so that it can
run the testsuite. Hopefully finished over the weekend.
Progress:
* VIRT-65 [QEMU upstream maintainership]
+ Got the VFP decodetree conversion patchset out for review
(42 patches, 8 files changed, 3024 insertions(+), 1476 deletions(-))
+ sent a patchset which does the (easy) first step in my plan
for converting QEMU's documentation to Sphinx; sadly all the other
steps are much trickier...
thanks
-- PMM
== Progress ==
* FDPIC
- GCC: No progress, still waiting for feedback
* GCC upstream validation:
- reported a few regressions.
* GCC:
- UBSAN/bare-metal: added sync primitives implementation for low-end
cores (eg cortex-m0) Seems OK
- noinit attribute: started work
* Infra
- Fixed Dejagnu configuration issue in ABE which prevented us from
using target variant specifications
- Investigated problems with ABE and failure to cross-build "recent"
glibc (trouble with C++ compiler detection). ABE patches under review.
- Reduced load on APM machines to avoid depending on them too much
== Next ==
GCC:
- handle feedback on FDPIC patches
- noinit attribute
- UBSAN/bare-metal: do more testing
Hi,
Food for thought for today's sync up. I've been writting QEMU plugins to
exercise the plugin system and see what sort of useful information you
can extract when you can control the instruction stream.
For example I now have a plugin that can break down instruction counts
for any given run, for example a kernel boot:
Instruction Classes:
Class: UDEF not counted
Class: SVE (68 hits)
Class: Reserved (0 hits)
Class: PCrel addr (4589078 hits)
Class: Add/Sub (imm,tags) (0 hits)
Class: Add/Sub (imm) (26832113 hits)
Class: Logical (imm) (74304974 hits)
Class: Move Wide (imm) (10933759 hits)
Class: Bitfield (71470957 hits)
Class: Extract (85655 hits)
Class: Data Proc Imm (0 hits)
Class: Cond Branch (imm) (37227632 hits)
Class: Exception Gen (6 hits)
Class: NOP not counted
Class: Hints (244825554 hits)
Class: Barriers (1668558 hits)
Class: PSTATE (202144 hits)
Class: System Insn (7132992 hits)
Class: System Reg (2268308 hits)
Class: Branch (reg) (6280976 hits)
Class: Branch (imm) (18347905 hits)
Class: Cmp & Branch (180167025 hits)
Class: Tst & Branch (4092972 hits)
Class: Branches (0 hits)
Class: AdvSimd ldstmult (0 hits)
Class: AdvSimd ldstmult++ (0 hits)
Class: AdvSimd ldst (0 hits)
Class: AdvSimd ldst++ (0 hits)
Class: ldst excl (160861365 hits)
Class: Prefetch (0 hits)
Class: Load Reg (lit) (12828544 hits)
Class: ldst noalloc pair (0 hits)
Class: ldst pair (60381349 hits)
Class: ldst reg (0 hits)
Class: Atomic ldst (0 hits)
Class: ldst reg (reg off) (0 hits)
Class: ldst reg (pac) (0 hits)
Class: ldst reg (imm) (119597941 hits)
Class: Loads & Stores (0 hits)
Class: Data Proc Reg (113586343 hits)
Class: Scalar FP (0 hits)
Class: Unclassified (0 hits)
You can break down each class to individual instructions. For example
the Hints are mostly:
Individual Instructions:
Instr: wfe (132400072 hits) (op=0xd503205f/ Hints)
Instr: sevl (66433640 hits) (op=0xd50320bf/ Hints)
Instr: yield (29619246 hits) (op=0xd503203f/ Hints)
Instr: wfi (2865 hits) (op=0xd503207f/ Hints)
So I'm looking for a similar experiment that would be useful for the
memory sub-system. When I chatted to Maxim we thought maybe a simplified
cache line simulator might be useful. The aim wouldn't be to simulate
what a real cache might do but to be useful say for identifying regions
of code which might be susceptible to cache line bouncing. So as
compiler writers what sort of run time memory behaviour would you like
to track? What sort of information would be useful to extract with such
a tool?
I'm open to ideas ;-)
--
Alex Bennée
Four day week.
[VIRT-327 # Richard's upstream QEMU work ]
Reviewed s390 fp vector patch set.
Posted v16 rx. This seemed so close to being ready
last week, but now I don't know. I think I should
quit pushing it myself and let Yoshinori do more of
the lifting here.
Reviewed avr v20 patch set.
Reviewed Alex's testing patch set.
Submitted patches to constify upstream capstone
(500k from .data to .rodata).
r~
Short week (off Thursday/Friday)
== Progress ==
* FDPIC
- GCC: No progress, still waiting for feedback
* GCC upstream validation:
- reported a few regressions / fixed some testcases
* GCC:
- UBSAN/bare-metal: added sync primitives implementation for low-end
cores (eg cortex-m0) Seems OK
* Infra
- cleanup
- handling some problems with boards upgrades and crashes
== Next ==
FDPIC:
- GCC: handle feedback
UBSAN/bare-metal: do more testing
== This Week ==
* PR88837 (7/10)
- Addressed all upstream suggestions.
- Found a (hackish) way to test patch with qemu (multiple issues).
- Sorting thru "strange" testsuite fallout most of which seems
unrelated to my patch.
* PR88833 (2/10)
- Looking at fwprop pass
* Misc (1/10)
- Meetings
== Next Week ==
- Continue ongoing tasks.
(Short week, three days.)
Progress:
* VIRT-65 [QEMU upstream maintainership]
+ continuing with the conversion of the VFP decoder to 'decodetree'.
With some useful advice from RTH I have now got a big chunk of
it done, and it looks like this will provide:
- better places to put "UNDEF if CPU doesn't have double support" checks
- checks of "VFP enabled?" only after all UNDEF checks have happened
- cleanup of a lot of code that uses some TCG globals cpu_F0 and cpu_F1,
which is weird ancient style and overdue for a cleanup
- a VFP decoder which isn't a single thousand-line function with multiple
nested switch statements
thanks
-- PMM
3 day week.
[LLVM-122] BTI and PAC support in lld, llvm-readobj, llvm-objdump
- Now in upstream review. Most of the week spent writing and updating tests.
Some time reviewing some asm goto patches patches.
Planned absences:
Holiday Friday 31st May.
== Progress ==
* Out of office 22 & 24 May
* [GlobalISel] Refactor CallLowering [LLVM-568]
- In progress, likely going to take a while
- Found a minor bug in the lowering for AArch64 (I can get it to
crash on some edge case), not sure if it's worth fixing independently
since it gets fixed anyway with the refactoring that I have in
progress
- Trying to understand an AMDGPU failure
== Plan ==
* Out of office 29 May - 10 June
* More of the same
o LLVM
* 8.0.1-rc1 ARM and AArch64 Binaries uploaded.
* Buildbots: One fixe committed upstream.
* Machine outliner:
- Fixed liveness issue.
- Preparing pat6ch for re-submission
o Misc
* Various meetings and discussions.
[VIRT-343 # ARMv8.5-RNG, Random Number Generator ]
Merged!
[VIRT-263 # ARMv8.1-VHE Virtual Host Extensions ]
Started dusting off and rebasing wip.
[VIRT-339 # ARMv8.5-BTI, Branch Target Identification ]
Started reviewing the kernel patch set for this feature.
[VIRT-327 # Richard's upstream QEMU work ]
PR for tcg gvec work.
PR for Sato-san's RX target.
Patch set to update capstone and enable s390x.
GSOC: Review v3 of Jan's enable risu for x86 patch set.
[Other]
Travel arrangements for Xilinx meeting in San Jose, June 13.
Will need to pick Peter's brain re m-profile before then.
r~
== This Week ==
* PR88833 (4/10)
- Started investigating the issue, it seems one of the code-movement
RTL passes like cse2
do not remove identical register copies resulting in extra register move.
* PR88837 (5/10)
- Patch almost approved offline by Richard, suggested me to move
discussion upstream.
- Observed "strange" issue with return value vectors on qemu for
run-time tests for fixed-length vectors. Turned out due to mismatch
in vector-length at compile and run time -;)
- Trying to run SVE tests with qemu.
* Misc (1/10)
- Meetings
== Next Week ==
- Continue ongoing tasks
[LLVM-122] BTI and PAC support in LLD
Implementation now working, have written BTI tests, next step is
finishing off PAC tests.
[Misc]
Helped out debugging an asm-goto problem on ARM targets.
Investigated a GNU ld LMA overlap when VMA and LMA got out of sync.
Helped out with CMSIS use of ld scripts when using a fast-model,
needed to get LMA == VMA for program header covering BSS.
QEMU Tooling ([VIRT-252])
=========================
QEMU plugin support ([VIRT-280])
- synced up with Emilio, will take over branch and submit
- latest branch is [plugins/plugins-v3]
- will peel off simple clean-ups and tweaks next week
- then need to split up some more and better separate code
- exposed plugin_disas to for "howvec" instruction counter
- some [example] [output] while booting kernel
[VIRT-280] https://projects.linaro.org/browse/VIRT-280
[plugins/plugins-v3]
https://github.com/stsquad/qemu/tree/plugins/plugins-v3
[example] http://ix.io/1JXC
[output] http://ix.io/1JXl
GSoC Mentoring ([VIRT-348])
- planning for start of coding next week
[VIRT-348] https://projects.linaro.org/browse/VIRT-384
Upstream Work ([VIRT-109])
==========================
- prepared [testing/next] for PR
[VIRT-109] https://projects.linaro.org/browse/VIRT-109
[testing/next] https://github.com/stsquad/qemu/tree/testing/next
Completed Reviews [3/3]
=======================
{RISU v3 00/11} Support for i386/x86_64 with vector extensions
Message-Id: <20190523204409.21068-1-jan.bobek(a)gmail.com>
{PATCH v10 00/20} gdbstub: Refactor command packets handler
Message-Id: <20190521095948.8204-1-arilou(a)gmail.com>
- CLOSING NOTE [2019-05-24 Fri 17:30]
awaiting re-spin with tags applied.
{RFC v2 00/38} Plugin support
Message-Id: <20181209193749.12277-1-cota(a)braap.org>
- CLOSING NOTE [2019-05-24 Fri 17:47]
taking over the tree
Absences
========
- May 27th is a Bank Holiday
- May 31st working on train in the afternoon
Current Review Queue
====================
* {PATCH 0/5} tests/vm: Python 3, improve image caching, and misc
Message-Id: <20190329210804.22121-1-wainersm(a)redhat.com>
* {Qemu-devel} {PATCH for-4.1 v2 00/36} tcg: Move the softmmu tlb to CPUNegativeOffsetState
Message-Id: <20190328230404.12909-1-richard.henderson(a)linaro.org>
* {Qemu-devel} {RFC v4 0/7} Baby steps towards saner headers
Message-Id: <20190523081538.2291-1-armbru(a)redhat.com>
* {Qemu-arm} {PATCH v2 0/4} hw/arm/boot: handle large Images more gracefully
Message-Id: <20190516144733.32399-1-peter.maydell(a)linaro.org>
* {Qemu-devel} {PATCH v12 00/12} Add RX archtecture support
Message-Id: <20190514061458.125225-1-ysato(a)users.sourceforge.jp>
* {PATCH 00/13} target/arm/kvm: enable SVE in guests
Message-Id: <20190512083624.8916-1-drjones(a)redhat.com>
--
Alex Bennée
Progress:
* VIRT-65 [QEMU upstream maintainership]
+ sent patchset fixing a handful of simple GICv3 bugs
+ usual codereview work
+ sent out a sketch of how we can transition our documentation
from the current texinfo manual to a set of sphinx manuals
+ had a look at the practicalities of converting our hand-written
VFP decoder to 'decodetree' -- this may be the easiest way to
support FPU configs which only support single-precision, like Cortex-M33
thanks
-- PMM
== Progress ==
* FDPIC
- GCC: Updated patch 03/21 with changes in the handling of -static
according to feedback. Pinged the whole series.
* GCC upstream validation:
- reported a couple of regressions
* Infra
- [stalled] working on adding binutils regression testing to round-robin jobs
- cleanup
- handling some problems with boards upgrades and crashes
== Next ==
FDPIC:
- GCC: handle feedback
UBSAN/bare-metal: look at how to make it easier to use on CPUs that
lack sync primivites (eg cortex-m0)