On Thu, May 5, 2016 at 12:45 PM, Marcin Juszkiewicz marcin.juszkiewicz@linaro.org wrote:
Recently my angry post on Google+ [1] got so many comments that it was clear that it would be better to move to some mailing list with discussion.
As it is about boot loaders and Linaro has engineers from most of SoC vendor companies I thought that this will be best one.
All started when I got Pine64 board (based on Allwinner A64 SoC) and had same issue as on several boards in past - boot loader written in some random place on SD card.
Days where people used Texas Instruments SoC chips were great - in-cpu boot loader knew how to read MBR partition table and was able to load 1st stage boot loader (called MLO) from it as long it was FAT filesystem.
GPU used by Raspberry/Pi is able to read MBR, finds 1st partition and reads firmware files from there as long it is FAT.
Chromebooks have some SPI flash to keep boot loaders and use GPT partitioning to find where from load kernel (or another boot loader).
And then we have all those boards where vendors decided that SPI flash for boot loader is too expensive so it will be read from SD card instead. From any random place of course...
Then we have distributions. And instead of generating bunch of images per board they want to make one clean image which will be able to handle as much as possible.
If there are UEFI machines on a list of supported ones then GPT partitioning will be used, boot loader will be stored in "EFI system area" and it boots. This is how AArch64 SBSA/SBBR machines work.
But there are also all those U-Boot (or fastboot/redboot/whateverboot) ones. They are usually handled by taking image from previous stage and adding boot loader(s) by separate script. And this is where "fun" starts...
GPT takes first 17KB of storage media as it allow to store information about 128 partitions. Sure, no one is using so many on such devices but still space is reserved.
But most of chips expects boot loader(s) to be stored:
- right after MBR
- from 1KB
- from 8KB
- any other random place
So scripts start to be sets of magic written to handle all those SoCs...
Solution for existing SoCs is usually adding 1MB of SPI flash during design phase of device and store boot loader(s) there. But it is so expensive someone would say when it is in 10-30 cents range...
To try and summarize, what you're asking for is to define the usage model for eMMC/SD when both the firmware* and OS are stored on the same media. Some argue that these things should always be on separate devices, but while the debate is interesting, it doesn't match the reality of how hardware is being built. In which case, the derived requirements are:
1) Co-exist with MBR partitioning 2) Co-exist with GPT partitioning 3) Be detectable --- partitioning tools must respect it 4) Be speced. Write it down so that tool and SoC developers can see it as a requirement 5) Be usable regardless of firmware type (UEFI, U-Boot, Little Kernel, etc) 6) Support some form of firmware non-volatile storage (variable storage)
It would be really nice if we could also have: 7) Support SoCs that hardcode boot code to specific locations (after-MBR, 1K, 8K, random) - May not be able to support all variants, but it is a worthy design goal.
Agreed?
* I'm ignoring eMMC's separate boot area because that solution has firmware and OS logically separated. Strong recommendation is for SoCs to boot from boot area. Then normal GPT/MBR partitioning works just fine. The rest of this discussion only applies If the SoC cannot do that
(For the following discussion, I refer to the UEFI spec because that is where GPT is defined, but the expectation is that anything described here can equally be used by non-UEFI platforms)
I've just read through the UEFI GPT spec, and here are the constraints: - MBR must be at the start of LBA0 (0 - 0.5k) - Primary GPT must be at the start of LBA1 (0.5k to 4k, but may collide with fw), - It /seems/ like the GPT Header and GPT table can be separated by some blocks. The GPT header has a PartitionEntryLBA field which describes where the actual table of partitions starts. - GPTHeader is only 92 bytes. - It should be possible to have: GPTHeader @ start of LBA1 and GPTPartitionTable @ an LBA that doesn't conflict with firmware.
I think we have everything we need to work around the location of the FW boot image without breaking the UEFI spec. The biggest problem is making sure partitioning tools don't go stomping over required firmware data and rendering systems unbootable. I *think* we can solve that problem by extending the MBR definition to block out a required region and then work around that. Tools can generically see the special region in the MBR and work around it accordingly.
So, let me try to itemize the use cases:
a. SoC boots from address immediately after MBR. - We're hooped. - Completely incompatible with current GPT spec. - Options: - Use MBR partitioning - move GPT to LBA > 1 - this creates a new question of how does the OS find the GPT? - I would prefer to treat this hardware as legacy and keep GPT at LBA1
b. SoC boots from fixed address > 1k and < 4k(?) - Place protective MBR at LBA0 - describe firmware block in MBR + protective GPT region - Modify spec to define this new format - Place GPT Header at LBA1 - Place GPT Partition Table > end of firmware LBA - Teach tools to understand & respect protective MBA when formatting disk
c. SoC boots from special MBR partition - place protective MBA at LBA0 (same as b.) - Describe FW partition in MBR - Perhaps mark as movable? - Vendor tools must place FW partition > 4k - Modify spec to define this new format (same format as b.) - Place GPT Header and partition table normally. - Tools should notice FW partition isn't in the way - Generate the GPT in the normal way - Either: - Add FW partition to the GPT table, or - set FirstUsableLBA to be immediately after the FW partition (safest?)
d. SoC boots from GPT partition. - Add new partition type that tools can recognize and refuse to delete unless the user forces it. - Define the new partition type in the UEFI spec, along with the required behaviour
I think that covers it, but I haven't yet talked through what to do with firmware non-volatile storage. It could be handled as another special firmware volume, or it could be treated as appended to the fw image. Need a bit more thought here.
If this looks good, then I can write up a proposed change to the spec and everyone can go look at the impact on firmware, the kernel, and tools.
g.