Hi Laszlo,
On 5 February 2016 at 17:19, Laszlo Ersek lersek@redhat.com wrote:
On 02/05/16 17:35, Ryan Harkin wrote:
Hello all,
I'm having a problem that is platform specific, but perhaps more of a generic problem.
When ARM's Juno board boots, not all devices are connected. The first boot creates the boot variables and sets their order, meaning that we get the following list on the first attempt:
EFI Misc Device EFI Misc Device 1 EFI Internal Shell
Intel BDS then attempts to boot from one of the devices and ends up in Shell. After exiting Shell, the Intel BDS console GUI comes up. Selecting the Boot Manager option shows more devices being connected and the list becomes longer:
EFI Misc Device EFI Misc Device 1 EFI Internal Shell EFI Hard Drive EFI Network
Subsequent boots will never attempt to boot from Hard Drive or Network because Shell will always succeed. That is not good.
Leif has a patch in his working tree that solves this problem [1] by making the platform call BdsLibConnectAll() at init time. So now, the first time boot order looks sane:
EFI Misc Device EFI Misc Device 1 EFI Hard Drive EFI Network EFI Internal Shell
However, then the board is booting, the "EFI Network" fails to boot the first time and so the board drops back to Shell again:
Warning: LAN9118 Driver in stopped state Link timeout in auto-negotiation. Lan9118: Auto Negociation not supported. EhcExecTransfer: transfer failed with 2 EhcControlTransfer: error - Device Error, transfer - 2 Buffer: EFI Hard Drive Booting EFI Misc Device Booting EFI Misc Device 1 Booting EFI Hard Drive Booting EFI Network Warning: LAN9118 Driver not initialized Link timeout in auto-negotiation. Lan9118: Auto Negociation not supported. Booting EFI Internal Shell
Exiting Shell drops the user back to the Intel BDS UI. Selecting "Continue" then succeeds in booting from the EFI Network:
Booting EFI Misc Device Booting EFI Misc Device 1 Booting EFI Hard Drive Booting EFI Network ..MnpFreeTxBuf: Duplicated recycle report from SNP. MnpFreeTxBuf: Duplicated recycle report from SNP. [snip repeated SNP errors]
If I duplicate the call to BdsLibConnectAll() [2], then boot works as expected. On first boot, the boot order is created correctly and EFI Network pulls down a file and boots it.
I'm assuming that the 2nd call is connecting things that didn't connect the first time. And from that, I suspect/guess that perhaps they didn't connect due to either ordering or timing.
Is there a recommended way to set the order things are connected? Is it even possible to specify dependencies or order? And if so, how do we work out what the order should be?
I cannot give a coherent answer, just a few thoughts.
(1) I think BdsLibConnectAll() actually succeeds for the first time as well. All devices are enumerated, all drivers are connected, aren't they? The boot order is a separate question.
Yes, you're right, they are all connected because they all appear in the boot list.
(2) The network, the NIC, or the NIC driver are more probable suspects. If I see right, you always have a misc / misc1 / hd / network sequence of attempts, it's just that on the first few occasions, the network fails. ("Link timeout in auto-negotiation".)
Correct.
(3) I think repeated BdsLibConnectAll() calls may only give more time to the NIC to bring itself into working shape. What if you keep only one BdsLibConnectAll(), and replace the second BdsLibConnectAll() with a sizeable gBS->Stall()?
Eureka! I replace the 2nd BdsLibConnectAll() with "gBS->Stall(500000);" (0.5 seconds) and this works every time also.
So time to negociate (sic) would seem like the culprit. I suppose a 2nd BdsLibConnectAll() buys the NIC some time.
I'm left wondering if the "Boot EFI Network" option should actually be waiting for negotiation, however. I'm sure it's common on first boot that the network needs a little time to negotiate. I'll look into that. Perhaps there is a setting or an override to tell it to be patient?
(4) What the boot order should be can be influenced by the platform BDS lib, in the PlatformBdsPolicyBehavior() function.
Namely, the BdsEntry() function in "MdeModulePkg/Universal/Variable/RuntimeDxe/Variable.c" initializes the "BootOptionList" variable to an empty list. Then it calls PlatformBdsPolicyBehavior(), which takes "BootOptionList" as an input/output parameter -- if it wishes, it can populate it.
In ArmVirtPkg and in OvmfPkg, we perform the following steps in PlatformBdsPolicyBehavior():
(a) connect the console(s)
(b) BdsLibConnectAll()
(c) BdsLibEnumerateAllBootOption (BootOptionList) -- this relies on the presence of all devices, from the previous step. This function (in "IntelFrameworkModulePkg/Library/GenericBdsLib/BdsBoot.c") has extensive documentation in its leading comment.
It will enumerate everything sensible (modifying BootOrder as well I think), and output a BootOptionList that contains all the possible boot options, in a sane order. Sanity means, if I remember correctly, that all options that existed previously and were referenced by BootOrder, retain their positions at the front of the list, and any new auto-detected boot options are tacked to the end.
(d) SetBootOrderFromQemu (BootOptionList) -- this is the really platform specific part for massaging the boot order. We read through BootOptionList -- we don't modify it --, do various calculations, and then rewrite the BootOrder variable. Importantly, all Boot#### variables that become *unreferenced* by BootOrder as a result of this, must be deleted (otherwise they constitute a leak). Again, BootOptionList is not modified.
(e) BdsLibBuildOptionFromVar (BootOptionList, L"BootOrder") -- it rebuilds BootOptionList from the new BootOrder contents. (We are again in PlatformBdsPolicyBehavior(), where BootOptionList counts as input/output.)
On a physical platform, I think you just go with (b) and (c), and then let the user customize the boot order. Next time you boot, (c) will respect that.
Excellent answer, thanks. It looks like (c) is exactly the thing I'm looking for. For example, make HDD boot before USB. That sort of thing.
I'm quite happy that once the default boot order has been set that it stays that way unless the user changes it. I don't (think I) want to customise the boot order after the initial boot.
There are further possibilities; there is a "boot mode" HOB with which your low-level platform code can control your BDS policy, in order to speed up things. See BdsLibGetBootMode() and the macros in "MdePkg/Include/Pi/PiBootMode.h". Those macros are documented in one of the PI spec volumes.
For example, I think BOOT_ASSUMING_NO_CONFIGURATION_CHANGES is meant to be very fast (no need to connect all devices to all drivers), but such a HOB must be produced by your own PEI phase somehow -- you must know for example that the chassis was never opened while the machine was off.
FWIW, OVMF only uses BOOT_WITH_FULL_CONFIGURATION, and BOOT_ON_S3_RESUME, and these two are differentiated in OVMF's PEI phase by reading a CMOS register.
Anyway, I think what you need is:
- call BdsLibConnectAll() exactly once
- give that NIC more time (?)
- if you'd like to regenerate all possible boot options *at the end* of BootOrder that the user may have deleted (or have become available by installing new hardware), call BdsLibEnumerateAllBootOption() too.
Yes, that sounds about right. I have concerns about the negotiation timing, but the boot order hacking sounds like what I'm looking for.
Thanks again, Ryan.
Laszlo
Regards, Ryan.
[1] https://git.linaro.org/uefi/linaro-edk2.git/commitdiff/bfbd0ef1a182e1baa120f... [2] https://git.linaro.org/landing-teams/working/arm/edk2.git/commitdiff/25320ba...