Hi -
With this patch, we've been seeing a small number of machines in our fleet boot up but are not able to register a SCSI device:
[ 6.290992] scsi_alloc_sdev: Allocation failure during SCSI scanning, some SCSI devices might not be configured
It usually goes away upon another reboot. I don't have a reliable reproducer except for rebooting some servers repeatedly on 6.1.132.
I added a couple of printks around the various cases where scsi_alloc_sdev fails, as there are 3 allocation sites, and also pulled in f7d77dfc91 ("mm/percpu.c: print error message too if atomic alloc failed"), and isolated it to a failed percpcu allocation:
[ 5.431189] percpu: allocation failed, size=4 align=4 atomic=1, atomic alloc failed, no space left [ 5.440383] sbitmap_init_node: init_alloc_hint failed. [ 5.440383] scsi_realloc_sdev_budget_map: sbitmap_init_node failed with -12
Which kind of makes sense, as __alloc_percpu_gfp says:
If @gfp doesn't contain %GFP_KERNEL, the allocation doesn't block and can be called from any context but is a lot more likely to fail.
Reverting this patch in our environment made the initial SCSI scan reliably work, and we no longer see issues with the SCSI drive disappearing.