Re: alloc failed, but?

28 Jun 2012


      On 06/28/12 11:27, the mail apparently from Tom Gall included:
...
Hi All,
I'm stressing a system with apachebench. As one scales up work on a
system obviously there's always a point where the wheels fall off, the
engine explodes or something else exciting happens. But as Han Solo
would say ... "hold together baby....", I'd like to eek out as much as
I can. (If you're really interested, here's what I'm up to :
http://fullshovel.wordpress.com/  start with part 1)
In this case with apachebench, I'm geting the following allocation
errors in the kernel and need a little help deciphering. It sure looks
like there's plenty of space to swap out however if I have this right,
we're getting so much network traffic that the kernel gets inundated
and it OOMs in the network stack.
I did later try setting sysctl -w vm.min_free_kbytes=32768  but that
didn't really seem to help.
The much more complete dmesg dump is located at
http://people.linaro.org/~tgall/dmesg-dump.txt
...
[127100.245117] swapper/0: page allocation failure: order:3, mode:0x20
...
[127100.245666] [<80100f14>] (__alloc_pages_nodemask+0x678/0x7a4) from
[<80695270>] (kmem_getpages.isra.35+0x3c/0xc0)
[127100.245666] [<80695270>] (kmem_getpages.isra.35+0x3c/0xc0) from
[<80695380>] (cache_grow.constprop.37+0x8c/0x1fc)
[127100.245666] [<80695380>] (cache_grow.constprop.37+0x8c/0x1fc) from
[<8069570c>] (cache_alloc_refill+0x21c/0x274)
[127100.245819] [<8069570c>] (cache_alloc_refill+0x21c/0x274) from
[<80132dac>] (__kmalloc_track_caller+0xac/0x1b0)
[127100.245910] [<80132dac>] (__kmalloc_track_caller+0xac/0x1b0) from
[<8057a37c>] (__alloc_skb+0x60/0xfc)
[127100.245971] [<8057a37c>] (__alloc_skb+0x60/0xfc) from [<8057a874>]
(__netdev_alloc_skb+0x2c/0x54)
[127100.245971] [<8057a874>] (__netdev_alloc_skb+0x2c/0x54) from
[<8049dbb8>] (rx_submit+0x2c/0x1d4)
[127100.245971] [<8049dbb8>] (rx_submit+0x2c/0x1d4) from [<8049e1c0>]
(rx_complete+0x1a4/0x1b8)
[127100.245971] [<8049e1c0>] (rx_complete+0x1a4/0x1b8) from
[<804a5f38>] (usb_hcd_giveback_urb+0xb0/0xfc)
[127100.246246] [<804a5f38>] (usb_hcd_giveback_urb+0xb0/0xfc) from
[<804b887c>] (ehci_urb_done+0xb8/0xc4)
[127100.246246] [<804b887c>] (ehci_urb_done+0xb8/0xc4) from
[<804bb240>] (qh_completions+0xc8/0x49c)
Just some not directly useful extra info...
I noticed these yesterday in dmesg as well while adding the 32K 
min_free_kybytes in tilt-3.4 as a hack.  It seems to be part of some 
syndrome with smsc driver and network memory allocation that's in 
mainline and not Panda-specific.  Yesterday I saw in Google the same 
problems plaguing Raspberry Pi folks.
When I recently tried to stress the Panda a week or so ago by cloning 
gcc with a plan to compile it, in fact it lost sanity during the 
download with a storm of these kevent lost messages, hence the 32K hack 
being added.
I also remember the same problems about kevents being dropped getting 
looked at like a year ago without any solid result, it'll be interesting 
if anyone understands and can explain what the underlying issue is.
-Andy
-- 
Andy Green | TI Landing Team Leader
Linaro.org │ Open source software for ARM SoCs | Follow Linaro
http://facebook.com/pages/Linaro/155974581091106  - 
http://twitter.com/#%21/linaroorg - http://linaro.org/linaro-blog

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

Re: alloc failed, but?