Hello,
Eric, sorry for the delay.
On (20/12/11 10:03), Eric Biggers wrote:
[..]
[ 1598.658233] __rwsem_down_read_failed_common+0x186/0x201 [ 1598.658235] call_rwsem_down_read_failed+0x14/0x30 [ 1598.658238] down_read+0x2e/0x45 [ 1598.658240] rmap_walk_file+0x73/0x1ce [ 1598.658242] page_referenced+0x10d/0x154 [ 1598.658247] shrink_active_list+0x1d4/0x475 [ 1598.658250] shrink_node+0x27e/0x661 [ 1598.658254] try_to_free_pages+0x425/0x7ec [ 1598.658258] __alloc_pages_nodemask+0x80b/0x1514 [ 1598.658279] __do_page_cache_readahead+0xd4/0x1a9 [ 1598.658282] filemap_fault+0x346/0x573 [ 1598.658287] ext4_filemap_fault+0x31/0x44
Could you provide some more information about what is causing these actual lockups for you? Are there more stack traces?
I think I have some leads, and, just like you said, this deos not appear to be ext4 related.
A likely root cause for the lockups I'm observing, is that kswapd and virtio_balloon have reverse locking order for THP pages:
down_write(mapping->i_mmap_rwsem) --> page->PG_locked vs page->PG_locked --> down_read(mapping->i_mmap_rwsem)
-ss