On Thu, 23 Sep 2010, Catalin Marinas wrote:
On Thu, 2010-09-23 at 16:15 +0100, Arnd Bergmann wrote:
On Thursday 23 September 2010, Nicolas Pitre wrote:
This highmem topic comes from the fact that highmem will be needed in the period of time between now and LPAE where we have boards with lots of memory but we can't address it all without highmem (unless we want to revisit the 3g/1g split, but I personally think not).
Note that LPAE does require highmem to be useful. The only way highmem could be avoided is to move to a 64-bit architecture.
Right, I'd even say LPAE can only make things worse because people will stick even more memory into their systems, most of which then becomes highmem.
If you really need so much memory, it's more efficient to have LPAE +highmem than a swap device. The problem is if the OS doesn't need so much memory but it is available, Linux tries to allocate from highmem first. What could help is a different zone fall back mechanism trying to allocate from lowmem up to a certain threshold.
Beware the subtlety here. The kernel will target highmem first for user space allocations, as this is in most cases memory that the kernel won't have to touch. Typically you get user memory populated with application code and data through DMA and the kernel doesn't have to kmap() those pages. Even swapping user space pages doesn't require that the kernel see the content of those pages. But that works out _only_ if IO is performed through DMA, and that DMA can be done on the full physical address range. As soon as you need to bounce data into lowmem you start to lose.
Also when highmem is involved, the proportion of low pages vs high pages becomes quickly small (more than 3 times as many highmem pages than lowmem pages when there is 4G of RAM), and lowmem pages become a sparse resource. It is normal in that case to favor highmem page allocations as much as possible.
Another option would be to use the highmem for hosting a swap via some form of ramdisk or slram/phram.
This is useless when highmem is allocated to user space. Better to simply allocate user space in highmem directly and do nothing else than swapping between page tables on context switch.
Nicolas