From: Andi Kleen The vmap vmalloc rework in 2.5 had a unintended side effect. vmalloc uses kmalloc now to allocate an array with a list of pages. kmalloc has a 128K maximum. This limits the vmalloc maximum size to 64MB on a 64bit system with 4K pages. That limit causes problems with other subsystems, e.g. iptables relies on allocating large vmallocs for its rule sets. This is a bug IMHO - on 64bit platforms there shouldn't be such a low limit on the vmalloc size. And even on 32bit it's too small for custom kernels with enlarged vmalloc area. Another problem is that this makes vmalloc unreliable. After the system has been running for some time it is unlikely that kmalloc will be able to allocate >order 2 pages due to memory fragmentation. This patch takes the easy way out for fixing this by just allocating this array with vmalloc when it is larger than a page. While more complicated and intrusive solutions would be possible they didn't use vmalloc recursively they didn't seem it worth to handle this very infrequent case. Please note that the vmalloc recursion is strictly bounded because each nested allocation will generate a much smaller stack frame. Also the kernel stack can handle even a few recursion steps easily because vmalloc has only a small stack frame. Signed-off-by: Andi Kleen Signed-off-by: Andrew Morton --- 25-akpm/mm/vmalloc.c | 12 ++++++++++-- 1 files changed, 10 insertions(+), 2 deletions(-) diff -puN mm/vmalloc.c~fix-small-vmalloc-per-allocation-limit mm/vmalloc.c --- 25/mm/vmalloc.c~fix-small-vmalloc-per-allocation-limit 2005-02-08 02:47:28.000000000 -0800 +++ 25-akpm/mm/vmalloc.c 2005-02-08 02:47:28.000000000 -0800 @@ -378,7 +378,10 @@ void __vunmap(void *addr, int deallocate __free_page(area->pages[i]); } - kfree(area->pages); + if (area->nr_pages > PAGE_SIZE/sizeof(struct page *)) + vfree(area->pages); + else + kfree(area->pages); } kfree(area); @@ -482,7 +485,12 @@ void *__vmalloc(unsigned long size, int array_size = (nr_pages * sizeof(struct page *)); area->nr_pages = nr_pages; - area->pages = pages = kmalloc(array_size, (gfp_mask & ~__GFP_HIGHMEM)); + /* Please note that the recursion is strictly bounded. */ + if (array_size > PAGE_SIZE) + pages = __vmalloc(array_size, gfp_mask, PAGE_KERNEL); + else + pages = kmalloc(array_size, (gfp_mask & ~__GFP_HIGHMEM)); + area->pages = pages; if (!area->pages) { remove_vm_area(area->addr); kfree(area); _