[PATCH v3 1/3] mm: Introduce arch_reserved_kernel_pages()

Srikar Dronamraju srikar at linux.vnet.ibm.com
Mon Aug 29 23:06:48 AEST 2016

Currently arch specific code can reserve memory blocks but
alloc_large_system_hash() may not take it into consideration when sizing
the hashes. This can lead to bigger hash than required and lead to no
available memory for other purposes. This is specifically true for

One approach to solve this problem would be to walk through the memblock
regions and calculate the available memory and base the size of hash
system on the available memory.

The other approach would be to depend on the architecture to provide the
number of pages that are reserved. This change provides hooks to allow
the architecture to provide the required info.

Cc: linux-mm at kvack.org
Cc: Mel Gorman <mgorman at techsingularity.net>
Cc: Vlastimil Babka <vbabka at suse.cz>
Cc: Michal Hocko <mhocko at kernel.org>
Cc: Andrew Morton <akpm at linux-foundation.org>
Cc: Michael Ellerman <mpe at ellerman.id.au>
Cc: linuxppc-dev at lists.ozlabs.org
Cc: Mahesh Salgaonkar <mahesh at linux.vnet.ibm.com>
Cc: Hari Bathini <hbathini at linux.vnet.ibm.com>
Cc: Dave Hansen <dave.hansen at intel.com>
Cc: Balbir Singh <bsingharora at gmail.com>
Suggested-by: Mel Gorman <mgorman at techsingularity.net>
Signed-off-by: Srikar Dronamraju <srikar at linux.vnet.ibm.com>
 include/linux/mm.h |  3 +++
 mm/page_alloc.c    | 12 ++++++++++++
 2 files changed, 15 insertions(+)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index 08ed53e..7e91cd8 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1924,6 +1924,9 @@ extern void show_mem(unsigned int flags);
 extern long si_mem_available(void);
 extern void si_meminfo(struct sysinfo * val);
 extern void si_meminfo_node(struct sysinfo *val, int nid);
+extern unsigned long arch_reserved_kernel_pages(void);
 extern __printf(3, 4)
 void warn_alloc_failed(gfp_t gfp_mask, unsigned int order,
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 3fbe73a..9d91706 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -6976,6 +6976,17 @@ static int __init set_hashdist(char *str)
 __setup("hashdist=", set_hashdist);
+ * Returns the number of pages that arch has reserved but
+ * is not known to alloc_large_system_hash().
+ */
+static unsigned long __init arch_reserved_kernel_pages(void)
+	return 0;
  * allocate a large system hash table from bootmem
  * - it is assumed that the hash table must contain an exact power-of-2
@@ -7000,6 +7011,7 @@ void *__init alloc_large_system_hash(const char *tablename,
 	if (!numentries) {
 		/* round applicable memory size up to nearest megabyte */
 		numentries = nr_kernel_pages;
+		numentries -= arch_reserved_kernel_pages();
 		/* It isn't necessary when PAGE_SIZE >= 1MB */
 		if (PAGE_SHIFT < 20)

