Skip to content

Commit dc90860

Browse files
Mel Gormantorvalds
authored andcommitted
mm: compaction: check for overlapping nodes during isolation for migration
When isolating pages for migration, migration starts at the start of a zone while the free scanner starts at the end of the zone. Migration avoids entering a new zone by never going beyond the free scanned. Unfortunately, in very rare cases nodes can overlap. When this happens, migration isolates pages without the LRU lock held, corrupting lists which will trigger errors in reclaim or during page free such as in the following oops BUG: unable to handle kernel NULL pointer dereference at 0000000000000008 IP: [<ffffffff810f795c>] free_pcppages_bulk+0xcc/0x450 PGD 1dda554067 PUD 1e1cb58067 PMD 0 Oops: 0000 [#1] SMP CPU 37 Pid: 17088, comm: memcg_process_s Tainted: G X RIP: free_pcppages_bulk+0xcc/0x450 Process memcg_process_s (pid: 17088, threadinfo ffff881c2926e000, task ffff881c2926c0c0) Call Trace: free_hot_cold_page+0x17e/0x1f0 __pagevec_free+0x90/0xb0 release_pages+0x22a/0x260 pagevec_lru_move_fn+0xf3/0x110 putback_lru_page+0x66/0xe0 unmap_and_move+0x156/0x180 migrate_pages+0x9e/0x1b0 compact_zone+0x1f3/0x2f0 compact_zone_order+0xa2/0xe0 try_to_compact_pages+0xdf/0x110 __alloc_pages_direct_compact+0xee/0x1c0 __alloc_pages_slowpath+0x370/0x830 __alloc_pages_nodemask+0x1b1/0x1c0 alloc_pages_vma+0x9b/0x160 do_huge_pmd_anonymous_page+0x160/0x270 do_page_fault+0x207/0x4c0 page_fault+0x25/0x30 The "X" in the taint flag means that external modules were loaded but but is unrelated to the bug triggering. The real problem was because the PFN layout looks like this Zone PFN ranges: DMA 0x00000010 -> 0x00001000 DMA32 0x00001000 -> 0x00100000 Normal 0x00100000 -> 0x01e80000 Movable zone start PFN for each node early_node_map[14] active PFN ranges 0: 0x00000010 -> 0x0000009b 0: 0x00000100 -> 0x0007a1ec 0: 0x0007a354 -> 0x0007a379 0: 0x0007f7ff -> 0x0007f800 0: 0x00100000 -> 0x00680000 1: 0x00680000 -> 0x00e80000 0: 0x00e80000 -> 0x01080000 1: 0x01080000 -> 0x01280000 0: 0x01280000 -> 0x01480000 1: 0x01480000 -> 0x01680000 0: 0x01680000 -> 0x01880000 1: 0x01880000 -> 0x01a80000 0: 0x01a80000 -> 0x01c80000 1: 0x01c80000 -> 0x01e80000 The fix is straight-forward. isolate_migratepages() has to make a similar check to isolate_freepage to ensure that it never isolates pages from a zone it does not hold the LRU lock for. This was discovered in a 3.0-based kernel but it affects 3.1.x, 3.2.x and current mainline. Signed-off-by: Mel Gorman <[email protected]> Acked-by: Michal Nazarewicz <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
1 parent 1ecd3c7 commit dc90860

File tree

1 file changed

+10
-1
lines changed

1 file changed

+10
-1
lines changed

mm/compaction.c

Lines changed: 10 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -330,8 +330,17 @@ static isolate_migrate_t isolate_migratepages(struct zone *zone,
330330
continue;
331331
nr_scanned++;
332332

333-
/* Get the page and skip if free */
333+
/*
334+
* Get the page and ensure the page is within the same zone.
335+
* See the comment in isolate_freepages about overlapping
336+
* nodes. It is deliberate that the new zone lock is not taken
337+
* as memory compaction should not move pages between nodes.
338+
*/
334339
page = pfn_to_page(low_pfn);
340+
if (page_zone(page) != zone)
341+
continue;
342+
343+
/* Skip if free */
335344
if (PageBuddy(page))
336345
continue;
337346

0 commit comments

Comments
 (0)