Thanks a bucnh for the review!
+
+ if (PageOffline(page) && PageOfflineSkippable(page))
+ continue;
+
Some comment like "Skippable PageOffline() pages are not migratable but are
skipped during memory offline" might help understand the change.
I can add a comment like for the other cases.
Some thoughts after reading the related code:
offline_pages() is a little convoluted, since it has two steps to make
sure a range of memory can be offlined:
1. start_isolate_page_range() checks unmovable (maybe not migratable
is more precise) pages using has_unmovable_pages(), but leaves unmovable
PageOffline() page handling to the driver;
Right, it's best-effort. For ZONE_MOVABLE we're skipping the checks
completely.
I was wondering for a longer time that -- with the isolate flag being a
separate bit soon :) -- we could simply isolate the whole range, and
then fail if we stumble over an unmovable page during migration. That
is, get rid of has_unmovable_pages() entirely. Un-doing the isolation
would then properly preserve the migratetype -- no harm done?
Certainly worth a look. What do you think about that?
2. scan_movable_pages() and do_migrate_range() migrate pages and handle
different types of PageOffline() pages.
Right, migrate what we can, skip these special once, and bail out on any
others (at least in this patch, patch #2 restores the pre-virtio-mem
behavior).
It might make the logic cleaner if start_isolate_page_range() can
have a callback to allow driver to handle PageOffline() pages.
We have to identify them repeadetly, so start_isolate_page_range() would
not be sufficient. Also, callbacks are rather tricky for the case where
we cannot really stabilize the page.
Hmm, Skippable PageOffline() pages are virtio-mem specific, I wonder
why offline_pages() takes care of them. Shouldn't virtio-mem driver
handle them?
> I also realize that the two steps in offline_pages()> are very
similar to alloc_contig_range() and virtio-mem is using
alloc_contig_range(). I wonder if the two steps in offline_pages()
can be abstracted out and shared with virtio-mem.Yes, offline_pages()> operates at memory section granularity, different from the granularity,
pageblock size, of alloc_contig_range() used in virtio-mem, but
both are trying to check unmovable pages and migrate movable pages.
Not sure I get completely what you mean. virtio-mem really wants to use
alloc_contig_range() to allocate ranges it wants to unplug, because it
will fail fast and allows for smaller ranges.
offline_pages() is primarily also about offlining the memory section,
which virtio-mem must do in order to remove the Linux memory block.
folio = page_folio(page);
if (!folio_try_get(folio))
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index a6fe1e9b95941..325b51c905076 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -7023,12 +7023,12 @@ unsigned long __offline_isolated_pages(unsigned long start_pfn,
continue;
}
/*
- * At this point all remaining PageOffline() pages have a
- * reference count of 0 and can simply be skipped.
+ * At this point all remaining PageOffline() pages must be
+ * "skippable" and have exactly one reference.
*/
if (PageOffline(page)) {
- BUG_ON(page_count(page));
- BUG_ON(PageBuddy(page));
+ WARN_ON_ONCE(!PageOfflineSkippable(page));
+ WARN_ON_ONCE(page_count(page) != 1);
already_offline++;
pfn++;
continue;
diff --git a/mm/page_isolation.c b/mm/page_isolation.c
index b2fc5266e3d26..debd898b794ea 100644
--- a/mm/page_isolation.c
+++ b/mm/page_isolation.c
@@ -121,16 +121,11 @@ static struct page *has_unmovable_pages(unsigned long start_pfn, unsigned long e
continue;
/*
- * We treat all PageOffline() pages as movable when offlining
- * to give drivers a chance to decrement their reference count
- * in MEM_GOING_OFFLINE in order to indicate that these pages
- * can be offlined as there are no direct references anymore.
- * For actually unmovable PageOffline() where the driver does
- * not support this, we will fail later when trying to actually
- * move these pages that still have a reference count > 0.
- * (false negatives in this function only)
+ * PageOffline() pages that are marked as "skippable"
+ * are treated like movable pages for memory offlining.
*/
- if ((flags & MEMORY_OFFLINE) && PageOffline(page))
+ if ((flags & MEMORY_OFFLINE) && PageOffline(page) &&
+ PageOfflineSkippable(page))
continue;
With this change, we are no longer give non-virtio-mem driver a chance
to decrease PageOffline(page) refcount? Or virtio-mem is the only driver
doing this?
The only in-tree driver making use of this so far, yes.
There is one tweak I'll have to perform in the virtio-mem driver to
cover one corner case: when force-unloading the virtio-mem driver, we
have to make sure that memory blocks with fake-offline pages cannot get
offlined (re-onlining would be bad). I'll fix that up.
--
Cheers,
David / dhildenb