[PATCH 2/2] mm/dax: Don't enable huge dax mapping by default

Dan Williams dan.j.williams at intel.com
Thu Mar 21 07:57:25 AEDT 2019


On Wed, Mar 20, 2019 at 8:34 AM Dan Williams <dan.j.williams at intel.com> wrote:
>
> On Wed, Mar 20, 2019 at 1:09 AM Aneesh Kumar K.V
> <aneesh.kumar at linux.ibm.com> wrote:
> >
> > Aneesh Kumar K.V <aneesh.kumar at linux.ibm.com> writes:
> >
> > > Dan Williams <dan.j.williams at intel.com> writes:
> > >
> > >>
> > >>> Now what will be page size used for mapping vmemmap?
> > >>
> > >> That's up to the architecture's vmemmap_populate() implementation.
> > >>
> > >>> Architectures
> > >>> possibly will use PMD_SIZE mapping if supported for vmemmap. Now a
> > >>> device-dax with struct page in the device will have pfn reserve area aligned
> > >>> to PAGE_SIZE with the above example? We can't map that using
> > >>> PMD_SIZE page size?
> > >>
> > >> IIUC, that's a different alignment. Currently that's handled by
> > >> padding the reservation area up to a section (128MB on x86) boundary,
> > >> but I'm working on patches to allow sub-section sized ranges to be
> > >> mapped.
> > >
> > > I am missing something w.r.t code. The below code align that using nd_pfn->align
> > >
> > >       if (nd_pfn->mode == PFN_MODE_PMEM) {
> > >               unsigned long memmap_size;
> > >
> > >               /*
> > >                * vmemmap_populate_hugepages() allocates the memmap array in
> > >                * HPAGE_SIZE chunks.
> > >                */
> > >               memmap_size = ALIGN(64 * npfns, HPAGE_SIZE);
> > >               offset = ALIGN(start + SZ_8K + memmap_size + dax_label_reserve,
> > >                               nd_pfn->align) - start;
> > >       }
> > >
> > > IIUC that is finding the offset where to put vmemmap start. And that has
> > > to be aligned to the page size with which we may end up mapping vmemmap
> > > area right?
>
> Right, that's the physical offset of where the vmemmap ends, and the
> memory to be mapped begins.
>
> > > Yes we find the npfns by aligning up using PAGES_PER_SECTION. But that
> > > is to compute howmany pfns we should map for this pfn dev right?
> > >
> >
> > Also i guess those 4K assumptions there is wrong?
>
> Yes, I think to support non-4K-PAGE_SIZE systems the 'pfn' metadata
> needs to be revved and the PAGE_SIZE needs to be recorded in the
> info-block.

How often does a system change page-size. Is it fixed or do
environment change it from one boot to the next? I'm thinking through
the behavior of what do when the recorded PAGE_SIZE in the info-block
does not match the current system page size. The simplest option is to
just fail the device and require it to be reconfigured. Is that
acceptable?


More information about the Linuxppc-dev mailing list