[PATCH v2 1/4] mm: Add optional close() to struct vm_special_mapping

Linus Torvalds torvalds at linux-foundation.org
Tue Aug 20 05:29:34 AEST 2024


On Mon, 19 Aug 2024 at 11:53, Nathan Chancellor <nathan at kernel.org> wrote:
>
>
> Modules linked in:
> Pid: 24, comm: mount Not tainted 6.11.0-rc4-next-20240819
> RIP: 0033:0x68006f6c
> RSP: 000000006c8bfc68  EFLAGS: 00010206
> RAX: 0000000068006f6c RBX: 0000000068a0aa18 RCX: 00000000600d8b09
> RDX: 0000000000000000 RSI: 0000000068a0aa18 RDI: 0000000068805120
> RBP: 000000006c8bfc70 R08: 0000000000000001 R09: 0000000068ae0308
> R10: 000000000000000e R11: ffffffffffffffff R12: 0000000000000001
> R13: 0000000068a0aa18 R14: 0000000000000015 R15: 0000000068944a88
> Kernel panic - not syncing: Segfault with no mm
> CPU: 0 UID: 0 PID: 24 Comm: mount Not tainted 6.11.0-rc4-next-20240819 #1
> Stack:
>  600caeff 6c8bfc90 600d8b2a 68944a80
>  00000047 6c8bfda0 600cbfd9 6c8bfd50
>  68944ad0 68944a88 7f7ffff000 7f7fffffff
> Call Trace:
>  [<600caeff>] ? special_mapping_close+0x16/0x19

Hmm. No "Code:" line? Did you just edit that out, or maybe UML doesn't
print one out?

Anyway, for me that special_mapping_close() disassembles to


 <+0>:  mov    %rdi,%rsi
 <+3>:  mov    0x78(%rdi),%rdi
 <+7>:  mov    0x20(%rdi),%rax
 <+11>: test   %rax,%rax
 <+14>: je     0x600caa11 <special_mapping_close+24>
 <+16>: push   %rbp
 <+17>: mov    %rsp,%rbp
 <+20>: call   *%rax
 <+22>: pop    %rbp
 <+23>: ret
 <+24>: ret

which may just match yours, because special_mapping_close+0x16 is
obviously that +22, and it's the return point for that call.

And your %rax value does match that invalid %rip value of 0x68006f6c.

So it does look like it's jumping off to la-la-land, and the problem is the code

        const struct vm_special_mapping *sm = vma->vm_private_data;

        if (sm->close)
                sm->close(sm, vma);

where presumably 'vm_private_data' isn't a "struct vm_special_mapping *" at all.

And I think I see the problem.

When we have that 'legacy_special_mapping_vmops', then the
vm_private_data field actually points to 'pages'.

So the 'legacy_special_mapping_vmops' can *only* contain the '.fault'
handler, not the other handlers.

IOW, does something like this fix it?

  --- a/mm/mmap.c
  +++ b/mm/mmap.c
  @@ -2095,7 +2095,6 @@ static const struct vm_operations_struct
special_mapping_vmops = {
   };

   static const struct vm_operations_struct legacy_special_mapping_vmops = {
  -       .close = special_mapping_close,
          .fault = special_mapping_fault,
   };

and honestly, we should have different 'fault' functions instead of
having the same fault function and then making the function
dynamically see which vm_operations_struct it was. But that's a
separate issue.

And oh Christ, the difference between "_install_special_mapping()"
(which installs the modern style special mapping) and
"install_special_mapping()" (which installs the legacy special
mapping) is truly horrendously disgusting.

And yes, UML uses that legacy crap, which explains why it happens on
UML but not on sane architectures.

As does csky, hexagon, and nios2.

We should get rid of the legacy case entirely, and remove that insane
difference between _install_special_mapping() and
install_special_mapping().

But in the meantime, the one-liner fix *should* fix it. The four
architectures that uses the legacy crud don't care about the close
function anyway.

What a horrid thing.

                Linus


More information about the Linuxppc-dev mailing list