Huge page support for PowerPC 32 bit and WIMG flexibility
Kumar Gala
galak at kernel.crashing.org
Thu Feb 1 09:29:25 EST 2007
On Jan 31, 2007, at 4:01 PM, Ilya Lipovsky wrote:
> Hi,
>
> I am not experienced in kernel development, so please be patient.
>
> After exploring the latest (2.19.2) sources it appears that there
> is no huge page support for the 32 bit powerpc platform. I deduced
> it by starting from 0x300 in head_32.S and comparing notes with
> head_64.S. It appears that the only sensible path for hashing in a
> huge page (on 64bit ppc) is via:
>
> 0x300: data_access -> do_hash_page -> hash_page -> hash_huge_page
>
> Unfortunately, on the 32bit, all paths that do anything useful end
> up in create_hpte() found in hash_low_32.S. I noticed someone on
> this mailing list claiming huge page support for IBM 44x core… Is
> it possible to make it general enough to encompass ppc32 in general?
>
> Another issue I have is the absence of control over hardware
> specific attributes of memory such as WIMG. More concretely, I am
> interested in having the ability to allocate off the heap in such a
> way so as to explicitly set the M (coherency) bit off
> (independently of SMP or non-SMP mode). This is needed because some
> multicore PowerPC platforms (e.g. 745x) perform an extra address
> broadcast to guarantee cache coherency per each store miss on a
> cacheline. This degrades performance for store-bound programs.
>
> I understand that hashing pages as non-cache-coherent makes data
> contained therein a potential victim to cache coherency paradoxes.
> Nevertheless, since I am working on high performance library, I am
> prepared to shift coherency guarantees to the library, which is
> supposed the one managing the data flow between memory and CPU
> caches intelligently.
>
> So, I have 2 main questions:
>
> 1) What’s so special about ppc32 that it didn’t get the
> matching feature of huge page support that ppc64 has? Who is
> responsible/willing to fix it?
The ppc32 HW doesn't support the same MMU features that ppc64 does.
There's a possibility for something like tlbfs support using BATs,
but the normal MMU path doesn't have any HW capable of doing large
pages.
> 2) Is it appropriate to provide a syscall mechanism (parallel
> to sys_brk, sys_mmap, and sys_shmget) to add WIMG settings?
You can do some of this via mmap today. I think O_SYNC is the flag
you need (well at least for mmap'ing /dev/mem).
> Overall, the vision here is to be able (from user-side, on
> powerpc32) to call:
>
>
>
> shmid = shmget(2, LENGTH, SHM_HUGETLB | IPC_CREAT | SHM_R | SHM_W |
> POWERPC_NONCOHERENT);
>
> shmaddr = shmat(shmid, ADDR, SHMAT_FLAGS);
>
>
>
> And get a segment mapped with wimg=0bxx0x (actually, I assume all
> x’s are 0). This would be very nice!
>
>
>
>
>
> Thank you,
>
> -Ilya
>
>
>
> P.S. As a side note, it is pretty difficult to read kernel sources
> (especially assembly ones) because of the lack of comments for
> people who are not in the kernel hacker “circle.” For example, what
> in the whole world is “paca??”
"paca" has to deal with the IBM HV interface.
- k
More information about the Linuxppc-embedded
mailing list