powerpc allyesconfig / allmodconfig linux-next next-20160729 - next-20160729 build failures

Arnd Bergmann arnd at arndb.de
Sun Aug 7 07:13:17 AEST 2016


On Saturday, August 6, 2016 2:17:16 PM CEST Nicholas Piggin wrote:
> On Fri, 05 Aug 2016 21:16:00 +0200
> Arnd Bergmann <arnd at arndb.de> wrote:
> 
> > On Saturday, August 6, 2016 2:16:42 AM CEST Nicholas Piggin wrote:
> > > > 
> > > > diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h
> > > > index 0ec807d69f18..7a3ad269fa23 100644
> > > > --- a/include/asm-generic/vmlinux.lds.h
> > > > +++ b/include/asm-generic/vmlinux.lds.h
> > > > @@ -433,7 +433,7 @@
> > > >   * during second ld run in second ld pass when generating System.map */
> > > >  #define TEXT_TEXT                                                    \
> > > >               ALIGN_FUNCTION();                                       \
> > > > -             *(.text.hot .text .text.fixup .text.unlikely)           \
> > > > +             *(.text.hot .text .text.* .text.fixup .text.unlikely)   \
> > > >               *(.ref.text)                                            \
> > > >       MEM_KEEP(init.text)                                             \
> > > >       MEM_KEEP(exit.text)                                             \
> > > > 
> > > > 
> > > > It also got much faster again, the link time for an allyesconfig
> > > > kernel is now 18 minutes instead of 10 hours, but it's still
> > > > much worse than the 2 minutes I had earlier or the four minutes
> > > > with the previous patch.  
> > > 
> > > Are you using the patches I just sent?  
> > 
> > Not yet, I was still busy with the older version, and trying to
> > figure out exactly what went wrong in ld.bfd. FWIW, I first tried
> > to see if the hash tables were just too small, but as it turned
> > out that was not the problem. When I tried to change the default
> > hash table sizes, making them bigger only made things slower.
> > 
> > I also found the --hash-size=xxx option, which has a significant
> > impact on runtime speed. Interestingly again, using sizes less
> > than the default made things faster in practice. If we can
> > work out the optimum size for the kernel build, that might
> > shave a few minutes off the total build time.
> > 
> > > Either way, you also need
> > > to do the same for data and bss sections as you are using
> > > -fdata-sections too.  
> > 
> > Right.
> > 
> > > I've found virtually no build time regression on powerpc or x86
> > > when those are taken care of properly (x86 numbers I sent are typo,
> > > it's not 5m20, it's 5m02).  
> > 
> > Interesting. I wonder if it's got something to do with the
> > generation of the branch trampolines on ARM, as we have a lot
> > of them on an allyesconfig.
> 
> Powerpc generates quite a few branch trampolines as well, so
> I'm not sure if that would be the issue. Can you get a profile
> of the link?


CPU: AMD64 family15h, speed 2600 MHz (estimated)
Counted CPU_CLK_UNHALTED events (CPU Clocks not Halted) with a unit mask of 0x00 (No unit mask) count 100000
samples  %        image name               symbol name
1212556  63.6990  ld-new                   bfd_hash_lookup
416050   21.8563  ld-new                   bfd_hash_hash
64861     3.4073  no-vmlinux               /no-vmlinux
59038     3.1014  ld-new                   bfd_hash_traverse
13873     0.7288  ld-new                   bfd_get_next_section_by_name
9880      0.5190  ld-new                   strrevcmp

I've manually marked bfd_hash_hash as __attribute__((noinline))
to see it separately from bfd_hash_lookup.

The vast majority of these calls seem to come from _bfd_elf_strtab_add
and from bfd_get_section_by_name/bfd_get_next_section_by_name.

While I first thought the hash tables were too slow, investigating
further showed that most of the hash tables are really small
(and appropriately sized), we just do a lot of lookups on them.

> Are you linking with archives? Do your input archives have a
> symbol index built?

yes, and don't know. I've moved on to your new patches now, will
see how that goes.

> > Is the 5m20 the total build time for the kernel, the time for
> > rebuilding after a trivial change, or the time to call 'ld.bfd'
> > once?
> 
> 5m02 was the total time for x86 defconfig. With the powerpc
> allyesconfig build, the final link:
> 
> $ time ld -EL -m elf64lppc -pie --emit-relocs --build-id --gc-sections -X -o vmlinux -T ./arch/powerpc/kernel/vmlinux.lds --whole-archive built-in.o .tmp_kallsyms2.o
> 
> real	0m15.556s
> user	0m13.288s
> sys	0m2.240s
> 
> $ ls -lh vmlinux
> -rwxrwxr-x 1 npiggin npiggin 279M Aug  6 14:02 vmlinux
> 
> Without -pie --emit-relocs it's 11.8s and 150M but I'm using
> emit-relocs for a post-link step.

Interesting, that does sound more like an ARM specific bug in ld
then. 

> > Are you using ld.bfd on x86 or ld.gold? For me ld.gold either
> > works and is really fast, or it crashes, depending on the
> > configuration. I also don't think it supports big-endian ARM
> > (which is what allyesconfig ends up using).
> 
> ld.bfd on both. Gold crashed on powerpc and I didn't try it on x86.

Ok.

	Arnd



More information about the Linuxppc-dev mailing list