[Development] how to reduce the relocation <-- Use static qt libraries

song.7.liu at nokia.com song.7.liu at nokia.com
Sun Jul 29 12:12:50 CEST 2012


> After changed with _protected_ visibility, that kind of relocation is reduced, but I still don't know why more R_ARM_RELATIVE relocation introduced.

Answer my own question, that is because the loading address of the module needs to be added to know actual address of each virtual functions.

So for the qt(5), should we change all the exported symbol 's visibility to _protected_ ? Or is there still some exited use case to use _default_ visibility ?

Thanks,
Song

-----Original Message-----
From: development-bounces+song.7.liu=nokia.com at qt-project.org [mailto:development-bounces+song.7.liu=nokia.com at qt-project.org] On Behalf Of Liu Song.7 (Nokia-MP/Beijing)
Sent: Sunday, July 29, 2012 6:02 PM
To: thiago.macieira at intel.com; development at qt-project.org
Subject: Re: [Development] how to reduce the relocation <-- Use static qt libraries

Probably, I know that the R_ARM_ABS32 is about *reference* the address of a function.
For C++ virtual class, then the virtual table will not know the actual address of the virtual functions, which is with _default_ visibility. So a R_ARM_ABS32 relocation is needed.

After changed with _protected_ visibility, that kind of relocation is reduced, but I still don't know why more R_ARM_RELATIVE relocation introduced.

Anything wrong please correct me ;-)

Thanks,
Song

-----Original Message-----
From: development-bounces+song.7.liu=nokia.com at qt-project.org [mailto:development-bounces+song.7.liu=nokia.com at qt-project.org] On Behalf Of Liu Song.7 (Nokia-MP/Beijing)
Sent: Sunday, July 29, 2012 4:13 PM
To: thiago.macieira at intel.com; development at qt-project.org
Subject: Re: [Development] how to reduce the relocation <-- Use static qt libraries

Hi,

I want to share some result about the relocation during the loading (with RTLD_LAZY).

Relocation count for single so (libqt5) + without optimization:
    R_ARM_GLOB_DAT: 1585
    R_ARM_RELATIVE: 9823
    R_ARM_ABS32: 19489
    R_ARM_JUMP_SLOT: 16998

Relocation count for single so (libqt5) + with optimization:
    R_ARM_GLOB_DAT: 1578
    R_ARM_RELATIVE: 28227
    R_ARM_ABS32: 435
    R_ARM_JUMP_SLOT: 290

And the optimization done here is only about changing the visibility of exported symbols from "default" to "protected", thanks Thiago's blog ;).
So:

- the R_ARM_JUMP_SLOT relocation is reduced significantly,
   but which is only happened at run time (as RTLD_LAZY), so it's irrelevant with the loading performance.

- the R_ARM_RELATIVE relocation is increase but this type relocation is very fast.

- actually for loading time, the bottleneck is the R_ARM_ABS32 relocation, which is reduced around 97% now !

Finally the overall loading time is reduced from ~10-20s to ~1s...

But I still have some question about the R_ARM_ABS32 relocation.

It seems if the function is virtual (with "default" visibility), then it will be added into .rel.dyn as the R_ARM_ABS32 type, for example:
007b0124  0011a802 R_ARM_ABS32            00311e4b   _ZN20QEventDispatcherUNIX13processEventsE6QFlagsIN10QEventLoop17ProcessEventsFlagEE

Could someone help with below:
1. why the virtual function with "default" visibility needs relocation even if it's implemented inside ?
2. when changed to "protected" visibility, I guess it's optimized to add a GOT.PLT entry as a R_ARM_RELATIVE relocation, is that true ?

Thanks,
Song

-----Original Message-----
From: development-bounces+song.7.liu=nokia.com at qt-project.org [mailto:development-bounces+song.7.liu=nokia.com at qt-project.org] On Behalf Of ext Thiago Macieira
Sent: Tuesday, July 24, 2012 10:29 PM
To: development at qt-project.org
Subject: Re: [Development] how to reduce the relocation <-- Use static qt libraries

On terça-feira, 24 de julho de 2012 13.22.25, song.7.liu at nokia.com wrote:
> Yes, the bottleneck of the loading now is the local relocations 
> instead of inter-library's.
> 
> So what we want to do will be reducing the number of local relocation.
> 
> Based on my understanding, this local relocation should be caused by 
> the "symbol inter-positioning".

That's not exactly the case. Some types of relocations do permit symbol interpositioning. But some types of code require relocations even if they're not interposable.

In my listing, all the "local" relocations are non-interposable.

More information:
http://www.macieira.org/blog/2012/01/sorry-state-of-dynamic-libraries-on-
linux/
http://www.macieira.org/blog/2012/01/update-and-benchmark-on-the-dynamic-
library-proposals/

> And from gcc option -Bsymbolic:
> "
> When creating a shared library, bind references to global symbols to 
> the definition within the shared library, if any. Normally, it is 
> possible for a program linked against a shared library to override the 
> definition within the shared library. This option is only meaningful 
> on ELF platforms which support shared libraries. "
> 
> But for my case, it's not needed to override the definition within the 
> libqt5.so.

Yes, it is.

But you didn't realise that your code requires relocations even if the symbols can't be overridden.

In order to do that, you need a fully position *dependent* code that can't be moved. Executables on Linux are like that, but all libraries are movable in memory, even those compiled without -fPIC.

Since you're not running Linux, check if your OS supports that. Note that you'll need to know the exact load address at build time and that it must match the loaded address for the ROM if you want to do XIP.

> So, besides the prelink solution, I think the compiler (I mean
> armlink) should provide the ability to disable this symbol 
> inter-positioning, just like the -Bsymbolic in gcc.
> 
> Does anyone have idea from the compiler point of view ?

Sorry, you're barking up the wrong tree.

Your only option to reduce the number of relocations is to prelink to the exact load address. There are two ways of doing that:

1) the ELF prelinker, which prelinks all relocations to a given address, but does still allow relocating if the shared object is loaded at a different address. The code is PIC, so XIP should work just fine.

2) compile without PIC and prelink at a specific address at link time, which means that the code must be loaded there or it will fail to run. This is the Windows DLL model.

> 
> Also I see that Qt also uses the "-Bsymbolic-functions" to do some 
> optimization, is that similar case to reduce the relocation ?

Yes. Read my blogs for more detail.

--
Thiago Macieira - thiago.macieira (AT) intel.com
  Software Architect - Intel Open Source Technology Center
     Intel Sweden AB - Registration Number: 556189-6027
     Knarrarnäsgatan 15, 164 40 Kista, Stockholm, Sweden _______________________________________________
Development mailing list
Development at qt-project.org
http://lists.qt-project.org/mailman/listinfo/development
_______________________________________________
Development mailing list
Development at qt-project.org
http://lists.qt-project.org/mailman/listinfo/development



More information about the Development mailing list