[Development] As Qt contemplates its future..

Mon Apr 17 01:39:26 CEST 2017

On Apr 15, 2017, at 19:14, Randall O'Reilly <randy.oreilly at Colorado.EDU> wrote:

> On Apr 16, 2017, at 1:28 AM, Thiago Macieira <thiago.macieira at intel.com> wrote:
>> 
>> Em sexta-feira, 14 de abril de 2017, às 22:38:25 PDT, Randall O'Reilly 
>> escreveu:
>>> One of the major innovations in Go is that it avoids all of those problems.
>>> You only ever write things once, in one place (no .h vs. .cpp), and, like
>>> an interpreted language, the only distribution mechanism *is the source
>>> itself*.  There is no such thing as binary compatibility.
>> 
>> Because there's no such thing as binary distribution in the first place. That 
>> means you cannot provide a component without the source. If we insisted on all 
>> Qt users simply recompiling every time that Qt changed, then we could apply 
>> the same to C++ and only retain source compatibility. That is, after all, what 
>> Boost does.
> 
> That’s why the Go folks worry so much about super-fast compile times..

I like that.  I like how it’s so easy to import libraries of source code from github that you didn’t even need to download ahead of time, let alone have them installed as dependencies.

But how much optimizing can such a compiler afford to do, if speed is the main priority?

And static linking is both a blessing and a curse.

>> By the way, is it even possible to distribute a binary application?
> 
> Yes, the final product of the compilation process is a (fat) static binary.

Well that gets me thinking more about the memory cost of that.  If you run KDE or any other big collection of independent Qt programs at the same time, then if they are statically linked, each one may be smaller than the whole set of Qt dynamic libs that you’d end up loading if you are using dynamic libs, but some code is being duplicated among those static binaries too.  Whereas if you dynamically link, Qt takes up a decent sized chunk of memory, but at least each program that links with it can be small, if it doesn’t have much unique code of its own.  So the more Qt programs you multitask with, the more the dynamic libs make sense, even if there are some functions that none of those programs use.

Now you will say you’ve got this many gigs of RAM and you don’t care.  Well, there is still the cost of cache misses.  It impacts performance when your OS is switching to a different process, and the processor has to fetch the same code from a different memory address just because it’s a different copy.  And you’re still neglecting the smaller embedded systems which don’t have as much RAM.  And there’s the general inelegance of letting software expand to fill its container, even doing the same old jobs we knew how to implement much more efficiently 20 years ago.

I don’t think the fat static binaries are what I want too many of on my system, until somebody figures out how to do de-duplication of blocks of code, either at the granularity of some arbitrary block size, or better yet, have the runtime linker (or the kernel maybe?) break up programs and dynamic libs into individual functions, hash the machine code for each function, and deduplicate them on that basis.  (I wrote up QTBUG-59619 for discussion about that recently.)  Does anybody know of any research being done on that?

It is relevant both on disk and in memory.  Top priority is for collections of programs to share common functions, and yet it’s easier if it’s already been done on the SSD, so that less work needs to be done at load time.  So one answer is use today’s deduplicating filesystems like btrfs, xfs or ZFS, because the kernel already knows not to load the same block from disk twice into two memory blocks… yeah I already thought of that, of course.  I do use ZFS, but the usual advice is you probably can’t afford to use deduplication: it takes a lot of RAM just to run the filesystem if you turn on that feature.  (So I haven’t tried yet, because I read that.)  And it works at a block level, where a block may be 4K or more.  I think it’s not very likely when you statically link two different programs with the same static library, that code blocks within the static library will just happen to be aligned to the block size on the disk, will it?  There would have to be some effort to make that happen.  Maybe a new kind of static linking which prioritizes alignment and grouping of “text” blocks into most-likely-reusable sections?

Only if something like that were done, would static binaries make sense to be used universally, IMO.  For now they only make sense as a workaround for DLL hell on Windows (because most software there is proprietary anyway, and the package management is so primitive that you can’t just ask the system to upgrade all your software, including Qt, and rely on binary compatibility to keep it all working), or in cases when you know you only need to run one Qt program on the system, or when the system doesn’t support dynamic libs very well (the mobile platforms for example - and it’s their loss that they made it like that, as well as a big pain for us.)  They save the developer some trouble, but at such an expense for the user.