[Qt5-feedback] The #include directives for Qt Essentials and Qt Add-on modules

Fri Jul 1 02:06:54 CEST 2011

>
> charley spaketh:
> // ARE YOU SURE YOU WANT TO DO THIS?
> #include <some/path/SomeFile.hpp>
>
> ...is really pretty dangerous.  It's a general nightmare for build systems
> and configuration management, since the "INCLUDE_PATH" is no
> longer "in-charge":
>

Craig respondeth:

> WARNING: I have to very much disagree with this assertion and all your
> reasons below. ;)
>
>
> (1) Many more directories are considered beyond "INCLUDE_PATH"
> (it explodes subdirectories off EVERY directory in your "INCLUDE_PATH")
>
> Whether you have the path to the header in the source file or as compiler
> flags (eg -Isome/path), it will result in the same thing. The compiler will
> end up searching the same sets of files/directories! You gain nothing by
> moving the path from the source to the compiler flags. Not sure if that's
> what you are advocating, but let's at least make this point clear to start
> with.
>

The assertion is that if you explicitly list ten paths in your INCLUDE_PATH,
then ten paths are searched.  However, if you "#include<mydir/MyFile.hpp>",
then those ten paths will be searched, *plus* a search for (or
"consideration of") the "mydir" subdirectory in each of those ten paths.

(2) It slows the build, especially searching every path for subdirs, even
> though you may not want those directories in your "INCLUDE_PATH"
> to be searched for subdirs.
>
> Seriously? Can you provide measurements for that? I'd be *very* surprised
> if you could measure a detectable difference between having the paths
> specified in the source files or as INCLUDE_PATHS specified by compiler
> flags. If you are talking about the difference between using a directory
> structure as opposed to a flat structure with many files in the same
> directory, again I don't see how you would get much of a measurable
> difference. I'm happy to be shown evidence to the contrary though.
>

Granted, it *could* be as simple as "path math" (in memory, not disk hits)
to search for the file in each directory; but that's still "path math" that
would not have otherwise occurred.  (It should depend on the build system,
and how clever it was with its stat() calls, such as individual file hits,
or full-dir-globs.)

I'd have to benchmark, and I concede I don't have a strong argument here.
Rather, the basis for my assertion is to have a fixed number of "discrete"
paths in the INCLUDE_PATH, and not an increasing number of
"dir-search-faults", that fail to show a directory (and which sometimes show
an accidentally implied directory).

The problem was worse with "high likelihood collisions" like commonly named
directories like "#include<inc/MyFile.hpp>", so collisions will be less
likely with QtAddOn names.  But, for very large systems (many thousands of
files), the extra tool processing (?might) be significant (I don't have
metrics).

But yes, in the case of "common dirname collisions", paths would be searched
that didn't even have source code in them (they should never have been
conceived to be part of the INCLUDE_PATH).

(3) Your build is no longer documented.  "INCLUDE_PATH" no longer
> reflects the directories from which files are included.  Accidental
> location of files can violate the build target, and you would NEVER
> KNOW (e.g., you can't reproduce it on different machines, or on the
> same machine on different days, unless you *always* started
> with a "virgin build".)
>
> I'd argue the opposite. The source files document exactly where you expect
> the files to be. If you are using this arrangement, you are typically saying
> that you expect to use a relatively limited number of INCLUDE_PATH search
> locations. These INCLUDE_PATH locations would define the base points of the
> search and the paths you use in your source files specify the directory
> structure.

The contrast is whether the paths are explicit in the INCLUDE_PATH (implied
in the source), or the opposite (implicit in the INCLUDE_PATH, and
specified-relative in the source).

When doing "#include<mydir/MyFile>", the source file specifies a relative
location from some external list (the INCLUDE_PATH list), and I argue it
doesn't have that authority (the relative organization of external source
files is outside a given source file's context).  It seems a little funny to
me that a source file can innocently add that path to one of its #include<>
statements, and WHAMMO, your entire build now considers subdirs below EVERY
path found in your INCLUDE_PATH.  (Especially since headers include other
headers, where each file has a different "origin" location, and which now
trigger subdirs for consideration below each of their origins, even in the
case where their origin itself (such as a project directory) is *not* in the
INCLUDE_PATH).

True, INCLUDE_PATH can merely specify the "base" of various hierarchies.  My
argument is that these hierarchies sometimes "overlap" with accidental
collisions, because "subdirs are now implied in the source".  This implies:

(1) It's more work to re-organize your source code *within* your project
(you must touch the #include<> in the source itself)

(2) Users do not know what names they should avoid using to avoid
collisions.

This becomes particularly important when you are writing libraries and you
> expect others to use your headers. You don't want them to have to know all
> the different directories that need to be searched and therefore added to
> their build system too. Rather, your library would be better off specifying
> a single INCLUDE_PATH as a base and then your own headers do the rest. Much
> simpler and more robust for people using your library. Also consider that
> putting the paths in the source makes it a simpler task if you want to
> switch build systems and/or mix different build systems (not everyone has
> the good fortune of being in complete control over all people who use their
> code and over what else they might use with it!).
>

I agree users should not need to add a bunch of paths to a project-specific
directory structure (but I don't mind a small number of specific paths).
So, I like the "export/include" directory idea, where a flat set of files
are made available to users, and these files reach into the project
directories for the actually-deployed implementation files (same as Qt does
now with "#include<QWidget>").

As for the comments regarding accidental location of files, if you are
> choosing appropriately unique directory names (as proposed by the
> Qt5ProductDefinition wiki page), then this should not be an issue. The
> feared cases of different behaviour on different machines, etc. would only
> arise if someone is mixing two different installations of the headers, and
> if they are doing that, then they already have to be careful with the way
> they specify their INCLUDE_PATH and the addition of a directory structure in
> the source files will not make that any harder than it already is. Think of
> converting every slash into a . or underscore and you see what I mean.
> Whether it is a path or a long-and-hopefully-unique file name makes no
> difference.
>

I pretty much agree here -- collisions as Qt5 is intending should really be
unlikely, assuming a non-crufty workspace (e.g., with multiple versions of
the same project in accidental locations).  However, it is *possible* for a
developer to unpack something locally someplace (like another version of the
add-on that should *not* be part of the build, and which is *not* in the
INCLUDE_PATH, and it may be used (depending on where it was unpacked).
That's the "undocumented" thing I worry about.

And, this worry is only because we've suffered it:  Many developers install
different versions of different libraries in different places (sometimes
forgetting cruft), and messy workspaces may introduce all kinds of cruft and
collisions when supporting branched development (e.g., different versions of
deployed products).  I find it handy to trust absolutely the INCLUDE_PATH,
which I can check into version control (and I can't do that with the
workspace cruft -- it's whatever mess the developer had, where adding a
couple files results in a broken build, and the developer doesn't know why).

(4) These "auto-include-subdirs" would be sprinkled in the source,
> which means developers must not violate a "virgin build system"
> directory structure, but the user will never know if they *did*
> violate that structure.  For example, "#include<abc/SomeFile.hpp>"
> means the user should be VERY careful about EVER creating
> a directory called "abc", especially if it is accidentally in a directory
> that may accidentally be placed into an "INCLUDE_PATH".  Thus,
> each "some/path" creates global namespace collision problems
> that are NOT remedied merely through which version of the module
> you placed in your "INCLUDE_PATH".
>
> How is this different from the user having to be careful not to create a
> header with the same name which appears somewhere earlier in the
> INCLUDE_PATH? The directory names are here acting just like part of a
> monolithic file name. There really is no difference when it comes to
> avoiding clashes.
>

True.  However, the INCLUDE_PATH is discrete and un-ambiguous, and checked
into version control.

And, the INCLUDE_PATH doesn't have the other problem:  If "MyCollision.hpp"
exists in two directories in the INCLUDE_PATH, the earlier one wins (maybe
in error), but at least the list is discrete.  Yet, if I must
manually-search for "MyCollision.hpp" in subdirectories that are *implied*
based on "dirnames" in *source code* that is *not even my source code*
relative to paths in my INCLUDE_PATH to resolve "bad-build-errors", then
I'll have to blow my brains out.

Since this has happened to me before, I have very few brains left.

In short, when INCLUDE_PATH dominates, the messiest-workspace-on-Earth will
still build the correct target, efficiently.

An alternative is if the build would print out an *exact* list of every
header included by every file that comprise the build target (which includes
the *final* results of the relatively-located file), and we could check that
into version control as documentation for what was included in the target.
This makes that diagnostic easier after-the-fact.  But, it would be nice if
I only had to worry about the collisions within INCLUDE_PATH (since it's not
possible to document all the relative paths in a library I just installed).

> Users are not really used to seeing directory names in QtAddOn.Foo form. It
> looks too much like a file with the file extension "Foo", especially if the
> user is simply doing a directory listing at a command line. There's nothing
> wrong with a name like QtAddOn.Foo, but as explained above, it isn't buying
> you anything that QtAddOn/Foo would not provide, but the latter would look
> and feel more natural to most people.
>

That's a good point.  It took me a bit to decide I liked
"#include<QWidget>", but I've decided that it was a good deployment decision
(to solve the cross-platform file extension-convention-and-case problem).

So, "QtAddOn.MyAddOnWidget" may be a little too crude.  I suppose we could
just pick some convention like "QtAddOn_MyAddOnWidget", but I realize it's
the same issue (it's a new convention, and looks weird, and I'm not sure I
like it either).

Also, I'll concede "#include<QtAddOn/MyAddOnWidget>" has precedent.  I'm not
partial to it, and I'll make it work.  But, for the reasons above I get
nervous with source-code-organization-stuff being placed into source code,
and not the build system.  It means the file "knows" its place relative to
other files, or "knows" the desired target, relative to some external "root"
(which it *cannot* know).  Since that was expressed in the source file
itself, odds-are the source file is wrong (because organization of source
code will, and always does, change to accommodate new source
structure/migration/need).

I merely voted.  I didn't expect to win the vote.  ;-))

--charley
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.qt.nokia.com/pipermail/qt5-feedback/attachments/20110630/7ddec212/attachment.html