[Qt5-feedback] The #include directives for Qt Essentials and Qt Add-on modules

BRM bm_witness at yahoo.com
Fri Jul 1 09:46:08 CEST 2011


>From: Charley Bay <charleyb123 at gmail.com>
>>
>>To: Craig.Scott at csiro.au
>>charley spaketh:
>>>// ARE YOU SURE YOU WANT TO DO THIS?
>>>#include <some/path/SomeFile.hpp>
>>>
>>>...is really pretty dangerous.  It's a general nightmare for build systems
>>>and configuration management, since the "INCLUDE_PATH" is no
>>>longer "in-charge":
>>>
>>>
>Craig respondeth: 
>
>WARNING: I have to very much disagree with this assertion and all your reasons 
>below. ;)

I have to quite agree with Craig here.

>>(1) Many more directories are considered beyond "INCLUDE_PATH"
>>(it explodes subdirectories off EVERY directory in your "INCLUDE_PATH")
>>
>>Whether you have the path to the header in the source file or as compiler flags 
>
>>(eg -Isome/path), it will result in the same thing. The compiler will end up 
>>searching the same sets of files/directories! You gain nothing by moving the 
>>path from the source to the compiler flags. Not sure if that's what you are 
>>advocating, but let's at least make this point clear to start with.
>>
>
>The assertion is that if you explicitly list ten paths in your INCLUDE_PATH, 
>then ten paths are searched.  However, if you "#include<mydir/MyFile.hpp>", then 
>
>those ten paths will be searched, *plus* a search for (or "consideration of") 
>the "mydir" subdirectory in each of those ten paths.

Not necessarily - those 10 paths plus any sub-paths would still be valid for 
search in either case.
However, I would hope compilers are smart enough to only look down a sub-tree 
when necessary.

>
>(2) It slows the build, especially searching every path for subdirs, even
>>though you may not want those directories in your "INCLUDE_PATH"
>>to be searched for subdirs.
>>
>>Seriously? Can you provide measurements for that? I'd be *very* surprised if you 
>>
>>could measure a detectable difference between having the paths specified in the 
>
>>source files or as INCLUDE_PATHS specified by compiler flags. If you are talking 
>>
>>about the difference between using a directory structure as opposed to a flat 
>>structure with many files in the same directory, again I don't see how you would 
>>
>>get much of a measurable difference. I'm happy to be shown evidence to the 
>>contrary though.
>>
> 
>Granted, it *could* be as simple as "path math" (in memory, not disk hits) to 
>search for the file  in each directory; but that's still "path math" that would 

>not have  otherwise occurred.  (It should depend on the build system, and how 
>clever it was with its stat() calls, such as individual file hits, or 
>full-dir-globs.)
>
>I'd have to benchmark, and I concede I don't have a strong argument here.  
>Rather, the basis for my assertion is to have a fixed number of "discrete" paths 
>
>in the INCLUDE_PATH, and not an increasing number of "dir-search-faults", that 
>fail to show a directory (and which sometimes show an accidentally implied 
>directory).

I would expect that you would find zero difference. Why? B/c large projects like 
the Linux Kernel make extensive use of the #include <subdir/.../somefile.xxx>.
Seriously - have you looked at /usr/include on a *nix system? It is full of 
namespacing this way, and it doesn't seem to affect much at all.
I'm pretty sure Microsoft is using it too.

>The problem was worse with "high likelihood collisions" like commonly  named 
>directories like "#include<inc/MyFile.hpp>", so collisions  will be less likely 

>with QtAddOn names.  But, for very large systems  (many thousands of files), the 
>
>extra tool processing (?might) be significant (I don't have metrics).

Again, take a look at the size of the Linux Kernel - 260+MB a raw source files 
and header files, millions of lines of code, thousands of files.

>But yes, in the case of "common dirname collisions", paths would be searched 
>that didn't even have source code in them (they should never have been conceived 
>
>to be part of the INCLUDE_PATH).

There is a far greater issue of collisions when you just do #include <QString> 
versus #include <Qt/QString.h>.
One limits you to a specific file under a specified subdir in any of the 
directories in INCLUDE_PATH.
The other matches any file in any of the directories in INCLUDE_PATH but not any 
of their subdirectories.

>(3) Your build is no longer documented.  "INCLUDE_PATH" no longer
>>reflects the directories from which files are included.  Accidental
>>location of files can violate the build target, and you would NEVER
>>KNOW (e.g., you can't reproduce it on different machines, or on the
>>same machine on different days, unless you *always* started
>>with a "virgin build".)
>>
>>I'd argue the opposite. The source files document exactly where you expect the 

>>files to be. If you are using this arrangement, you are typically saying that 
>>you expect to use a relatively limited number of INCLUDE_PATH search locations. 
>
>>These INCLUDE_PATH locations would define the base points of the search and the 
>
>>paths you use in your source files specify the directory structure. 
>>
> 
>The contrast is whether the paths are explicit in the INCLUDE_PATH  (implied in 

>the source), or the opposite (implicit in the INCLUDE_PATH,  and 
>specified-relative in the source).

The location specified as #include <somedir/somefile> or #include 
<somedir/somefile.xxx> is always relative to any of the directories in 
INCLUDE_PATH.
As Craig noted, it makes it very clean and really helps with namespacing. Not 
only does it prevent conflicts with others who are using your project, but it 
also can prevent conflicts within your own project as you can now name header 
files appropriately and even re-use the header file name as long as it is in a 
different path.

And a project that did this would be doing to via this specification in all 
header files - no header file in those directories would be allowed to do 
#include "someotherparallelfile".
Of course, you also have the difference between #include "" and #include <> - 
#include "" telling the compiler to search really really hard to find the file.

>When doing "#include<mydir/MyFile>", the source file specifies a relative 
>location from some external list  (the INCLUDE_PATH list), and I argue it 
>doesn't have that authority (the relative organization of external source files 

>is outside a given source file's context).  It seems a little funny to me that a 
>
>source file can innocently add that path to one of its #include<> statements, 
>and WHAMMO, your entire build now considers subdirs below EVERY path found in 
>your INCLUDE_PATH.  (Especially since headers include other headers, where each 

>file has a different "origin" location, and which now trigger subdirs for 
>consideration below each of their origins, even in the case where their origin 
>itself (such as a project directory) is *not* in the INCLUDE_PATH).

Please, please get more accurate information. As that kind of convention is used 
all over - from Boost, to Linux Kernel, to many many other projects - I have a 
very hard time with your assertion.
Again, #include <somefile.h> does only two things:
(a) searches in the direct paths specified by INCLUDE_PATH - not any 
subdirectory of any of them.
(b) searches the local path where the including file (source or header) is 
located (e.g. current directory)

#include <somedir/somefile.h> does those same two things, only now 'b' will be 
more regulated.

>True, INCLUDE_PATH can merely specify the "base" of various hierarchies.  My 
>argument is that these hierarchies sometimes "overlap" with accidental 
>collisions, because "subdirs are now implied in the source".  This implies:

The specified sub-dir can be easily regulated by controlling your INCLUDE_PATH 
and the INCLUDE_PATH environment variable, which you seem to be missing concern 
for.
INCLUDE_PATH environment variable specifies the primary headers required for the 
system you are building on, this is in addition to whatever is specified on the 
command-line to the compiler,
and to truly control the build you have to control both.

>(1) It's more work to re-organize your source code *within* your project (you 
>must touch the #include<> in the source itself)
>
>(2) Users do not know what names they should avoid using to avoid collisions.
>
>
>This becomes particularly important when you are writing libraries and you 
>expect others to use your headers. You don't want them to have to know all the 
>different directories that need to be searched and therefore added to their 
>build system too. Rather, your library would be better off specifying a single 
>INCLUDE_PATH as a base and then your own headers do the rest. Much simpler and 
>more robust for people using your library. Also consider that putting the paths 

>in the source makes it a simpler task if you want to switch build systems and/or 
>
>mix different build systems (not everyone has the good fortune of being in 
>complete control over all people who use their code and over what else they 
>might use with it!).
>
>
>I agree users should not need to add a bunch of paths to a project-specific 
>directory structure (but I don't mind a small number of specific paths).  So, I 

>like the "export/include" directory idea, where a flat set of files are made 
>available to users, and these files reach into the project directories for the 
>actually-deployed implementation files (same as Qt does now with 
>"#include<QWidget>"). 

It doesn't simply help users. It helps compilers and IDEs too.

One code-base I occassionally work on does exactly what you do - everything does 
simple #include <somefile.xxx>, and then a ton of specific directories are added 
to the INCLUDE_PATH.
Visual Studios often cannot find the files when going through the source code 
and trying to open the various headers.
However, another code-base I work on does the #include <dir/file.xxx>, and 
Visual Studios never has a problem finding the files - and there only has to be 
a couple directories added to INCLUDE_PATH.

>As for the comments regarding accidental location of files, if you are choosing 

>appropriately unique directory names (as proposed by the Qt5ProductDefinition 
>wiki page), then this should not be an issue. The feared cases of different 
>behaviour on different machines, etc. would only arise if someone is mixing two 

>different installations of the headers, and if they are doing that, then they 
>already have to be careful with the way they specify their INCLUDE_PATH and the 

>addition of a directory structure in the source files will not make that any 
>harder than it already is. Think of converting every slash into a . or 
>underscore and you see what I mean. Whether it is a path or a 
>long-and-hopefully-unique file name makes no difference.
>
>
>I pretty much agree here -- collisions as Qt5 is intending should really be 
>unlikely, assuming a non-crufty workspace (e.g., with multiple versions of the 
>same project in accidental locations).  However, it is *possible* for a 
>developer to unpack something locally someplace (like another version of the 
>add-on that should *not* be part of the build, and which is *not* in the 
>INCLUDE_PATH, and it may be used (depending on where it was unpacked).  That's 
>the "undocumented" thing I worry about.

If they are doing that, then they need to manage the issue itself.
As noted above, I find your assertion hard given how many projects already DO do 
this - especially high profile big projects.
The larger the project, or the more used a project is the more likely it is to 
have at least one level specified - e.g. Boost - #include <Boost/somefile.xxx>.
It just makes it easier all around.

May be this was an issue way way back when compilers were a lost stupider, but 
not likely today.
So please provide some evidence to the contrary.

>And, this worry is only because we've suffered it:  Many developers install 
>different versions of different libraries in different places (sometimes 
>forgetting cruft), and messy workspaces may introduce all kinds of cruft and 
>collisions when supporting branched development (e.g., different versions of 
>deployed products).  I find it handy to trust absolutely the INCLUDE_PATH, which 
>
>I can check into version control (and I can't do that with the workspace cruft 
>-- it's whatever mess the developer had, where adding a couple files results in 

>a broken build, and the developer doesn't know why).

You can control the version by controlling where the include files are located.
For example, install the header files to /usr/local/<some 
project>/<version>/include, and only add that include path to the projects as 
necessary.
Many projects already do that kind of thing - whether under /usr/local or /usr 
directory, or even /opt.

In the case of Windows, there is no central location for storing header files. 
So you have to manage it already - a project will get installed to 
C:\someproject\include or C:\someproject2\include, and you have to add them 
accordingly.


>(4) These "auto-include-subdirs" would be sprinkled in the source,
>>which means developers must not violate a "virgin build system"
>>directory structure, but the user will never know if they *did*
>>violate that structure.  For example, "#include<abc/SomeFile.hpp>"
>>means the user should be VERY careful about EVER creating
>>a directory called "abc", especially if it is accidentally in a directory
>>that may accidentally be placed into an "INCLUDE_PATH".  Thus,
>>each "some/path" creates global namespace collision problems
>>that are NOT remedied merely through which version of the module
>>you placed in your "INCLUDE_PATH".
>>
>>How is this different from the user having to be careful not to create a header 
>
>>with the same name which appears somewhere earlier in the INCLUDE_PATH? The 
>>directory names are here acting just like part of a monolithic file name. There 
>
>>really is no difference when it comes to avoiding clashes.
>>
>
>True.  However, the INCLUDE_PATH is discrete and un-ambiguous, and checked into 

>version control.

As noted above you forget about the INCLUDE_PATH environment variable, which 
unless you project explicitly clears it cannot be checked into version control 
so easily.

>And, the INCLUDE_PATH doesn't have the other problem:  If "MyCollision.hpp" 
>exists in two directories in the INCLUDE_PATH, the earlier one wins (maybe in 
>error), but at least the list is discrete.  Yet, if I must manually-search for 
>"MyCollision.hpp" in subdirectories that are *implied* based on "dirnames" in 
>*source code* that is *not even my source code* relative to paths in my 
>INCLUDE_PATH to resolve "bad-build-errors", then I'll have to blow my brains 
>out.

As noted above, I can show you problems in IDEs with finding the file _at all_ 
(forget about whether it is the right one or not) using the #include 
"mycollissionl.hpp" method.
But it has no problem finding _the right_ file with #include <dir/somefile.hpp>.

As to which file is used in the case of a collision - yes, the file that is in 
the earliest specified path in INCLUDE_PATH is utilized.
That doesn't change when doing #include <dir/somefile.hpp>, and as noted above, 
it doesn't add 'dir' to the search for everything.

>Since this has happened to me before, I have very few brains left.
>
>In short, when INCLUDE_PATH dominates, the messiest-workspace-on-Earth will 
>still build the correct target, efficiently.

Yes, it may build. But does the IDE work? The one code-base I mentioned above 
builds just fine - but the IDEs cannot find anything.

>An alternative is if the build would print out an *exact* list of every header 
>included by every file that comprise the build target (which includes the 
>*final* results of the relatively-located file), and we could check that into 
>version control as documentation for what was included in the target.  This 
>makes that diagnostic easier after-the-fact.  But, it would be nice if I only 
>had to worry about the collisions within INCLUDE_PATH (since it's not possible 
>to document all the relative paths in a library I just installed).

I don't know about g++ off hand, but I do know VC++ has an option to do that.
It's very handy when trying to track down errors at times.
 
>
>Users are not really used to seeing directory names in QtAddOn.Foo form. It 
>looks too much like a file with the file extension "Foo", especially if the user 
>
>is simply doing a directory listing at a command line. There's nothing wrong 
>with a name like QtAddOn.Foo, but as explained above, it isn't buying you 
>anything that QtAddOn/Foo would not provide, but the latter would look and feel 

>more natural to most people.
>
>
>That's a good point.  It took me a bit to decide I liked "#include<QWidget>", 
>but I've decided that it was a good deployment decision (to solve the 
>cross-platform file extension-convention-and-case problem).

It doesn't really solve the cross-platform file-extensions-convertion-case 
problem.
It just does what C++ does - e.g. #include <string> - it still has to match a 
header file.
In the case of Qt, they've added a header file QWidget that then goes on to 
#include "qtgui/qwidget.h". (or is it ../qtgui/qwidget.h? - something like 
that.)
Check out the files. It's not like Qt isn't already doing what they are 
suggesting - just making it more explicity.
Of course, you can already do that too - I typically use the #include 
<module/header> convention already.

>So, "QtAddOn.MyAddOnWidget" may be a little too crude.  I suppose we could just 

>pick some convention like "QtAddOn_MyAddOnWidget", but I realize it's the same 
>issue (it's a new convention, and looks weird, and I'm not sure I like it 
>either).

And it doesn't solve the collision issue.

>Also, I'll concede "#include<QtAddOn/MyAddOnWidget>" has precedent.  I'm not 
>partial to it, and I'll make it work.  But, for the reasons above I get nervous 

>with source-code-organization-stuff being placed into source code, and not the 
>build system.  It means the file "knows" its place relative to other files, or 
>"knows" the desired target, relative to some external "root" (which it *cannot* 

>know).  Since that was expressed in the source file itself, odds-are the source 

>file is wrong (because organization of source code will, and always does, change 
>
>to accommodate new source structure/migration/need).

I think you need to look more at what compilers actually do. I'm not sure your 
concerns are founded with today's compilers - may be those 20 or 30 years ago 
and even then it would be hard to find issues as they would have had to be a lot 
more conservative in what they were doing due to limited memory, etc., but 
certainly not today's.


>I merely voted.  I didn't expect to win the vote.  ;-))

Hearing concerns is always a good thing, if for no other reason than to provide 
the information to alleviate them at worst and at best address the issue should 
there really be one.

$0.02

Ben


More information about the Qt5-feedback mailing list