[Interest] How to get QtConcurrent to do what I want?

Mon Jan 31 00:06:50 CET 2022

I'm hitting a design issue with the way I'm using the QtConcurrent module to do some image processing, and I'm wondering if someone can give some pointers?

At a high level, the software needs to do some processing on every pixel of an image. The processing can mostly be done in parallel, so I've created the following:

  1.  Tile class - responsible for doing the processing on a small subset of the original image
     *   Has a constructor that takes a Position and Size. From those parameters, the Tile knows what subset of the original image it is going to process
     *   Has a process() function which will do the work on those assigned pixels
  2.  TileManager class - responsible for managing the Tile objects
     *   Contains a for-loop that creates each Tile object, assigns it a unique Position, and adds it to the QVector<Tile> vector
     *   Has a processTile(Tile& t) function which calls t.process() to tell a given Tile to begin its work
     *   Calls QtConcurrent::map(tiles, processTile) to process each tile

So far this works well, but as I was timing different parts of the codebase, I discovered that a large portion of the time is spent allocating the QVector<Tile> vector (step 2a above) before I get to the concurrent processing call. The reason why is obvious to me - I need to ensure that each tile is created with a unique assignment and as far as I can see, that need to happen in a single thread? If I could instead pass off the Tile creation to the parallel processing step, I might be able to improve the overall performance, but I don't see a way around it within the QtConcurrent framework.

How can I go about creating Tile objects in parallel AND ensure that each of them gets a unique Position assignment? I could easily move the Tile allocation into processTile(), but if I do that, I don't see a way make the unique position assignment since I don't see how a given call to processTile() would know where it is in the overall parallelization sequence to determine what Position to assign to the Tile it creates. If I were using something like CUDA, I could use things like blockIdx and threadIdx to do that, but as far as I can see, those concepts don't exist (or at least aren't exposed) in QtConcurrent.

Any thoughts?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.qt-project.org/pipermail/interest/attachments/20220130/d4aa4698/attachment.htm>