Sunday, May 20, 2018

One of my favourite ways of multi-threading on Windows

Having blogged about a few bugs in multi-threading libraries recently, I want to show an easy and convenient alternative. It's minimalistic but it works.

On Windows, the I/O completion port offers a way to use a thread pool for your multi-threaded application. Designed primarily for efficient processing of asynchronous I/O, it supports files, named pipes, sockets and device control.

In addition to that, you can post your own packets to the port. Quoting from the documentation:
The PostQueuedCompletionStatus function allows an application to queue its own special-purpose completion packets to the I/O completion port without starting an asynchronous I/O operation.
An example of this is shown in the worker demo project (which uses no async I/O at all).

Saturday, April 28, 2018

Another Bug

So you've put in the time and effort to refactor your code and data for parallel execution and are eager to see some parallel action. Unfortunately, you might be disappointed; in some cases your tasks might get serialized, performed sequentially in a single thread. With the overhead you've just introduced, your code probably performs a little bit worse than before.

If you thought TParallel.Join was a nasty bug, things can apparently get even worse. Reading further in Primož Gabrijelčič's book Delphi High Performance, Chapter 7, "Exploring Parallel Practices":


There's a nasty bug in the System.Threading code that was introduced in Delphi 10.2 Tokyo. I certainly hope that it will be fixed in the next release, as it makes the Parallel Programming Library hard to use. It sometimes causes new threads not to be created when you start a task. That forces your tasks to execute one by one, not in parallel.

(Interestingly, I'm able to reproduce it reliably in XE7, too.)

The ParallelTasks sample project demonstrates the issue.
If you run the "Check primes 2" code with four tasks first, then two tasks, and finally one task, you'll get the expected result:


However, running the code with one task first, then two tasks, and finally four tasks will give you this:


The author offers two different workarounds:
- insert a little delay after each call to TTask.Run (in the sample code, uncomment the Sleep(1) call in btnCheckPrimes2Click method), or
- create your own thread pool and limit the minimum number of running threads in it (in the sample code, see btnCustomThreadPoolClick method).

Sunday, April 22, 2018

Don't lose time with a known Delphi bug affecting TParallel.Join

Writing multi-threaded code is hard and takes a lot of time. That's why it's especially annoying to waste time with bugs like this.

Reading (and enjoying) Primož Gabrijelčič's book Delphi High Performance, I've come across this paragraph in Chapter 7: Exploring Parallel Practices:
There's not much to say about Join, except that in current Delphi it doesn't work correctly. A bug in the 10.1 Berlin and 10.2 Tokyo implementations causes Join to not start enough threads. For example, if you pass in two tasks, it will only create one thread and execute tasks one after another. If you pass in three tasks, it will create two threads and execute two tasks in one and one in another.
The code accompanying the book is available on Github: PacktPublishing/Delphi-High-Performance

Related:
G+ discussion
TParallel,Join does not create enough threads (Embarcadero's Quality Portal RSP-19557)

The author offers a simple workaround by starting a dummy task (which does nothing) first; you can see an example here.

Some time ago (using Delphi XE7), I wrote a library with my own TThreadPool class using and encapsulating the Windows IOCP. The result turned out well, it was rock-solid. Reading about bugs like this in the latest-and-greatest Delphi version makes me glad I chose to write my own implementation from scratch and saved a lot of time hunting for bugs like this in Delphi's runtime library (in addition to my own).