Threads

Threads & Processes

Everything we have talked about in a processor has all been within a core.

Each core is effectively an independent processor with its own execution units, registers, L1/L2 cache, etc.

Your computer's processor probably has four or more cores.

Threads & Processes

When your program runs, that's a process. A process has some memory allocated to it (including your code in the static part of its memory).

Your computer can be running multiple processes concurrently. The operating system deals with allocating resources to processes (memory, processor time, etc).

Each process has one or more threads…

Threads

A thread is a point where code is running within a process. A thread has its own registers and stack, but all threads within a process share the same memory.

Most notably, each thread has its own instruction pointer: they can all be executing at different places within the program.

Threads

The operating system takes care of scheduling threads and making sure they all get processor time. It's in charge of running processes/​threads on cores, letting them take turns with other threads, etc.

… if they're kernel threads or OS-level threads.

User threads or userspace threads or green threads are handled within a single kernel thread, and are managed by some runtime environment provided by the language/​library.

Threads

Something I can't say often enough: threads don't have to be hard. I want to use several threads in my code when doing a lot of computation.

I paid for those cores. I want to use them. Remember that each core is probably hyperthreaded: can run two threads at once, getting somewhere between one and two full cores worth of work done on them.

If you want to take advantage of multiple cores, you need either multiple processes or multiple threads.

Threads

In general threaded code is difficult to write correctly (e.g. in CMPT 300). It's hard to test and debug. Interactions between threads can be unpredictable because each could start/​stop at any point because the OS decides to.

But, you don't have to do the hardest thing every time you use threads.

Threads

Thread safety is hard because a thread could be paused when it's in the middle of updating a data structure. Suppose you have multiple threads doing this on the same collection:

unsigned len = collection.size();
collection[len] = new_data;
collection.set_size(len + 1);

Thread #1 might be paused after the first line; thread #2 might do this whole fragment; then #1 resumes, overwriting the value #2 just wrote.

Threads

So don't do that.

Threads are easy if you don't share any data between them. Threads are easy if you share data between them and don't modify it.

Multiple threads are hard if you are sharing data structures between them and modifying them. As soon as you start doing that, things are tricky.

Threads

Multiple threads are easy if you can guarantee that for each value/​object/​data structure/​whatever either:

  1. It is available to only one thread.
  2. It will not be modified by any thread.

If that's true, go for it: use threads all you want.

If you need to modify something that's shared between threads, that's when you have to be careful.

Threads

One additional case where you can easily share something changeable:

  1. It's a thread-safe channel whose whole job is to send data safely between threads: Go's channels; Rust's std::sync::mpsc; Python's multiprocessing.Queue; Java's BlockingQueue.

Threads in C++

There are several tools in the C++ standard library that make it fairly straightforward to work with threads (but not a thread-safe channel, as far as I can see).

For example, std::thread. Its constructor takes a function and its arguments. Then that function is executed in a separate thread. e.g. this simple function:

void say_hello(int id) {
    cout << "Hello from thread " << id << '\n';
}

Threads in C++

We can start three of those in threads like this:

auto t1 = std::thread(say_hello, 1);
auto t2 = std::thread(say_hello, 2);
auto t3 = std::thread(say_hello, 3);

And then wait for them to finish:

t1.join();
t2.join();
t3.join();

Threads in C++

What I got from a run of that code:

Hello from thread Hello from thread 1
Hello from thread 3
2

Remember: threads can start/stop at any moment. We could think of these as multiple threads that modify the output and thus break our rules.

Threads in C++

Let's try something else with std::async (added in C++17). It lets you call a function in a separate thread and get a return value.

That will be easy to work with. If our function gets no input besides its arguments and does nothing besides calculate and return a result, it will be easy to meet the requirement for nothing shared and changing.

Threads in C++

In other words, a pure function. Here's one:

int do_work(int a, int b) {
    return a + b;
}

Note: this function is much too small to sensibly call in a separate thread. Creating and destroying the thread will take many times more work than the addition, but it is good enough for an example.

Threads in C++

Now we can call it with async and get back a std::future. Basically, a result that will be ready some time in the future.

std::future<int> f1 = std::async(do_work, 5, 6);
std::future<int> f2 = std::async(do_work, 7, 8);

And maybe do some other stuff in the main thread, but eventually wait for their results and use them:

int total = f1.get() + f2.get();

Threads in C++

That's how easy threads are, as long as you're not sharing.

Threads in C++

We can easily break our array sum into two halves and do each in a different thread:

float array_sum_threaded(float* array, uint64_t length) {
    uint64_t half = length / 2;
    auto f1 = std::async(array_sum, array, half);
    auto f2 = std::async(array_sum, array+half, length-half);
    return f1.get() + f2.get();
}

Here, array is shared but not modified so we're safe. The local variables in array_sum are not shared: each thread has its own stack, so its own local variables.

Threads in C++

The speedup was about 1.5×, not the 2× I was hoping for.

Maybe speed was limited by memory bandwidth. Maybe something else.

Threads in C++

The message: using threads isn't as easy as not using them. But, you can use them in a way that's not that hard.

The hard part is actually deciding when the amount of work you have to do is big enough to be worth doing in another thread.

Use the cores you paid for.