A concurrent program is one where several parts of the program are running together (possibly taking turns, or possibly on different cores).
A parallel program is a concurrent program where different processes or (kernel) threads allow different parts of the code to run at the same time on multiple cores.
e.g. generators in Python and lazy evaluation in Haskell make our functions concurrent but not parallel.
In the modern world, almost every computing device you'll encounter has multiple cores. If your code isn't parallel, it isn't fast.
Let's hope our programming languages can help us write concurrent code, and then get that concurrent code running in parallel.
In Haskell, we saw par
, pseq
, and parMap
to help us get things running concurrently. The compiler and runtime environment let that code run in parallel using multiple threads.
The pure functional style made concurrency easy, but lazy evaluation took it away unless we did some work.
See the concurrent Fibonacci example.
Process: an instance of a running program.
A single process can have several points where it is executing concurrently: each is a thread.
Multiple processes are scheduled to run in parallel by the operating system. If we want them to work together, they need to communicate somehow: sockets, pipes, shared files, shared memory, etc.
The difficulty: inter-process communication is essentially as hard as network programming. Processes might stop communicating or fail completely. You need to handle that.
Multiple threads share their process' memory. If one thread modifies the data while another is reading it, bad things are going to happen.
Code (for a module or data structure) is thread safe if it can be used by multiple concurrent threads without anything going wrong. Most languages provide some thread-safe data structures, but you must be very careful to identify them.
e.g. Some Java collections are thread-safe; the Python queue
module provides a thread-safe queue.
Thread safety is hard in general. See CMPT 300.
e.g. the Python and Ruby interpreters (standard implementations) are not thread safe, so have a global interpreter lock to let only one Python/Ruby thread run at a time (per process).
But if you can avoid sharing any data structures, thread safety is easy.
Pure functions come up again: the function can be called many times in parallel safely because they never share anything.
Or sharing immutable data structures should be safe.
Maybe the immutability is enforced by the language, or maybe it's a class designed to be immutable (or immutable in certain circumstances), or maybe you just design your code to not to mutate the data while multiple threads are using it.
e.g. in Rust, when data is shared between threads it is automatically immutable.
If you must share and mutate data structures, you need to do it safely by locking/communicating with mutexes or semaphores.
Like manual memory management: it's error-prone hard to get right 100% of the time. My advice: avoid if possible.
There are two flavours of threads that a programming language might make available to you…
A kernel thread (or OS-level thread) is created and scheduled by the operating system. Each kernel thread has its own stack, program counter, and registers. Each process has one or more kernel threads that share its memory.
A user thread (or sometimes green thread) are managed by the language's runtime environment (VM or similar). The environment can let different concurrent parts of the code take turns executing, but not run in parallel.
User threads are much faster to create and manage: you can make thousands of user threads in most environments, but thousands of simultaneous kernel threads might be a problem.
But user threads don't take advantage of multiple cores. Some hybrid of the two might be most efficient, like hybrid (M:N) threading.
We want threads to avoid sharing (mutable) memory. But, we still need threads to communicate with each other to get work done.
One possible answer will be to let them communicate by sending messages on channels.
Newer languages have incorporated more safe concurrency features into their design.
Rust: explicit ownership of references makes formal who has access to objects. References can either be shared and immutable, or mutable and unique.
Go: Threads should coordinate by communicating on thread-safe channels. Don't communicate by sharing memory, share memory by communicating.
*
Erlang: an Erlang processes
is a user thread that cannot share memory with others. All communication by messages.
Many languages have started to support concurrency with asynchronous programming.
The idea: calling an async function doesn't return the function result, it starts the function and returns some kind of object representing the in-progress calculation. Your code can get on with other stuff and await the result when it's actually needed.
A quick example in Python:
async def test1(): print('one') s = asyncio.sleep(1) print(s) print('two') await s print('three') asyncio.run(test1())
Note from non-async code, you can't just call an async function: you have to run
it with an async tool.
The output, with a pause between two
and three
:
one <coroutine object sleep at 0x7f1fafb0f450> two three
Or a slightly more realistic example where we start an async task before we need its result, then d some other stuff while it works, then await
the result.
async def test2(): task = calculate_things(10) print(f'task == {task}') # Do other stuff while calculate_things is working. # Then... res = await task print(f'res == {res}')
task == <coroutine object calculate_things at 0x7f1fafb0f450> res == 20
Async gives us a relatively easy way to get things happening concurrently.
Is the async task done is a user thread or kernel thread or something else? It depends on the language. Async is more commonly used for the get a slow I/O task started
case than use all my CPU cores to compute
, so that's likely how the tools are optimized.
It might be worth looking at a tutorial in your favourite language: it seems like an increasingly popular technique.
There are higher-level tools a language can provide to make concurrency easier.
In Haskell, parMap
let us work almost like with map
but in parallel, which was nice.
One possibility: thread pools, where a collection of threads (or processes in some implementations) is created, along with a work queue that feeds work into them.
This prevents any overhead of frequently creating/destroying threads, and give the programmer an easy API to work with.
For example, if we have a Scala class (or several different classes) that implements Runnable
(like Java)…
class SomeWork(n: Int) extends Runnable { // ... }
Then we can have a pool of threads that queue whatever .run()
methods we want executed: [full code]
val pool = Executors.newFixedThreadPool(8) 1 to 100 foreach { x => pool.execute(new SomeWork(x)) } pool.shutdown()
In Python, the multiprocessing
module contains process pools. They contain higher-level functions to do work in parallel:
pool = multiprocessing.Pool() values = range(100) result = pool.map(long_computation, values) print(result)
Example: Parallel Computation with Pools.
Go makes it extremely easy to do a function call in a separate thread (goroutine) with the go
keyword.
func Count(ident string, n int) { for i := 1; i <= n; i++ { fmt.Printf("%s%d ", ident, i) time.Sleep(50 * time.Microsecond) } } func testGoroutines() { go Count("A", 5) go Count("B", 4) }
Go uses hybrid (M:N) threading, so creating a goroutine is fairly inexpensive. If you have work where creating a user-space thread is reasonable overhead, doing everything in parallel is very easy.
for request := HttpListen(80) { go send_response(request) }
To achieve the threads should coordinate by communicating
, Go provides a thread-safe channel that can be used to coordinate
func addBytesChannel(a, b byte, result chan byte) { result <- a + b } func ChannelExample() { result := make(chan byte) go addBytesChannel(3, 4, result) res := <-result fmt.Println(res) }
The lesson: concurrency can be hard, but it doesn't have to be.
Look for help from your language/libraries. Look for the easy
cases where you can work concurrently without having to share resources.