Lesson Title: Key Points

Beta

Lesson Title

introductionCommon problems Overview and rationaleWhy Python?What is parallel computing?

Programs are parallelizable if you can identify independent tasks.
To make programs scalable, you need to chunk the work.
Parallel programming often triggers a redesign; we use different patterns.
Doing work in parallel does not always give a speed-up.

BenchmarkingA first example with Dask Memory profilingUsing many cores

It is often non-trivial to understand performance
Memory is just as important as speed
Measuring is knowing

Computing $\pi$Parallelizing a Python application Using Numba to accelerate Python code

Always profile your code to see which parallelization method works best
Vectorized algorithms are both a blessing and a curse.
Numba can help you speed up code

Threads and processesThreading Multiprocessing

If we want the most efficient parallelism on a single machine, we need to circumvent the GIL.
If your code releases the GIL, threading will be more efficient than multiprocessing.
If your code does not release the GIL, some of your code is still in Python, and you’re wasting precious compute time!

Delayed evaluationDask Delayed

We can change the strategy by which a computation is evaluated.
Nothing is computed until we run compute().
By using delayed evaluation, Dask knows which jobs can be run in parallel.
Call compute only once at the end of your program to get the best results.

Map and reduce

Use abstractions to keep programs manageable

Exercise with FractalsThe Mandelbrot and Julia fractals

Actually making code faster is not always straight forward
Easy one-liners can get you 80% of the way
Writing clean, modular code often makes it easier to parallelise later on

AsyncioIntroduction to Asyncio A first programWorking with `asyncio` outside Jupyter

Use the async keyword to write asynchronous code.
Use await to call coroutines.
Use asyncio.gather to collect work.
Use asyncio.to_thread to perform CPU intensive tasks.
Inside a script: always make an asynchronous main function, and run it with asyncio.run.

Calling external C and C++ libraries from PythonCalling C and C++ libraries Call the C library from multiple threads simultaneously.

Multiple options are available in calling external C and C++ libraries and that the best choice can depend on the complexity of your problem.
Obviously, there is an extra compile and link step, but you will get a much faster execution compared to pure Python.
Also, the GIL will be circumvented in calling these libaries.
Numba might also offer you the speedup you want with even less effort.