5 Matching Annotations
  1. Dec 2023
    1. In sync code, you might use

      a thread pool and imap_unordered():

      ``` pool = multiprocessing.dummy.Pool(2)

      for result in pool.imap_unordered(do_stuff, things_to_do): print(result) ```

      Here, concurrency is limited by the fixed number of threads.

    1. Tips
      • if name == "main" is important for multiprocessing because it will spawn a new Python, that will import the module. You don't want this module to spawn a new Python that imports the module that will spawn a new Python...
      • If the function to submit to the executor has complicated arguments to be passed to it, use a lambda or functools.partial.
      • max_worker = 1 is a very nice way to get a poor man’s task queue.
    2. Process pools are good for:
      • When you don't need to share data between tasks.
      • When you are CPU bound.
      • When you don't have too many tasks to run at the same time.
      • When you need true parallelism and want to exercise your juicy cores.
    3. Thread pools are good for:
      • Tasks (network, file, etc.) that needs less than 10_000 I/O interactions per second. The number is higher than you would expect, because threads are surprisingly cheap nowadays, and you can spawn a lot of them without bloating memory too much. The limit is more the price of context switching. This is not a scientific number, it's a general direction that you should challenge by measuring your own particular case.
      • When you need to share data between the tasks.
      • When you are not CPU bound.
      • When you are OK to execute tasks a bit slower to you ensure you are not blocking any of them (E.G: user UI and a long calculation).
      • When you are CPU bound, but the CPU calculations are delegating to a C extension that releases the GIL, such as numpy. Free parallelism on the cheap, yeah!

      E.G: a web scraper, a GUI to zip files, a development server, sending emails without blocking web page rendering, etc.

    4. Python standard library comes with a beautiful abstraction for them I see too few people use: the pool executors.