Python moves to remove the GIL and boost concurrency
After much debate, the Python Steering Council intends to approve a proposal, PEP 703, “Making the Global Interpreter Lock Optional in CPython.”
This proposal is the culmination of many attempts over the years to remove Python’s Global Interpter Lock, or GIL. Removing the GIL removes a major obstacle to multi-threading, making Python a truly multi-core language and significantly improving its performance for workloads that benefit from parallelism.
With this proposal, first-class support for multithreading and concurrency in Python is a step closer to becoming reality.
Why remove Python’s GIL?
Python’s memory management system keeps track of object usage by maintaining counts of the number of references to each object. When the reference count for an object falls to zero, the object is slated for removal.
Because Python was created at a time when multi-processor systems were a rarity, and multi-core processors were non-existent, this reference count mechanism isn’t thread-safe. Instead, Python achieves thread safety by allowing only one thread to access an object at a time. This is the purpose of the GIL.
Many projects over the years have attempted to remove the GIL. They did enable multithreaded programs to run faster, but at the cost of degrading the performance of single-threaded programs. Given the vast majority of Python applications are single-threaded, this was a poor trade-off. Although refinements to the GIL have improved its handling of multithreaded apps, it’s still a serious bottleneck.
Python’s core developers finally decided to remove the GIL from CPython, but only if it could be done without slowing down single-threaded programs.
How a GIL-free Python will work
The current proposals for a no-GIL edition of Python use a mix of techniques to make reference counting thread-safe, and leave the speed of single-threaded programs untouched or impact it only minimally.
- Biased reference counting. Counts for objects accessed by only a single thread would be handled differently (and more quickly) than counts for objects accessed by multiple threads. Since most objects are accessed by only one thread, the impact on single-threaded programs is minimized.
- Immortalization. Some objects, like
None
, never need to be deallocated, so their reference counts do not need to be tracked. - Thread-safe memory allocation. A new memory allocation system for CPython objects will make it easier to trace objects in the garbage collector, and to allocate memory in a thread-safe way.
- Deferred reference counting. Reference counts for some objects, like top-level functions in a module, can be safely deferred. This saves both time and resources.
- A revised garbage collector. The CPython garbage collector cleans up cyclical object references, where two or more objects hold references to each other. The no-GIL build makes many changes to the garbage collector, such as removing the “generations” system for tracking objects.
How a GIL-free Python will be phased in
Implementing PEP 703 is a long-term project that will take place in multiple stages over several years. During this time, the CPython interpreter will transition to make the no-GIL version first optional, then supported, and finally the standard version of CPython.
To accomplish this, CPython’s developers will add an experimental “no-GIL” build mode to CPython, so that one can compile a version of CPython with or without the GIL. Eventually, the no-GIL build will become the default.
Here is how the plan to remove the GIL from CPython is set to unfold.
Step 1: No-GIL CPython is optional
The first incarnations of a no-GIL CPython will be experimental, for both CPython developers and the larger Python community. This experimental phase has several goals:
- Get the rest of the Python community involved. Any major change to Python needs buy-in from the wider Python community. The experimental builds give Python users a way to safely experiment with testing their code, and to see how both non-threaded and threaded code will behave.
- Give Python distributions the option, not the requirement, to ship a GIL-less Python. Python distributions like Conda or WinPython need to guarantee compatibility with stock CPython. During the transition phase, they could provide the option to install the regular or GIL-less version of CPython. This would allow Conda or WinPython users to pick the version best compatible with their needs.
- Determine whether the no-GIL project is worthwhile. If the community tries out the GIL-less builds at scale and is unhappy with the results, the core CPython developers reserve the right to back out. Having dual builds means a heavier maintenance burden in the short term, but provides an escape hatch if the no-GIL project proves unworthy.
Step 2: No-GIL CPython is supported
The next stage will be to offer the no-GIL build as a supported alternative build for CPython. A user would have the choice of installing either the no-GIL or GIL build, with either one being a formally supported version of CPython that receives bug fixes, security patches, and updates.
One big goal of this stage is to set a target date for making no-GIL the default. This will likely happen on the same timeline as the deprecation and removal of other Python features—at least two or three versions, meaning at least two or three years.
Step 3: No-GIL CPython is the default
The final stage would be to make the no-GIL version of CPython the default build, and to remove all GIL-related code from CPython. “We don’t want to wait too long with this,” wrote Thomas Wouters, CPython core developer, “because having two common build modes may be a heavy burden on the community (as, for example, it can double test resources and debugging scenarios), but we can’t rush it either. We think it may take as much as five years to get to this stage.”
The biggest challenges to removing the GIL
The biggest challenges present in this plan aren’t only technical, although the technical challenges are daunting. What looms even larger is how to bring the rest of the Python ecosystem into line with these changes—and make sure a GIL-less Python doesn’t create more problems than it solves.
According to Wouters, “… any changes in third-party code needed to accommodate no-GIL builds should just work in with-GIL builds (although backward compatibility with older Python versions will still need to be addressed).”
The other big challenge, as mentioned above, is “to bring along the rest of the Python community,” said Wouters, “… and make sure the changes we want to make, and the changes we want them to make, are palatable.
“Before we commit to switching entirely to the no-GIL build, we need to see community support for it,” Wouters said. “We can’t just flip the default and expect the community to figure out what work they need to do to support it.”
The Python community experienced huge growing pains when transitioning from Python 2 to Python 3, so any big changes like removing the GIL would have to be thoroughly backwards compatible. As Wouters put it, “We do not want another Python 3 situation.”
Beyond the perils and challenges lies a great reward: A Python that finally supports the parallelism that programmers have come to expect in the 21st century.