Skip to content

intern_common should use _Py_LOCK_DONT_DETACH #149162

@colesbury

Description

@colesbury

Bug report

PyUnicode_InternFromString and various other interning functions call intern_common, which locks an interpreter-wide lock in the free threading build. We should use _Py_LOCK_DONT_DETACH when locking that lock to avoid potential deadlocks in certain extensions.

PyTorch, TensorFlow, and possibly other extensions use C++11 one time initialization via function local statics to intern string. For example:

void PyNode::compiled_args(CompiledNodeArgs& args) const {
  static PyObject* method_name = PyUnicode_InternFromString("_compiled_autograd_key");
  ...
}

https://github.com/pytorch/pytorch/blob/8804b122355d1aa1cc6bfffd1516f3df39f96c5f/torch/csrc/autograd/python_function.cpp#L369-L370

C++ static locals implement thread-safe one time initialization, similar to std::call_once or pthread_once. That means that a (hidden) C++ lock is held during the call to PyUnicode_InternFromString. If we detach, during PyUnicode_InternFromString, we risk a lock ordering deadlock.

Here's a complicated scenario that could deadlock:

  • Thread 1: requests a stop the world pause
  • Thread 2: holds the interned mutex (causes lock contention)
  • Thread 3: calls static PyObject* method_name = PyUnicode_InternFromString(...). Blocks on interned mutex and detaches for stop the world pause
  • Thread 4: calls static PyObject* method_name = PyUnicode_InternFromString(...) and blocks on C++ one time initialization lock without detaching.

So there's a deadlock where Thread 4 - (waiting on) -> Thread 3 - (waiting on) -> Thread 1 - (waiting on) -> Thread 4.

This isn't a problem in GIL-enabled builds because thread 3 won't block during PyUnicode_InternFromString.

_Py_LOCK_DONT_DETACH prevents this problem because PyUnicode_InternFromString() will no longer block on a stop the world if there's contention on the interned mutex.

Note that if callers used something like pybind11's gil_safe_call_once_and_store that would also avoid the deadlock.

Linked PRs

Metadata

Metadata

Assignees

No one assigned

    Labels

    3.14bugs and security fixes3.15new features, bugs and security fixesinterpreter-core(Objects, Python, Grammar, and Parser dirs)topic-free-threadingtype-bugAn unexpected behavior, bug, or error

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions