You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Some of them are pointers to shared memory structures and can stay as
they are. But many of them are per-connection state. The most
straightforward conversion for those is to turn them into thread-local
variables, like Konstantin did in [0].
It might be good to have some kind of a Session context struct that we
pass everywhere, or maybe have a single thread-local variable to hold
it. Many of the global variables would become fields in the Session. But
that's future work.
Extensions
A lot of extensions also contain global variables or other things that
break in a multi-threaded environment. We need a way to label extensions
that support multi-threading. And in the future, also extensions that require a multi-threaded server.
Let's add flags to the control file to mark if the extension is
thread-safe and/or process-safe. If you try to load an extension that's
not compatible with the server's mode, throw an error.
We might need new functions in addition _PG_init, called at connection
startup and shutdown. And background worker API probably needs some changes.
Exposed PIDs
We expose backend process PIDs to users in a few places.
pg_stat_activity.pid and pg_terminate_backend(), for example. They need
to be replaced, or we can assign a fake PID to each connection when
running in multi-threaded mode.
Signals
We use signals for communication between backends. SIGURG in latches,
and SIGUSR1 in procsignal, for example. Those primitives need to be
rewritten with some other signalling mechanism in multi-threaded mode.
In principle, it's possible to set per-thread signal handlers, and send
a signal to a particular thread (pthread_kill), but I think it's better
to just rewrite them.
We also document that you can send SIGINT, SIGTERM or SIGHUP to an
individual backend process. I think we need to deprecate that, and maybe
come up with some convenient replacement. E.g. send a message with
backend ID to a unix domain socket, and a new pg_kill executable to send
those messages.
Restart on crash
If a backend process crashes, postmaster terminates all other backends
and restarts the system. That's hard (impossible?) to do safely if
everything runs in one process. We can continue have a separate
postmaster process that just monitors the main process and restarts it
on crash.
Thread-safe libraries
Need to switch to thread-safe versions of library functions, e.g.
uselocale() instead of setlocale().
The Python interpreter has a Global Interpreter Lock. It's not possible
to create two completely independent Python interpreters in the same
process, there will be some lock contention on the GIL. Fortunately, the
python community just accepted https://peps.python.org/pep-0684/. That's
exactly what we need: it makes it possible for separate interpreters to
have their own GILs. It's not clear to me if that's in Python 3.12
already, or under development for some future version, but by the time
we make the switch in Postgres, there probably will be a solution in
cpython.
At a quick glance, I think perl and TCL are fine, you can have multiple
interpreters in one process. Need to check any other libraries we use.
The text was updated successfully, but these errors were encountered:
I'm using this issue to keep track of things that we need to do to make PostgreSQL multi-threaded. To be updated as we go.
Discussion on pgsql-hackers: https://www.postgresql.org/message-id/flat/31cc6df9-53fe-3cd9-af5b-ac0d801163f4%40iki.fi
PG Wiki page: https://wiki.postgresql.org/wiki/Multithreading (mostly from pgconf.dev 2024)
Prior art: Konstantin's old branch: https://github.com/postgrespro/postgresql.pthreads
Some very preliminary hacking on: https://github.com/hlinnaka/postgres/tree/threading. I used similar approach to labeling all global variables as Konstantin.
see also: https://github.com/cmu-db/peloton/wiki/Postgres-Modifications
TODOs:
Global variables
We have a lot of global and static variables:
$ objdump -t bin/postgres | grep -e ".data" -e ".bss" | grep -v
"data.rel.ro" | wc -l
1666
Some of them are pointers to shared memory structures and can stay as
they are. But many of them are per-connection state. The most
straightforward conversion for those is to turn them into thread-local
variables, like Konstantin did in [0].
It might be good to have some kind of a Session context struct that we
pass everywhere, or maybe have a single thread-local variable to hold
it. Many of the global variables would become fields in the Session. But
that's future work.
Extensions
A lot of extensions also contain global variables or other things that
break in a multi-threaded environment. We need a way to label extensions
that support multi-threading. And in the future, also extensions that
require a multi-threaded server.
Let's add flags to the control file to mark if the extension is
thread-safe and/or process-safe. If you try to load an extension that's
not compatible with the server's mode, throw an error.
We might need new functions in addition _PG_init, called at connection
startup and shutdown. And background worker API probably needs some changes.
Exposed PIDs
We expose backend process PIDs to users in a few places.
pg_stat_activity.pid and pg_terminate_backend(), for example. They need
to be replaced, or we can assign a fake PID to each connection when
running in multi-threaded mode.
Signals
We use signals for communication between backends. SIGURG in latches,
and SIGUSR1 in procsignal, for example. Those primitives need to be
rewritten with some other signalling mechanism in multi-threaded mode.
In principle, it's possible to set per-thread signal handlers, and send
a signal to a particular thread (pthread_kill), but I think it's better
to just rewrite them.
We also document that you can send SIGINT, SIGTERM or SIGHUP to an
individual backend process. I think we need to deprecate that, and maybe
come up with some convenient replacement. E.g. send a message with
backend ID to a unix domain socket, and a new pg_kill executable to send
those messages.
Restart on crash
If a backend process crashes, postmaster terminates all other backends
and restarts the system. That's hard (impossible?) to do safely if
everything runs in one process. We can continue have a separate
postmaster process that just monitors the main process and restarts it
on crash.
Thread-safe libraries
Need to switch to thread-safe versions of library functions, e.g.
uselocale() instead of setlocale().
The Python interpreter has a Global Interpreter Lock. It's not possible
to create two completely independent Python interpreters in the same
process, there will be some lock contention on the GIL. Fortunately, the
python community just accepted https://peps.python.org/pep-0684/. That's
exactly what we need: it makes it possible for separate interpreters to
have their own GILs. It's not clear to me if that's in Python 3.12
already, or under development for some future version, but by the time
we make the switch in Postgres, there probably will be a solution in
cpython.
At a quick glance, I think perl and TCL are fine, you can have multiple
interpreters in one process. Need to check any other libraries we use.
The text was updated successfully, but these errors were encountered: