Add support for PSK to remote CIB and deprecate insecure methods by clumens · Pull Request #4051 · ClusterLabs/pacemaker

clumens · 2026-02-09T15:50:36Z

No description provided.

clumens · 2026-02-09T17:03:48Z

CI failure is unrelated to this PR:

[2026-02-09T16:06:35.420Z] logging.c:111:14: error: case value not in enumerated type 'GLogLevelFlags' [-Werror,-Wswitch]
[2026-02-09T16:06:35.420Z]   111 |         case G_LOG_LEVEL_MASK:
[2026-02-09T16:06:35.420Z]       |              ^
[2026-02-09T16:06:35.420Z] 1 error generated.

Let's see if #4052 fixes it.

clumens · 2026-02-10T15:18:26Z

Rebased on main to pick up that logging fix so we can get the green check.

daemons/execd/remoted_tls.c

include/crm/common/tls_internal.h

include/crm/common/options_internal.h

daemons/based/based_remote.c

lib/common/tls.c

lib/cib/cib_remote.c

nrwahl2

I left some substantial thoughts in #4051 (comment). If we want to reconsider the approach, we can do that either now or later. Feel free to merge this and get the basic feature in place, if you would like.

clumens · 2026-03-27T18:21:45Z

@nrwahl2 The patches here that are identical to the first version are:

Refactor: libcrmcommon: Make client PSK initialization more obvious.
Refactor: libcrmcommon: pcmk__tls_client_add_psk_key should take a username. (commit message is slightly different)
Test: cts-scheduler: Remove remote-clear-port from test XML files.
Refactor: libcib: Use pcmk__xe_set_props in cib_tls_signon.

include/crm/common/tls_internal.h

lib/common/tls.c

lib/cib/cib_remote.c

doc/sphinx/Pacemaker_Administration/configuring.rst

nrwahl2 · 2026-04-09T08:43:32Z

doc/sphinx/Pacemaker_Administration/configuring.rst

+   As of Pacemaker 3.0.2, even with the use of PSK authentication, a username
+   and password is additionally required to login to the CIB itself.  Due to
+   the use of the ``CIB_user`` environment variable, the username in the PSK
+   credentials file and the CIB username must match.


"Due to the use of the CIB_user environment variable" reads oddly to me. The username in the PSK credentials file must match the username in the CIB_user environment variable... right?

It's late and I'm at the end of my review. With that said, now I'm feeling confused. So the credentials file is indexed by CIB_user, which refers to a user on the cluster (server) node. The username on the client is irrelevant, except that the credentials file on the client must belong to the client effective user and be readable by that user only. Is that right?

So for example, let's say we're admin on the client node. The ~/cib-credentials file must belong to admin. But CIB_user, which is unset, will default to CRM_DAEMON_USER. The remote client is created based on CIB_user. That's what private->user gets set to. So the cib-credentials file on the server will need an entry for CRM_DAEMON_USER, to match the client-side credentials file belonging to admin.

But the example server-side credentials file above has a line for admin, even though CIB_user is CRM_DAEMON_USER. Is that example wrong?

It's late and I'm at the end of my review. With that said, now I'm feeling confused.

I don't think this explanation is going to make you feel much better, but here goes.

So the credentials file is indexed by CIB_user, which refers to a user on the cluster (server) node. The username on the client is irrelevant, except that the credentials file on the client must belong to the client effective user and be readable by that user only. Is that right?

That's correct.

Okay, there's two layers of authentication here (because of course there are, and it would be great to simplify this, but we love backwards compatibility).

First, there's the gnutls layer. On the client, the username comes from $CIB_user and the key comes from $CIB_key_file. The username can be anything you want for gnutls. As you pointed out, if $CIB_user is empty, we default to CRM_DAEMON_USER. On the server, gnutls gets the username and key, and looks for that pair in /etc/pacemaker/cib-credentials. If found, then we move on to...

Second, there's logging into the CIB daemon. On the client, the username comes from $CIB_user and the password either comes from $CIB_passwd or you get prompted for it. Here again, we default to CRM_DAEMON_USER. On the server, authentication is handled by cib_remote_auth -> authenticate_user.

There are comments scattered around in the source, the docs, and I think in the upstream task manager that while you should be able to use whatever name you want for authenticating to the CIB daemon, only CRM_DAEMON_USER works. I don't feel like debugging this. For instance, from Pacemaker_Administration/configuring.rst:

Only the Pacemaker daemon user (|CRM_DAEMON_USER|) may be used as ``CIB_user``.

And then as you can see, using $CIB_user at both authentication levels is what forces you to use whatever username you want for the gnutls layer as long as it's CRM_DAEMON_USER. This also kind of defeats the whole point of what we were going for here. I suppose the fix is yet another environment variable for the gnutls layer, but all the best names are taken.

Only the Pacemaker daemon user (|CRM_DAEMON_USER|) may be used as CIB_user.

This is specifically for the certificates section. I'm still processing the rest of your comment.

Reading the rest of your comment and reading back over the code, I don't see where CIB_user has to be CRM_DAEMON_USER, except in the case of certificate-based auth.

I have not built your code and tested its remote CIB PSK feature. But it seems like we'd need CIB_user to be set to admin on the client, since that's what we put in the cib-credentials file on the server in your example.

The CIB_user (admin in this example) does have to be a member of CRM_DAEMON_GROUP, however.

Actually... even for certificate-based auth, I don't think CIB_user has to be CRM_DAEMON_USER. It looks like the gnutls layer ignores the user in that case; it cares about the user only if we're using PSK. So the user matters only for the based login layer (i.e., CIB_user must be a member of CRM_DAEMON_GROUP and has to log in via PAM).

I haven't looked at this as closely, so I'm only about 80-90% confident.

In daemons/based/based_remote.c:

/* Get the authenticated user name (PAM modules can map the original name to * something else). Since the CIB manager runs as the daemon user (not * root), that is the only user that can be successfully authenticated. */

I haven't looked into this mapping stuff, so maybe there's a major step I'm missing. But trying as a different user:

client$ CIB_port=9898 CIB_server=rhel9-scratch-1 CIB_encrypted=true CIB_key_file=creds CIB_user=chris tools/cibadmin -Q -VVVVVVVV ... (crm_exit) info: Exiting cibadmin | with status 76 (CRM_EX_PROTOCOL: Protocol violated)

server# cat /var/log/pacemaker/pacemaker.log ... Apr 10 10:00:40.115 rhel9-scratch-1 pacemaker-based [3961] (cib_remote_listen@based_remote.c:612) info: Encrypted connection from 192.168.122.1 pending authentication for client 576d578a-a399-44b3-b48d-63c82c2a67c8 Apr 10 10:00:43.042 rhel9-scratch-1 pacemaker-based [3961] (authenticate_user@based_remote.c:243) notice: Access for remote user chris denied: Authentication failure

Okay, so more specifically, it seems that the issue is unix_chkpwd(). When not running as root, unix_chkpwd() can only authenticate the current user (which in this case is CRM_DAEMON_USER). It will return "Authentication failure" for other users.

Here are the options that come to mind without giving root privileges to CRM_DAEMON_USER, in order to support an arbitrary CIB_user -- which seems very desirable, especially given our PSK approach.

Get rid of PAM authentication to the CIB manager, at least for PSK and probably for certificate authentication. The deprecated anonymous auth would still need PAM.. I think certificates or PSK should be secure enough without the need for password-based authentication through PAM. But security best practices are not my forte, so we would want to consult other resources/people.

Put an IPC endpoint in pacemakerd that will authenticate a user, since pacemakerd runs as root. I'm not sure how to send the password securely though, or how to prompt for it from pacemakerd if that's possible.

The latter seems too convoluted, and PAM seems redundant now, so I lean toward the former. There's also something called SASL that I haven't really looked into.

https://www.reddit.com/r/linuxadmin/comments/9ih2mj/pam_authentication_of_nonroot_user_how/

https://stackoverflow.com/questions/966827/easiest-way-to-authenticate-users-in-linux-unix-w-o-root-permissions

I think we're going to have to address that in a separate PR, hopefully soon. For now, with this new PSK authentication, it at least works as well as what we currently have and if it takes a release or two to get around to dealing with the inability to give an arbitrary user, I think I can live with that.

Even before this problem, I've been leaning towards wanting to get rid of the separate PAM-based authentication step.

doc/sphinx/Pacemaker_Administration/configuring.rst

To me, pcmk__tls_add_psk_key is not especially obviously named. This function should only be called on TLS clients - it's not some general purpose PSK key management function. It has no purpose on the server side of a connection. Therefore, rename it and update the doxygen to be clearer. Ref T961

…ername. Pre-shared keys in gnutls are tied to a specific username. See the man page for gnutls_psk_set_client_credentials or https://www.gnutls.org/manual/html_node/PSK-credentials.html for more details. Right now, we only support PSK for remote node authentication, and we only support a single username for that. We'll be adding PSK support to remote CIB authentication in a later patch, and that will allow specifying other usernames. Ref T961

gnutls expects that for PSK authentication, there will be a password file somewhere with the format "username:key". The server tells gnutls where that password file is, and then when a client connects, it gives the server a username and key. gnutls reads that file and looks for a match. execd doesn't do that. For Pacemaker Remote node connections, every client always has the same username (DEFAULT_REMOTE_USERNAME, or "lrmd") and there's only one key. That's because the key is just a single file that's referenced in /etc/sysconfig/pacemaker. In order for us to make gnutls work the way we want, we have to register a callback function. This is probably okay because in the remote node world, the cluster is the client and the remote node is the server. It makes some sense that each remote node would have the same username. This callback function will get called every time a client attempts to connect. It will load the key from disk (which, because we only support a single key, it's probably been cached - I think this is why we even have the caching layer for keys in the first place) into a variable, and then gnutls will check its validity. We additionally try to load the key from disk outside of gnutls's control as a way to log an error if it's not present, and I suppose as a way to have it cached for the first connection. Note one behavioral change: previously, our callback function wasn't even checking the username. This is somewhat pointless since the username will always be the same, but it still seems worth doing as a best practice. Because all of the above only applies to execd (PSK support for remote CIB operations will not work this way), just get rid of pcmk__tls_add_psk_callback and inline its contents in execd. Ref T961

This is the server side of things. The way this works is that the server has a file where each line is "username:key". If this file is present and useable, PSK authentication will be supported. If it's just present but not useable, we assume the admin made a mistake in configuration and error out. Otherwise, we'll fall back to anonymous authentication for now. gnutls will handle reading this file, looking for a user match, and checking the key. This patch is made more complicated by refusing to use a credentials file with the wrong ownerships or permissions. It's annoying, but that's the way a lot of security software works and I think it's probably a best practice. Note that the location of the credentials file is not configurable. Ref T961

This function can be used to add two kinds of PSK keys: (1) Binary (or raw, as gnutls calls them) keys. This is just a binary blob of data and is what Pacemaker Remote keys tend to be. Various documents online describe using dd to create the authentication key for remote nodes, and I think pcs does something similar. (2) Text files with name:key rows. Here, the key is a string of hex characters. This is what remote CIB administration will use. gnutls needs to be told what format the key is in when it's being loaded. I feel like this is opening us up to some pretty annoying bugs, but I don't see a way around it if we're to support both kinds of things we want to do. In particular, I'm worried that someone will be using a text-based Pacemaker Remote key. Ref T961

This is for the exact same reasons as pcmk__tls_client_add_psk_key. Hex keys won't work with gnutls unless the trailing newline is stripped from the read string. Ref T961

This is the client side of things. For the client, we need to get the username and key from somewhere. I've chosen to use $CIB_user and to overload $CIB_key_file (which is already in use for X509 cert support) for this purpose. Overloading the latter should be fine - there's nowhere we check just that environment variable to decide on cert support, and you can't use both certs and PSK at the same time. Given those two environment variables, all we have to do is load the key and pass it to gnutls which takes care of the rest. Again, this patch is made more complicated by refusing to use a credentials file with the wrong ownerships or permissions Ref T961

Ref T961

…ey_location. Ref T961

Ref T961

No test output was affected. Ref T961

This CIB property is deprecated in favor of remote-tls-port which is more secure and has been supported since at least 2014. For now, it will still be recognized but will log a warning. Also mark it as deprecated in the docs and remove references to it being something that you can use. Ref T961

This is an insecure authentication method where we use an encrypted TLS channel for communication, but there's no authentication on the channel beyond that. Well, that's not completely true - for remote CIB operations, you do still need a username and password. Instead, people should be using X509 certificates or PSK. A future release will remove support for anonymous authentication and require use of one of those other mechanisms. Fixes T961

That function is already set up to return more than just -1 (see the bottom of the function, for instance). In the TLS setup code, we should replace the -1 return codes with more definite errors. These return codes are already errno values, not pacemaker return codes, so we're okay to do this.

The block of code under "if (private->encrypted)" is getting pretty complicated now, so it makes sense to make it into its own function. Differences from the previous code: * There's now a done label which allows us to short circuit out of the GNUTLS_CRD_PSK block on error. * The above allows code under it to no longer test rc and can therefore be un-indented. * Calling cib_tls_close is handled by the caller instead of in the error handling blocks.

clumens requested a review from nrwahl2 February 9, 2026 15:50

clumens force-pushed the remote-cib-psk branch from 600865c to 834fb83 Compare February 10, 2026 15:18

clumens mentioned this pull request Feb 10, 2026

Med: daemons: Fix user/group checking in based. #4055

Merged

clumens force-pushed the remote-cib-psk branch from 834fb83 to 5eedc5f Compare March 4, 2026 17:59

clumens force-pushed the remote-cib-psk branch 2 times, most recently from 37ff190 to 947f634 Compare March 17, 2026 18:30

nrwahl2 added the review: in progress PRs that are currently being reviewed label Mar 23, 2026

nrwahl2 reviewed Mar 24, 2026

View reviewed changes

nrwahl2 removed the review: in progress PRs that are currently being reviewed label Mar 24, 2026

clumens force-pushed the remote-cib-psk branch from 947f634 to 10edb33 Compare March 24, 2026 16:24

nrwahl2 approved these changes Mar 24, 2026

View reviewed changes

clumens force-pushed the remote-cib-psk branch from 10edb33 to b2150a9 Compare March 27, 2026 15:42

nrwahl2 requested changes Apr 9, 2026

View reviewed changes

clumens added 14 commits April 10, 2026 16:22

Refactor: libcrmcommon: Add a raw param to pcmk__load_key.

39f3f80

This is for the exact same reasons as pcmk__tls_client_add_psk_key. Hex keys won't work with gnutls unless the trailing newline is stripped from the read string. Ref T961

Doc: sysconfig: Remove a redundant sentence from PCMK_authkey_location.

ea67ff9

Ref T961

Doc: Pacemaker Explained: Remove a redundant sentence from PCMK_authk…

b053922

…ey_location. Ref T961

Doc: Pacemaker Administration: Document remote CIB admin with PSK.

43d8d68

Ref T961

Test: cts-scheduler: Remove remote-clear-port from test XML files.

75182c8

No test output was affected. Ref T961

Refactor: libcib: Use pcmk__xe_set_props in cib_tls_signon.

eee2748

clumens added 3 commits April 13, 2026 10:45

Refactor: libcib: Clean up includes in cib_remote.c

253f1ef

clumens force-pushed the remote-cib-psk branch from b2150a9 to 6e81f53 Compare April 13, 2026 15:22

Conversation

clumens commented Feb 9, 2026

Uh oh!

clumens commented Feb 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

clumens commented Feb 10, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

nrwahl2 left a comment

Choose a reason for hiding this comment

Uh oh!

clumens commented Mar 27, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

nrwahl2 Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

clumens Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

nrwahl2 Apr 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

nrwahl2 Apr 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

nrwahl2 Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

clumens Apr 10, 2026

Choose a reason for hiding this comment

Uh oh!

nrwahl2 Apr 10, 2026

Choose a reason for hiding this comment

Uh oh!

clumens Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

clumens commented Feb 9, 2026 •

edited

Loading

nrwahl2 Apr 9, 2026 •

edited

Loading

nrwahl2 Apr 9, 2026 •

edited

Loading