Make SSH use gpg-agent
ssh-agent
OpenSSH is the defacto "standard" for SSH clients and servers on Linux and macOS. (This is apparently true for newer versions of windows as well.)
The ssh
program normally uses a program called ssh-agent
to hold SSH secret keys in memory. The ssh-agent
program actually performs the encryption operations necessary to authenticate SSH connections, without ssh
needing to know the actual secret key.
On most systems, ssh-agent
is started as part of each user's login process. When it starts, it creates a "Unix Domain Socket". The full pathname to this socket ends up being stored in an SSH_AUTH_SOCK
environment variable, which ends up being inherited by other processes within the user's login session.
Unix Domain Sockets work the same way that a network socket works, but ...
Unix sockets can only be used to communicate with other processes on the same machine.
Instead of each endpoint being an IP address and port number, the endpoint is a filename on the local filesystem.
Programs like ssh
, scp
, and sftp
use the SSH_AUTH_SOCK
environment variable to find the agent. If this variable doesn't exist, ssh
will not be able to use an agent, and will only be able to authenticate using passwords or secret key files stored on the local disk.
The protocol (or "language") that SSH clients use when talking to ssh-agent
is fairly simple, although it doesn't seem to be widely documented. The best thing I've been able to find every time I've looked for it is an IETF draft document which "expired" in 2020 ... which doesn't make it any less valid, it just means that the document hadn't been updated for six months (which is probably a good thing, it means the document didn't need to be updated.)
macOS
macOS 10.5 and later set things up to start an ssh-agent
process as part of each user's login process. The underlying mechanics are different for different macOS versions, and the filename of the Unix socket is randomly generated, but the result is that every process running as part of the user's login session, will inherit an SSH_AUTH_SOCK
environment variable pointing to that Unix socket.
With macOS 10.15 and later, the SIP (System Integrity Protection) mechanism makes it difficult (and in later versions, impossible) to make macOS not start ssh-agent
automatically.
Linux
Most Linux distributions do something similar, especially if the login session involves a GUI desktop environment. If it doesn't happen automatically, it's usually fairly simple to edit your "login scripts" (such as a .bashrc
file) to either start an agent, or find an existing agent process, and export the SSH_AUTH_SOCK
environment variable for you.
Note that I haven't needed to mess with this stuff in at least ten years, and I don't honestly remember any details about it.
gpg-agent
GnuPG has a program called gpg-agent
which performs the same kind of in-memory caching, but for for PGP keys.
The gpg-agent
program can be configured to open a unix socket and speak the ssh-agent
protocol. If you do this, gpg-agent
will be able to perform the same signing operations that ssh-agent
does, using any of the following:
- SSH secret key files (such as "
id_rsa
") from disk. - PGP authentication subkeys from your keyring.
- PGP authentication subkeys stored on a smartcard, such as a YubiKey.
So what we want to do is make all SSH clients talk to gpg-agent
instead of ssh-agent
. SSH clients use the SSH_AUTH_SOCK
environment variable to find the agent, so ...
If we make the SSH_AUTH_SOCK
environment variable point to the Unix socket that gpg-agent
opens, when an SSH client tries to talk to ssh-agent
, it will actually be talking to gpg-agent
.
Ultimately, we need to make the SSH_AUTH_SOCK
variable to point to the Unix socket file that gpg-agent
creates.
macOS
Back in 2018, I figured out how to stop macOS from starting the ssh-agent
process, and how to make the login process set the SSH_AUTH_SOCK
environment variable point to the socket created by gpg-agent
. This worked for a while, but then SIP came along (and later APFS with its immutable filesystems) and that approach didn't work anymore.
Then I found this article, which explains how to "do it the other way around". Instead of trying to change what macOS does, we can replace the Unix socket file with a symbolic link, pointing to the Unix socket where gpg-agent
is listening for connections from SSH clients.
This is so much simpler than what I had originally come up with.