Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pre-proposal: CurveZMQ #75

Open
minrk opened this issue Sep 21, 2021 · 1 comment
Open

pre-proposal: CurveZMQ #75

minrk opened this issue Sep 21, 2021 · 1 comment

Comments

@minrk
Copy link
Member

minrk commented Sep 21, 2021

zeromq has a transport-level encryption and authentication protocol called CurveZMQ.

I've just landed support for CurveZMQ in IPython Parallel, and I think it's worth talking about in Jupyter.

The gist of the most basic implementation:

  • server sockets set a public and private key, issued e.g. via zmq.curve_keypair() from pyzmq. (the sockets that bind make the most sense, but is not strictly a requirement), and CURVE_SERVER=1 to enable auth.
  • client sockets use the server's public key as a server key - anyone with this key can connect to the server and send/receive messages
  • client's own private/public key can be any key pair, and are used exclusively for encryption; they have no role in authentication

In Jupyter, we already have a key distribution mechanism, which is the HMAC message-signing key in connection files. We can use the same key distribution for Curve keys. In the context of Jupyter, it's a little weird, because it's usually the client (KernelManager) that sets the credentials, which means the client issues the kernel's private key as well, and needs to pass the private key to the kernel. This being the case, the absolute simplest version is to use the same private/public keypair for both ends.

Sketch:

  • KernelManager issues private, public keypair with zmq.curve_keypair()
  • store private_key and public_key in connection file
  • kernel sockets set CURVE_PRIVATEKEY, CURVE_PUBLICKEY, CURVE_SERVER=1
  • client sockets set the same CURVE_PRIVATEKEY, CURVE_PUBLICKEY, and use the public key in CURVE_SERVERKEY (alternative: client sockets issue new private/public keys, but if the private key is already in the connection file, I see no benefit to issuing more keypairs)

Here's an example of an authenticated socket pair in pyzmq:

curve socket example
import asyncio
import zmq
import zmq.asyncio as zaio

async def main():
    public, private = zmq.curve_keypair()
    ctx = zaio.Context()
    server = ctx.socket(zmq.ROUTER)
    # server socket is a 'curve server'
    server.CURVE_SECRETKEY = private
    server.CURVE_PUBLICKEY = public
    server.CURVE_SERVER = True

    url = "tcp://127.0.0.1:5555"
    server.bind(url)

    no_auth_client = ctx.socket(zmq.DEALER)

    auth_client = ctx.socket(zmq.DEALER)
    auth_client.CURVE_SECRETKEY = private
    auth_client.CURVE_PUBLICKEY = public
    auth_client.CURVE_SERVERKEY = public # this authenticates the client

    auth_client.connect(url)
    no_auth_client.connect(url)


    for i in range(5):
        # messages from 'auth_client' will be received
        asyncio.ensure_future(auth_client.send(b'auth'))
        # messages from 'no_auth_client' will never be delivered
        asyncio.ensure_future(no_auth_client.send(b'noauth'))
        msg = await server.recv_multipart()
        print("Received", msg)

    ctx.destroy(linger=0)

if __name__ == "__main__":
    asyncio.run(main())

Benefits

  • transport-level encryption, which was not previously available. CurveZMQ provides forward-secrecy, meaning access to the keys is not sufficient to decrypt captured traffic in the future. It is sufficient for a live man-in-the-middle attack.
  • connection-level authentication removes need to check message signatures (the less security-related code we implement ourselves, the better!), and is vastly simpler to implement for both kernels and clients
  • connection-level authentication prevents non-authenticated clients with access to ports from monitoring IOPub, which is only currently preventable when using ipc:// transport and file permissions

Caveats

  • Severe crash issues in libzmq regarding threadsafety on systems without getrandom, closed as "wontfix: use getrandom": threadsafety issue with curve_keypair, libsodium, randombytes_close zeromq/libzmq#4241 . I think this is a huge mistake, and have backported the opt-in patches on libzmq to both pyzmq's bundled libzmq and conda-forge libzmq, but may turn out to be a major problem.
  • Requires implementation by kernels, and capability advertisement in kernelspecs; always a high hurdle

Alternatives:

  • broader support for zmq auth, including PLAIN (username:password, authentication but not encryption), GSSAPI, etc.
  • full support for ZAP (zmq auth protocol), which is a much bigger burden on kernels, but would enable more sophisticated access control

Both of these would require more generic exposure of general options for zmq, whereas the proposal as it is only requires sharing of a single string (or two strings) for the key pair, under the exact same model we already have, and only setting a 3 socket options (same values on all sockets), a fairly minor change in practice. Plus, all changes are in socket creation, nowhere else in the protocol implementation.

@minrk
Copy link
Member Author

minrk commented Sep 21, 2021

we can avoid passing the private key in the connection file by specifying that it will be in an environment variable, e.g. $JUPYTER_CURVE_SECRETKEY. In which case, clients should assume that they will need to issue their own private/public keypair. This doesn't really make any difference to clients, but it means the connection file is no longer enough info for a man-in-the-middle attack. It's also no longer enough info for kernels to set up their sockets.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant