Skip to main content

Securing ElasticSearch / Kafka clusters with SSL

By default, there is no encryption, authentication, or ACLs configured in Elasticsearch/Kafka. Any client can communicate to ES nodes / Kafka brokers via the PLAINTEXT port. It is critical that access via this port is restricted to trusted clients only. Network segmentation and/or authorization ACLs can be used to restrict access to trusted IPs in such cases. If neither is used, the cluster is wide open and can be accessed by anyone.

While non-secured clusters are supported, as are a mix of authenticated, unauthenticated, encrypted and non-encrypted clients, it is recommended to secure the components in your cluster.

Secure Sockets Layer (SSL) is the predecessor of Transport Layer Security (TLS), and SSL has been deprecated since June 2015. However, generally people use the term SSL instead of TLS in configuration and code.

SSL can be configured for encryption or authentication. You may configure just SSL encryption (by default SSL encryption includes certificate authentication of the server) and independently choose a separate mechanism for client authentication, e.g. SSL, SASL, etc. Note that SSL encryption, technically speaking, already enables 1-way authentication in which the client authenticates the server certificate. So when referring to SSL authentication, it is really referring to 2-way authentication in which the server also authenticates the client certificate.

SSL Keys and Certificates

SSL uses private-key/certificates pairs which are used during the SSL handshake process.

  • Each server node needs its own private-key/certificate pair, and the client uses the certificate to authenticate the server node
  • Each logical client needs a private-key/certificate pair if client authentication is enabled, and the server node uses the certificate to authenticate the client

Each server node and logical client can be configured with a truststore, which is used to determine which certificates (broker or logical client identities) to trust (authenticate). The truststore can be configured in many ways, consider the following two examples:

  1. the truststore contains one or many certificates: the server node or logical client will trust any certificate listed in the truststore
  2. the truststore contains a Certificate Authority (CA): the server node or logical client will trust any certificate that was signed by the CA in the truststore

Using the CA (2) is more convenient, because adding a new server node or client doesn’t require a change to the truststore. The CA case (2) is outlined in this diagram:

However, with the CA case (2), blocking authentication for individual server nodes or clients that were previously trusted via this mechanism (certificate revocation is typically done via Certificate Revocation Lists or the Online Certificate Status Protocol) is not conveniently supported, so one would have to rely on authorization to block access. In contrast, with case (1), blocking authentication would be achieved by removing the server node or client’s certificate from the truststore.

Creating SSL Keys and Certificates

Each machine in the cluster has a public-private key pair, and a certificate to identify the machine. The certificate, however, is unsigned, which means that an attacker can create such a certificate to pretend to be any machine.

Therefore, it is important to prevent forged certificates by signing them for each machine in the cluster. A certificate authority (CA) is responsible for signing certificates, and the cryptography guarantees that a signed certificate is computationally difficult to forge. Thus, as long as the CA is a genuine and trusted authority, the clients have high assurance that they are connecting to the authentic machines.

The keystore stores each machine’s own identity. The truststore stores all the certificates that the machine should trust. Importing a certificate into one’s truststore also means trusting all certificates that are signed by that certificate. This attribute is called the chain of trust, and it is particularly useful when deploying SSL on a large Kafka cluster. You can sign all certificates in the cluster with a single CA, and have all machines share the same truststore that trusts the CA. That way all machines can authenticate all other machines.

To deploy SSL, the general steps are:
  • Generate the keys and certificates
  • Create your own Certificate Authority (CA)
  • Sign the certificate

Comments

Popular posts from this blog

MPlayer subtitle font problem in Windows

While playing a video with subtitles in mplayer, I was getting the following problem: New_Face failed. Maybe the font path is wrong. Please supply the text font file (~/.mplayer/subfont.ttf). Solution is as follows: Right click on "My Computer". Select "Properties". Go to "Advanced" tab. Click on "Environment Variables". Delete "HOME" variable from User / System variables.

wget and curl behind corporate proxy throws certificate is not trusted or certificate doesn't have a known issuer

If you try to run wget or curl in Ununtu/Debian behind corporate proxy, you might receive errors like: ERROR: The certificate of 'apertium.projectjj.com' is not trusted. ERROR: The certificate of 'apertium.projectjj.com' doesn't have a known issuer. wget https://apertium.projectjj.com/apt/apertium-packaging.public.gpg ERROR: cannot verify apertium.projectjj.com's certificate, issued by 'emailAddress=proxyteam@corporate.proxy.com,CN=diassl.corporate.proxy.com,OU=Division UK,O=Group name,L=Company,ST=GB,C=UK': Unable to locally verify the issuer's authority. To connect to apertium.projectjj.com insecurely, use `--no-check-certificate'. To solution is to install your company's CA certificate in Ubuntu. In Windows, open the first part of URL in your web browser. e.g. open https://apertium.projectjj.com in web browser. If you inspect the certifcate, you will see the same CN (diassl.corporate.proxy.com), as reported by the error above ...

Kafka performance tuning

Performance Tuning of Kafka is critical when your cluster grow in size. Below are few points to consider to improve Kafka performance: Consumer group ID : Never use same exact consumer group ID for dozens of machines consuming from different topics. All of those commits will end up on the same exact partition of __consumer_offsets , hence the same broker, and this might in turn cause performance problems. Choose the consumer group ID to group_id+topic_name . Skewed : A broker is skewed if its number of partitions is greater that the average of partitions per broker on the given topic. Example: 2 brokers share 4 partitions, if one of them has 3 partitions, it is skewed (3 > 2). Try to make sure that none of the brokers is skewed. Spread : Brokers spread is the percentage of brokers in the cluster that has partitions for the given topic. Example: 3 brokers share a topic that has 2 partitions, so 66% of the brokers have partitions for this topic. Try to achieve 100% broker spread...