In the 0.9.0.0 release, the Kafka community added a number of features that, used either separately or together, increase security in a Kafka cluster. These features are considered to be of beta quality. The following security measures are currently supported:
- Authentication of connections to brokers from clients (producers and consumers), other brokers and tools, using either SSL or SASL (Kerberos)
- Authentication of connections from brokers to ZooKeeper
- Encryption of data transferred between brokers and clients, between brokers, or between brokers and tools using SSL (Note that there is a performance degradation when SSL is enabled, the magnitude of which depends on the CPU type and the JVM implementation.)
- Authorization of read / write operations by clients
- Authorization is pluggable and integration with external authorization services is supported
Kafka was originally developed at LinkedIn in 2010. It was originally an open system to encourage adoption; developers could easily create new data streams, add data to the pipeline, and read data as it was created. It succeeded brilliantly at encouraging developers to build new data applications, improved the reliability of systems and applications, and helped LinkedIn scale it’s logging and data infrastructure.
Unfortunately, as Kafka usage grew at LinkedIn (and at other sites), we discovered problems with a totally open system. Developers might inadvertently cause production problems when creating new Kafka streams, engineers might change the configuration of critical systems, and employees might get access to sensitive data. As Kafka has been adopted by larger enterprises with more complex security requirements, we have had to rethink our architecture.
In this course, we will explain how we have secured Apache Kafka. We will explain the threats that Kafka Security mitigates, the changes that we made to Kafka to enable security, and the steps required to secure an existing Kafka cluster.
- Specifically, we will cover:
- New security features in Kafka 0.9
- The risks and threats with a distributed data streaming system
- Common issues with deploying a secure Kafka system
- The access control model for Kafka
- Configuring authentication, access control, and encryption
- Using a secure Kafka cluster with other secure (and insecure) systems
- Testing, monitoring and tuning a secure Kafka cluster
- Future work in Kafka security