Skip to content
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions molecule/zookeeper-digest-rhel/molecule.yml
Original file line number Diff line number Diff line change
Expand Up @@ -120,6 +120,11 @@ provisioner:
all:
scenario_name: zookeeper-digest-rhel

# Enable ansible_become to simulate customer production environment
ansible_become: true
ansible_become_method: sudo
ansible_become_user: root

zookeeper_quorum_authentication_type: digest
zookeeper_client_authentication_type: digest
sasl_protocol: plain
Expand Down
25 changes: 25 additions & 0 deletions molecule/zookeeper-digest-rhel/verify.yml
Original file line number Diff line number Diff line change
Expand Up @@ -37,3 +37,28 @@
file_path: /etc/schema-registry/schema-registry.properties
property: kafkastore.security.protocol
expected_value: SASL_PLAINTEXT

- name: Verify - ZooKeeper chroot creation with digest authentication
hosts: kafka_broker
gather_facts: false
tasks:
- name: Import Variables
import_role:
name: variables

- name: Verify chroot exists in ZooKeeper with authentication
shell: >
{% if kafka_broker_final_properties['zookeeper.set.acl']|default('false')|lower == 'true' %}KAFKA_OPTS='-Djava.security.auth.login.config={{kafka_broker.jaas_file}}'{% endif %} \
{{ binary_base_path }}/bin/zookeeper-shell {{ hostvars[groups['zookeeper'][0]] | confluent.platform.resolve_hostname }}:{{zookeeper_client_port}} \
ls /
register: zk_root_listing
run_once: true
changed_when: false
failed_when: false
Copy link

Copilot AI Sep 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This duplicates the same complex conditional logic from the main task. Consider extracting this authentication logic into a variable or using the environment parameter to avoid code duplication.

Suggested change
- name: Verify chroot exists in ZooKeeper with authentication
shell: >
{% if kafka_broker_final_properties['zookeeper.set.acl']|default('false')|lower == 'true' %}KAFKA_OPTS='-Djava.security.auth.login.config={{kafka_broker.jaas_file}}'{% endif %} \
{{ binary_base_path }}/bin/zookeeper-shell {{ hostvars[groups['zookeeper'][0]] | confluent.platform.resolve_hostname }}:{{zookeeper_client_port}} \
ls /
register: zk_root_listing
run_once: true
changed_when: false
failed_when: false
- name: Set KAFKA_OPTS for ZooKeeper shell if needed
set_fact:
zk_kafka_opts: >-
{% if kafka_broker_final_properties['zookeeper.set.acl']|default('false')|lower == 'true' %}
-Djava.security.auth.login.config={{ kafka_broker.jaas_file }}
{% else %}
""
{% endif %}
- name: Verify chroot exists in ZooKeeper with authentication
shell: >
{{ binary_base_path }}/bin/zookeeper-shell {{ hostvars[groups['zookeeper'][0]] | confluent.platform.resolve_hostname }}:{{zookeeper_client_port}} \
ls /
register: zk_root_listing
run_once: true
changed_when: false
failed_when: false
environment:
KAFKA_OPTS: "{{ zk_kafka_opts }}"

Copilot uses AI. Check for mistakes.

- name: Assert chroot creation succeeded
assert:
that:
- zk_root_listing.rc == 0
- "zookeeper_chroot.lstrip('/') in zk_root_listing.stdout"
fail_msg: ZOOKEEPER CHROOT CREATION FAILED
1 change: 1 addition & 0 deletions roles/kafka_broker/tasks/main.yml
Original file line number Diff line number Diff line change
Expand Up @@ -432,6 +432,7 @@
# Only runs with zookeeper
- name: Create Zookeeper chroot
shell: >
{% if kafka_broker_final_properties['zookeeper.set.acl']|default('false')|lower == 'true' %}KAFKA_OPTS='-Djava.security.auth.login.config={{kafka_broker.jaas_file}}'{% endif %} \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this not caught in molecule tests ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this issue also present in case of zookeeper client authentication being kerberos as there too we are defining the zookeeper.set.acl property in kafka ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can is zookeeper-shell command also fail in these tasks Get Kafka Cluster ID from Zookeeper
or Get Controller Broker ID as there also we have not provided the jaas file as argument ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this not caught in molecule tests ?

In the related incident, the customer's znode was nested:
image

According to the above, the getACL command is telling only zkuser has privilege to w on that particular znode. In ansible playbook, since in most cases znode is only at root this command to create znode doesn't fail in molecule tests, but in case of customer's its nested znode. For this kind of scenario, we need the auth configs to be passed.

is this issue also present in case of zookeeper client authentication being kerberos as there too we are defining the zookeeper.set.acl property in kafka ?

Yes, if a nested znode with similar permissions are used, then we will see this issue since zookeeper.set.acl = true

can is zookeeper-shell command also fail in these tasks Get Kafka Cluster ID from Zookeeper
or Get Controller Broker ID as there also we have not provided the jaas file as argument ?

No, these tasks will not fail

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lets add a nested znode in 1 or few of our test cases as well

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding a molecule test for nested znode is not straightforward. The way our playbooks work, in order to create kafkadev/01, /kafkadev should exist beforehand. So, the playbook must run twice. This was caught during an upgrade scenario and this case can exist in pre-existing clusters. That is why, we have done manual testing for this

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. probably this can be handled using prepare.yml.
  2. shouldnt this thing be handled in playbook code so that it doesnt fail for customers ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On further discussion now it seems okay to not add additional tests for this. Since this fix is to smoothen some upgrades for users. handling of nested directory paths is not natively supported via cp-ansible and since 8.0 onwards zk is deprecated it doesnt seem worth the effort to add that kind of support

{{ binary_base_path }}/bin/zookeeper-shell {{ hostvars[groups['zookeeper'][0]] | confluent.platform.resolve_hostname }}:{{zookeeper_client_port}} \
{% if zookeeper_ssl_enabled|bool %}-zk-tls-config-file {{ kafka_broker.zookeeper_tls_client_config_file if kafka_broker_secrets_protection_enabled else kafka_broker.config_file }}{% endif %} \
create {{zookeeper_chroot}} ""
Expand Down