Install Kafka using Ansible, monitor using Prometheus and Grafana

  • Home
  • /
  • Install Kafka using Ansible, monitor using Prometheus and Grafana
Install Kafka using Ansible, monitor using Prometheus and Grafana

Install Kafka using Ansible, monitor using Prometheus and Grafana

Kafka 12 Jul 2020 Siva Nadesan
Table of Contents


In this article we will see how to install Confluent Kafka using Ansible and to monitor the metrics using Prometheus and Grafana. Code used in this article can be found in GitHub

Create Anisble playbook and Install Confluent platform

  • Download and install Anisble for your platform in the client machine. See here for instructions on how to install Anisble.

  • Download the Ansible playbook for Confluent platform from GitHub

  • Create a copy of hosts_example.yml as hosts_lab.yml and make changes to update host names specific to your environment.

  • Here is the output of compare results from changes to hosts_example.yml. Pay attention to changes related to jmx and prometheus like, jmxexporter_enabled and ksql_custom_java_args

Compare Command: reset; sdiff -WBs -w $COLUMNS hosts_example.yml hosts_lab.yml > /tmp/compare.output.temp; sed -i '/^[[:space:]]*$/d;s/[[:space:]]*$//' /tmp/compare.output.temp

  • Edit roles\confluent.common\tasks\main.yml to add the following. Add it before set_fact command. main.yml may not be the best place for these statements but it will work for this demo.
- name: Create UDF directory
    path: "{{ ksql_udf_path }}"
    state: directory
    mode: 0777

- name: Create javatmp directory
    path: "{{ ksql_javatmp_path }}"
    state: directory
    mode: 0777

- name: Create ksql state directory
    path: "{{ ksql_state_path }}"
    state: directory
    mode: 0777

- name: Create rocksdbtmp directory
    path: "{{ ksql_rocksdbtmp_path }}"
    state: directory
    mode: 0777

- name: Create kafka data directory
    path: "{{ ksql_kafka_data_path }}"
    state: directory
    mode: 0777
  • Now we are ready to run the playbook, If you are running this in a server which already confluent platform run below commands to remove old install and start fresh.
# Get list of installed packages and remove it
sudo apt list --installed confluent* | cut -d, -f1 | xargs sudo apt --yes --purge remove

# Stop confluent services
sudo systemctl stop confluent*

# Disable confluent services
sudo systemctl disable confluent*

# Remove systemd directories
sudo rm -Rf /etc/systemd/system/confluent*

# Clean systemctl
sudo systemctl daemon-reload
sudo systemctl reset-failed

# Remove all old directories
sudo rm -Rf /etc/schema-registry/
sudo rm -Rf /etc/kafka/
sudo rm -Rf /etc/confluent-rebalancer/
sudo rm -Rf /etc/confluent-kafka-mqtt/
sudo rm -Rf /etc/confluent-control-center/
sudo rm -Rf /etc/ksql/

sudo rm -Rf /var/log/kafka/
sudo rm -Rf /var/log/confluent/
sudo rm -Rf /var/lib/confluent/
sudo rm -Rf /var/lib/kafka/
sudo rm -Rf /var/lib/kafka-streams/
sudo rm -Rf /var/lib/zookeeper/

sudo rm -Rf /tmp/control-center-logs/
sudo rm -Rf /usr/share/confluent-hub-components/

sudo rm -Rf /opt/confluent/javatmp/
sudo rm -Rf /opt/confluent/kafka/
sudo rm -Rf /opt/confluent/rocksdbtmp/
  • Create ssh key
ssh-keygen -t rsa
  • Validate the public key by copying it to authorized_keys in same machine and then issuing ssh <target-host-name>
cat /home/<user-name>/.ssh/ > ~/.ssh/authorized_keys
ssh <target-host-name>
  • Copy the ssh key from client host with Ansible to target host on which we need to install Confluent.
cat /home/<user-name>/.ssh/ | ssh <user-name>@<target-host-name> 'mkdir -p ~/.ssh && chmod 700 ~/.ssh && cat >> ~/.ssh/authorized_keys && chmod 600 ~/.ssh/authorized_keys'
  • Update /etc/sudoers in target host
Defaults:<user-name> !requiretty  
<user-name> ALL=(ALL) NOPASSWD: ALL
  • Execute the playbook by running
ansible-playbook -i hosts_lab.yml all.yml

Here is some common errors and solutions when running this playbook

Error Solution
fatal: [entechlog-vm-01]: FAILED! => {"msg": "Missing sudo password"} ansible-playbook --ask-become-pass -i hosts_lab.yml all.yml
Jul 11 21:34:04 entechlog-vm-01 schema-registry-start[17870]: Caused by: Failed to write Noop record to kafka store. Make sure to remove all old directories or start the services with a new service id
Jul 11 23:55:12 entechlog-vm-01 kafka-server-start[5411]: Caused by: Address already in use Update ksql_custom_java_args to use a different port for prometheus, Running Docker and Jenkins on same machine may also cause Address already in use error
  • Confluent components should up and running now. You can also see prometheus metrics in http://<target-host-name>:29000/

Install and Configure Prometheus

  • Create Prometheus system user and group
sudo groupadd --system prometheus
sudo useradd -s /sbin/nologin --system -g prometheus prometheus
  • Create data and config directories for Prometheus
sudo mkdir /var/lib/prometheus
for i in rules rules.d files_sd; do sudo mkdir -p /etc/prometheus/${i}; done
  • Download and Install Prometheus
sudo apt update
sudo apt -y install wget curl vim
mkdir -p /tmp/prometheus && cd /tmp/prometheus
curl -s | grep browser_download_url | grep linux-amd64 | cut -d '"' -f 4 | wget -qi -
tar xvf prometheus*.tar.gz
cd prometheus*/
sudo mv prometheus promtool /usr/local/bin/
  • Check installed version
prometheus --version
promtool --version
  • Create Prometheus configuration file
sudo mv prometheus.yml /etc/prometheus/prometheus.yml
sudo mv consoles/ console_libraries/ /etc/prometheus/
  • Update Prometheus configuration file with scrape_configs for ksqlDB
sudo nano /etc/prometheus/prometheus.yml
  # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
  - job_name: 'ksqldb'

    # metrics_path defaults to '/metrics'
    # scheme defaults to 'http'.

    - targets: ['entechlog-vm-01:29000']

If you want to locate the prometheus.yml later on, use the command ps -ef | grep prom | grep yml

  • Create Prometheus systemd service unit file

ExecReload=/bin/kill -HUP \$MAINPID
ExecStart=/usr/local/bin/prometheus \
  --config.file=/etc/prometheus/prometheus.yml \
  --storage.tsdb.path=/var/lib/prometheus \
  --web.console.templates=/etc/prometheus/consoles \
  --web.console.libraries=/etc/prometheus/console_libraries \
  --web.listen-address= \


  • Update directory permissions
for i in rules rules.d files_sd; do sudo chown -R prometheus:prometheus /etc/prometheus/${i}; done
for i in rules rules.d files_sd; do sudo chmod -R 775 /etc/prometheus/${i}; done
sudo chown -R prometheus:prometheus /var/lib/prometheus/
  • Reload systemd daemon and start the service
sudo systemctl daemon-reload
sudo systemctl start prometheus
sudo systemctl enable prometheus
sudo systemctl status prometheus
  • Prometheus comes up in port 9090, Validate by navigating to http://<target-host-name>:9090/ post thumb

Install and Configure Grafana

  • Install and configure Grafana, See here for the instructions.

  • Prometheus comes up in port 3000, Validate by navigating to http://<target-host-name>:3000/. The default user name and password is admin/admin.

Create Grafana Dashboard

  • Add a new datasource and name it as Prometheus-ksqlDB.

    • Set URL to http://<target-host-name>:9090/
    • Set HTTP Method to GET
    • Save and Test the Data Sources post thumb
  • Create a new Dashboard, Add the required Panels and Metrics like the one shown below. post thumb

  • Once you have all the required metrics added, You can visualize the dashboard like the one shown below. post thumb


About The Authors
Siva Nadesan

Siva Nadesan is a Principal Data Engineer. His passion includes data and blogging about technologies. He is also the creator and maintainer of