Skip to main content


Securing ElasticSearch / Kafka clusters with SSL

Recent posts

Logstash throws error while installing plugins

While trying to install logstash plugin, I was getting below error:
$ /work/logstash/logstash-5.5.2/bin/logstash-plugin install logstash-input-cloudwatch WARNING: A maven settings file already exist at ~/.m2/settings.xml, please review the content to make sure it include your proxies configuration. Validating logstash-input-cloudwatch Installing logstash-input-cloudwatch Error Bundler::InstallError, retrying 1/10 An error occurred while installing logstash-core (5.5.2), and Bundler cannot continue. Make sure that `gem install logstash-core -v '5.5.2'` succeeds before bundling. Error Bundler::InstallError, retrying 2/10 An error occurred while installing logstash-core (5.5.2), and Bundler cannot continue. Make sure that `gem install logstash-core -v '5.5.2'` succeeds before bundling. Here are the things I did to make it work:
Created maven ~/.m2/settings.xml file
<?xml version="1.0" encoding="UTF-8"?> <settings xmlns="http://maven.apache.…

Read and parse CSV containing Key-value pairs using Akka Streams

Let's say we want to read and parse a CSV file containing Key value pairs.
We will be using Alpakka's CSVParser for this.

A snippet of a file (src/main/resources/CountryNicCurrencyKeyValueMap.csv) that shows mapping from Country NIC code to currency code with pipe (|) as field delimiter:
Following is the code:
import import java.nio.charset.StandardCharsets import import import import{FileIO, Flow, Sink} import akka.util.ByteString import scala.collection.immutable import scala.concurrent.{ExecutionContext, _} import scala.concurrent.duration._ implicit val system: ActorSystem = ActorSystem("TestApplication") implicit val ma…

Kafka performance tuning

Performance Tuning of Kafka is critical when your cluster grow in size. Below are few points to consider to improve Kafka performance:
Consumer group ID: Never use same exact consumer group ID for dozens of machines consuming from different topics. All of those commits will end up on the same exact partition of __consumer_offsets, hence the same broker, and this might in turn cause performance problems. Choose the consumer group ID to group_id+topic_name.
Skewed: A broker is skewed if its number of partitions is greater that the average of partitions per broker on the given topic. Example: 2 brokers share 4 partitions, if one of them has 3 partitions, it is skewed (3 > 2). Try to make sure that none of the brokers is skewed.
Spread: Brokers spread is the percentage of brokers in the cluster that has partitions for the given topic. Example: 3 brokers share a topic that has 2 partitions, so 66% of the brokers have partitions for this topic. Try to achieve 100% broker spread.
Leader skew…

Migrating ElasticSearch 2.x to ElasticSearch 5.x

In my previous blog post, I described how to install and configure an ElasticSearch 5.x cluster.
In this blog post, we will look at how to migrate data.
Consult this table to verify that rolling upgrades are supported for your version of Elasticsearch.

Full cluster upgrade (2.x to 5.x)We will have to do full cluster upgrade and restart.

Install Elasticsearch Migration Helper on old cluster. This plugin will help you to check whether you can upgrade directly to the next major version of Elasticsearch, or whether you need to make changes to your data and cluster before doing so.
cd /work/elk/elasticsearch-2.4.3/ curl -O -L ./bin/plugin install file:///work/elk/elasticsearch-2.4.3/ Start old ElasticSearch:
./bin/elasticsearch & Browse elasticsearch-migration
Click on "Cluster Checkup" > "Run checks now". Check all the suggest…

ElasticSearch max file descriptors too low error

ElasticSearch 5.x requires a minimum of Max file descriptors 65536 and Max virtual memory areas 262144.
It throws an error on start-up if these are set to very low value.
ERROR: bootstrap checks failed max file descriptors [16384] for elasticsearch process is too low, increase to at least [65536] max virtual memory areas vm.max_map_count [65530] is too low, increase to at least [262144]
Check current values using:
$ cat /proc/sys/fs/file-max 16384 $ cat /proc/sys/vm/max_map_count 65530 $ ulimit -Hn 16384 $ ulimit -Sn 4096
To fix this, following files need to change/add below settings:
Recommended: Add a new file 99-elastic.conf under /etc/security/limits.d with following settings:
elasticsearch - nofile 800000 elasticsearch - nproc 16384 defaultusername - nofile 800000 defaultusername - nproc 16384 Alternatively, edit /etc/sysctl.conf with following settings:
fs.file-max = 800000 vm.max_map_count=300000

ElasticSearch Curator

Curator is a tool from Elastic to help manage your ElasticSearch cluster.
For certain logs/data, we use one ElasticSearch index per year/month/day and might keep a rolling 7 day window of history.
This means that every day we need to create, backup, and delete some indices.
Curator helps make this process automated and repeatable.

InstallationCurator is written in Python, so will need pip to install it:
pip install elasticsearch-curator curator --config ./curator_cluster_config.yml curator_actions.yml --dry-run
ConfigurationCreate a file curator_cluster_config.yml with following contents:
--- # Remember, leave a key empty if there is no value. None will be a string, not a Python "NoneType" client: hosts: - "" port: 9200 url_prefix: use_ssl: True # The certificate file is the CA certificate used to sign all ES node certificates. # Use same CA certificate to generate and sign the certificate running curator (specif…