Skip to main content

Aggregation based comparison in Watch condition and hard-coded variables

Watch: Compare Vs Script condition

If you need to do comparisons with aggregation buckets in watchers, then instead of this:
"condition": {
  "compare": {
    "ctx.payload.aggregations.agg1.buckets.0.doc_count": {
      "gte": 5
    }
  }
}

please prefer this, as this takes care of "Index out of bound" exceptions:
"condition": {
  "script": {
    "source": "return (ctx.payload.aggregations.agg1.buckets.size() > 0 && ctx.payload.aggregations.agg1.buckets.0.doc_count >= params.threshold)",
    "lang": "painless",
    "params": {
      "threshold": 5
    }
  }
}

Hard-coded variables in scripts

  • ElasticSearch throws circuit_breaking_exception error when it sees more than 15 new dynamic scripts within a minute.
  • Sometimes this error is also thrown when master goes down abruptly.
  • This is so because, it has to compile every new/unique script it sees and this is a CPU hog.
  • You can change the settings dynamically by setting
    script.max_compilations_rate
    to a larger value, whose default value is 15/minute.
  • It then caches the compiled script, but there is an upper limit of 100 or 65,535 bytes, whichever is lower. You can configure the size of this cache by using the
    script.cache.max_size setting
    and the size of stored scripts can be changed by setting
    script.max_size_in_bytes
    setting to increase that soft limit, but if scripts are really large then a native script engine should be considered.
  • By default, scripts do not have a time-based expiration, but you can change this behaviour by using the
    script.cache.expire
    setting.

Solution

  • For ElasticSearch, an script with threshold value of 5 is different to another script with threshold 10. But if we parameterize a changing variable, then it will be a single script compiled and kept in cache with parameters injected at run time.
  • Just to be clear, hard-coded variables are different than hard-coded constants. You don't need to parameterize hard-coded constants.

Comments

Popular posts from this blog

MPlayer subtitle font problem in Windows

While playing a video with subtitles in mplayer, I was getting the following problem: New_Face failed. Maybe the font path is wrong. Please supply the text font file (~/.mplayer/subfont.ttf). Solution is as follows: Right click on "My Computer". Select "Properties". Go to "Advanced" tab. Click on "Environment Variables". Delete "HOME" variable from User / System variables.

wget and curl behind corporate proxy throws certificate is not trusted or certificate doesn't have a known issuer

If you try to run wget or curl in Ununtu/Debian behind corporate proxy, you might receive errors like: ERROR: The certificate of 'apertium.projectjj.com' is not trusted. ERROR: The certificate of 'apertium.projectjj.com' doesn't have a known issuer. wget https://apertium.projectjj.com/apt/apertium-packaging.public.gpg ERROR: cannot verify apertium.projectjj.com's certificate, issued by 'emailAddress=proxyteam@corporate.proxy.com,CN=diassl.corporate.proxy.com,OU=Division UK,O=Group name,L=Company,ST=GB,C=UK': Unable to locally verify the issuer's authority. To connect to apertium.projectjj.com insecurely, use `--no-check-certificate'. To solution is to install your company's CA certificate in Ubuntu. In Windows, open the first part of URL in your web browser. e.g. open https://apertium.projectjj.com in web browser. If you inspect the certifcate, you will see the same CN (diassl.corporate.proxy.com), as reported by the error above ...

Kafka performance tuning

Performance Tuning of Kafka is critical when your cluster grow in size. Below are few points to consider to improve Kafka performance: Consumer group ID : Never use same exact consumer group ID for dozens of machines consuming from different topics. All of those commits will end up on the same exact partition of __consumer_offsets , hence the same broker, and this might in turn cause performance problems. Choose the consumer group ID to group_id+topic_name . Skewed : A broker is skewed if its number of partitions is greater that the average of partitions per broker on the given topic. Example: 2 brokers share 4 partitions, if one of them has 3 partitions, it is skewed (3 > 2). Try to make sure that none of the brokers is skewed. Spread : Brokers spread is the percentage of brokers in the cluster that has partitions for the given topic. Example: 3 brokers share a topic that has 2 partitions, so 66% of the brokers have partitions for this topic. Try to achieve 100% broker spread...