Curator is a tool from Elastic to help manage your ElasticSearch cluster.
For certain logs/data, we use one ElasticSearch index per year/month/day and might keep a rolling 7 day window of history.
This means that every day we need to create, backup, and delete some indices.
Curator helps make this process automated and repeatable.
A non SSL cluster configuration will look much simpler like:
Now, we need to define an action. i.e. what will curator do. There are many actions to choose from. Check the documentation for more information.
ElasticSearch doesn’t provide automatic removal of data.
As example we will delete all .watcher-history and .monitoring-*indices that are older than 3 days. We will use Delete Indices as the action.
Our indices are named with YYYY.MM.DD duffix, so we have to tell Curator about our format and what indices to remove.
Below is the sample action file delete3DaysOldUselessIndices.yml, which will delete the watcher indices which are older than 3 days:
To run this action, simple use the command:
You’ll want to remove snapshots after a certain time as well otherwise snapshot performance will reduce dramatically as the number of snapshots will grow.
With date-based indices, only the current index is being written to, so it is safe to make older indices read-only.
To run this action, simple use the command:
For certain logs/data, we use one ElasticSearch index per year/month/day and might keep a rolling 7 day window of history.
This means that every day we need to create, backup, and delete some indices.
Curator helps make this process automated and repeatable.
Installation
Curator is written in Python, so will need pip to install it:pip install elasticsearch-curator curator --config ./curator_cluster_config.yml curator_actions.yml --dry-run
Configuration
Create a file curator_cluster_config.yml with following contents:--- # Remember, leave a key empty if there is no value. None will be a string, not a Python "NoneType" client: hosts: - "es_coordinating_01.singhaiuklimited.com" port: 9200 url_prefix: use_ssl: True # The certificate file is the CA certificate used to sign all ES node certificates. # Use same CA certificate to generate and sign the certificate running curator (specified in properties client_cert and client_key) certificate: '/work/elk/elasticsearch-6.3.2/config/x-pack/certificate-bundle/ca/ca.crt' client_cert: '/work/elk/elasticsearch-6.3.2/config/x-pack/certificate-bundle/myhostname/myhostname.crt' client_key: '/work/elk/elasticsearch-6.3.2/config/x-pack/certificate-bundle/myhostname/myhostname.key' ssl_no_validate: False # Username password to connect to ES using basic auth http_auth: "username:password" timeout: 30 master_only: False logging: loglevel: INFO logfile: logformat: default blacklist: ['elasticsearch', 'urllib3']
A non SSL cluster configuration will look much simpler like:
--- # Remember, leave a key empty if there is no value. None will be a string, not a Python "NoneType" client: hosts: - "es_coordinating_01.singhaiuklimited.com" port: 9200 url_prefix: use_ssl: False certificate: client_cert: client_key: ssl_no_validate: False http_auth: "username:password" logging: loglevel: WARNING logfile: logformat: default blacklist: ['elasticsearch', 'urllib3']
Now, we need to define an action. i.e. what will curator do. There are many actions to choose from. Check the documentation for more information.
Removing time-series indices
ElasticSearch is a great choice for storing time-series data for a number of reasons. ElasticSearch indices teamplates automatically create indices and aliases allow to seamlessly search across many indices.ElasticSearch doesn’t provide automatic removal of data.
As example we will delete all .watcher-history and .monitoring-*indices that are older than 3 days. We will use Delete Indices as the action.
Our indices are named with YYYY.MM.DD duffix, so we have to tell Curator about our format and what indices to remove.
Below is the sample action file delete3DaysOldUselessIndices.yml, which will delete the watcher indices which are older than 3 days:
--- # Remember, leave a key empty if there is no value. None will be a string, not a Python "NoneType" actions: 1: action: delete_indices description: >- "Delete indices older than 3 days (based on index name), for .watcher-history- or .monitoring-es-6- or .monitoring-kibana-6- or .monitoring-logstash-6- prefixed indices. Ignore the error if the filter does not result in an actionable list of indices (ignore_empty_list) and exit cleanly." options: timeout_override: 300 continue_if_exception: True ignore_empty_list: True disable_action: False filters: - filtertype: pattern kind: regex value: '^\.(monitoring-es-6-|monitoring-kibana-6-|monitoring-logstash-6-|watcher-history-6-).*$' exclude: - filtertype: age source: name direction: older timestring: '%Y.%m.%d' unit: days unit_count: 3
To run this action, simple use the command:
curator --config ./curator_cluster_config.yml ./delete3DaysOldUselessIndices.yml --dry-run 2017-10-04 12:15:38,544 INFO Preparing Action ID: 1, "delete_indices" 2017-10-04 12:15:38,900 INFO Trying Action ID: 1, "delete_indices": "Delete indices older than 1 day (based on index name), for .watcher-history- or .monitoring-es-6- or .monitoring-kibana-6- or .monitoring-logstash-6- prefixed indices. Ignore the error if the filter does not result in an actionable list of indices (ignore_empty_list) and exit cleanly." 2017-10-04 12:15:39,351 INFO DRY-RUN MODE. No changes will be made. 2017-10-04 12:15:39,351 INFO (CLOSED) indices may be shown that may not be acted on by action "delete_indices". 2017-10-04 12:15:39,351 INFO Action ID: 1, "delete_indices" completed. 2017-10-04 12:15:39,352 INFO Job completed.The --dry-run mode will not actually delete the index. It can be used to test the output of the action.
Managing snapshots
Another task that Curator helps us automate is using Elasticsearch snapshots.--- # Remember, leave a key empty if there is no value. None will be a string, not a Python "NoneType" actions: 1: action: snapshot description: >- Snapshot selected indices to 'repository' with the snapshot name or name pattern in 'name'. Use all other options as assigned options: repository: <repository name=""> # Leaving name blank will result in the default 'curator-%Y%m%d%H%M%S' name: wait_for_completion: True max_wait: 3600 wait_interval: 10 filters: - filtertype: ...This will create a snapshot of all your indices with a name such as curator-20170928193030, which works fine for our use case (of course you can customize the name and date format).
You’ll want to remove snapshots after a certain time as well otherwise snapshot performance will reduce dramatically as the number of snapshots will grow.
--- # Remember, leave a key empty if there is no value. None will be a string, not a Python "NoneType" actions: 1: action: delete_snapshots description: "Delete selected snapshots from 'repository'" options: repository: <repository name=""> retry_interval: 120 retry_count: 3 filters: - filtertype: ...
Changing Index Settings
For indices that aren’t being actively written to, you can make them read only.With date-based indices, only the current index is being written to, so it is safe to make older indices read-only.
--- # Remember, leave a key empty if there is no value. None will be a string, not a Python "NoneType" actions: 1: action: index_settings description: >- Set Monitoring and watcher indices older than 1 day to be read only (block writes) options: disable_action: False index_settings: index: blocks: write: True ignore_unavailable: False preserve_existing: False filters: - filtertype: pattern kind: regex value: '^\.(monitoring-es-6-|monitoring-kibana-6-|monitoring-logstash-6-|watcher-history-6-).*$' exclude: - filtertype: age source: name direction: older timestring: '%Y.%m.%d' unit: days unit_count: 1
To run this action, simple use the command:
curator --config ./curator_cluster_config.yml readOnly1DayOldUselessIndices.yml --dry-run 2017-10-03 16:39:10,237 INFO Preparing Action ID: 1, "index_settings" 2017-10-03 16:39:10,602 INFO Trying Action ID: 1, "index_settings": Set Monitoring ES indices older than 1 day to be read only (block writes) 2017-10-03 16:39:11,075 INFO DRY-RUN MODE. No changes will be made. 2017-10-03 16:39:11,075 INFO (CLOSED) indices may be shown that may not be acted on by action "indexsettings". 2017-10-03 16:39:11,075 INFO DRY-RUN: indexsettings: .monitoring-es-6-2017.10.02 with arguments: {'index': {'blocks': {'write': True}}} 2017-10-03 16:39:11,075 INFO DRY-RUN: indexsettings: .monitoring-kibana-6-2017.10.02 with arguments: {'index': {'blocks': {'write': True}}} 2017-10-03 16:39:11,075 INFO Action ID: 1, "index_settings" completed. 2017-10-03 16:39:11,075 INFO Job completed.
Shrinking static indices
For indices that aren’t being actively written to, you can shrink them to reduce and merge the shards/segments that represent the index’s data on disk. Shrinking an index is a similar concept to defragmenting your hard drive. Indices can only be shrunk if they satisfy the following requirements:- The source index must be marked as read-only
- A (primary or replica) copy of every shard in the index must be relocated to the same node
- The cluster must have health green
- The target index must not exist
- The source index must have more primary shards than the target index.
- The number of primary shards in the target index must be a factor of the number of primary shards in the source index. The source index must have more primary shards than the target index.
- The index must not contain more than 2,147,483,519 documents in total across all shards that will be shrunk into a single shard on the target index as this is the maximum number of docs that can fit into a single shard.
- The node handling the shrink process must have sufficient free disk space to accommodate a second copy of the existing index.
- When an index is being written to, the segment merge process happens automatically so you don’t want to explicitly call shrink on an active index. With date-based indices, only the current index is being written to, so it is safe to shrink older indices.
--- # Remember, leave a key empty if there is no value. None will be a string, not a Python "NoneType" actions: 1: action: shrink description: >- Shrink monitoring and watcher-history indices older than 1 day on the node with the most available space. Delete source index after successful shrink, then reroute the shrunk index with the provided parameters. options: ignore_empty_list: True # The shrinking will take place on the node identified by shrink_node, # unless DETERMINISTIC is specified, in which case Curator will evaluate # all of the nodes to determine which one has the most free space. # If multiple indices are identified for shrinking by the filter block, # and DETERMINISTIC is specified, the node selection process will be # repeated for each successive index, preventing all of the space being # consumed on a single node. shrink_node: DETERMINISTIC node_filters: # If you have a small cluster with only master/data nodes, you must set permit_masters to True in order to select one of those nodes as a potential shrink_node. permit_masters: False # exclude_nodes: ['some_named_node'] # The resulting index will have number_of_shards primary shards, and number_of_replicas replica shards number_of_shards: 1 number_of_replicas: 1 # Name of target index will be shrink_prefix + the source index name + shrink_suffix shrink_prefix: shrink_suffix: '-shrink' # By default, Curator will delete the source index after a successful shrink. # This can be disabled by setting delete_after to False. # If the source index, is not deleted after a successful shrink, Curator will # remove the read-only setting and the shard allocation routing applied to the # source index to put it on the shrink node. # Curator will wait for the shards to stop rerouting before continuing. delete_after: True # The post_allocation option applies to the target index after the shrink is complete. # If set, this shard allocation routing will be applied (after a successful shrink) and # Curator will wait for all shards to stop rerouting before continuing. # post_allocation: # allocation_type: include # Following will allocate shards to nodes that have "node_tag" attribute with value "cold" # key: node_tag # value: cold wait_for_active_shards: 1 # The only extra_settings which are acceptable are settings and aliases. # Please note that in the example above, while best_compression is being # applied to the new index, it will not take effect until new writes are # made to the index, such as when force-merging the shard to a single segment. extra_settings: settings: index.codec: best_compression wait_for_completion: True wait_interval: 9 max_wait: -1 filters: - filtertype: pattern kind: regex value: '^\.(monitoring-es-6-|monitoring-kibana-6-|monitoring-logstash-6-|watcher-history-6-).*$' exclude: - filtertype: age source: name direction: older timestring: '%Y.%m.%d' unit: days unit_count: 1 2: action: index_settings description: >- Set monitoring and watcher-history indices older than 1 day to be read only (block writes) options: disable_action: False index_settings: index: blocks: write: True ignore_unavailable: False preserve_existing: False filters: - filtertype: pattern kind: regex value: '^\.(monitoring-es-6-|monitoring-kibana-6-|monitoring-logstash-6-|watcher-history-6-).*$' - filtertype: pattern kind: suffix value: -shrink 3: action: alias description: "Add/Remove selected indices to or from the .monitoring-es alias" options: name: monitoring-es add: filters: - filtertype: pattern kind: regex value: '^\.monitoring-es.*$' - filtertype: pattern kind: suffix value: -shrink 4: action: alias description: "Add/Remove selected indices to or from the .monitoring-kibana alias" options: name: monitoring-kibana add: filters: - filtertype: pattern kind: regex value: '^\.monitoring-kibana.*$' - filtertype: pattern kind: suffix value: -shrink 5: action: alias description: "Add/Remove selected indices to or from the .monitoring-logstash alias" options: name: monitoring-logstash add: filters: - filtertype: pattern kind: regex value: '^\.monitoring-logstash.*$' - filtertype: pattern kind: suffix value: -shrink 6: action: alias description: "Add/Remove selected indices to or from the .watcher-history alias" options: name: watcher-history add: filters: - filtertype: pattern kind: regex value: '^\.watcher-history.*$' - filtertype: pattern kind: suffix value: -shrink
Scheduling Curator Jobs
If you want to schedule it in a cron, you can do so using crontab -e0 0 6 * * * root curator --config ./curator_cluster_config.yml ./delete3DaysOldUselessIndices.ymlThe above configuration will cleanup the indices older than 1 day everyday at 6 AM.
Comments