Skip to main content


Showing posts from July, 2016

ElasticSearch pipeline bucket selector aggregation

ElasticSearch has a concept of bucket selection generated from aggregation.
This works as a pipeline, where first aggregation generates buckets, and then bucket selection further filters out buckets.

We have an ElasticSearch index 'daily_reports', where a row represents a particular version of report.
When a report is created a new row is inserted in the index with a new 'reportId' field value and 'publishDate' field representing the UNIX timestamp.
Each report/row has multiple other fields representing properties of the report, for e.g., 'title', 'activity', 'reportStatus', 'reportLevel', etc.
When the report is edited/deleted, a new row is inserted into the index, with same 'reportId', but different '_id', 'publishDate', 'reportLevel' etc.

Now if user wants to get the latest version for each report matching a particular filter criterion (reportLevel = Monitoring AND reportStatus = 1), we can get …