Oath Inc., the parent company of Yahoo and AOL, recently announced that it has open sourced Bullet, a real-time query engine for large and complex data sources
Bullet is different than other data query engines. Bullet is a forward-looking query system with no persistence layer that makes it a light weight query system. Bullet uses look-forward query method so no need to repeat same queries again and again.
From the documentation:
Bullet is
- Is a real-time query engine that lets you run queries on very large data streams
- Does not use a a persistence layer. This makes it light-weight, cheap and fast
- Is a look-forward query system. Queries are submitted first and they operate on data that arrive after the query is submitted
- Supports rich queries for filtering and getting Raw data, Counting Distincts, Distincts, Grouping (Sum, Count, Min, Max, Avg), Distributions, and Top K
- Is multi-tenant and can scale for more queries and/or for more data
- Provides a UI and Web Service that are also pluggable for a full end-to-end solution to your querying needs
- Has an implementation on Storm currently. There are plans to implement it on other Stream Processors.
- Is pluggable. Any data source that can be read from Storm can be converted into a standard data container letting you query that data. Data is typed
- Is used at scale and in production at Yahoo with running 500+ queries simultaneously on 200,000 rps (records per second) and tested up to 2,000,000 rps
Bullet is now available on Github.