JethroData is an index-based SQL engine for Hadoop. It lets you run 1000 X faster interactive ad-hoc queries, live dashboards and reports. With Jethro, you enjoy the scalability of Hadoop with the performance of an analytical database, in one system. Jethro works by automatically indexing data as it is written into Hadoop. Queries use indexes to access only the data they need instead of performing a full-scan of the entire dataset, leading to an increase in speed and a dramatic reduction in use of computing resources.
Simple and Easy
- Implementing Jethro is simple and quick. It is added to any standard Hadoop distribution.
- You can select the tables that you want indexed.
- With storage scale-out and elastic query nodes, Jethro can support the loading of millions of events per minute, fast queries of TB-size tables, and hundreds of concurrent users.
Fully-indexed data
- Jethro’s index structure and algorithms were designed for Big Data.
- Indexes speed up query times in almost all cases. Query elements are answered by an index, sometimes without even accessing the data. Moreover, the index makes it easy to identify the rows that would satisfy the query and limits the actual data read to those rows alone
- Jethro’s indexes are built in the background and are stored on HDFS.
Standard SQL and BI Integration
- Jethro enables standard SQL access for query, data manipulation and definition, with JDBC and ODBC support.
- Jethro’s support for complex SQL joins is unparalleled in its ability to perform joins on multiple large tables without any restrictions.
Hadoop Distributed File System (HDFS)
- Jethro saves its indexes in HDFS and is compatible with the leading Hadoop distributions
- Hadoop cluster administration is the same as with any HDFS cluster and requires no additional tools or knowledge
- Relying on HDFS provides Jethro with scalable storage on commodity hardware and high availability
Query live data
- Once data is received, the loader processes it in memory. Based on optimization parameters, data will be committed to disk, typically every few seconds or even faster. Once data is committed, it becomes fully accessible to queries.
- Using Jethro you can analyze raw granular data almost immediately.
Highly efficient use of system resources
- Using Jethro results in a 90% reduction in I/Os and a 50% reduction in computing nodes / power through compression and selective processing
- On the other hand, Jethro’s approach doesn’t increase network traffic. Traffic usage is similar to other methods of using Hadoop as a data store.
Our technology in unique. To learn more about our technology, go to our technology page.
Jethro Data is in Private Beta. If you’d like to be one of our first users and see Jethro for yourself, contact us here.
