Our website makes use of cookies like most of the websites. In order to deliver a personalised, responsive and improved experience, we remember and store information about how you use it. This is done using simple text files called cookies which sit on your computer. These cookies are completely safe and secure and will never contain any sensitive information. By clicking continue here, you give your consent to the use of cookies by our website.

Friday, 12 February 2016 10:49

Redis accelerates Spark SQL by over 100 times

Written by 

A new open source connector package from Redis Labs enables 100 times speed improvements over HDFS and Tachyon

Redis Labs, the home of Redis, is now integrated with Spark SQL and benchmarks using time-series data, show that running Spark on Redis as a data store results in 135 times faster processing compared to Spark using HDFS and 45 times faster processing compared to Spark using Tachyon as an off-heap data store or Spark storing the data on-heap. 

As part of the release Redis has launched the open source Spark-Redis connector package which provides a library for writing to and reading from a Redis cluster with access to all of Redis' data structures – string, hash, list, set, sorted set, bitmaps, hyperloglogs – from Spark as RDDs. In addition, the package also ensures close cluster alignment between Spark and Redis clusters, reducing network overhead and ensuring optimal processing times.

“Big data is coming of age and customers are demanding that big data insights are extracted in real-time,” said Yiftach Shoolman, co-founder and CTO of Redis Labs. “This is where Redis Labs fills the gap by delivering both the right performance and optimized distributed memory infrastructure to accelerate Spark. Our goal is to make Redis the de-facto data store for any Spark deployment.”

Additional planned enhancements to the solution include using the combination of Spark and Redis for other popular use cases such as graph computation and machine learning.

“Apache Spark is becoming a default in-memory engine for high-performance data integration and analytics,” said Matt Aslett, research director, data platforms and analytics at 451 Research. “The combination of Redis and Spark should enable high-performance, real-time analytics with extremely large and variable datasets.”

Leave a comment

Make sure you enter the (*) required information where indicated. HTML code is not allowed.



255x635 banner2-compressed