The duo, together with other open source contributors, have built a new open source Apache licensed REST-based Spark Service, called Livy, which is still in early alpha development.
Livy allows applications to easily interface with Spark, with its client API simplifying the process of submitting jobs and systematically retrieving results. At its core, Livy is a REST server for submitting, running, and managing Spark jobs and contexts. Clients can consume Spark like a multi-tenant service, where each customer is given the ability to customize some parts of the application.
“Microsoft is focused on simplifying big data and advanced analytics to make technologies like Apache Hadoop and Spark available for everybody,” said Tiffany Wissner, senior director of Data Platform Marketing at Microsoft.
“The collaboration on Project Livy was able to make interacting with Spark easier for developers through a REST web service and able to make Spark enterprise-ready as a robust back-end for running interactive notebooks.”
Key benefits of Livy include:
- Reduced Friction in Spark Consumption – Each client of Spark need not go through a Spark installation or configuration process to get started. Only a lightweight client that talks to an HTTP endpoint is needed.
- Enabling Third-Party Applications to Use Spark – Applications can build with REST-based client APIs in Java, Scala and Python for fine-grained Spark job submission, result retrieval and management of SparkContexts (the Scala and Python client APIs are under development). Spark can be invoked by applications written in diverse frameworks like Django for Python, Play for Scala or Java. Moreover, because it is REST-based, with a little work, you can also leverage Livy from applications written in languages like Node.js or Go.
- Enabling of New Architectures – Livy makes it easy to integrate Spark into service oriented- or microservices-based architectures, which primarily interact through REST.
“Spark gives you fast big data processing with a general purpose flexible API. We see a natural tendency among our customers and partners to want to leverage Spark’s capabilities from client applications that can easily interface with Spark, and Livy makes that possible,” said Anand Iyer, senior product manager at Cloudera.
“Livy will open Spark to new use cases, and we are hoping it attracts a community of developers that will not only build applications on top of Livy, but also contribute to it, help shape its API and enhance its functionality.”