Search
Tom Phelan

Complex Stateful Applications on Kubernetes: KubeDirector version 0.2

September 9, 2019

Last summer, I wrote here about our BlueK8s initiative and a new open source project for deploying and managing complex stateful scale-out applications on Kubernetes: KubeDirector. KubeDirector enables data scientists familiar with data-intensive distributed applications such as Hadoop, Spark, Cassandra, TensorFlow, Caffe2, etc. to easily run these applications on Kubernetes.

In my blog post on the Kubernetes site in the fall, I introduced version 0.1 of KubeDirector and described how it works. Since then, we’ve seen a lot of interest in KubeDirector from the community we’re very excited about the progress so far. The BlueData team behind this effort is now part of HPE, and the KubeDirector project continues to move full steam ahead.

1*mbaqzb3tkv_zx36nxxdgbw

To that end, we just pushed out the next release and our first public update of KubeDirector: version 0.2. You can check out the full details on our github site here: https://github.com/bluek8s/kubedirector/releases/tag/v0.2.0

Some of the highlights of what’s new in version 0.2 of KubeDirector include:

  • A fully deployable Cloudera 5.14.2 image is now available in the catalog of example applications
  • Cluster launch performance has been enhanced through additional work on launch parallelization
  • The “configcli” tool used in application setup is now included in the “nodeprep” directory.
  • We’ve made additional improvements to the Makefile support and functionality:
    • KubeDirector can now be built and deployed on Ubuntu systems
    • “make deploy” now waits for deployment to succeed before returning
    • “make teardown” now waits for teardown to finish before returning.
  • KubeDirector actions are now recorded as Kubernetes events and can be viewed by the standard “kubectl describe” command
  • KubeDirector has been tested on the following Kubernetes platforms:
    • DigitalOcean Kubernetes (DOK)
    • Google Kubernetes Engine (GKE)
    • Amazon Elastic Container Service for Kubernetes (EKS)
    • Kubernetes version 1.13.2 on CentOS kernels

See below for a screenshot of KubeDirector v0.2 running four pods of a Spark cluster:

1*6syu8q9lacfchtp9 ctdca

One of those pods is a Jupyter notebook, as shown below:

1*bazfzfp7zyqvpvtww ekbw

Join the Community

We’re working towards the next version of KubeDirector (and the broader BlueK8s initiative) and we’d welcome your help as developers, contributors, and adopters. Follow @BlueK8s on Twitter and get involved through these channels:

Related

Ted Dunning & Ellen Friedman

3 ways a data fabric enables a data-first approach

Mar 15, 2022
Nicolas Perez

A Functional Approach to Logging in Apache Spark

Feb 5, 2021
Cenz Wong

Getting Started with DataTaps in Kubernetes Pods

Jul 6, 2021
Kiran Kumar Mavatoor

Accessing HPE Ezmeral Data Fabric Object Storage from Spring Boot S3 Micro Service deployed in K3s cluster

Sep 13, 2021
Carol McDonald

An Inside Look at the Components of a Recommendation Engine

Jan 22, 2021
Carol McDonald

Analyzing Flight Delays with Apache Spark GraphFrames and MapR Database

Dec 16, 2020
Nicolas Perez

Apache Spark as a Distributed SQL Engine

Jan 7, 2021
Carol McDonald

Apache Spark Machine Learning Tutorial

Nov 25, 2020

HPE Developer Newsletter

Stay in the loop.

Sign up for the HPE Developer Newsletter or visit the Newsletter Archive to see past content.

By clicking on “Subscribe Now”, I agree to HPE sending me personalized email communication about HPE and select HPE-Partner products, services, offers and events. I understand that my email address will be used in accordance with HPE Privacy Statement. You may unsubscribe from receiving HPE and HPE-Partner news and offers at any time by clicking on the Unsubscribe button at the bottom of the newsletter.

For more information on how HPE manages, uses, and protects your personal data please refer to HPE Privacy Statement.