Search
platform logo

HPE Ezmeral Data Fabric

You can store, manage and access your data from edge to core to cloud at any scale or speed that you need. You can build data structures that span your enterprise using the data fabric to handle data storage and motion. Your current systems can access data in the fabric, and the same bits can be processed by cloud native applications.

Tutorials           Data Fabric Blogs           Free Training

Learn from the Experts

What is HPE Ezmeral Data Fabric?

Machine Learning with Symbolic Data

How to size a data fabric system

Practical Erasure Coding in a Data Fabric

Tutorials           Data Fabric Blogs           Free Training

Tutorials

"Music Catalog" Tutorial: REST and GraphQL

The Music Catalog application explain the key Ezmeral Data Fabric Database features, and how to use them to build a complete Web application. Here are the steps to develop, build and run the application:

  1. Introduction
  2. MapR Music Architecture
  3. Setup your environment
  4. Import the Data Set
  5. Discover MapR Database Shell and Apache Drill
  6. Work with MapR Database and Java
  7. Add Indexes
  8. Create a REST API
  9. Deploy to Wildfly
  10. Build the Web Application with Angular
  11. Work with JSON Arrays
  12. Change Data Capture
  13. Add Full Text Search to the Application
  14. Build a Recommendation Engine

The source code of the Music Catalog application is available in this GitHub Repository. Music Catalog application is also implemented with a GraphQL endpoint instead of REST, the application code is available in this GitHub Repository. You can find informations about this implementation in the project readme file.

"Smart Home" IoT Tutorial

The Smart Home Tutorial is designated to walk the developer through a process of developing event processing system, starting from defining business requirements and ending with system deployment and testing. The system is built on top of MapR Converged Data Platform and you will be familiarized with:

  • Ezmeral Data Fabric Event Store for Apache Kafka
  • Apache Spark
  • Ezmeral Data Fabric Database (JSON and OpenTSDB)

The following Tutorial will drive you through the steps to build the application:

  1. Introduction
  2. Motivation
  3. Smart Home Architecture
  4. Setup your environment
  5. Deployment
  6. Data visualization with Grafana
  7. Run the application in a Docker Container

The source code of the Smart Home application is available in this GitHub Repository.

Ezmeral Data Fabric for Predictive Maintenance

This project is intended to show how to build Predictive Maintenance applications on Ezmeral Data Fabric. Predictive Maintenance applications place high demands on data streaming, time-series data storage, and machine learning. Therefore, this project focuses on data ingest with Ezmeral Data Fabric Event Store, time-series data storage with Ezmeral Data Fabric Database and OpenTSDB, and feature engineering with Ezmeral Data Fabric Database and Apache Spark. The source code of the Predictive Maintenance application is available in this GitHub Repository. Look at the project Readme to get more informations about this sample application.

Customer 360 View

Customer 360 applications require the ability to access data lakes containing structured and unstructured data, integrate data sets, and run operational and analytical workloads simultaneously. MapR enables applications to glean customer intelligence through machine learning that relates to customer personality, sentiment, propensity to buy, and likelihood to churn. This application focuses on showing how the following three tenants to customer 360 applications can be achieved on Ezmeral Data Fabric:

  1. Big Data storage of structured and semi-structured data in files, tables, and streams
  2. SQL-based data integration of disparate datasets
  3. Predictive analytics through machine learning insights

The source code of the Customer 360 View application is available in this GitHub Repository.

Application for Processing Stock Market Trade Data

This project provides an engine for processing real time streams trading data from stock exchanges. The application consists of the following components:

  • A Producer microservice that streams trades using the NYSE TAQ format
    • The data source is the Daily Trades dataset described here
    • The schema for our data is detailed in Table 6, "Daily Trades File Data Fields", on page 26 of Daily TAQ Client Specification (from December 1st, 2013)
  • A multi-threaded Consumer microservice that indexes the trades by receiver and sender
  • Example Spark code for querying the indexed streams at interactive speeds, enabling Spark SQL queries
  • Example code for persisting the streaming data to Ezmeral Data Fabric Database
  • Performance tests for benchmarking different configurations
  • A supplementary python script to enhance the above TAQ dataset with "level 2" bid and ask data at a user-defined rate

The source code of the Application for Processing Stock Market Trade Data application is available in this GitHub Repository.

Tutorials           Data Fabric Blogs           Free Training

Free On-Demand Training

Educate yourself for free with online courses that teach you how to build applications and administer the HPE Ezmeral Data Fabric. These lecture and lab courses are part of HPE Ezmeral Learn On-Demand academy.

The developer series of courses includes content on basic and advanced programming with Apache Spark as well as information about how to develop applications using some of the unique capabilities of the HPE Ezmeral Data Fabric such as the integrated JSON-oriented document database.

This admin series covers a range of topics from preparing and testing a bare metal cluster to installing a data fabric to running it on a day to day basis. Hands-on labs help you make sure you have the necessary skills wired by the time you need to install a production system.

This new and expanding series covers the basics of containers and Kubernetes through to up-to-date methods for building stateful applications to run in a containerized world using a data fabric.

Any questions on Ezmeral Data Fabric?

Join the HPEDEV Slack Workspace and start a discussion in our #ezmeral-data-fabric channel.

Not a Slack user? You can also ask your questions in our Ezmeral Forum.

Tutorials           Data Fabric Blogs           Free Training

Related Blogs

author logo
Carol McDonald

Streaming ML pipeline for Sentiment Analysis using Apache APIs: Kafka, Spark and Drill - Part 2

Mar 31, 2021
author logo
Martijn Kieboom

Kubernetes Tutorial part 2 of 3: How to Install and Deploy Applications at Scale on K8s

Mar 31, 2021
author logo
HPE DEV

Boost Your Analytics Factory into Hyperdrive

Mar 9, 2021
author logo
Carol McDonald

How Big Data is Reducing Costs and Improving Outcomes in Health Care

Mar 9, 2021
author logo
Philippe de Cuzey

From Pig to Spark: An Easy Journey to Spark for Apache Pig Developers

Mar 9, 2021
author logo
Nicolas Perez

Spark Streaming and Twitter Sentiment Analysis

Mar 9, 2021
author logo
Carol McDonald

Spark Streaming with HBase

Feb 19, 2021
author logo
Tugdual Grall

Getting Started with MapR Event Store

Feb 19, 2021
author logo
Carol McDonald

Real-Time Streaming Data Pipelines with Apache APIs: Kafka, Spark Streaming, and HBase

Feb 19, 2021
author logo
Carol McDonald

How to Get Started with Spark Streaming and MapR Event Store Using the Kafka API

Feb 19, 2021
author logo
Nicolas Perez

MapR Database Spark Connector with Secondary Indexes Support

Feb 19, 2021
author logo
Jim Scott

Using Python with Apache Spark

Feb 13, 2021
author logo
Tugdual Grall

Getting Started with MapR Client Container

Feb 13, 2021
author logo
Rachel Silver

Event-Driven Microservices on the MapR Data Platform

Feb 5, 2021
author logo
Rachel Silver

Kubernetized Machine Learning and AI Using KubeFlow

Feb 5, 2021
author logo
Rachel Silver

End-to-End Machine Learning Using Containerization

Feb 5, 2021
author logo
Suzy Visvanathan

Containers, Kubernetes, and MapR: The Time is Now

Feb 5, 2021
author logo
Mathieu Dumoulin

Kafka REST Proxy - Performance Tuning for MapR Event Store

Feb 5, 2021
author logo
Nicolas Perez

A Functional Approach to Logging in Apache Spark

Feb 5, 2021
author logo
Ranjit Lingaiah

How to Use Secondary Indexes in Spark With Open JSON Application Interface (OJAI)

Feb 5, 2021
author logo
Tugdual Grall

Setting Up Spark Dynamic Allocation on MapR

Feb 5, 2021
author logo
Nicolas Perez

How to Integrate Custom Data Sources Into Apache Spark

Jan 29, 2021
author logo
Suzy Visvanathan

Containers vs. VMs: A 5-Minute Guide to Understanding Their Differences

Jan 29, 2021
author logo
Ankur Desai

Kafka Connect and Kafka REST API on MapR: Streaming Just Became a Whole Lot Easier!

Jan 29, 2021
author logo
Ankur Desai

Real-Time Event Streaming: What Are Your Options?

Jan 29, 2021
author logo
Will Ochandarena

Scaling with Kafka – Common Challenges Solved

Jan 29, 2021
author logo
Dale Rensing

Exploring Data Fabric and Containers in HPE DEVs new Munch & Learn monthly gatherings

Jan 28, 2021
author logo
Carol McDonald

Top Trends: Machine Learning, Microservices, Containers, Kubernetes, Cloud to Edge. What are they and how do they fit together?

Jan 22, 2021
author logo
Kirk Borne

Association Rule Mining – Not Your Typical Data Science Algorithm

Jan 22, 2021
author logo
Carol McDonald

An Inside Look at the Components of a Recommendation Engine

Jan 22, 2021
author logo
Ellen Friedman

Making AI a Reality

Jan 15, 2021
author logo
Carol McDonald

Streaming Data Pipeline to Transform, Store and Explore Healthcare Dataset With Apache Kafka API, Apache Spark, Apache Drill, JSON and MapR Database

Jan 14, 2021
author logo
Ian Downard

Predicting Forest Fires with Spark Machine Learning

Jan 14, 2021
author logo
Nicolas Perez

Spark Custom Streaming Sources

Jan 14, 2021
author logo
Ronald Van Loon

Journey Science in Telecom: Take Customer Experience to the Next Level

Jan 14, 2021
author logo
Michele Nemschoff

Architecting the World’s Largest Biometric Identity System: The Aadhaar Experience

Jan 7, 2021
author logo
Nicolas Perez

Apache Spark as a Distributed SQL Engine

Jan 7, 2021
author logo
Michael Farnbach

Best Practices on Migrating from a Data Warehouse to a Big Data Platform

Dec 16, 2020
author logo
Jim Scott

Cloud vs. On-Premises – What Are the Best Options for Deploying Microservices with Containers?

Dec 16, 2020
author logo
Nicolas Perez

Spark Data Source API: Extending Our Spark SQL Query Engine

Dec 16, 2020
author logo
Carol McDonald

Analyzing Flight Delays with Apache Spark GraphFrames and MapR Database

Dec 16, 2020
author logo
Nicolas Perez

Apache Spark Packages, from XML to JSON

Dec 11, 2020
author logo
Saira Kennedy

Types of Machine Learning – Part #2 in the Intro to AI/ML Series

Dec 9, 2020
author logo
Carol McDonald

Apache Spark Machine Learning Tutorial

Nov 25, 2020
author logo
Carol McDonald

Demystifying AI, Machine Learning and Deep Learning

Nov 25, 2020
author logo
Suzy Visvanathan

Best Practices for Migrating Your Apps to Containers and Kubernetes

Nov 25, 2020
author logo
Nitin Bandugula

The 5-Minute Guide to Understanding the Significance of Apache Spark

Nov 25, 2020
author logo
Carol McDonald

Event Driven Microservices Architecture Patterns and Examples

Nov 19, 2020
author logo
Carol McDonald

How Spark Runs Your Applications

Nov 18, 2020
author logo
Carol McDonald

Real Time Credit Card Fraud Detection with Apache Spark and Event Streaming

Nov 18, 2020
author logo
Saira Kennedy

Artificial Intelligence and Machine Learning: What Are They and Why Are They Important?

Nov 12, 2020
author logo
Mathieu Dumoulin

Performance Tuning of an Apache Kafka/Spark Streaming System - Telecom Case Study

Nov 12, 2020
author logo
Carol McDonald

Kubernetes, Kafka Event Sourcing Architecture Patterns and Use Case Examples

Nov 11, 2020
author logo
Ian Downard

Kafka vs. MapR Event Store: Why MapR?

Nov 11, 2020
author logo
Mathieu Dumoulin

Configure Jupyter Notebook for Spark 2.1.0 and Python

Nov 5, 2020
author logo
Mathieu Dumoulin

Performance Tuning of an Apache Kafka/Spark Streaming System

Nov 5, 2020
author logo
Karen Whipple

Data Fabric: The Future of Data Management

Nov 5, 2020
author logo
Carol McDonald

Big Data Opportunities for Telecommunications

Nov 5, 2020
author logo
Suresh Ollala

MapR, Kubernetes, Spark and Drill: A Two-Part Guide to Accelerating Your AI and Analytics Workloads

Nov 3, 2020
author logo
Mathieu Dumoulin

Better Complex Event Processing at Scale Using a Microservices-based Streaming Architecture (Part 1)

Nov 3, 2020
author logo
Jimit Shah

Provisioning Secure Access Controls in MapR Database

Nov 2, 2020
author logo
Martijn Kieboom

Kubernetes Tutorial: How to Install and Deploy Applications at Scale on K8s - Part 1 of 3

Nov 2, 2020
author logo
Carol McDonald

Streaming Machine learning pipeline for Sentiment Analysis using Apache APIs: Kafka, Spark and Drill - Part 1

Oct 28, 2020
author logo
Carol McDonald

Fast data processing pipeline for predicting flight delays using Apache APIs: Kafka, Spark Streaming and Machine Learning (part 1)

Oct 21, 2020
author logo
Terry He

How to Use a Table Load Tool to Batch Puts into HBase/MapR Database

Oct 15, 2020
author logo
Ian Downard

How to Persist Kafka Data as JSON in NoSQL Storage Using MapR Event Store and MapR Database

Sep 25, 2020
author logo
Prashant Rathi

How to Build Stanzas Using MapR Installer for Easy and Efficient Provisioning

Sep 19, 2020
author logo
Magnus Pierre

CRUD with the New Golang Client for MapR Database

Sep 18, 2020
author logo
Suzy Visvanathan

Containers: Best Practices for Running in Production

Sep 16, 2020
author logo
Carol McDonald

Datasets, DataFrames, and Spark SQL for Processing of Tabular Data

Aug 19, 2020
author logo
Nicolas Perez

How to Log in Apache Spark

Aug 19, 2020
author logo
Suzanne Ferry

Kubernetes Application Containers: Managing Containers and Cluster Resources

Jul 10, 2020
author logo
Carol McDonald

Tips and Best Practices to Take Advantage of Spark 2.x

Jul 8, 2020
author logo
Carol McDonald

Data Modeling Guidelines for NoSQL JSON Document Databases

Jul 8, 2020
author logo
Carol McDonald

Spark 101: What Is It, What It Does, and Why It Matters

Jul 3, 2020
author logo
Prashant Sachdeva

HPE achieves gold for large-scale enterprise Kubernetes deployments

Jun 17, 2020

HPE Developer Newsletter

Stay in the loop.

Sign up for the HPE Developer Newsletter or visit the Newsletter Archive to see past content.

By clicking on “Subscribe Now”, I agree to HPE sending me personalized email communication about HPE and select HPE-Partner products, services, offers and events. I understand that my email address will be used in accordance with HPE Privacy Statement. You may unsubscribe from receiving HPE and HPE-Partner news and offers at any time by clicking on the Unsubscribe button at the bottom of the newsletter.

For more information on how HPE manages, uses, and protects your personal data please refer to HPE Privacy Statement.