Search
platform logo

HPE Ezmeral Data Fabric

You can store, manage and access your data from edge to core to cloud at any scale or speed that you need. You can build data structures that span your enterprise using the data fabric to handle data storage and motion. Your current systems can access data in the fabric, and the same bits can be processed by cloud native applications.

Tutorials           Data Fabric Blogs           Free Training

Learn from the Experts

What is HPE Ezmeral Data Fabric?

How to size a data fabric system

Practical Erasure Coding in a Data Fabric

Tutorials           Data Fabric Blogs           Free Training

Tutorials

"Music Catalog" Tutorial: REST and GraphQL

The Music Catalog application explain the key Ezmeral Data Fabric Database features, and how to use them to build a complete Web application. Here are the steps to develop, build and run the application:

  1. Introduction
  2. MapR Music Architecture
  3. Setup your environment
  4. Import the Data Set
  5. Discover MapR Database Shell and Apache Drill
  6. Work with MapR Database and Java
  7. Add Indexes
  8. Create a REST API
  9. Deploy to Wildfly
  10. Build the Web Application with Angular
  11. Work with JSON Arrays
  12. Change Data Capture
  13. Add Full Text Search to the Application
  14. Build a Recommendation Engine

The source code of the Music Catalog application is available in this GitHub Repository. Music Catalog application is also implemented with a GraphQL endpoint instead of REST, the application code is available in this GitHub Repository. You can find informations about this implementation in the project readme file.

"Smart Home" IoT Tutorial

The Smart Home Tutorial is designated to walk the developer through a process of developing event processing system, starting from defining business requirements and ending with system deployment and testing. The system is built on top of MapR Converged Data Platform and you will be familiarized with:

  • Ezmeral Data Fabric Event Store for Apache Kafka
  • Apache Spark
  • Ezmeral Data Fabric Database (JSON and OpenTSDB)

The following Tutorial will drive you through the steps to build the application:

  1. Introduction
  2. Motivation
  3. Smart Home Architecture
  4. Setup your environment
  5. Deployment
  6. Data visualization with Grafana
  7. Run the application in a Docker Container

The source code of the Smart Home application is available in this GitHub Repository.

Ezmeral Data Fabric for Predictive Maintenance

This project is intended to show how to build Predictive Maintenance applications on Ezmeral Data Fabric. Predictive Maintenance applications place high demands on data streaming, time-series data storage, and machine learning. Therefore, this project focuses on data ingest with Ezmeral Data Fabric Event Store, time-series data storage with Ezmeral Data Fabric Database and OpenTSDB, and feature engineering with Ezmeral Data Fabric Database and Apache Spark. The source code of the Predictive Maintenance application is available in this GitHub Repository. Look at the project Readme to get more informations about this sample application.

Customer 360 View

Customer 360 applications require the ability to access data lakes containing structured and unstructured data, integrate data sets, and run operational and analytical workloads simultaneously. MapR enables applications to glean customer intelligence through machine learning that relates to customer personality, sentiment, propensity to buy, and likelihood to churn. This application focuses on showing how the following three tenants to customer 360 applications can be achieved on Ezmeral Data Fabric:

  1. Big Data storage of structured and semi-structured data in files, tables, and streams
  2. SQL-based data integration of disparate datasets
  3. Predictive analytics through machine learning insights

The source code of the Customer 360 View application is available in this GitHub Repository.

Application for Processing Stock Market Trade Data

This project provides an engine for processing real time streams trading data from stock exchanges. The application consists of the following components:

  • A Producer microservice that streams trades using the NYSE TAQ format

    • The data source is the Daily Trades dataset described here
    • The schema for our data is detailed in Table 6, "Daily Trades File Data Fields", on page 26 of Daily TAQ Client Specification (from December 1st, 2013)
  • A multi-threaded Consumer microservice that indexes the trades by receiver and sender
  • Example Spark code for querying the indexed streams at interactive speeds, enabling Spark SQL queries
  • Example code for persisting the streaming data to Ezmeral Data Fabric Database
  • Performance tests for benchmarking different configurations
  • A supplementary python script to enhance the above TAQ dataset with "level 2" bid and ask data at a user-defined rate

The source code of the Application for Processing Stock Market Trade Data application is available in this GitHub Repository.

Tutorials           Data Fabric Blogs           Free Training

Free On-Demand Training

Learn for free with online courses that teach you how to build applications and administer the HPE Ezmeral Data Fabric. Visit HPE Ezmeral Learn On-Demand to enroll.

  • Artificial Intelligence and Machine Learning. Newer course series covering the basics of data science, machine learning, and AI, with step-by-step instructions on managing successful machine learning projects.
  • Apache Spark. This course series offers an overview of Apache Spark 2.x, the Spark execution model, and some advanced topics on developing data pipeline apps using Spark streaming, Spark SQL, GraphFrame, and MLlib.
  • Data Fabric Cluster Administration. Learn about preparing and testing a bare metal cluster to installing a data fabric, to running it on a day to day basis.
  • Kubernetes and Stateful Applications. Covers the basics of containers and Kubernetes, and methods for building stateful applications to run in a containerized world using a data fabric.

Workshops-on-Demand

Take advantage of our free, Jupyter-Notebook based Workshops-on-Demand available in the Hack Shack. These technical workshops provide you with an in-depth, hands-on learning experience where you can interact with and learn from the experts. Designed to fit your schedule, these workshops are available 24/7 – any time, from anywhere. HPE Ezmeral Data Fabric workshops are available today.

Any questions on Ezmeral Data Fabric?

Join the HPEDEV Slack Workspace and start a discussion in our #ezmeral-data-fabric channel.

Not a Slack user? You can also ask your questions in our Ezmeral Forum.

Tutorials           Data Fabric Blogs           Free Training

Related Blogs

Sridhar Reddy

File, objects, databases and streams – Oh my!

Mar 31, 2022
Mathieu Dumoulin

Real-time Smart City Traffic Monitoring Using Microservices-based Streaming Architecture (Part 2)

Jan 27, 2022
MapR Tutorials

Getting Started with Spark on MapR Sandbox

Dec 14, 2021
Carol McDonald

Using Apache Spark DataFrames for Processing of Tabular Data

Dec 13, 2021
Ellen Friedman

How fine-grained data placement helps optimize application performance

Oct 22, 2021
Carol McDonald

Fast data processing pipeline for predicting flight delays using Apache APIs: Kafka, Spark Streaming and Machine Learning (part 3)

Oct 11, 2021
Carol McDonald

Fast data processing pipeline for predicting flight delays using Apache APIs: Kafka, Spark Streaming and Machine Learning (part 2)

Oct 11, 2021
Kiran Kumar Mavatoor

Accessing HPE Ezmeral Data Fabric Object Storage from Spring Boot S3 Micro Service deployed in K3s cluster

Sep 13, 2021
Cenz Wong

Data Analytics with PySpark using HPE Ezmeral Container Platform

Sep 7, 2021
John Omernik

Using Docker Wrong: My Journey to a Better Container

Jul 13, 2021
Joseph Blue

Comparing "To Kill a Mockingbird" to its Sequel with Apache Spark

Jul 13, 2021
Cenz Wong

Getting Started with DataTaps in Kubernetes Pods

Jul 6, 2021
Don Wake

On-Premise Adventures: How to build an Apache Spark lab on Kubernetes

Jun 15, 2021
Prasad Singathi, Maikel Pereira

Similar Document Search using Apache Spark with TF-IDF

Jun 15, 2021
Carol McDonald

Streaming ML pipeline for Sentiment Analysis using Apache APIs: Kafka, Spark and Drill - Part 2

Mar 31, 2021
Martijn Kieboom

Kubernetes Tutorial part 2 of 3: How to Install and Deploy Applications at Scale on K8s

Mar 31, 2021
HPE DEV

Boost Your Analytics Factory into Hyperdrive

Mar 9, 2021
Carol McDonald

How Big Data is Reducing Costs and Improving Outcomes in Health Care

Mar 9, 2021
Philippe de Cuzey

From Pig to Spark: An Easy Journey to Spark for Apache Pig Developers

Mar 9, 2021
Nicolas Perez

Spark Streaming and Twitter Sentiment Analysis

Mar 9, 2021
Carol McDonald

Spark Streaming with HBase

Feb 19, 2021
Tugdual Grall

Getting Started with MapR Event Store

Feb 19, 2021
Carol McDonald

Real-Time Streaming Data Pipelines with Apache APIs: Kafka, Spark Streaming, and HBase

Feb 19, 2021
Carol McDonald

How to Get Started with Spark Streaming and MapR Event Store Using the Kafka API

Feb 19, 2021
Nicolas Perez

MapR Database Spark Connector with Secondary Indexes Support

Feb 19, 2021
Jim Scott

Using Python with Apache Spark

Feb 13, 2021
Tugdual Grall

Getting Started with MapR Client Container

Feb 13, 2021
Rachel Silver

Event-Driven Microservices on the MapR Data Platform

Feb 5, 2021
Rachel Silver

Kubernetized Machine Learning and AI Using KubeFlow

Feb 5, 2021
Rachel Silver

End-to-End Machine Learning Using Containerization

Feb 5, 2021
Suzy Visvanathan

Containers, Kubernetes, and MapR: The Time is Now

Feb 5, 2021
Mathieu Dumoulin

Kafka REST Proxy - Performance Tuning for MapR Event Store

Feb 5, 2021
Nicolas Perez

A Functional Approach to Logging in Apache Spark

Feb 5, 2021
Ranjit Lingaiah

How to Use Secondary Indexes in Spark With Open JSON Application Interface (OJAI)

Feb 5, 2021
Tugdual Grall

Setting Up Spark Dynamic Allocation on MapR

Feb 5, 2021
Nicolas Perez

How to Integrate Custom Data Sources Into Apache Spark

Jan 29, 2021
Suzy Visvanathan

Containers vs. VMs: A 5-Minute Guide to Understanding Their Differences

Jan 29, 2021
Ankur Desai

Kafka Connect and Kafka REST API on MapR: Streaming Just Became a Whole Lot Easier!

Jan 29, 2021
Ankur Desai

Real-Time Event Streaming: What Are Your Options?

Jan 29, 2021
Will Ochandarena

Scaling with Kafka – Common Challenges Solved

Jan 29, 2021
Dale Rensing

Exploring Data Fabric and Containers in HPE DEVs new Munch & Learn monthly gatherings

Jan 28, 2021
Carol McDonald

Top Trends: Machine Learning, Microservices, Containers, Kubernetes, Cloud to Edge. What are they and how do they fit together?

Jan 22, 2021
Kirk Borne

Association Rule Mining – Not Your Typical Data Science Algorithm

Jan 22, 2021
Carol McDonald

An Inside Look at the Components of a Recommendation Engine

Jan 22, 2021
Ellen Friedman

Making AI a Reality

Jan 15, 2021
Carol McDonald

Streaming Data Pipeline to Transform, Store and Explore Healthcare Dataset With Apache Kafka API, Apache Spark, Apache Drill, JSON and MapR Database

Jan 14, 2021
Ian Downard

Predicting Forest Fires with Spark Machine Learning

Jan 14, 2021
Nicolas Perez

Spark Custom Streaming Sources

Jan 14, 2021
Ronald Van Loon

Journey Science in Telecom: Take Customer Experience to the Next Level

Jan 14, 2021
Michele Nemschoff

Architecting the World’s Largest Biometric Identity System: The Aadhaar Experience

Jan 7, 2021
Nicolas Perez

Apache Spark as a Distributed SQL Engine

Jan 7, 2021
Michael Farnbach

Best Practices on Migrating from a Data Warehouse to a Big Data Platform

Dec 16, 2020
Jim Scott

Cloud vs. On-Premises – What Are the Best Options for Deploying Microservices with Containers?

Dec 16, 2020
Nicolas Perez

Spark Data Source API: Extending Our Spark SQL Query Engine

Dec 16, 2020
Carol McDonald

Analyzing Flight Delays with Apache Spark GraphFrames and MapR Database

Dec 16, 2020
Nicolas Perez

Apache Spark Packages, from XML to JSON

Dec 11, 2020
Saira Kennedy

Types of Machine Learning – Part #2 in the Intro to AI/ML Series

Dec 9, 2020
Carol McDonald

Apache Spark Machine Learning Tutorial

Nov 25, 2020
Carol McDonald

Demystifying AI, Machine Learning and Deep Learning

Nov 25, 2020
Suzy Visvanathan

Best Practices for Migrating Your Apps to Containers and Kubernetes

Nov 25, 2020
Nitin Bandugula

The 5-Minute Guide to Understanding the Significance of Apache Spark

Nov 25, 2020
Carol McDonald

Event Driven Microservices Architecture Patterns and Examples

Nov 19, 2020
Carol McDonald

How Spark Runs Your Applications

Nov 18, 2020
Carol McDonald

Real Time Credit Card Fraud Detection with Apache Spark and Event Streaming

Nov 18, 2020
Saira Kennedy

Artificial Intelligence and Machine Learning: What Are They and Why Are They Important?

Nov 12, 2020
Mathieu Dumoulin

Performance Tuning of an Apache Kafka/Spark Streaming System - Telecom Case Study

Nov 12, 2020
Carol McDonald

Kubernetes, Kafka Event Sourcing Architecture Patterns and Use Case Examples

Nov 11, 2020
Ian Downard

Kafka vs. MapR Event Store: Why MapR?

Nov 11, 2020
Mathieu Dumoulin

Configure Jupyter Notebook for Spark 2.1.0 and Python

Nov 5, 2020
Mathieu Dumoulin

Performance Tuning of an Apache Kafka/Spark Streaming System

Nov 5, 2020
Karen Whipple

Data Fabric: The Future of Data Management

Nov 5, 2020
Carol McDonald

Big Data Opportunities for Telecommunications

Nov 5, 2020
Suresh Ollala

MapR, Kubernetes, Spark and Drill: A Two-Part Guide to Accelerating Your AI and Analytics Workloads

Nov 3, 2020
Mathieu Dumoulin

Better Complex Event Processing at Scale Using a Microservices-based Streaming Architecture (Part 1)

Nov 3, 2020
Jimit Shah

Provisioning Secure Access Controls in MapR Database

Nov 2, 2020
Martijn Kieboom

Kubernetes Tutorial: How to Install and Deploy Applications at Scale on K8s - Part 1 of 3

Nov 2, 2020
Carol McDonald

Streaming Machine learning pipeline for Sentiment Analysis using Apache APIs: Kafka, Spark and Drill - Part 1

Oct 28, 2020
Carol McDonald

Fast data processing pipeline for predicting flight delays using Apache APIs: Kafka, Spark Streaming and Machine Learning (part 1)

Oct 21, 2020
Terry He

How to Use a Table Load Tool to Batch Puts into HBase/MapR Database

Oct 15, 2020
Ian Downard

How to Persist Kafka Data as JSON in NoSQL Storage Using MapR Event Store and MapR Database

Sep 25, 2020
Prashant Rathi

How to Build Stanzas Using MapR Installer for Easy and Efficient Provisioning

Sep 19, 2020
Magnus Pierre

CRUD with the New Golang Client for MapR Database

Sep 18, 2020
Suzy Visvanathan

Containers: Best Practices for Running in Production

Sep 16, 2020
Carol McDonald

Datasets, DataFrames, and Spark SQL for Processing of Tabular Data

Aug 19, 2020
Nicolas Perez

How to Log in Apache Spark

Aug 19, 2020
Suzanne Ferry

Kubernetes Application Containers: Managing Containers and Cluster Resources

Jul 10, 2020
Carol McDonald

Tips and Best Practices to Take Advantage of Spark 2.x

Jul 8, 2020
Carol McDonald

Data Modeling Guidelines for NoSQL JSON Document Databases

Jul 8, 2020
Carol McDonald

Spark 101: What Is It, What It Does, and Why It Matters

Jul 3, 2020
Prashant Sachdeva

HPE achieves gold for large-scale enterprise Kubernetes deployments

Jun 17, 2020

HPE Developer Newsletter

Stay in the loop.

Sign up for the HPE Developer Newsletter or visit the Newsletter Archive to see past content.

By clicking on “Subscribe Now”, I agree to HPE sending me personalized email communication about HPE and select HPE-Partner products, services, offers and events. I understand that my email address will be used in accordance with HPE Privacy Statement. You may unsubscribe from receiving HPE and HPE-Partner news and offers at any time by clicking on the Unsubscribe button at the bottom of the newsletter.

For more information on how HPE manages, uses, and protects your personal data please refer to HPE Privacy Statement.