Search
what's your role logo

Data/ML Engineer

It’s your job to ensure the right data powers the right applications at the right time and in the right place.

With an increased number and variety of workloads, how can you address all aspects of data logistics and processing that can make or break the success of any data-intensive project, including analytics and AI/machine learning? And do it easily and reliably?

On this page, we provide content to help you meet these challenges. You will find a rotating selection of foundational material, ideas to help you get inspired, as well as practical tips on key issues to improve efficiency and performance. You’ll also learn what Hewlett Packard Enterprise (HPE) offers.

The roles of the Data/ML Engineer and Data Scientist can overlap. You may also find content of interest to you on the Data Scientist page. Content on this page changes as new material becomes available or new topics arise, so check back regularly.



Get Inspired

A sampler of new ideas related to data/ML engineering:

Learn how industry innovation may affect your job.

Building a Foundation

Key to data science projects is a unifying data infrastructure to handle logistics and the containerization of applications

Simplify operations and workflows with the right data fabric and orchestrate containerized applications with open source Kubernetes.



Addressing Key Concerns

What can I do to lower the entry barriers to developing new AI/ML/data science projects?

Who should be included on the team to ensure the success of the project?

How do I handle data movement?

What makes it easier to deal with edge computing in large-scale systems?

How do I ensure data trust and security?

How are others doing this?

Check out these real-world case studies



Skill Up

Munch & Learn technology talk

Monthly meetups where you can hear from experts on the newest technologies. Catch up on any you may have missed and register for upcoming talks.



Workshops-on-Demand

Free, in-depth, hands-on workshops that allow you to explore details of a technology by interacting with it. Designed to fit your schedule, these workshops are available 24/7 – from anywhere at any time.



Documentation
The HPE Ezmeral Data Fabric platform page offers documentation and API information along with informative videos and tutorials. Additional documentation can be found here.


Engage
Ping us with your comments, questions, and requests for information.

Blog articles and tutorials

Abhishek Kumar Agarwal

Streamline and optimize ML workflows with HPE Ezmeral Unified Analytics

Sep 27, 2023
Isha Ghodgaonkar

End-to-end, easy-to-use pipeline for training a model on Medical Image Data using HPE Machine Learning Development Environment

Jun 16, 2023
Andrew Mendez

Production-ready object detection model training workflow with HPE Machine Learning Development Environment

Jun 16, 2023
Thirukkannan M

ML Ops – Deploying an ML model in HPE GreenLake Platform ML Ops service

Aug 8, 2022
Sweta Katkoria

How to Set Up an Automation Pipeline to View Historical Trend Data of Clusters with HPE GreenLake for Private Cloud Enterprise

Jun 9, 2022
Denis Choukroun

Deep Learning Model Training – A First-Time User’s Experience with Determined – Part 2

May 3, 2022
Denis Choukroun

Deep Learning Model Training – A First-Time User’s Experience with Determined - Part 1

Apr 14, 2022
Srikanth Venkata Seshu

Highlighting key features of HPE Ezmeral Runtime Enterprise Release 5.4

Mar 31, 2022
By Neil Conway and Alex Putnam

Writing Deep Learning Tools for all Data Scientists, Not Just Unicorns

Feb 11, 2022
Dale Rensing

HPE Developer launches its Munch & Learn technical talks

Jan 27, 2022
Cenz Wong

Getting Started with DataTaps in Kubernetes Pods

Jul 6, 2021
Don Wake

On-Premise Adventures: How to build an Apache Spark lab on Kubernetes

Jun 15, 2021
Carol McDonald

Real-Time Streaming Data Pipelines with Apache APIs: Kafka, Spark Streaming, and HBase

Feb 19, 2021
Ranjit Lingaiah

How to Use Secondary Indexes in Spark With Open JSON Application Interface (OJAI)

Feb 5, 2021
Tugdual Grall

Setting Up Spark Dynamic Allocation on MapR

Feb 5, 2021
Will Ochandarena

Scaling with Kafka – Common Challenges Solved

Jan 29, 2021
Carol McDonald

Streaming Data Pipeline to Transform, Store and Explore Healthcare Dataset With Apache Kafka API, Apache Spark, Apache Drill, JSON and MapR Database

Jan 14, 2021
Michael Farnbach

Best Practices on Migrating from a Data Warehouse to a Big Data Platform

Dec 16, 2020
Nicolas Perez

Spark Data Source API: Extending Our Spark SQL Query Engine

Dec 16, 2020
Carol McDonald

Fast data processing pipeline for predicting flight delays using Apache APIs: Kafka, Spark Streaming and Machine Learning (part 1)

Oct 21, 2020
Terry He

How to Use a Table Load Tool to Batch Puts into HBase/MapR Database

Oct 15, 2020
Ian Downard

How to Persist Kafka Data as JSON in NoSQL Storage Using MapR Event Store and MapR Database

Sep 25, 2020
Magnus Pierre

CRUD with the New Golang Client for MapR Database

Sep 18, 2020
Carol McDonald

Datasets, DataFrames, and Spark SQL for Processing of Tabular Data

Aug 19, 2020
Carol McDonald

Tips and Best Practices to Take Advantage of Spark 2.x

Jul 8, 2020
Carol McDonald

Data Modeling Guidelines for NoSQL JSON Document Databases

Jul 8, 2020

HPE Developer Newsletter

Stay in the loop.

Sign up for the HPE Developer Newsletter or visit the Newsletter Archive to see past content.

By clicking on “Subscribe Now”, I agree to HPE sending me personalized email communication about HPE and select HPE-Partner products, services, offers and events. I understand that my email address will be used in accordance with HPE Privacy Statement. You may unsubscribe from receiving HPE and HPE-Partner news and offers at any time by clicking on the Unsubscribe button at the bottom of the newsletter.

For more information on how HPE manages, uses, and protects your personal data please refer to HPE Privacy Statement.