Search
what's your role logo

Data Scientist

Data science is an intriguing field, and recognition of its potential is rapidly expanding. But there are challenges. You need access to the right data and the flexibility to use a variety of tools of your own choice. The pipeline for data preparation and for model and application deployment needs to be reliable and efficient. And you need to increase the likelihood that stakeholders and IT managers will green-light new projects.

On this page, we provide a range of content – for advanced data scientists to those just getting started – to help you meet these challenges. You will find a rotating selection of foundational material, ideas to help you get inspired, as well as practical tips on key issues that help make your data science projects easier to build and more likely to be successful. You’ll also learn what Hewlett Packard Enterprise (HPE) offers.

The roles of the Data/ML Engineer and Data Scientist can overlap. You may also find content of interest to you on the Data/ML Engineer page. Content on this page changes as new material becomes available or new topics arise, so check back regularly.



Get Inspired

A sampler of new ideas

Learn how innovation may affect your job.

Building a Foundation

Optimize data access

The right data infrastructure gives you direct access to data via a wide range of APIs for a choice in tools.

Working together

Domain expertise helps frame questions, identify useful data, and take action on insights.

Containerization of applications

The open source Kubernetes framework orchestrates containers.



Addressing Key Concerns

How do I find and get access to the right data?

What can I do to lower the entry barriers to developing new AI/ML/data science projects?

How do I optimize data logistics and preparation efforts to keep them from overwhelming the data science project?

What makes it easier to deal with edge computing in large-scale systems?

How are others doing this?

Check out these real-world case studies



Skill Up

Munch & Learn technology talk

Monthly meetups where you can hear from experts on the newest technologies. Catch up on any you may have missed and register for upcoming talks.



Workshops-on-Demand

Free, in-depth, hands-on workshops that allow you to explore details of a technology by interacting with it. Designed to fit your schedule, these workshops are available 24/7 – from anywhere at any time.



Documentation
The HPE Ezmeral Data Fabric platform page offers documentation and API information along with informative videos and tutorials. Additional documentation can be found here.


Engage
Ping us with your comments, questions, and requests for information.

Blog articles and tutorials

Nelson Luís Dias

7 Questions for Nelson Luís Dias: Atmospheric Turbulence in Chapel

Oct 15, 2024
Abhishek Kumar Agarwal

Streamline and optimize ML workflows with HPE Ezmeral Unified Analytics

Sep 27, 2023
Denis Choukroun

Deep Learning Model Training – A First-Time User’s Experience with Determined – Part 2

May 3, 2022
Denis Choukroun

Deep Learning Model Training – A First-Time User’s Experience with Determined - Part 1

Apr 14, 2022
Srikanth Venkata Seshu

Highlighting key features of HPE Ezmeral Runtime Enterprise Release 5.4

Mar 31, 2022
By Neil Conway and Alex Putnam

Writing Deep Learning Tools for all Data Scientists, Not Just Unicorns

Feb 11, 2022
Don Wake

On-Premise Adventures: How to build an Apache Spark lab on Kubernetes

Jun 15, 2021
Dale Rensing

Exploring Data Fabric and Containers in HPE DEVs new Munch & Learn monthly gatherings

Jan 28, 2021
Kirk Borne

Association Rule Mining – Not Your Typical Data Science Algorithm

Jan 22, 2021
Carol McDonald

Streaming Data Pipeline to Transform, Store and Explore Healthcare Dataset With Apache Kafka API, Apache Spark, Apache Drill, JSON and MapR Database

Jan 14, 2021
Nicolas Perez

Spark Data Source API: Extending Our Spark SQL Query Engine

Dec 16, 2020
Carol McDonald

Fast data processing pipeline for predicting flight delays using Apache APIs: Kafka, Spark Streaming and Machine Learning (part 1)

Oct 21, 2020
Carol McDonald

Datasets, DataFrames, and Spark SQL for Processing of Tabular Data

Aug 19, 2020
Carol McDonald

Tips and Best Practices to Take Advantage of Spark 2.x

Jul 8, 2020

HPE Developer Newsletter

Stay in the loop.

Sign up for the HPE Developer Newsletter or visit the Newsletter Archive to see past content.

By clicking on “Subscribe Now”, I agree to HPE sending me personalized email communication about HPE and select HPE-Partner products, services, offers and events. I understand that my email address will be used in accordance with HPE Privacy Statement. You may unsubscribe from receiving HPE and HPE-Partner news and offers at any time by clicking on the Unsubscribe button at the bottom of the newsletter.

For more information on how HPE manages, uses, and protects your personal data please refer to HPE Privacy Statement.