One of the benefits of having a data lake is that the same data can be consumed by multi-tenant groups—an efficient way to share a persistent Amazon EMR cluster. The same business data can be safely used for many different analytics and data processing needs. In this builders session, we discuss steps to make an Amazon EMR cluster multi-tenant for analytics, best practices for a multi-tenant cluster, and solving common challenges. We also address security and governance aspects of a multi-tenant Amazon EMR cluster.
7. Silo scenario . . .
Wrangle/manipulate data
Explore/analyze data
Run machine learning models
Processed data
Query results
Deploy
Data engineer
Data analyst
Data scientist
EMR cluster
EMR cluster
EMR cluster
Data engineer bucket
Data analyst bucket
Data scientist bucket