First Things First…What’s a Data Lake?
If you’re not already familiar with the term, a “data lake” is generally defined as an expansive collection of data that’s held in its original format until needed. Data lakes are repositories of raw data, collected over time, and intended to grow continually. Any data that’s potentially useful for analysis is collected from both inside and outside your organization, and is usually collected as soon as it’s generated. This helps ensure that the data is available and ready for transformation and analysis when needed. Data lakes are central repositories of data that can answer business questions…including questions you haven’t thought of yet.
Azure Data Lake
Azure Data Lake is actually a pair of services: The first is a repository that provides high-performance access to unlimited amounts of data with an optional hierarchical namespace, thus making that data available for analysis. The second is a service that enables batch analysis of that data. Azure Data Lake Storage provides the high performance and unlimited storage infrastructure to support data collection and analysis, while Azure Data Lake Analytics provides an easy-to-use option for an on-demand, job-based, consumption-priced data analysis engine.
We’ll now take a closer look at these two services and where they fit into your cloud ecosystem. Read More…