Azure simplifies cloud analytics

multi-cloudThe modern business landscape is ruled by data, with analytics and AI now essential for driving transformation. Customers have benefited tremendously from the performance, flexibility, and low cost offered by Azure for analytics and AI workloads. Microsoft have introduced new capabilities in Azure that make it easier to deliver, build, and manage powerful analytics and AI solutions.

Firstly, Microsoft have announced the preview of Azure Data Lake Storage Gen2, the only cloud scale data lake designed specifically for mission critical analytics and AI workloads. Azure Data Lake Storage Gen2 combines the scalability and cost benefits of object storage with the reliability and performance offered by the Hadoop file system capabilities.

Microsoft have announced the general availability of new capabilities in Azure Data Factory. Now, integrating data from multiple sources to validate, enrich, and transform data for insights is dramatically simplified.

This evolution of the Microsoft analytics portfolio makes it easier for customers to integrate disparate data sources, then store and process large amounts of data economically to accelerate their digital transformation.

Taking Azure Data Lake Storage to the next level

Analytics solutions such as Hadoop have been designed assuming they run on scale out file systems. Other cloud providers shoehorn these solutions using a combination of client-side file system emulation and feature-deficit object stores resulting in poor performance and inconsistent reliability, ultimately forcing compromise.

Azure Data Lake Storage Gen2 offers a no-compromise data lake. It unifies the core capabilities from the first generation of Azure Data Lake with a Hadoop compatible file system endpoint now directly integrated into Azure Blob Storage. This enhancement combines the scale and cost benefits of object storage with the reliability and performance typically associated only with on-premises file systems. This new file system includes a full hierarchical namespace that makes files and folders first class citizens, translating to faster, more reliable analytic job execution.

Azure Data Lake Storage Gen2 also includes limitless storage ensuring capacity to meet the needs of even the largest, most complex workloads. In addition, Azure Data Lake Storage Gen2 will deliver on native integration with Azure Active Directory and support POSIX compliant ACLs to enable granular permission assignments on files and folders.

As Azure Data Lake Storage Gen2 is fully integrated with Blob storage, customers can access data through the new file system-oriented APIs or the object store APIs from Blob Storage. Customers also have all the benefits of Azure Blob Storage including encryption at rest, object level tiering, and lifecycle policies as well as HA/DR capabilities such as ZRS and GRS. All of this will come at a lower cost and lower overall TCO for customers’ analytics projects! Azure Data Lake Storage Gen2 is the most comprehensive data lake available anywhere. At general availability, Azure Data Lake Storage Gen2 will be available in all Azure regions.

To enable a seamless experience with leading Open Source providers of Hadoop and Spark analytics engines, we are working closely with our partners to make Azure Data Lake Storage Gen2 the most optimized data lake solution for customers.

“As a key partner, Cloudera has been working very closely with Microsoft since our integration of CDH with the first generation of Azure Data Lake. We are confident that Azure Data Lake Storage Gen2 will provide a superior experience for our CDH customers, specifically from a performance and stability perspective. We are very excited to announce our commitment in providing comprehensive platform support for Azure Data Lake Storage Gen2.”

– Vikram Makhija, General Manager for Cloud, Cloudera

To read more please click here.

Scroll to Top