Voiced by Amazon Polly |
Introduction
The power of an AI or analytic use case comes from the underlying data platform. Without a correct data foundation, one cannot unlock the true potential of data and cannot replicate that use case efficiently across the organization. This brings us to the question of choosing the best data platform. To answer this, let’s dive deep into the basic functionalities of a foundational data platform and how Microsoft Fabric can be one of the best data analytics solutions.
Pioneers in Cloud Consulting & Migration Services
- Reduced infrastructural costs
- Accelerated application deployment
Microsoft Fabric
A data platform must offer the following key functionalities:
- Data integration
- Storage
- Processing
- Governance
- Reporting
‘Fabric’, a SaaS offering by Microsoft, is a unified data analytics solution that enables customers with all the functionalities listed above and provides services that cover everything from the Data Lake, Data Engineering, Data Integration, and Data Science in one place. Microsoft Fabric combines the already proven Azure PaaS services like Azure Data Factory, Azure Data Explorer, Azure Synapse Analytics, and Azure Machine Learning into a unified SaaS like experience to enable collaboration between data teams. This SaaS like experience not only helps IT teams but also the business groups to experience the capabilities of a modern data platform. Along with the unification of data services, Microsoft Fabric introduced new concepts such as OneLake, OneLake Shortcuts, V-order, and Direct Lake in Power BI. Let’s look at each one of them.
OneLake
OneLake can be called as OneDrive for data. OneLake is a unified logical data lake built on Azure Data Lake Gen2. OneLake was built to bring customers a single data lake for the entire organization and a single copy of data for use with multiple analytical engines. It is fully managed by Microsoft and designed as a single place to store and manage all your enterprise analytical data. Like Azure Active Directory and OneDrive, OneLake is automatically provisioned for each Azure tenant and is aligned with an Azure Active Directory. The main advantage customers get by using OneLake is the prevention of data duplication across different domains and the isolated data lake for each use case. In OneLake, we can store data using OneLake Workspaces, and each data object in OneLake workspaces can be called an Item. We can assume each OneLake workspace is a single storage container, and tenant admins can control and govern the access policies of each workspace. OneLake supports any file format and data, whether structured or unstructured.
OneLake Shortcuts
OneLake Shortcuts allows users to share data in OneLake across different applications without unnecessarily moving or duplicating data. A Shortcut refers to the data stored internally in OneLake or externally in Azure Data Lake Gen 2 or Amazon S3 buckets. When teams work independently in separate OneLake workspaces, OneLake Shortcuts are helpful when users want to combine data available in multiple file locations. The main benefit of using OneLake Shortcut is that wherever the location of the data that we are referring to, the reference will make it appear like data stored in local files or folders.
V-Order
V-Order is like a superpower for your data files, especially when dealing with Microsoft Fabric compute engines such as Power BI, SQL, and Spark. It’s a smart optimization technique for parquet files that makes reading data super-fast. When using Power BI or SQL with Microsoft Verti-Scan tech and V-Ordered parquet files, your data is instantly accessible, almost like it’s already in your computer’s memory. Even if you’re using Spark or other engines that don’t have Verti-Scan, you still get a boost – around 10% faster reads on average, and sometimes up to 50%. V-Order do this by rearranging and compressing stuff in your parquet files. This special sorting, row group distribution, dictionary encoding, and compression mean your compute engines need less network, disk, and CPU power to read the data. That’s not just good for speed; it’s also cost-efficient. It follows the open-source parquet format, making it compatible with all parquet engines. Delta tables, especially with features like Z-Order, work efficiently with V-Order. You can control V-Order on its partitions using table properties and optimization commands.
Direct Lake
In the ever-evolving landscape of Microsoft Fabric and its Power BI, a significant stride has been made with the introduction of Direct Lake mode. This isn’t just another feature; it marks an essential evolution in Power BI, promising enhanced efficiency and a better user experience. Before delving into Direct Lake, let us understand the traditional connection modes in Power BI:
- Import mode: This mode pulls data into Power BI’s internal memory, allowing quick performance and utilization of the full spectrum of Power BI capabilities. However, its size limitation makes it less suitable for large datasets.
- Direct Query mode: This mode establishes a direct link to data sources, making it ideal for massive datasets. The disadvantage lies in slower performance and limitations in Power Query transformations.
Now, Direct Lake mode emerges as a game-changer. It’s a new data connection feature exclusive to Microsoft Fabric, serving as a bridge to Delta parquet files. This mode enables users to access vast amounts of data without traditional limitations, providing real-time, large-scale analytics without typical performance bottlenecks. With its promise of combining the strengths of both Import mode and Direct Query mode, Direct Lake mode opens up new possibilities for handling vast datasets in real time.
Conclusion
With Fabric, users can focus on creating without the burden of managing the underlying infrastructure, benefiting from seamless integration, shared experiences, and enhanced governance across all analytics components. It represents a user-friendly, end-to-end solution that streamlines the complexities of enterprise analytics.
Drop a query if you have any questions regarding Microsoft Fabric and we will get back to you quickly.
Making IT Networks Enterprise-ready – Cloud Management Services
- Accelerated cloud migration
- End-to-end view of the cloud environment
About CloudThat
CloudThat is a leading provider of Cloud Training and Consulting services with a global presence in India, the USA, Asia, Europe, and Africa. Specializing in AWS, Microsoft Azure, GCP, VMware, Databricks, and more, the company serves mid-market and enterprise clients, offering comprehensive expertise in Cloud Migration, Data Platforms, DevOps, IoT, AI/ML, and more.
CloudThat is recognized as a top-tier partner with AWS and Microsoft, including the prestigious ‘Think Big’ partner award from AWS and the Microsoft Superstars FY 2023 award in Asia & India. Having trained 650k+ professionals in 500+ cloud certifications and completed 300+ consulting projects globally, CloudThat is an official AWS Advanced Consulting Partner, AWS Training Partner, AWS Migration Partner, AWS Data and Analytics Partner, AWS DevOps Competency Partner, Amazon QuickSight Service Delivery Partner, Amazon EKS Service Delivery Partner, Microsoft Gold Partner, AWS Microsoft Workload Partners, Amazon EC2 Service Delivery Partner, and many more.
To get started, go through our Consultancy page and Managed Services Package, CloudThat’s offerings.
FAQs
1. Is Direct Lake Mode accessible on the Power BI desktop?
ANS: – Direct Lake mode is not accessible directly from Power BI Desktop. It requires creating a new Power BI dataset in the Power BI Cloud experience.
2. What are the limitations of Direct Lake Mode?
ANS: – The limitations of Direct Lake mode in the web version include the immediate application of changes without a rollback option and the inability to create calculated columns and tables.
WRITTEN BY Yaswanth Tippa
Yaswanth Tippa is working as a Research Associate - Data and AIoT at CloudThat. He is a highly passionate and self-motivated individual with experience in data engineering and cloud computing with substantial expertise in building solutions for complex business problems involving large-scale data warehousing and reporting.
Click to Comment