Introduction to Building Scalable Machine Learning Models
Machine learning is a forever-evolving space. In the early days, the focus was on implementing the historic algorithms using the computing power that was previously unavailable. Currently, ML has expanded across multiple avenues of interest: Being able to scale models with the ever-increasing quantities of data, the ability to package models up so that they can fit into an MLOps framework, how well tracked and logged the models are, how explainable they are, not to mention techniques to squeeze the best performance out of the models. All these aspects are rarely covered all at once. This pre-con aims to take all these key topics and deliver them over the Databricks platform, one of the leading tools in the industry for data manipulation, machine learning and distributed computing. It utilises a powerful Spark engine under the hood whilst presenting an easy-to-use bespoke notebook for the front end.
This session calls upon experience gained over the years by experienced machine learning practitioners to deliver a condensed, concise course to help fast-track any machine learning novice towards mastery.
The pre-con will cover building real-world models in Azure, further techniques for robust model building, the latest craze in ML known as feature stores, advanced model tracking and key considerations when scoring models. The pre-con will contain talks, coupled with follow-along labs giving an element of hand-holding with some more tricky tasks thrown in. The sessions will also explore real-world use cases that are rife in the business world.The Data Cave (Dijlezaal)Mon 09:00 - 17:00
Power BI Source Control & CI/CD
This training day takes you from zero to hero with respect to using source control and CI/CD with Power BI projects.
Learn directly from Mathias who has many years of experience with defining, implementing and evolving Power BI source control practices and tools. His open-source tool pbi-tools has pioneered source control capabilities for Power BI, and has directly and indirectly influenced many investments and releases in the Power BI ecosystem by Microsoft.
With many new tools and features available in 2023, this session is your opportunity to learn, hands-on, how to get started with source control, which tools are available, and what it takes to implement them successfully for your team/project.
The session covers source control basics and explains the use of git, Visual Studio Code and Azure DevOps in conjunction with Power BI. All the latest Power BI source control and collaboration features are covered, including Power BI Deployment Pipelines as well as advanced CI/CD pipelines via external tools.
1. Hands-on source control tutorial
2. Setting up CI/CD in your organization
3. Advanced workflows and branching strategiesThe Vault (Scala)Mon 09:00 - 17:00
Power BI Report Design and Development workshop
This one-day collaborative workshop will cover the various Power BI design stages learned from years of consulting and working with some of the largest enterprises. The workshop, divided into five modules, enables you to bring your end user's data to life with Power BI.
1 The Basics of Data Visualization
In this first module, you will get an introduction to data visualization and discover why telling the right story to the right audience is so important.
2 Aligning people and data
This module will teach you how to conduct workshops and gather requirements from stakeholders and end users. Then you will learn techniques to explore data and identify meaningful patterns. In the hands-on workshops, we will use sample templates to ask the right questions and convert captured requirements to use cases. By the end of this section, you will be able to determine the 5W's and How (who, what, why, where, and how) to create an optimal Power BI model.
3 Drive use cases and data patterns to drive visualizations
In this section, you will learn concepts and techniques that will assist you in identifying the right visual based on data and use cases. You will understand the most used visual categories like Key Metrics, charts, Maps etc. and then learn how to use them in Power BI to create stand-out visuals.
4 How to find the right visuals
In this section, you will learn tips and techniques to identify the right visual for the right audience and their use cases. The section also includes an exercise to find the right visuals based on your previously identified use cases.
5 Putting the story together
In this module, we will tie it all together using my top 10 techniques to create a stunning Power BI report. We will cover everything, from a blank canvas to an actionable report, and ensure it is accessible to all users. We will end the day with a final workshop to create reports based on different audiences. A ready-to-use Power BI model, personas and use cases will be provided.Cloud Fortress (Alcazar)Mon 09:00 - 17:00
Analytics superhero in a day with Microsoft Fabric
Are you ready to become analytics superhero?
Hesitate no more! Microsoft Fabric is out and building analytics solution cannot get any easier! If you believe that the best way to learn is to learn by example, join us!
We will build a new green field analytical solution using new all-in-one Microsoft Fabric that covers everything from data movement to data science, Real-Time Analytics, and business intelligence. You will enjoy a highly integrated, end-to-end, and easy-to-use product that is designed to simplify your analytics needs.
We will start off the day with an overview of Microsoft Fabric, set up the environment and explain the scenario for the day.
There is no use of analytics solution if there is no data. So, we will show how data is stored in Fabric, how you can integrate existing data, or ingest data from different sources. Here we will create landing zone and integrate existing data without making another copy!
Once data is in Microsoft Fabric OneLake, we will process it efficiently while demonstrating best practices. Data engineers are used to Spark so it will be our tool of choice in creating different layers while tackling all the things you need to know. During this part, you will learn how to incrementally load data and make sure your cleansing is done right and in an effective and dynamic way! Rest assured, SQL folks are not forgotten! We will show how you can access the same data using SQL!
Of course, there is no using loading data without a serving layer for your business needs. We will show you how you can use Lakehouse & Warehouse in Microsoft Fabric focusing on performance!
Finally, we will show you how to build Power BI report using new consumption method reading data directly from the lake, called DirectLake, skipping data engines.
If you do not know where to start, don’t worry, during this day we will explain every step and how Fabric helps you get the job done. Also, we will share the best practices seen at most customers!
After this day you will have a clear understanding of how Fabric works and what components you must use for you own top notch analytical solution!
Fabric is brand new, so instead of throwing terminology here, we are inviting you to join us and become the analytics superhero!The Great Hall (Auditorium)Mon 09:00 - 17:00
SQL Server Performance Tuning and Optimization - FULL DAY PRE CON
Do your users complain about slow reports? Are your database servers overwhelmed during times of high usage? Every SQL Server environment can benefit from performance tuning whether your environment has one server or thousands. In this full-day session you will learn about how identify problems using a wide variety of tools and scripts and how to implement best practices across your environment. Additionally, you will learn how to begin reading execution plans and how to tune queries to improve your performance within SQL Server. You will walk away with a list of items to evaluate in your environments and ways to resolve common issues. This session will guide you through real-life performance problems which can be solved by best practices and practical skills. Taught on a level anyone can understand, this session will focus on Microsoft SQL Server 2016 and forward.
You will also learn about maintenance activities and how they affect your server’s overall performance, and how to identify when your infrastructure is affecting your performance. Lastly, we will cover the newest performance enhancements coming with the latest release, SQL Server 2022. You’ll leave this demo-filled session better prepared to tackle many issues that can plague SQL Server performance along with the knowledge of how to resolve them.Hall of Data (Herten Aas)Mon 09:00 - 17:00
So much hybrid SQL goodness
Over the last years, Microsoft announced and released multiple services under their Azure Arc umbrella.
But what's in it for you? Join me on a journey through the different offerings in Azure Arc and how they can benefit you as a data professional!
While there are countless offerings, we'll mostly focus on those with a clear data focus: Azure Arc-enabled SQL Server, Azure Arc-enabled Data Services and Azure Arc-enabled Machine Learning.
But Arc isn't the only hybrid puzzle piece: SQL Server 2022 also comes with a bunch of new hybrid features like Synapse Link or MI Link, so we'll take a look at those, too!
You'll walk away from this demo loaded day with a good understanding on what Microsoft's hybrid offerings are and how they can benefit you as well as with pointers to more deep dive materials on the specific offerings!The Watchtower (Begijnenzolder)Mon 09:00 - 17:00
Synapse SQL for Microsoft Fabric
Learn how Microsoft completely rebuilt Synapse SQL. Moving from having both the Serverless and Dedicated Pools to a single engine.
Get a behind the scenes look at the architecture with showcases on scalability, performance, vertipaq/v-order and much more.
Welcome to the story about the Witch, the Squirrel, and the Wardrobe.The Great Hall (Auditorium)Tue 09:00 - 09:30
SQL Server Partitioning from Zero to Hero
Ever since the incident, you’ve been wanting to learn more about database partitioning in SQL Server. This session walks you through what partitioning really does, and what it doesn’t do. We’ll take a look at some of the quirks of partitioning, as well as how to use it to boost performance, enable high(er) availability, and even build lightning-fast data retention policies.The Data Cave (Dijlezaal)Tue 09:45 - 10:45
Real-world CI/CD for SQL Server using Azure DevOps deep dive
In this expert session we dive deeper into CI/CD for SQL Server using Azure DevOps. Which focuses on the Azure Pipelines service within Azure DevOps.
Topics covered include:
• Setup optimal settings for Azure Pipeline Agents for your environment
• Decide whether to use Classic Editor or YAML pipelines
• Customize YAML pipelines for various CI/CD scenarios
• Work with secrets
• Configure manual and automated approvals to different environments
Even though this session focuses on SQL Server you can use a lot of the concepts in this session for other Microsoft Data Platform services.
At the end of this session you will walk away with a better understanding of how to do SQL Server deployments in your work environment using Azure Pipelines.The Vault (Scala)Tue 09:45 - 10:45
Power BI CI/CD Options in 2023
As Power BI is being adopted by more and more businesses, there is a growing number of technical data management teams finding themselves with a desire to implement DevOps/DataOps principles and practices as part of their delivery process. CI/CD pipelines are an integral part of those systems, and for a long time Power BI was lacking suitable CI/CD capabilities. This situation has changed significantly in 2023, and several viable solutions are now available, both from Microsoft as well as from external tools.
This session provides an annotated overview of the Power BI CI/CD landscape in 2023. Looking at datasets, reports and other workspace artifacts, attendees will learn which options are available and what it takes to implement them. Advantages and disadvantages are explained as well as licensing requirements.
Mathias is uniquely qualified to present this topic: His open-source tool, pbi-tools, pioneered source control and CI/CD capabilities for Power BI. Earlier in 2023, Mathias contributed the new TMDL declarative data modeling language to Power BI, specifically to enhance the ability manage Power BI projects with DevOps tooling and enable enterprise-level collaboration scenarios. For anyone getting started with CI/CD for their Power BI projects as well as existing practitioners interested in updating their technical understanding of today's capabilities, this session is a must.Cloud Fortress (Alcazar)Tue 09:45 - 10:45
Microsoft Fabric, Lakehouses and Power BI: A guide for BI developers
Microsoft Fabric is here, and it fundamentally changes our data landscape. It introduces a Power BI like SaaS model for the data platform. You can now with a few clicks ingest data into the cloud by running pipelines, create a Lakehouse to make that data accessible and run Python, Spark, SQL and DAX on top of it. What does this all mean for Power BI? There are a lot of architectural changes, and you might wonder if you now need to start refactoring your Power BI solutions.
In this session we get you ready to decide whether you want to start using the new Fabric workloads for your Power BI solutions. First, we do an introduction of Microsoft Fabric is, then we will get an overview of lakehouse and how the data is stored in parquet files under the covers. Finally, we will see how this data is being exposed to Power BI through Directlake and how this differs from DirectQuery or Import. Then we will close off with demos where we solve some common Power BI patterns like incremental refresh and transactions on top of data getting loaded into the Fabric lakehouse.The Great Hall (Auditorium)Tue 09:45 - 10:45
Navigating the Jungle of ML: Using MLOps in Azure to Track and Tame Data and Model Drift
In the wild world of machine learning, it's essential to stay on top of model and data drift to keep your models running smoothly. In this presentation, we'll show how to use Azure Machine Learning to do MLOps and how you can track and tame these sneaky beasts. We'll explore the concepts and importance of MLOps, the challenges it presents, and the best practices. Through a series of real-world examples and hands-on demonstrations, we'll show you how to use Azure Machine Learning to identify and overcome drift, so your models can continue to thrive in the jungle of machine learning.Hall of Data (Herten Aas)Tue 09:45 - 10:45
Speaking Human - Learn to Ask Better Questions to Solve the Right Problems
Asking questions is easy. Anyone can do it, and, in fact, it is done every day. Asking good questions is harder. It takes some thinking about the problem before the question comes to mind, but it can be done.
Asking really good questions is very, very hard. It's so hard that the really good questions are rare. Why is that? It might be because we're asking the wrong questions.
We think that the technical community tends to jump straight to trying to solve the problem as stated instead of considering if the problem identified is even the actual problem in the first place! By taking a step back and considering the bigger picture instead of focusing solely on the stated problem, we can not only find other solutions, but we might even find better questions to ask. In essence: look further. Ask better questions.
As a journalist, Linda used to be all about open ended questions in order to let the subject frame the problem. As a solutions architect, Alexander was all about trying to avoid the detailed technical solutions to a people problem. Together we found better ways to find the answers that mattered.
We want to help you figure out how to ask better questions - and subsequently how to become a better problem solver.
We will discuss why words matter, when you should not always listen to what is said, and how your choice of questions will always give you the answers you deserve.The Watchtower (Begijnenzolder)Tue 09:45 - 10:45
From Frustration to Fun: Mastering Power BI Errors with a Smile
Errors are often the most valuable learning moments. This session is about the most common errors in in Power BI Desktop and Power BI Service every Power BI report developer will encounter at some point, and how to fix them. I will demonstrate each error, explain the error message (which often doesn’t provide sufficient guidance), why the error is thrown and what needs to be done to fix it, including debugging strategies. By unleashing the Power of Errors, I am transforming setbacks into valuable insights!The Hive (Verloren Zoon)Tue 09:45 - 10:45
Azure Arc-enabled SQL MI – More than just another kind of SQL Server
Join me, as we explore the capabilities of Azure Arc-enabled SQL Managed Instances.
We will start with a quick introduction of Azure Arc-enabled Data Services: What are they and what do you need to deploy them?
After that, we’ll focus on Azure Arc-enabled SQL Managed Instances and how they provide much more than just another SQL Server deployment, from evergreen versions through managed updates, peace of mind through managed backups and restores, flexibility through their pay as you go model and more!
You will leave not only with a solid understanding of their capabilities but also the scripts and tools to deploy and manage Azure Arc-enabled SQL Managed Instances today!The Data Cave (Dijlezaal)Tue 11:00 - 12:00
Parallelism in Microsoft SQL Server
SQL Server can execute queries in parallel which can sometimes improve the performance of your queries. However, parallelism has its own challenges; therefore, we will look at the concept of parallel executions to understand when it makes sense to run a query on more than one thread. We deal with topics such as CXPacket, threads, workers, execution context, branches, MAXDOP, Cost threshold for parallelism, NUMA nodes and the iterators that SQL Server can implement in the execution plan to enable parallelism.The Vault (Scala)Tue 11:00 - 12:00
Diving into the depths of OneLake!
By now you’ve probably heard lots about Microsoft Fabric, the latest and greatest SaaSified analytics solution from Microsoft. In a nutshell, Fabric enables you to work on a single copy of data using your favourite analytical engine. Whether you want to write good old SQL queries, do advanced data engineering using Spark notebooks or you just want to create a report in Power BI, Fabric has you covered. This is all possible through OneLake, the next chapter in the data lake storyline.
I hear you thinking “how does that work/perform?”, “does it integrate with other technologies?”, “what are some of the gotcha’s?” ... In this advanced session we’ll answer all these questions and instantly transform you into a OneLake expert! We will dive deep into the technology it’s built on, giving you a technical breakdown of the key features whilst also addressing some of the things you need to be aware of before plunging in!
Specifically, we will cover:
- Brief overview of the evolution of data lakes
- Distinctive features that set OneLake apart, focusing on:
- file format
- Hands-on demo illustrating its usability and performance
- Current limitations and unsolved feature mysteries
- How to start working with OneLake
The session is designed for data engineers and data analysts who have a fundamental knowledge of Azure data services and want to get a deeper technical understanding of Microsoft Fabric’s OneLake.Cloud Fortress (Alcazar)Tue 11:00 - 12:00
Power BI Hidden Gems
Power BI has many capabilities for business analytics not just for reporting but also managing your data and allowing for scale of items. In this demo-heavy session, you will learn about hidden items that most folks aren’t aware of or just don’t think about. You will learn about how these items work and understand how they can help you leverage them in your solution.The Great Hall (Auditorium)Tue 11:00 - 12:00
Simple and cheap BI solution for smaller organizations – Case story
When you are a smaller organization, doing analytics in the cloud might look unpredictable and expensive. All the vendor material is aimed at larger organizations with the complexity and cost that comes with that segment.
What if you just want a “simple” data warehouse and some reports on top of that? What if you have only one major source system and a few other smaller sources?
Well, there is hope for you. Microsoft offers a lot of options when it comes to analytics in Azure. The problem for smaller organizations is to find the right solution which does not break the bank and does not take months to implement.
In this session we will talk about a pattern implemented for several smaller organizations in the private and public sector. We will use a smaller retail organization as our example and talk about the competencies needed and what technologies were chosen and why.
We will start by describing the organization, what challenges they faced and the competencies they had. Then we will see how the pattern fits and of course describe the pattern in detail.
Spoiler alert: It´s not rocket science. It´s a classic modern data warehouse with a data lake and Power BI on top. What is interesting is why this pattern works for organizations of that size and what makes it cost effective and above all simple to maintain.
The audience will come away with a clear understanding why this pattern fits smaller organization and why it´s cost effective and simple to maintain.Hall of Data (Herten Aas)Tue 11:00 - 12:00
Visualizing Data for Non-Data Experts: Making Reports Accessible to All
Creating reports that effectively communicate data insights to non-data experts can be tough! While you as a developer know all ins and outs of Power BI and the data in the report, your audience maybe just got started or are less familiar with interpreting data.
The charts, graphs and tables you provide with the intention to inform, engage and trigger the audience may even have a complete different effect: they confuse and overwhelm your audience.
In this session, we'll explore how to visualize data in a way that is accessible to everyone, regardless of their level of expertise. The sessions attendees will learn how to identify their audience's needs, select the appropriate visualizations for their data, and present their findings in a clear and concise manner.
The session starts with discussing the importance of storytelling and how it can be used to create engaging reports that resonate with the target audience. Then it covers the key principles of effective data visualization, including:
* Choosing the right chart or graph for the data
* Simplifying complex data to make it easily digestible
* Incorporating visual cues to highlight important information
* Designing for accessibility, including colorblindness and other disabilities
Throughout the session, I provide practical examples and demonstrations of how to create effective data visualizations using Power BI.
The audience of the session will leave the session with a solid understanding of how to create reports that are accessible, engaging, and informative for everyone in their organization.The Watchtower (Begijnenzolder)Tue 11:00 - 12:00
Elevating your PySpark development in Databricks: Towards data engineering excellence
As data engineering projects tackle complex and ever-growing amounts of data, many organizations turn to Databricks as their platform of choice for Spark workloads.
Databricks offers a dozen of possibilities to develop your ETL pipelines, like notebooks, Python scripts, IDE support, and more. This makes it sometimes challenging to select the most suitable approach for a specific scenario or team composition.
This session will cover the different options that are available, as well as discussing their pros and cons, and when to use which approach. Some software engineering best-practices will also be covered and how they can be easily included in data projects.
By the end of the session, the audience will know some low-hanging fruits that offer key advantages for structuring projects and facilitate collaborating within.The Hive (Verloren Zoon)Tue 11:00 - 12:00
Have you experienced performance problems caused by contention in TempDB? Have you ever wondered why your TempDB is suddenly 3 TB? In this session, you will learn about all the various components of SQL Server that use TempDB. Whether it be AlwaysOn Availability Groups, Read Committed Snapshot version stores, spills, or simply temporary tables, learn about how to identify what SQL Server or your applications are doing in TempDB. Once you understand all the ways SQL Server uses this critical resource, and how to proper configure it, you'll be better prepared for your workloads whether it be an Azure VM, a physical server, or a container.The Data Cave (Dijlezaal)Tue 13:00 - 14:00
Time Series for relational people
Relational databases have been around for 40 years and they are a great choice for most data management needs. On the other hand, telemetry data has some properties that make it not exactly a great fit for a relational database. The last few years have seen the rise of time series databases, specialized for data that has a time attribute.
Join me to see what a time series database is, how it works and how you can use it in your projects. I will demonstrate how this technology enables new possibilities and overcomes some limitations of relational databases. Are you working with IOT telemetry data or performance metrics? This is the session for you!The Vault (Scala)Tue 13:00 - 14:00
Raiders of the Lost ADX
Often we need to find a way to uncover the hidden insights buried deep within the data jungles of the company.
This can be a challenging task, but myths talk about the legendary Azure Data Explorer (ADX). Many claim it can handle even the most massive and complex data sets, as it's a brand new timeseries database in Azure.
If you can handle the adventure, join our two explorers when they set out on a journey to discover the vast capabilities of ADX.
We'll navigate through the setup and configuration of ADX like Indiana Jones did through ancient ruins. With their whip-like queries we'll uncover the powers of the Kusto Query Language (KQL) to solve our data puzzles.
As we delve deeper into this data exploration, we'll find out that ADX is much more than a simple tool; it's like having a trusty sidekick by our side, helping us to navigate the treacherous waters of data analysis.
With much adventure, we'll also unearth the link between ADX and Azure Cosmos DB.
At the end of this ride, the audience will realize that with ADX on their side, they too can become daring data explorers. They will know when to use ADX and what its strengths and weaknesses are.Cloud Fortress (Alcazar)Tue 13:00 - 14:00
Context transition in DAX
One of the primary reasons why some DAX expressions have low performance is because it performs (too) many context transitions. But even worse, sometimes context transition (or the lack of context transition) causes unwanted results for your DAX expressions.
Context transitions occur in DAX when you need a filter context while you were in a row context. It can happen explicitly as well as implicitly.
This session dives into these context transitions. After a brief recap on row versus filter context, we will explore the different scenarios in which context transitions occur, and look into the factors that can make this process more expensive to execute.
This session is intended for participants who already have a basic understanding of row and filter context in DAXThe Great Hall (Auditorium)Tue 13:00 - 14:00
Azure Networking infrastructure: is it really that boring?
Welcome to the wonders of Azure networking. Feeling excited yet? I'm guessing you're not. But don't write it off too soon, because after this talk you'll finally understand why we need to consider a proper networking architecture for our Azure resources. Besides being perceived as boring, Azure Networking infrastructure actually has more value than meets the eye. In this talk, we will embark on a journey to demystify Virtual Networks (VNets), private and public endpoints, and many more. We'll keep it practical, assess pro's, con's and risks and make sure you'll finally see it is not thàt boring after all.Hall of Data (Herten Aas)Tue 13:00 - 14:00
Are you being influenced? Data privacy and the ethical challenges of data
We live in an exciting time where data is everywhere and available to everyone. We’ve seen the power of data put to use to increase safety, for instance by assisting car drivers to brake when someone suddenly crosses the road. Another way data is put to use, is by helping us broaden our horizon by suggesting what movie or band we should check out based on our viewing and listening behavior. Many of these data-driven innovations are meant to have a positive impact on our daily lives.
However, data, and the algorithms that use the data to drive innovations, are also used for less transparent or noble goals. From building psychological models on individuals–based on the things we “like” on Facebook–to influence our behavior, to deciding whether or not we should be hired for that new job we applied for.
Where exactly do we, people who work with data every day, draw the line between the ethical and unethical use of data? Unfortunately, the answer to that question is not as easy as you would think.
In this session, we are going to look at how data and algorithms change the way we interact with each other, buy products, and even influence who we will vote for in an election. By showing you how your data is used to influence your behavior, I hope you will be more empowered to answer the beforementioned question!The Watchtower (Begijnenzolder)Tue 13:00 - 14:00
Avoid data visualization pitfalls and gain the trust of your end users
From scales, to axis, to colors, to subtitles, to graph type, to clutter and more.. The development choices you make creating a Power BI report determine how the data will be perceived and interpreted.
In fact, building a report is building trust with your end users.
Find out what pitfalls can easily be made in Power BI and above all: discover how to make it right.The Hive (Verloren Zoon)Tue 13:00 - 14:00
Accelerated Database Recovery - A Deep Dive Behind the Magic
We have all heard the stories. Horror story after horror story being regaled from database administrators all over the world of waiting for hours or sometimes days for a rollback operation to complete. DBA's hoping beyond hope that the rollback finishes soon before someone else gets the notion to reboot the server.
With the release of SQL Server 2019, a new talisman was conjured that will save us all from hearing about future horror stories. Accelerated Database Recovery or also known as "ADR". This new mystical feature changes the way transaction rollback performs and will undoubtedly prove invaluable in your arsenals of magical tricks!
In this session, we'll show you the secret behind it!
You will -
• Learn about new components of the transaction log
• Discover the magic behind ADR
• Determine when you wouldn't want to implement it
Don't be the protagonist in a rollback horror story! After all, the time you save might be your own!The Data Cave (Dijlezaal)Tue 14:15 - 15:15
Testing in DataOps, how to maximize your impact
In practice, DataOps is not as common for data & analytics as DevOps is for software engineering. For the latter, Development and Operations are jointly responsible for developing a system, deploying it and maintaining the system. With the aim of delivering faster, being more agile and creating maximum business value. This is where DataOps is the same as DevOps: the objective is similar. But ‘How’ we do this, differs considerably.
Having the right data in the right place at the right time with the right quality, is becoming increasingly important for supporting business decisions, optimizing, automating and powering AI models. Just like with software development, you want to deliver new functionalities with premium quality much faster. You don’t want to make new data, new insights, new AI models available to the user every month, but when it is ready for deployment. That is what DataOps can achieve in theory. But in practice one faces serious challenges that make it a lot more difficult to effectuate the DataOps process in an organization. For example, how to deal with development sandboxes and representative test data across systems.
In this session Vincent Goris will show what DataOps is and that it is not just DevOps for data. They will discuss the unique challenges, solutions for these challenges and their lessons learned.
- How does DataOps relate to DevOps/Database DevOps and what are the differences?
- What variables to consider when testing
- How to start testing your data pipelines
- A roadmap to implement DataOps in your organizationThe Vault (Scala)Tue 14:15 - 15:15
Migrating an on-prem SSIS/SSAS ETL to Synapse? What are the dangers, costs and lessons learned?
Migrating an on premise SSIS/SSAS ETL project with 100+ tables to Azure Synapse may look like a big hurdle. However, with the right preparation and a list of best practices it becomes more manageable. In this session I want to share how we approached this migration, where we struggled and what our lessons learned were. When migrating to Azure, costing is always an important aspect. I’ll also explain how we kept the costs under control and what the driving cost factors are.Cloud Fortress (Alcazar)Tue 14:15 - 15:15
Data modeling for experts with Power BI
Have you been working with Power BI for a while? Then it is time to take the next step with your data modeling skills!
You might have heard about Power BI composite models. But in order to implement them successfully, it is important to understand the different storage modes that Power BI has. This is key for a successful implementation and also understand the possible dangers of the storage mode behaviors.
More over, let's also talk Calculation groups! You might have been creating numerous calculations in the past for year to date, quarter to date and many other common patterns. With Calculation Groups you can limit the number of redundant measures by creating a common expression parts in a group, that can be applied on top of multiple measures.
And finally, we will combine the two together. Dealing with calculation groups in composite models. Cause there are some key-elements you have to keep in mind. Potentially this combination can lead to wrong results, if you do not understand the behavior of the remote and local engine and with that dealing with wholesale and non-wholesale queries.
During this demo-rich session I will take you on a tour through all above mentioned features and cases, and explain them one by one. So that after this session you will be able to;
- Understand and explain different types of storage modes in Power BI
- Combine DirectQuery and import storage modes in a composite model
- Successfully implement calculation groups
- Understand calculation groups in composite models and their behaviorThe Great Hall (Auditorium)Tue 14:15 - 15:15
How to apply the European AI act when implementing Large Language Models like ChatGPT on Azure
In this session we will explain the basics of the European AI Act. We will elaborate on the key aspects that must be considered by organizations on the four steps of the AI process flow (Design, Development, Evaluation, Operation) as defined within the Conformity Assessment Protocol for AI Systems by Floridi et al., 2022 (CAP AI). We will focus on Large Language Models applications (based on solutions like ChatGPT) for a variety of use cases across several domains such as employment, education, healthcare, government or legal. In addition, insights on our experience implementing the role of the AI translator within the innovation cycle and lessons learned from the ongoing development of a Model Risk Management Framework (MRM) for AI systems will be discussed. Collaboration among interdisciplinary stakeholders, including legal, technical, business, and user experience teams, will become crucial for the success of AI tools. This collaboration is especially important to be aligned with the European AI act requirements and identify potential risks and benefits from the tools.
Floridi, L., Holweg, M., Taddeo, M., Amaya Silva, J., Mökander, J., & Wen, Y. (2022). CapAI-A Procedure for Conducting Conformity Assessment of AI Systems in Line with the EU Artificial Intelligence Act. Downloaded from: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4064091
Hall of Data (Herten Aas)Tue 14:15 - 15:15
Lessons learned: Governance and Adoption for Power BI
Rolling out Power BI within an organization is not an easy job. There are usually two strategies: Decide on a governance and adoption strategy first and roll out Power BI accordingly, or make Power BI widely available and set up governance and adoption afterwards. Especially in the latter scenario, setting up a proper governance and adoption strategy might become a challenge. Users might resist new guidelines as they are used to their own way of work and feel neglected without any guidance on how to use Power BI. On top of that, there might be widely used solutions in place which are violating the company regulations, leading to more resistance in changing and rearranging the way of work.
This session focuses on the (re)governance and (re)adoption of Power BI within an organization in which Power BI is already being (widely) used by users with limited governance and adoption. As there are multiple paths to success, we will focus on a few core concepts to take into account when walking one of these paths. These concepts include, but are not limited to:
- Practices: Clear and transparent guidance and control on what actions are permitted, why and how.
- Content ownership: Managing and owning the content in Power BI.
- Enablement: Empowering users to leverage Power BI for data driven decisions.
- Help and Support: Establishing a support system with training, various levels of support and community.
In this session we will combine these theoretical concepts with lessons we learned along the way of implementing these concepts at various projects.The Watchtower (Begijnenzolder)Tue 14:15 - 15:15
Graph Literacy – The best proof of data are graphs
This session is about 'why' people should care about being graph literate, knowing when to choose which graph, for which audience, and for which purpose. In a world where data is exploding and companies are buying expensive ETL tools, data platforms, and consultants, it is a pity the data visualization part is often forgotten. It is necessary to know the psychological 'why' behind being able to communicate data effectively. In the session we dig deeper into the three main aspects of what it takes to become graph literate and how you can embrace this new culture inside your organisation.The Hive (Verloren Zoon)Tue 14:15 - 15:15
The Combined Power of Microsoft Fabric for Data Engineer, Data Analyst and Data Governance Manager
Are you ready for an illuminating exploration of Microsoft Fabric? In this session, we bring together three data professionals to showcase how the power of collaboration and a unified data platform can amplify your organization's data-driven success. Before Fabric, each data professional would typically work with their own tools and capabilities, each with their own strengths, weaknesses, and experiences. With Fabric, this is no longer neccesarily the case. In this joint session, three data professionals will share their unique perspectives on how different roles can thrive within the Fabric ecosystem.
The Data Engineer: Meet "Emma", the visionary Data Engineer with an eye for transforming raw data into actionable insights. She will delve into the realm of Notebooks and Spark Jobs, illustrating how she harnesses the capabilities of Fabric to architect data pipelines, drive automation, and ensure data integrity within the Lakehouse. Discover how Emma empowers her team by seamlessly integrating various data sources, all residing in OneLake.
The Data Analyst: Introducing "Jenna", the meticulous Data Analyst who excels in the art of harmonizing dataflows and nurturing the vital relationship between data and BI. Witness her unveil the secrets of Dataflows, showcasing the power of low-code solutions and the synergy between Power Query and Fabric. Explore how Jenna's orchestration of data ensures accurate, cleansed, and harmonized insights across the entire organization.
The Data Governance Manager: Last but not least, join us on a journey with "Mia", the brilliant Data Governance Manager who ensures that data remains a strategic asset within the organization. With her keen eye for compliance and data integrity, Mia will guide us through the realm of data governance within Microsoft Fabric. Explore how Mia establishes robust data governance frameworks, enforces data privacy regulations, and oversees metadata management to drive responsible and ethical data practices. Witness her expertise in balancing data access controls and democratizing data insights while maintaining regulatory compliance and safeguarding sensitive information.
Together, Emma, Jenna and Mia will demonstrate how their unique roles intertwine within Microsoft Fabric, all driven by a shared vision and powered by the comprehensive capabilities of OneLake. Prepare to be inspired as we uncover the synergistic magic that happens when diverse perspectives converge to unlock the true potential of data management.The Data Cave (Dijlezaal)Tue 15:30 - 16:30
From Cloud APIs to Your Screen: A Demo of Retrieving, Storing, and Serving Data in Azure
This session started with a question on Reddit—“How Do a I get a list of VMs in Azure?”. I started down the path of building a PowerShell script, and it eventually became a small application. In this demo-full session you will learn about the following technologies:
• Azure Automation
• Azure Blob Storage
• Azure Active Directory Managed Identities
• Azure Logic Apps
• Azure SQL Database
• Azure Web Apps
You will learn how to take the basic idea of how to take a PowerShell script, persist its output, and then operalizationalize and secure the application. You will also learn about the challenges encountered while building the app, and how I worked to resolve those challenges.The Vault (Scala)Tue 15:30 - 16:30
Accelerate Your Data Lakehouse: Unleashing the Power of Databricks Delta Lake!
Join this engaging session as we delve into the critical role of query performance in the world of Data Lakehouses. While cloud-based data lake storage offers cost-efficiency, it often falls short in terms of performance. Fortunately, Databricks has developed a groundbreaking solution: Delta Lake. To highlight the relevancy of this topic, solutions like Fabric have also recognized the immense value of Delta Lake and implemented it within their framework.
In this inclusive session, we will explore the mechanics of working with tables in the Delta format. Through captivating demos, you'll witness firsthand the seamless integration and functionality of Delta Lake. We'll delve into the hidden intricacies and demonstrate the need for optimization, empowering you to unlock the full potential of Delta Lake.
But optimization is the key to truly unleashing the power of Delta Lake. We'll showcase impactful optimization techniques through practical demos, allowing you to supercharge your Data Lakehouse. From intelligent partitioning to efficient data indexing, these demos will provide valuable insights to maximize query performance and accelerate your data analytics processes.
Join us on this collective journey to witness the mechanics and optimization of Delta Lake in action. Discover the art of optimization and unlock the true potential of your data-driven endeavors. Get ready to elevate your data analytics game as we embark on a hands-on exploration of the power of Delta Lake!Cloud Fortress (Alcazar)Tue 15:30 - 16:30
My Top 10 Power BI Tips, Tricks and Resources
Want to learn some excellent Power BI Tips and Tricks to make your Power BI reports sleek? In this session, I will talk about some of my favourite Power BI Tips and tricks assembled working with different clients and open data projects. The session will include some simple tricks like visual settings to advanced functionalities like commentary.
The Great Hall (Auditorium)Tue 15:30 - 16:30
Double up the Power: combining Power BI with PowerPoint for strong storytelling
Since a few months it's possible to embed your Power BI dashboard in a PowerPoint presentation. But this does not make it a datastory just yet...
Some of the 'rules of thumb' for building presentations and building dashboards clash with each other. So how can you use Power BI in your presentation?
We want our presentations to have an effect on our audience. Your audience needs to do something, stop doing something or never forget something. In order to create such a change in your audience, people need more than just the 'dry' numbers. Their emotions should be triggered.
In this session we'll look at both the requirements for your Power BI dashboard and your presentation to truly build your datastory, consisting of both a strong narrative AND strong visuals that triggers emotion. To instigate actions we need more than the technicality to include Power BI in PowerPoint. We need storytelling for our audience to make a change.Hall of Data (Herten Aas)Tue 15:30 - 16:30
The ComColors model for colourful and efficient teams
The quality of the relationships between co-workers within a company is the engine of performance, and therefore the main concern of companies and workplaces.
The fundamental approach of the ComColors model is to bring every individual of a team to accept themselves as they are in order to build on their strengths and talents.
Knowing our personality type as well as understanding others has a strong impact on our self-development and our relationships with our environment. It naturally leads us to understand our inner conflicts, the ones between what we think should be and what we can’t help but be (our personality type for example). Getting to know our personality type actually gives us the real permission to allow ourselves to be who we are.
Knowing your personality type allows you to identify:
• How you perceive the world and how you communicate
• What motivates your decisions
• What puts you under pressure
• The environment that places you in the best conditions for success
• The role you play in a team
The ComColors model can be used for both personal development and team management purposes.The Watchtower (Begijnenzolder)Tue 15:30 - 16:30
From Warehouse to LakeHouse : A serverless approach in Synapse Analytics
Today when we speak about a modern cloud data warehouse, it's all about "LakeHouses", combining key benefits of data lakes and data warehouses.
How can clients easily migrate from a traditional data warehouse following Kimball methodology towards a modern lakehouse ? And how all of this can be done in a serverless environment?
Furthermore, we speak about the benefits, pitfalls and differences with Databricks and what you can expect from the new Fabric environment.
During this session we will focus on the end-to-end process (demo) including logging and reporting on the delta lake.The Hive (Verloren Zoon)Tue 15:30 - 16:30
Code, deploy and maintain your Azure (data) Infrastructure with confidence
Have you been deploying your Azure databases and all connected resources through the portal?
Are you fed-up with clicking, weird resource naming and mostly, with having to deal with changes manually?
If you are working in Azure and you have anything to do with data and the infrastructure, this session is for you!
Azure Infrastructure as Code offers a plethora of possibilities, but the first time I checked it out, all I saw were Azure Resource Manager (ARM) templates. Hard to read, harder to write. They gave me headaches. It seems I wasn't the only one with that problem, because there are excellent tools to help you out! My favourite, and the one I'm using in this session is Terraform.
Now why is this presenter talking about this? I've deployed a number of customer environments with this language. Whenever there's a security update, like a new policy for example, I can deploy this to all customers in minutes. I'll only have to code this once and can easily deliver it many times, saving them time and money. Resources we can spend in other areas like ETL, ELT etc.
During the session, I'll demonstrate the basics of a data deployment, following the spirit of the Microsoft Well Architected Framework. I'll show you my way of working, the structure and and the end result. There is no need to try and photograph what's happening on screen, all the scripts will be available after the session.The Data Cave (Dijlezaal)Tue 16:45 - 17:45
Avoid Data Silos! Best Practices for Implementing Shared Datasets
Fragmented data adds unnecessary cost and complexity within an organization! Within the Power BI platform these are called (data silos), where the ecosystem has nearly as many unique datasets as they do reports. This puts a burden on the developers to maintain these datasets, and the servers when the datasets refresh. It can even cause discrepancies in the logic between datasets resulting in different numbers! Thankfully this can be solved by leveraging the shared datasets feature in Power BI. In this session you will learn: how to properly configure a dataset for enhanced user experiences, enabling row level security (RLS) on a dataset to protect sensitive data, publish a dataset for optimal sharing and distribution, promote or certify a dataset for increased exposure, and view usage metrics to determine where the dataset is being used, and by whom.The Vault (Scala)Tue 16:45 - 17:45
Building Better Data Models with Tabular Editor's Best Practice Analyzer
Tabular Editor (TE) is a widely used tool for Power BI and Analysis Services data model development. Whether you're using the free or commercial version, its built-in Best Practice Analyzer provides a simple and effective way to validate your data models against established best practices. This session is designed to help you get the most out of the Best Practice Analyzer and ensure that your data models are optimized for performance, scalability, and maintainability.
You'll learn how to utilize existing rule sets published by TE, the Power BI Customer Advisory Team, and other third parties, and how to customize these rules to meet your specific requirements. You'll discover how to integrate the Best Practice Analyzer into your development process, including during validation steps in CI/CD pipelines, to catch potential issues early and improve your data models over time.
In addition, you'll learn how to create your own custom rule sets for naming conventions, code conventions, and other best practices specific to your organization. By the end of this session, you'll have a solid understanding of how to use the Best Practice Analyzer to improve the quality and performance of your data models and ensure that they meet the needs of your organization.
Note: To get the most from this session, attendees should have prior experience with Tabular Editor and a basic understanding of Analysis Services Tabular, Azure Analysis Services, or Power BI data models.Cloud Fortress (Alcazar)Tue 16:45 - 17:45
Behind the Hype - Architecture Trends in Data
As an industry, we data folks love a buzzword. We see huge swings between centralisation and decentralisation, we debate ferociously between warehousing methodologies from decades past and yet we're fairly susceptible to the latest and greatest thing taking the industry by storm. So what's the latest buzz?
In this session, seasoned data engineer and youtube grumbler Simon Whiteley takes us on a journey through the current industry trends and buzzwords, carving through the hype to get at the underlying ideals. Which is going to last and which is a sales gimmick? Which bandwagon might actually take you in the right strategic direction? Do you know your Meshes from your Fabrics? Why is ETL suddenly 'Reverse'? This session aims to answer these questions with a heady mix of opinion & optimism!The Great Hall (Auditorium)Tue 16:45 - 17:45
Writing boardgames with T-SQL for SQL Server and Azure SQL
Transact SQL has been around for many years and most of us have been using it as a tool, a language to manipulate data. But what if we go a step away from these tasks and look on the funny side of T-SQL.
Can we use T-SQL to create a game, a drawing and gameplay? Session will explore couple of possibilities of using T-SQL for fun purposes using T-SQL. This way is also a great way to explore and learn the capabilities of the language.Hall of Data (Herten Aas)Tue 16:45 - 17:45
Knee-Deep In Tech Live @ DataMinds Connect
We're recording an episode of Knee-Deep in Tech live at DataMinds Connect, and we want you to be a part of it! The episode is roughly 45 minutes, recorded in front of a live audience. The audience will get to be part of the episode in several ways, and the hosts will take questions that will be discussed in the episode.The Watchtower (Begijnenzolder)Tue 16:45 - 17:45
Breaking the language barrier in Power BI
In today's global business landscape, it has become crucial for companies worldwide to provide their data reports and dashboards in multiple languages. Whether it's multinational corporations with divisions spread across different countries or companies operating in regions where multiple languages are prevalent, the demand for multilingual support is undeniable.
Unfortunately – until now – Power BI falls short in providing an out-of-the-box solution for creating multilingual dashboards, while other vendors do. But worry not, as there are solutions to overcome this challenge.
This session will present a live demonstration of what we believe to be the best solution to this problem. Join us now to empower yourself to break through the language barrier in Power BI and transform your reporting and dashboards into truly multilingual masterpieces.The Hive (Verloren Zoon)Tue 16:45 - 17:45