In this episode Andrew and Danny talk about modern data warehouses – what they are, how they differ from old-fashioned data warehouses, and why you might want one.
What is a modern data warehouse?
- A modern data warehouse is a database and analytics technology that helps you process big data using all the scale of the cloud.
- We are talking about products such as Azure Synapse Analytics and Snowflake.
How do they differ from older generations of data warehouses?
- Older generations of data warehouse were built on relational database technology, and therefore constrained by the performance of individual machines.
- A next generation introduced MPP – Massively Parallel Processing – which gave scale to data warehouses.
- Modern data warehouses separate compute from storage. Compute is the expensive part, storage is cheap (relatively). You can scale compute up and down as demand requires, whilst benefitting from massive-scale cloud storage.
- Modern cloud data warehouses are as much a system for managing cloud resources as they are a single product.
Why you might want a modern data warehouse?
- Increasingly, we see companies adopting SaaS products for major line of business systems.
- This is great for many businesses – lower operating and support costs, functionality develops over time, can get industry-specific solutions that fit with their businesses.
- Downside is that this creates data silos – isolated pockets of data within the SaaS apps.
- You need to own your own data, don’t rent it. Build a data warehouse and ingest the data from your SaaS apps so that you have a single view of the data your organisation relies on.
- This single view of data allows you to combine and crunch datasets together, which is simply not possible when your data lives in separate SaaS apps.
Watch the video here:
Listen to the audio here:
In this episode we take a first look at IoT and cover off some of the common scenarios where IoT is a great solution. IoT
In this episode of the podcast we are again at the Data Science South Coast meetup. This time I’m presenting an Introduction to Data Lakes.