It’s easy to be turned off when you hear talk of the “data warehouse”. For a long time, the term has been associated with defining the reports you want and then gathering the data you need to populate the reports, often via data intakes in nightly builds.
But this would often lead to discarding a lot of the data gathered – which doesn’t sound smart or resourceful. No surprise, then, that talk of data warehouses wouldn’t exactly set pulses racing.
But the modern data warehouse is a different beast, much more in tune with current data best practices.
Today’s data warehouse is geared to allow you to gather all the data you might want and then to crunch it down afterwards in many ways. Rather than populating a static report, data warehouses now give you the flexibility to analyse what you want, and that offers more potential for useful business insight.
What problems do data warehouses solve?
There are 3 big issues that data warehouses address:
1. Data silos – where your data can’t deliver maximum value
Data that’s self-contained and separate from your other systems is of limited value.
- CRMs, g. Salesforce, HubSpot
- ERPs, e.g. Dynamics 365, SAP HANA
- Accounting/financial software, g. Oracle NetSuite
- Staffing/HR software, e.g. Rotacloud
Even telephony (VoIP) is now being delivered as a service.
Is there anything wrong with these services? No, of course not. They have millions of users for a good reason. But they do lead to the creation of data silos, and that limits how useful your data is to your business.
The value of such data multiplies when it breaks out of silos and connects with your other data sets and the rest of your systems. And modern data warehouses are the ideal tool to enable such interconnectedness within your business. (Some integration intelligence is needed to make this work optimally – that’s what we specialise in.)
Data warehouses also offer the related benefit of bringing together the right datasets when compared with data lakes.
A data lake is a storage service designed for huge scale, where files of any type can be organised hierarchically, and where each file can be saved with metadata, making identification simpler. While a data lake is good for storing data, it isn’t built for analysis of that information.
On the other hand, a data warehouse is more like a curated version of a data lake. It can take input from a data lake, bringing together different data sets for analysis.
2. Data ownership – where your data isn’t really yours
While it might be convenient to use remote data stores for storage in your apps, this means that – at least for SaaS solutions – you’re reliant on third-party systems to protect one of the most valuable assets your business has.
As shareholders and the public at large become more informed about the ownership, privacy and safety of data, there’s a growing expectation for businesses to be above reproach when it comes to how their data is handled. Don’t forget the potential data loss that could occur if you were ever to change apps, as you inevitably will.
A modern data warehouse allows you to remove these risks from your operation and take full ownership of your data.
3. Data processing – where your data is expensive to process
Even where it’s cheap to store large amounts of data, it can be expensive to process that information, because the computational power needed to manipulate a big database is significant – and that comes at a price.
Modern data warehouses such as Azure Synapse Analytics and Snowflake separate storage (cheap) from compute power (expensive), so that you can store large datasets at low cost and then pay only for the processing required as needed.
By making sure that this is done only with the relevant subsets of data, modern data warehouses significantly reduce the cost of processing that data.
How it works in practice
Here’s our approach to dealing with a data warehouse project, to help us deliver insight to your business:
And here’s the step-by-step process:
- We start with a simple data set to prove the concept, using a set of data that’s currently hard to get to.
- We create pipelines to extract the data from the source system. (We might load this into a data lake as a way of staging it.)
- We pull data into the data warehouse, checking and cleaning it as we do.
- We build analytics and dashboards so you get a single, accessible place to view the data.
- We work our way around your software systems, adding new data sets into the data warehouse until you have a complete picture of your business.
- We build out the analytics to provide rich visualisations of your data, making it easier for you to understand.
What this all means for your business
Our work on data warehouse projects means that what starts as a large amount of data that you’d struggle to analyse is turned into useful information that you can consume via easy-to-understand dashboards. Giving you the right data in context means you have the ability to react quicker than others in your field.
And at this point, you’re really set up to take advantage of data mining and AI. These technological leaps can give you serious competitive advantages – all of which stems from connecting up and analysing your data the right way to begin with.