Snowflake Implementation

BI Infrastructure Project

Client is an online property investment portal and provides analytics for and on behalf of agents, vendors and buyers. The company needed to have fast and responsive analytics that could use multiple data sources, both structured and unstructured data, to enrich their reporting.

Client wants to become more data-driven and ensure that internal and external stakeholders can access clean, accurate reporting.

Client has about 15 years of unexploited data. Because there was no mechanism for maintaining a history of change in the data store it was not easy to analyse patterns over time. Chartio had been useful to client to provide a limited snapshot of MI reports but they were seeking to extend this capability. Client has a Cloud-first approach to infrastructure. A key element of the work that Client was doing with the data project is to de-risk the environment by reducing the single point of failure both in terms of infrastructure and individual resources. The Client also had serious performance issues that was impacting the amount of reporting and analysis they could perform.

Client needed a BI infrastructure that would be available 24/7 to the management team and to a wider community within the business.

The Project

KETL has provided consultancy, design and development resources to build, from the ground up, a data analytics reporting layer using Snowflake. The work has involved defining and creating a mix of delta (where possible) and full extracts from source systems, import routines in Snowflake, SQL code to generate full history of change and auditing, creation of views to remove complexity from the Chartio reports and allowing internally defined measures to be used across multiple platforms.

KETL also created a data dictionary and a data map (including sources and derivations). The work to design and create the reporting database was mainly done using views having instantiated the Operational Data Store (ODS). This allows changes to be made to the model at very short notice or to add new models with no risk to the underlying data.

The technologies

  • Snowflake, SQL.
  • The historic loading of data from MySQL and Postgres.

The Outcome

With the complex logic removed from their BI tool they can use different tools but always ensure the same results. The creation of the Data Dictionary has helped the business focus on their needs and has brought consistency across all departments.

By providing a more accurate view of daily performance business decisions can be made earlier. Existing BI tools became more efficient and so the time to insight was improved significantly. Now additional feeds can be added into the EDW and more advanced analytic and machine learning methods can be employed to cut down on fraud, to model pricing optimisation and improve customer segmentation.