Activity Schema at coalesce 2021

Ahmed presented the activity schema at dbt's coalesce 2021 conference, and explained why it's a better alternative to a star schema.

Activity Schema at coalesce 2021

Ahmed was invited to give a talk at dbt's Coalesce this year. He spoke about activity schema, the data model that powers Narrator's data platform.

So what's an activity schema anyway?

The fundamental idea behind the activity schema is that a single table can model any data required for analysis.

One table is all you need

This means all data is in the same, consistent structure - a series of events (called activities) done by an entity (called a customer) over time.

Any source data in a data warehouse can be easily modeled into this format with some simple SQL. Because the format is consistent, it requires a lot less upfront work and though than a star schema approach.

How is an activity schema better than a star schema?

The special structure has a few very cool attributes

  1. no need for foreign keys or explicit joins - any model can be joined to any other using relationships in time
  2. automated analyses - a consistent structure allows software tools to automatically generate queries and analyses that can work on anyone's data

The diagram below shows how an activity schema is used: one table, queried using relationships in time, powers all the queries you need to build out your metrics, dashboards, etc.

An activity schema table can power all your queries

The overall claim is that adopting an activity schema in your data warehouse makes data modeling and analysis substantially faster compared to dimensional modeling with a star schema.

The entire topic is somewhat complex, but Ahmed's talk does a great job of diving into how the activity schema came about and how it can be used. He also shows that it can actually answer any possible data question.

Coalesce Video

Check out the full video for yourself!


Check us out on the Data Engineering Podcast

Find it on the podcast page or stream it below