Using Data Warehouse tables in Experiments
Contents
If you have event-like data such as purchases, subscriptions, and usage records in a Data Warehouse table, you can use it directly as a metric in your experiment.
This enables you to measure how PostHog Feature Flags impact outcomes tracked in your external systems. For example, you can measure how a checkout redesign affects completed orders stored in your Stripe or Shopify data.
Setting up a Data Warehouse metric
When adding a metric to your experiment, select the Data Warehouse tables category and pick your table.
Configure the following fields:
Field Description Timestamp field The column in your Data Warehouse table that contains the timestamp of each row Data Warehouse join key The column in your Data Warehouse table that identifies which user each row belongs to (e.g., user_id,email)Events join key The field on PostHog events to match against the Data Warehouse join key (usually distinct_idorproperties.$user_id)


The Data Warehouse join key and Events join key together determine how PostHog links your Data Warehouse rows to PostHog events.
For example, if your Data Warehouse table has a user_id column and PostHog events include properties.$user_id, PostHog matches rows where both values are equal. This links your external data (orders, subscriptions) to the users in your experiment.
How join keys work
The join key system bridges PostHog events with your external data:
- Events join key extracts an identifier from PostHog events (e.g.,
properties.$user_id→"user_123") - Data Warehouse join key queries your table using that identifier (e.g.,
WHERE user_id = "user_123") - PostHog attributes the Data Warehouse rows to the experiment variant you were exposed to
Common join key patterns
| Scenario | Events join key | Data Warehouse join key |
|---|---|---|
| User ID in properties | properties.$user_id | user_id |
| Email as identifier | properties.$email | email |
| Using distinct_id directly | distinct_id | posthog_distinct_id |
| Session-based | properties.$session_id | session_id |
Supported metric types
| Metric type | Supported | Notes |
|---|---|---|
| Mean | Yes | Count, sum, average, min, max aggregations |
| Ratio | Yes | Both numerator and denominator can be Data Warehouse tables |
| Retention | Yes | Both start and completion events can be Data Warehouse tables |
| Funnel | Yes | With limitations (see below) |
Using Data Warehouse tables in funnels
You can include Data Warehouse tables as steps in funnel metrics. For example, measuring a funnel from viewing a product page:
Requirements for Data Warehouse funnel steps
All Data Warehouse steps in a funnel must use the same events join key. This is a technical requirement of how PostHog queries mixed sources.
Valid funnel – All steps use properties.$user_id:
Invalid funnel – Different events join keys:
The Data Warehouse join keys can be different (e.g., one table has user_id, another has customer_id), but the events join key must match.
Limitations
Data Warehouse funnel steps have the following limitations:
Maximum 3 Data Warehouse steps per funnel – This prevents expensive queries. If you need more steps, create a materialized view combining multiple tables.
Maximum 2 distinct Data Warehouse tables – This reduces query complexity. Create a view joining multiple tables if needed.
Same events join key required – All Data Warehouse steps must extract your identifier from the same PostHog event property.
These limits ensure queries remain performant while enabling cross-system funnel analysis.
How PostHog processes Data Warehouse metrics
Understanding the technical implementation helps debug issues and optimize performance.
Query execution
PostHog uses different query patterns depending on the metric type:
Mean, ratio, and retention metrics:
- PostHog joins your Data Warehouse table to the experiment's exposure data
- Applies the conversion window to filter rows occurring after exposure
- Aggregates values per variant
Funnel metrics with Data Warehouse steps:
- PostHog uses a
UNION ALLpattern to combine PostHog events with Data Warehouse tables - Each source (events + each Data Warehouse table) produces a separate sub-query
- Results are combined and processed by the funnel algorithm
- This enables mixing PostHog events with external data in a single funnel
Temporal filtering
Like PostHog events, Data Warehouse rows are filtered by:
- Exposure timing – Only rows occurring after your first exposure are counted
- Conversion window – Only rows within the metric's conversion window are included
- Experiment date range – Only rows within the experiment's start and end dates are counted
This ensures Data Warehouse metrics follow the same exposure logic as PostHog events.
Troubleshooting
No data appearing in results
Problem: Your Data Warehouse metric shows zero users or no data.
Solution: Verify that:
- The events join key extracts a value that exists on PostHog events (check using Data Management to see event properties)
- The Data Warehouse join key column exists in your table and contains matching values
- Your Data Warehouse table has rows with timestamps within the experiment date range
- The timestamp field is a valid datetime/timestamp column (not a string)
Join keys not matching
Problem: Users are exposed to your experiment but their Data Warehouse rows aren't being attributed.
Solution: Check that:
- Values are formatted identically (e.g., both use lowercase emails or both have the same prefix)
- The events join key extracts the correct property (use Data Management to inspect event structure)
- Your Data Warehouse table actually contains the identifier (query your table directly to verify)
Funnel validation errors
Problem: You see an error when adding Data Warehouse steps to a funnel.
Common errors and solutions:
| Error | Cause | Solution |
|---|---|---|
| "events_join_key is required" | Missing configuration | Add the events join key field to your Data Warehouse step |
| "Step X uses properties.$user_id but Step Y uses properties.$email" | Inconsistent join keys | Use the same events join key for all Data Warehouse steps |
| "Maximum 3 Data Warehouse steps exceeded" | Too many steps | Create a materialized view combining tables, or remove steps |
| "Maximum 2 distinct tables exceeded" | Too many tables | Create a view joining the tables |
Best practices
Use consistent identifiers – If possible, use the same identifier across all your systems (e.g., always use email or always use
user_id). This makes join keys simpler.Create materialized views for complex queries – If you need data from multiple tables or complex transformations, create a materialized view in your Data Warehouse first. This improves query performance and simplifies setup.
Test join keys before running Experiments – Use Data Management to verify your events have the properties you're joining on, and query your Data Warehouse to confirm matching values exist.
Monitor query performance – Data Warehouse queries can be slower than PostHog event queries. Start with smaller date ranges to test performance before running long Experiments.
Document your join keys – Keep a record of which join keys you use for different tables. This helps when creating new metrics or troubleshooting issues.
Example: Measuring revenue impact
Here's a complete example of measuring how a pricing page redesign affects subscription revenue:
Setup
Data Warehouse table: stripe_subscriptions
- Columns:
subscription_id,customer_id,created_at,plan_amount
PostHog events: Include properties.$user_id matching Stripe's customer_id
Metric configuration
- Create a new mean metric
- Select Data Warehouse tables →
stripe_subscriptions - Configure join keys:
- Timestamp field:
created_at - Data Warehouse join key:
customer_id - Events join key:
properties.$user_id
- Timestamp field:
- Set math to Sum and math property to
plan_amount
Result
PostHog measures total subscription revenue per variant, attributing each Stripe subscription to the variant you saw when you visited your pricing page.