After Hightouch runs identity resolution (IDR) on your graph, Hightouch outputs tables into your warehouse so they can be used for any business need. These are found in the hightouch_planner
schema (or the override name of the internal schema for the source you're using.)
Your marketing team can use them for paid campaigns, your sales team for deduplicating outreach efforts, or data scientists for fraud dection.
The output of IDR can also helpful in constructing models or parent models in Hightouch.
Hightouch creates three output tables:
- A RESOLVED table (
<output_prefix>_resolved
) that labels each row from your input tables with aHT_ID
representing a profile. This table tells you all the input rows associated with a profile by itsHT_ID
. TheSOURCE
is the slug of an input model. ThePRIMARY_KEY
is the primary key value from the input model. TheLATEST_TIMESTAMP
is the timestamp of the row from the input model.
HT_ID | SOURCE | PRIMARY_KEY | LATEST_TIMESTAMP |
---|---|---|---|
ht1 | earl-s-ecommerce-events | 1 | 2024-08-01 12:00:00.000 |
ht1 | earl-s-ecommerce-events | 2 | 2024-08-01 12:05:00.000 |
ht1 | earl-s-ecommerce-events | 3 | 2024-08-01 12:10:00.000 |
ht1 | earl-s-ecommerce-events | 4 | 2024-08-01 12:15:00.000 |
ht5 | earl-s-ecommerce-events | 5 | 2024-08-01 12:20:00.000 |
ht5 | earl-s-ecommerce-events | 6 | 2024-08-01 12:25:00.000 |
ht5 | earl-s-ecommerce-events | 7 | 2024-08-01 12:30:00.000 |
ht5 | earl-s-ecommerce-events | 8 | 2024-08-01 12:35:00.000 |
ht9 | earl-s-ecommerce-events | 9 | 2024-08-01 12:40:00.000 |
ht9 | earl-s-ecommerce-events | 10 | 2024-08-01 12:45:00.000 |
- A RESOLVED_IDENTIFIERS table (
<output_prefix>_resolved_identities
) includes all the unique identifiers associated with each profile. Each row represents a unique identifiers associated with a profile.COUNT
represents the number of times the identifier was seen in an input row, including updated rows if the same row gets seen multiple times due to updates.IDENTIFIER
is the type of identifier.VALUE
is the identifier value for the profile.
HT_ID | IDENTIFIER | VALUE | INITIAL_ROW_PK | INITIAL_ROW_SOURCE | FIRST_TIMESTAMP | LAST_TIMESTAMP | COUNT |
---|---|---|---|---|---|---|---|
ht1 | user_id | user_003 | 4 | earl-s-ecommerce-events | 2024-08-01 12:15:00.000 | 2024-08-01 12:15:00.000 | 1 |
ht1 | anonymous_id | anon_125 | 1 | earl-s-ecommerce-events | 2024-08-01 12:00:00.000 | 2024-08-01 12:15:00.000 | 4 |
ht5 | user_id | user_004 | 8 | earl-s-ecommerce-events | 2024-08-01 12:35:00.000 | 2024-08-01 12:35:00.000 | 1 |
ht5 | anonymous_id | anon_126 | 5 | earl-s-ecommerce-events | 2024-08-01 12:20:00.000 | 2024-08-01 12:35:00.000 | 4 |
ht9 | anonymous_id | anon_127 | 9 | earl-s-ecommerce-events | 2024-08-01 12:40:00.000 | 2024-08-01 12:45:00.000 | 2 |
-
A UNRESOLVED table (
<output_prefix>_unresolved
) which lists all the input rows we couldn't process because they primary key of the rows were not unique within the model it came from. -
GOLDEN_RECORD table (
<output_prefix>_golden_records
):
(Optional) Only appears if you enable Golden Record in your IDR project's settings. Golden Record flattens all possible values of each identifier into a single "best" or "primary" value. See our (Golden Record)[/identity-resolution/golden-record] doc for more info.- Each row corresponds to a single
HT_ID
(one identity). - You can define survivorship rules to pick the most recent, most frequent, or highest priority source value for each identifier.
- Each row corresponds to a single
State tables
In addition to the output tables, Hightouch generates and maintains state tables in the hightouch_planner
schema for each identity graph. These tables are typically prefixed with IDR_XXX...
where XXX represents a series of alphanumeric characters. These state tables are required for incremental runs and maintaining the consistency of your identity graphs. We may also store a couple of backup tables for each identity graph, prefixed by IDR_BACKUP_...
in the hightouch_planner
schema.
Example: Finding the most frequent or recent email
If you are using Customer Studio, you can leverage the output of IDR in your parent models.
In this example, we'll use a hypothetical users
table and output table from IDR, accounts_activity
.
There are 3 steps:
- Create the parent model
- Create the related model
- Configure the Traits
Let's take a look:
1. Parent model
The parent model SQL is as follows:
SELECT users.*, idr.HT_ID
FROM HIGHTOUCH_PLANNER.accounts_activity_resolved AS idr
JOIN USERS AS users ON idr.USER_ID = users.USER_ID;
The base table is the users
table and the generated Hightouch ID is joined in so it can be used later.
2. Related model
The next step is to create a 1:many related model with resolved table and join it with the parent model on the Hightouch ID.
select * from HIGHTOUCH_PLANNER.accounts_activity_resolved
3. The Traits
Each row in users
may have multiple emails in the IDR output table. Traits can be used to quickly aggregate which email to use for a campaign.
For example, you may want to create a trait that grabs the most frequent email detected in the IDR output table.
Alternatively, you can order the IDR table by timestamp and grab the most recent email detected in the IDR output table.