Search documentation...

K
ChangelogBook a demoSign up

Step 4: Validating your migration

After implementing Hightouch Events alongside your existing system, it's crucial to validate that your new setup captures data correctly and consistently and can power your existing use cases.

There are two main steps in validation:

  1. Verify that event data is flowing from Hightouch Events with the correct properties and data types.
  2. Check that the data from Hightouch Events is comparable to the data from your existing provider in volume and in values, such as user IDs and event properties.

Verifying your setup

  1. Check event reception:
  • Confirm that events are being received by both your current system and Hightouch Events.
  • Use the Hightouch debugger to view incoming events in real-time.
  1. Verify event structure:
  • Ensure all expected properties are present in the Hightouch events.
  • Check that data types are correct (for example, numbers aren't being sent as strings).
  1. Test all event types:
  • Manually trigger each type of event (page views, user identifications, custom events) in your application.
  • Verify that they appear correctly in both systems.

Checking data quality

Compare the data between your current platform and Hightouch Events to make sure that Hightouch Events is instrumented correctly and sending the data you expect. We recommend checking both event volume and values.

If you're maintaining your data model during your migration or only making minor changes, validating data values will be more straightforward. Significant changes to the model model during migration will make validation more complex and time-consuming.

Compare event volume

Let's assume that you're migrating from Segment to Hightouch Events, and have data flowing from Segment into tables per Segment's schema and from Hightouch Events into tables per Hightouch's schema. We'll also assume events from during the migration have a migrationId assigned through an analytics wrapper function, described in Step 2.

We could use the following query—or something adjusted to your warehouse and setup—to look at the count of identifies events within the last 7 days. While we expect to see roughly the same count of events, there can be some variation—we'll cover why differences can occur later in this step of the guide.

WITH segment_identifies AS (
  SELECT DATE(timestamp) AS event_date, migrationId, COUNT(*) AS segment_count
  FROM SEGMENT.identifies
  WHERE timestamp >= DATEADD(day, -7, CURRENT_DATE())
  GROUP BY DATE(timestamp), migrationId
),
hightouch_identifies AS (
  SELECT DATE(timestamp) AS event_date, migrationId, COUNT(*) AS hightouch_count
  FROM HIGHTOUCH.identifies
  WHERE timestamp >= DATEADD(day, -7, CURRENT_DATE())
  GROUP BY DATE(timestamp), migrationId
)
SELECT
  COALESCE(s.event_date, h.event_date) AS event_date,
  s.migrationId,
  s.segment_count,
  h.hightouch_count,
  s.segment_count - h.hightouch_count AS count_difference,
  CASE
    WHEN s.segment_count = h.hightouch_count THEN 'Match'
    ELSE 'Mismatch'
  END AS comparison_result
FROM segment_identifies s
FULL OUTER JOIN hightouch_identifies h ON s.event_date = h.event_date AND s.migrationId = h.migrationId
WHERE s.migrationId IS NOT NULL OR h.migrationId IS NOT NULL
ORDER BY event_date, migrationId;

Compare property values

We also need to validate that Hightouch Events is collecting the same values as your prior provider.

The query below looks at a selection of properties relevant to identifies calls and compares between the two platforms over a 7-day period.

You can modify this query to examine different properties, use a different time window, or look at data in a narrower set of dates.

WITH segment_data AS (
  SELECT
    id AS segment_id,
    migrationId,
    anonymous_id,
    user_id,
    timestamp,
    email,
    name
  FROM SEGMENT.identifies
  WHERE timestamp >= DATEADD(day, -7, CURRENT_DATE())
),
hightouch_data AS (
  SELECT
    id AS hightouch_id,
    migrationId,
    anonymous_id,
    user_id,
    timestamp,
    email,
    name
  FROM HIGHTOUCH.identifies
  WHERE timestamp >= DATEADD(day, -7, CURRENT_DATE())
)
SELECT
  s.migrationId,
  COALESCE(s.user_id, h.user_id) AS user_id,
  s.segment_id,
  h.hightouch_id,
  s.anonymous_id AS segment_anonymous_id,
  h.anonymous_id AS hightouch_anonymous_id,
  s.timestamp AS segment_timestamp,
  h.timestamp AS hightouch_timestamp,
  CASE WHEN s.email = h.email THEN 'Match' ELSE 'Mismatch' END AS email_comparison,
  CASE WHEN s.name = h.name THEN 'Match' ELSE 'Mismatch' END AS name_comparison,
  s.email AS segment_email,
  h.email AS hightouch_email,
  s.name AS segment_name,
  h.name AS hightouch_name
FROM segment_data s
FULL OUTER JOIN hightouch_data h ON s.migrationId = h.migrationId
WHERE
  s.email != h.email
  OR s.name != h.name
  OR s.anonymous_id != h.anonymous_id
  OR s.migrationId IS NULL
  OR h.migrationId IS NULL
ORDER BY s.timestamp DESC
LIMIT 100;  -- Limit to 100 rows for a manageable sample

What differences to expect between your old provider and Hightouch Events

Event volume should be approximately the same, but some discrepancies between systems are normal. Deployment rollout, ad blockers, and network errors could all cause events to appear in one tool but not another, leading to differences in volume.

  1. Timing differences: Events may be processed in slightly different orders or with small time variations.
  2. Dropped events: Network issues might cause events to be lost in one system but not the other.
  3. Duplicate events: Some events might be sent twice in edge cases (for example, page reloads).
  4. Blocking behavior: Other providers might not block events as strictly as Hightouch does when running type checks.

Hightouch Events should collect the same values as your previous event collection provider, though there may be minor variations in automatically collected fields.

In the next section, we'll explore how to unify your historical data with new Hightouch data.

Ready to get started?

Jump right in or a book a demo. Your first destination is always free.

Book a demoSign upBook a demo

Need help?

Our team is relentlessly focused on your success. Don't hesitate to reach out!

Feature requests?

We'd love to hear your suggestions for integrations and other features.

Last updated: Oct 16, 2024

On this page

Verifying your setupChecking data qualityCompare event volumeCompare property valuesWhat differences to expect between your old provider and Hightouch Events

Was this page helpful?