# Real-Time ID Stitching Overview

## Terminology

Data identifiers are categorized by Processing Layer (how and when they are collected) and ID Type (whether they contain Personally Identifiable Information).

### Processing Layers

| Term | Definition | Characteristics and Use Cases |
|  --- | --- | --- |
| **Real-Time Layer** | Processes data as streaming events arrive in real-time. | Captured instantaneously at the point of interaction (website, app). Essential for immediate personalization, session tracking, and real-time decision-making. |
| **Batch Layer** | Processes data through scheduled unification workflows. | Collected, processed, and ingested at scheduled intervals or in large groups. Often from offline sources (POS, CRM) or historical data. Critical for building persistent customer profiles, historical records, and backend analytics. |


### ID Types

| Term | Definition | Key Function and Examples |
|  --- | --- | --- |
| **Known IDs (PII)** | Personally Identifiable Information; can directly or indirectly identify a specific individual. | Crucial for **addressability** (proactive reach via email, SMS) and **Cross-Channel Matching** (deterministic unification across devices/platforms). Examples: Email address, phone number, customer ID, full name. |
| **Unknown IDs (Non-PII)** | Anonymous identifiers; do not contain PII. | Backbone of site analytics, behavioral segmentation, and programmatic advertising. Essential for tracking user behavior and devices while maintaining anonymity until a Known ID is provided. Examples: Cookie ID, device ID, anonymous session token. |


**Important Note:** The distinction between Known and Unknown IDs is purely about the *type* of identifier. Both Known IDs and Unknown IDs can flow through either the Real-Time Layer or the Batch Layer for processing.

## Capability Overview

Real-Time 2.0 performs ID stitching in real time. As multiple events flow through the real-time layer with matching IDs across different columns, the system immediately unifies those identifiers into a single profile. That real-time profile is then enriched with IDs and attributes copied from your batch Parent Segment, so the real-time layer is always operating on the same unified customer view as your batch activations.

This means you are not just doing real-time enrichment with pre-unified profiles. You are doing real-time unification of identities as events stream in.

## User Flow

Real-Time ID Stitching User Flow
## How It Works

### Real-Time Layer

The real-time layer receives streaming events containing both known and unknown IDs and stitches them together instantly when matching identifiers are detected.

**Example:**


```
Event 1: { email: "user@example.com", cookie_id: "abc123" }
Event 2: { cookie_id: "abc123", device_id: "xyz789" }
Event 3: { email: "user@example.com", session_id: "session456" }
```

The real-time layer recognizes all three events belong to the same customer and stitches them together immediately.

**Key Points:**

- ID stitching happens in enterprise-grade sub-second latency as events arrive
- Works across any combination of known and unknown IDs
- No waiting for a scheduled batch workflow to complete
- Once stitched, the unified identity is available for all future events


### Batch Layer Enrichment

The batch layer processes historical event data through scheduled TD Workflows and serves as the source of truth for historical customer profiles. Batch profile attributes (loyalty tier, lifetime value, purchase history, etc.) are synced to a shared data store that the real-time layer reads on demand.

**Key Points:**

- Processes historical event data on a schedule (hourly, daily, or custom)
- Builds comprehensive customer profiles with aggregated attributes
- Performs ID unification across all historical data
- Creates the necessary data set for upload to the real-time layer


**ID Stitching via Batch Workflow:**
Once you have a unified batch identity graph (a consolidated, deduplicated set of customer profiles across all batch data, often called a "golden record" layer), you're ready to add the workflow that promotes those IDs into the real-time layer. Your Treasure Data CSM will supply this lightweight TD Workflow, which is designed to plug cleanly into your existing end-to-end orchestration.

### How They Work Together

Batch profile attributes are synced to a shared data store that the real-time layer reads on demand. This means the real-time layer benefits from:

1. **Its own real-time stitching** — identities unified instantly from live events
2. **Batch profile enrichment** — historical attributes (tier, lifetime value, purchase history) from the batch layer


## Setting Up RT ID Stitching

### Choosing the RT Profile Key

The batch unification ID — the canonical, persistent identifier assigned during batch ID unification — is the strongest candidate for the RT profile key. By design it is the logical primary key for batch profiles, meaning it satisfies all three requirements a profile key must meet:

- **Mandatory** — every batch profile has one
- **Unique** — it identifies exactly one profile
- **Persistent** — when set by persistent ID Unification with correctly managed input time values, the batch unification ID remains stable over time, making it suitable for use in Journeys


Note
The batch unification ID is used to look up and retrieve batch profile attributes in the real-time layer. It is **not** used as a stitching key for incoming real-time event data — live event stitching is performed using the RT stitching keys you configure (such as email, cookie ID, device ID, etc.).

## Use Cases

### When Real-Time 2.0 Platform Adds Value

**Scenario 1: Personalizing for Previously Anonymous Visitors**

An initially anonymous visitor arrives at the website. The real-time engine matches their unknown ID (`cookie_id`) with a corresponding ID (`device_id`) in the batch layer. This batch layer ID is then linked to a known customer ID (`email`). This allows for immediate personalization of the visitor's experience based on their established loyalty status.

**Scenario 2: Cross-Device Recognition**

When a customer engages with a brand across multiple channels—first using the mobile app (identified by `device_id` and `email`), and later visiting the website (identified by `cookie_id` and `email`)—the real-time system automatically unifies these three separate identifiers based on the shared email address. This immediate stitching creates an omni-channel identity graph in real-time, over the course of the day, allowing for consistent and personalized experiences across both the app and the web without relying on the batch processing layer.

**Scenario 3: Loyalty Tier Calculation**

A customer completes a purchase using the Point-of-Sale (POS) system in-store, triggering an event to be sent to the real-time engine. Simultaneously, the customer signs up for the loyalty program, and a welcome email is sent. Upon receiving the POS information and sign-up data, the real-time engine calculates the customer's loyalty tier. Despite just enrolling, the customer's purchase history is immediately recognized as meeting the criteria for the Platinum tier, resulting in an immediate upgrade from Bronze. This ensures the customer receives the high-value service appropriate for a Platinum member.

### Benefits Over Batch-Only Unification

| Aspect | Batch-Only | Real-Time 2.0 |
|  --- | --- | --- |
| **Identity unification speed** | Hours (next batch run) | Seconds |
| **Anonymous to known transition** | Delayed until next batch | Immediate |
| **Cross-device stitching** | Delayed | Real-time |
| **Personalization on first visit** | Limited to data from previous batch runs | Available as soon as customer provides a matching ID |
| **Activation trigger** | After batch completes | As it happens |