By clicking “Accept All Cookies”, you agree to the storing of cookies on your device to enhance site navigation, analyze site usage, and assist in our marketing efforts. View our Privacy Policy for more information.

Where Should You Deduplicate: HubSpot or Salesforce?

Duplicate records are one of the most common sources of broken syncs, missed follow-ups, and unreliable pipeline data in a HubSpot–Salesforce integration. The question RevOps teams consistently ask is: which platform should own deduplication? The answer depends on where the record lives in the buyer journey — and getting the sequencing wrong creates far more damage than the duplicates themselves.

Why Does Deduplication Become Complicated Once HubSpot and Salesforce Are Integrated?

HubSpot’s native merging capability is restricted for records that are already synced to Salesforce. Once a contact or lead has a Salesforce ID attached to it, HubSpot treats that record as “owned” by Salesforce. Attempting to merge directly in HubSpot — without first resolving the duplicate in Salesforce — risks creating what practitioners call a “ghost record”: a record that still exists in Salesforce but has lost its HubSpot counterpart, or vice versa. The integration then detects the mismatch and, depending on sync settings, either recreates the deleted record or throws a sync error. Neither outcome is acceptable in a live revenue operation.

The underlying issue is that HubSpot and Salesforce have different data models. Salesforce structures records as Leads, Contacts, and Accounts with explicit hierarchy and ownership rules. HubSpot treats everything as a Contact associated to a Company. When a merge happens on one side without a corresponding action on the other, the two systems fall out of alignment — and realigning them manually is expensive and error-prone.

Should You Use Salesforce or HubSpot as the Master System for Deduplication?

Salesforce is the correct master system for deduplication once a record has been synced. The reason is architectural: when you merge two records in Salesforce, the integration recognises the surviving “master” record and maintains the Salesforce ID link. The losing record is archived or deleted, and HubSpot detects this change through the sync, triggering a corresponding merge or deletion on its side — depending on how your sync lifecycle settings are configured.

This sequence preserves data integrity in both directions. Salesforce’s deduplication logic is also more suited to handling complex Account and Contact hierarchies, particularly in enterprise or multi-entity accounts where the same legal entity may appear under slightly different names across different data sources.

The practical workflow is:

  1. Identify the duplicate in Salesforce using duplicate rules or a third-party deduplication tool
  2. Confirm which record is the master (typically the one with the most complete data and the longest activity history)
  3. Merge in Salesforce, assigning the master record
  4. Allow the integration sync to propagate the change to HubSpot
  5. Verify in HubSpot that the losing record has been removed or merged correctly

Do not initiate a merge in HubSpot for synced records. Do not delete a record in HubSpot without first merging or archiving in Salesforce. If you delete in HubSpot first, the integration will typically recreate the deleted record on the next sync cycle, because Salesforce still holds a reference to it.

What About Contacts That Have Not Yet Been Synced to Salesforce?

Pre-MQL data — contacts captured through marketing forms, website activity, or email campaigns that have not yet met the threshold to be pushed to Salesforce — is often not relevant for SDRs and BDRs working in Salesforce. However, this pre-sync data must still be kept clean in HubSpot. If duplicates accumulate in HubSpot before a contact is ever synced, those duplicates may both qualify and sync simultaneously, immediately creating duplicates in Salesforce.

Deduplication before sync is therefore a separate, mandatory step. HubSpot’s native deduplication tool handles email-based matching for contacts at this pre-sync stage. For higher volumes or more complex matching logic — such as contacts with different email addresses but the same phone number or company — a third-party tool is required. The principle is: clean records in HubSpot before they ever reach the sync threshold, and clean synced records in Salesforce after that point.

How Should You Configure Sync Lifecycle Settings to Prevent Duplicate Recreation?

Before beginning any deduplication work, verify that your HubSpot–Salesforce sync is configured with one of two settings: “Always create a Salesforce Lead or Contact,” or “Match by Email.” The “Match by Email” setting is the safer choice for most organisations because it prevents the integration from creating a new Salesforce record if a contact with the same email address already exists. Without this setting, a merge or deletion on the HubSpot side can cause the integration to interpret the action as a new record and push a fresh, duplicate entry back into Salesforce on the next sync cycle.

This configuration detail is frequently overlooked during initial integration setup and is a leading cause of deduplication efforts failing to hold. Across CRM implementations at Cremanski & Company, misconfigured sync lifecycle settings account for a significant proportion of recurring duplicate problems in active HubSpot–Salesforce environments.

What Is the Right Governance Rule for Deduplication?

The principle that guides reliable deduplication in a dual-platform environment is: clean the source, and the stream will follow. In practice, this means:

  • Pre-sync: HubSpot is the source. Deduplicate contacts before they reach the sync threshold.
  • Post-sync: Salesforce is the source. All merges and deletions must be initiated there.
  • Never reverse this sequence for synced records.

This rule eliminates the majority of ghost records, broken syncs, and recreated duplicates that RevOps teams encounter. It also makes the deduplication process auditable: every merge has a clear origin point and a predictable downstream effect.

Frequently Asked Questions

What happens if I merge two contacts in HubSpot that are already synced to Salesforce?

HubSpot will attempt to notify Salesforce of the merge, but the behaviour depends on your sync configuration. In many setups, HubSpot cannot complete the merge for synced records and will return an error. In others, the merge may succeed in HubSpot but leave an orphaned record in Salesforce. The safest practice is always to merge in Salesforce first for any record with a Salesforce ID.

Can I use HubSpot’s native deduplication tool for contacts that are already synced?

HubSpot’s native deduplication tool is designed primarily for contacts that have not yet been synced. For synced contacts, it typically restricts the merge action and directs you to Salesforce. Use it confidently for pre-MQL, pre-sync contacts. For post-sync records, use Salesforce’s duplicate management tools or a third-party AppExchange tool.

What is a “ghost record” in a HubSpot–Salesforce integration?

A ghost record is a record that exists in one platform but has lost its corresponding record in the other due to an unsynchronised deletion or merge. Ghost records cause sync errors, missing activity data, and broken automation triggers. They are typically created when records are deleted or merged on one side of the integration without a corresponding action on the other side.

Which Salesforce tool should I use for high-volume deduplication?

Salesforce’s standard merge function is suitable for low-volume, manual cleanup. For datasets in the thousands, a third-party AppExchange deduplication tool is necessary. These tools offer fuzzy matching logic (matching on partial names, similar email domains, or phonetic variations), mass-processing capability, and protection rules that allow you to preserve specific records from automated merges.

How do I prevent duplicates from re-entering after a deduplication exercise?

Prevention requires three controls working in parallel: (1) Salesforce duplicate rules set to “Alert” or “Block” on Contact and Lead creation, (2) HubSpot sync configured to “Match by Email” rather than “Always Create,” and (3) a documented data entry standard for any manual record creation. Without all three, duplicates will re-accumulate within weeks of a cleanup exercise.

Read the full report

Who We Serve

Presenting our distinguished clientele! We collaborate closely with visionary B2B tech and software companies, intricately shaping their comprehensive Revenue Architecture. Take a look at who we have already served.

Have a Question?

You have questions? Our Founder and Managing
Partner Michael is looking forward to hearing from
you.

Michael Jäger
Managing Partner