Data clean rooms: A beginner’s guide

Here's how enterprises are using data clean rooms to improve the return on their customer data investments.

Chat with MarTechBot

With the deprecation of third-party cookies looming large and data privacy and compliance taking center stage, enterprises have had to adapt their customer data strategies substantially.

Many are exploring various approaches, with data clean rooms (DCRs) emerging as a prominent option.

What are data clean rooms?

A data clean room is a collaborative environment where two or sometimes more participants (brands, publishers, advertisers, groups within a company, or other entities) come together to share and/or combine their respective first-party data.

This enables them to glean insights from each other’s first-party data under strict controls. Ultimately, every participant derives additional insights from the collective pool of data. 

Data clean rooms provide a neutral place for multiple participants to share and collaborate.
Data clean rooms provide a neutral place for multiple participants to share and collaborate.

Each participant essentially provides their own customer data to the data clean room. The DCR then typically uses advanced algorithms to find matches between participants’ data and then supplements those matched profiles with additional attributes that were not available originally.

In short, each participant now has access to more data than they had before and can run a wider range of analytics, segmentation and insights. As a result, their marketers can now run more targeted — and therefore more efficient and effective — activations, for example, in paid media. 

All this enhancement and collaboration happens in a privacy-friendly, neutral environment, with a contract governing what each participant can and can’t do with additional data. The controls also govern how data ingestion happens, what kind of matching rules are applied and how the data gets activated. Security, governance, audit trails, encryption and anonymization all play an important role here. 

In general, the primary objective is dataset enhancement and collaboration. The enriched data, if any, is not intended to be exported back to their original sources or participants. The contract between the parties decides what each participant can do with combined data within the confines of the data clean room.

Data privacy and compliance take precedence; all personally identifiable information (PII) is encrypted and masked. None of the parties have access to PII. 

Data exchanges and data markets may also provide additional data enrichment capabilities in the context of a DCR. However, these are separate and distinct services, although some players here will also provide their own DCR capabilities.

Data clean rooms: An example

The figure below shows how a fast-moving consumer goods (FMCG) firm might collaborate with one of its large retailers.

A simplified example of how an FMCG and a retailer can combine their customer data.
A simplified example of how an FMCG and a retailer can combine their customer data.

This particular FMCG’s customer data consists mainly of demographic data (e.g., age group, location) and some user preferences (e.g., their favorite ice cream). This data could have come from many different venues, but for this case, let’s assume that the customers provided this information as part of their registration on the FMCG’s community site.

In this case, the firm doesn’t have any transaction information because they do not sell directly to consumers in this segment. Their large retail partners, however, do possess transaction data, including purchase date, the amount spent, items bought and so forth. The retail partner also has results from campaigns across different social media channels. 

When these two decide to collaborate in a secure data clean room, both partners can benefit from this combined data. Since they now have access to additional attributes, they can perform more sophisticated segmentation and gain new insights. This would not be possible for either of the partners without this collaboration. 

For example, consider a paid media use case where traditionally, there was a high dependence on third-party data. Collaborating in a DCR allows both partners to leverage matching for better targeting. And since the partners now have access to campaign results data, they can find out if the same person is being targeted across multiple channels and decide if they want to minimize that overlap.

Dig deeper: Marketing use cases for data clean rooms

Key challenges of data clean rooms

As with any technology, not everything is rosy in DCR land. Some key challenges are:

  • Establishing and agreeing on data sharing scope. Data clean rooms are intended to be neutral, but often, the rules are set by whoever owns the room.
  • Governance and monitoring in an age of compliance.
  • Finding the right partners amenable to the same DCR. 
  • A data clean room doesn’t solve all privacy and data sharing issues and you will almost always need to employ it in conjunction with other tools and technologies. 
  • Finally, technical challenges of integration with the rest of your stack, ensuring data management and matching configurations. 

Dig deeper: Evaluating data clean rooms for your organization

Types of data clean rooms

A data clean room can provide several different services such as data storage, bits of identity matching, security, encryption, enrichment, data ingestion and so on. Consequently, you can find a diverse marketplace for data clean rooms that provide varying services.

To complicate things, different players will partner ad hoc to provide a more comprehensive offering. Finally, several of them provide a vertical or domain-specific ad hoc which could be handy.

We group these players into five categories:

  • Specialized data clean rooms
  • Data warehouses/data lakes
  • Walled gardens and media companies
  • Data onboarding vendors
  • Customer data platforms

They all bring their differences and varying capabilities. Nevertheless, whoever owns the data clean room has a major say on governance. 

Specialized data clean rooms

You can find numerous specialized clean room vendors where this is their primary focus area. As independent players, they may provide a wide range of capabilities, including data enrichment via their data partners (in addition to your partners) and activation capabilities.

However, most are relatively smaller firms with limited market presence. Therefore, your potential partners will be less likely to use the same vendor. Getting the right partners to collaborate on the same platform might take some negotiation. 

Data warehouses (DWH) / data lakes

Leading DWH/data lake vendors — e.g., Snowflake, Google, AWS and Databricks — all sell an optional data clean room service offering. In some cases, though, what they offer is a toolkit, and you or some other firm will actually need to build a data clean room using SQL, table joins, rules, stored procedures, etc. These providers typically extend their offerings via a third-party marketplace with supplementary partner tools. 

This route may be useful when you and your partner already use the same platform, in which case you may not need to move data physically. But be prepared to have a lot more dependence on SQL and programming rather than visual interfaces.

Walled gardens and media companies

Walled gardens are the oldest form of data clean rooms and pre-date the term. Google, Meta and Amazon lead this pack. You ingest your customer data in these walled gardens and match it against massive amounts of advertising data (e.g., ad exposure data or who saw which ad, etc.) that Google et al. have accumulated from their ad networks. 

For Google and Amazon, this is an optional, additional offering from their DWH offering. Although still built off their DWH (e.g., BigQuery for Google), you are limited to that walled garden’s advertising data as the partner data. 

Other than these walled gardens, some large media companies also provide a data clean room offering. As with big players, these offerings are also specifically for these companies’ media destinations.

Underneath the covers, though, you may find some familiar technology. Disney’s data clean room is a collaboration with specialized DCR vendors Habu and Infosum, along with Snowflake. Similarly, NBCUniversal’s Audience Insights Hub works in partnership with Snowflake. 

Data onboarding vendors

Several data onboarding vendors provide a Data Clean Room now. These vendors typically offer useful additional capabilities, such as identity resolution and access to their data marketplace, where you can leverage data from their network and not just your partner. 

This alternative can be useful in matching data sets across partners and enriching first-party data with second- and third-party data. However, their activation capabilities might be limiting. 

Bonus category: Customer data platforms (CDPs)

Surprisingly, only a few CDPs, like Adobe and Blueconic, provide private DCR capabilities for their licensees. However, this also means that your partner must be using the same CDP, so the network effects remain limited. The key benefit is that your first-party data remains in your CDP without you having to move it elsewhere.

DCRs power targeted data activation strategies 

Data clean rooms are rapidly emerging as a key mechanism to improve the return on your customer data investments. You have several options, but note some key points as you pick and choose:

  • There are several overlaps across the options above. Vendors across these options often partner with each other to offer their respective data clean rooms. So, as an example, you could use a Snowflake-operated data clean room from Snowflake, along with a vendor from its marketplace, or you could get another vendor offering based on Snowflake. Both may be similar but offered by different vendors.
  • Unlike other integration platforms, you and your partner must make data available in the same data clean room. This may limit your collaboration choices.


As a result, it is not uncommon for an enterprise to use multiple data clean room offerings across a panoply of use cases and partner profiles. The savvy enterprise will keep its options open here.

Email:


Opinions expressed in this article are those of the guest author and not necessarily MarTech. Staff authors are listed here.


About the author

Apoorv Durga
Contributor
Apoorv Durga is Vice-President, Research & Advisory at analyst firm Real Story Group, where he covers CDPs, ecommerce, Web CMS, and technologies. He is a two-decade veteran in the marketing technology space.

Fuel for your marketing strategy.