Implementing a multi-cloud CDP in Google Compute Platform by Kevin Daly and Juan Manuel Pozo

Подписаться 3,7 тыс.

50% 1

Our purpose is to provide an analysis of the basic objectives and value propositions of any Customer Data Platform by encouraging discussion with participants and sharing our own experience. In this sense, we would like to have the opportunity to present a production use case of a multi-cloud Customer Data Platform. A posteriori, to enrich our presentation, we will start a discussion on the reasons for separating a CDP into two domains: the domain of personally identifiable data and the domain of anonymised data. We will then delve into the specific production use, examining the value propositions both for the end-customer, for businesses and from an operational point of view. Through these points, we are convinced that the audience will clearly see the business and technical drivers for designing and building the CDP, not 100% in Salesforce and not 100% in GCP.
In order to illustrate our presentation through a real case, we propose to deepen the discussion with a technical twist and we will share the experience of Making Science in building a custom CDP with a cloud-first design and development. Among the points we will highlight are:
• A review of the GCP and Salesforce services that the solution used. o A review of the GCP and Python-based technology stack and development design to continuously ingest signals and events from over 38 data sources.
• The management of the bi-directional exchange of signals with the client’s website.
• The selection of serverless GCP technologies for ingesting signals from the customer’s website while protecting the system from external predators.
• The design approach to protect the solution from duplicate signal transmissions from streaming sources.
• The no-harassment approach to continuous batch event processing.
• The design point of view to protect the solution from duplicate batch transmissions.
We will walk through our design considerations with respect to signal/event publishing to CDP processes and external machine learning enrichment systems.
• Persistent keyless data store operating at the core of the CDP giving the most up-to-date view of the client in both an anonymised and de-anonymised view depending on the domain.
• The bi-directional anonymisation/de-anonymisation gateway between Salesforce and GCP.
The gateway had to support sending anonymised data to the marketing analytics domain within GCP and support receiving custom engagement requests from the marketing analytics domain to the customer analytics domain. We will examine in detail the GCP technologies used to determine which attributes of a given data flow/feed needed to be anonymised.
• We will show how signal enrichment was supported from both the personally identifiable data domain and the anonymised CDP analytics domain.
• What additional design and development steps we took to ensure GDPR compliance by leveraging features within GCP. We will also examine our implementation to ensure traceability of consent on a customer basis.
• The design and development approach and technologies used to deliver ‘human readable’ analytics, even as the CDP’s customer-centric data warehouse continually changes.
• We will review the selection of our GCP serverless data warehouse and take a look at the design approaches applied to ensure efficient, consistent and governed access to data.
To conclude, we will put the spotlight on key learnings from the multi-cloud, multi-domain Client Data Platform implementation; as well as share Making Science’s design approach to prepare a CDP to be deployed in a cloud-agnostic manner.