The Proven Power of DataSapien Edge Architecture

DataSapien offers Personal Marketing Technology to brands. Its edge architecture is focused on providing better customer outcomes. It is comprised of a bundle of components that are embedded inside existing customer apps, as a Software Development Kit, or SDK.

These components include 1) On-Edge Personal Data Storage and 2) On-Edge Personal Intelligence technology. The in-app SDK is controlled and managed by the 3) DataSapien Orchestrator which can be deployed by clients in their choice of cloud service provider (Azure, AWS, Oracle) or on their premises.

Together, these components enable customers to participate with brands, privately, using their own data, to get their jobs done. The result is far better outcomes for customers and for the brands that they choose to engage with.

Benefits of DataSapien Edge Architecture:

Privacy is 100% Guaranteed with Zero-Shared Data
Personal Experiences: Brands can offer more tailored experiences based on unique user insights due to the volume and variety of verified data now available to generate insights, signals and solutions
Enhanced Data Security: Data remains on the user’s device, minimizing the risk of breaches.
Low Latency as processing of data and insight generation happen as close as possible to the customer
Offline access as processing can still happen locally if/when the device loses internet connection
Reduced cost of centralised compute, as processing shifts to the customer devices
Regulatory Compliance: Ensures adherence to data privacy laws such as GDPR and CCPA.
Scalable Solutions: Easily adaptable to various business sizes and industries.
Eco-Friendly: Reduced need for massive data centres lowers environmental impact.

This post provides a high-level overview of the technical components.

1. Personal Data Gathering and Storage Components

The secure, private and confidential storage of personal data is at the core of the DataSapien edge architecture. The DataSapien SDK contains several different storage elements that perform different storage functions. We call the personal life data that is gathered and stored, MeData.

Data Vault

The core repository of the user’s MeData. By default, all data held in the DataSapien DataVault is private Zero-Shared Data unless explicitly shared as consented Zero-Party Data.

The MeData stored here may come from a number of sources:

Customer Data: from the brand such as:
- purchase history
- purchase frequency
- customer service support interaction history
- loyalty history etc.

This is ideal seed data ‘given’ by the brand to serve as the initial population of the DataVault, creating value and solving the ‘cold start’ data problem.

Life Data: Gather data utilising simple user journey’s driven by the customer’s Job To Be Done (JTBD) and desired Outcomes. Once again, this MeData may be gathered from a variety of sources:
- Device Native data such as location data, accelerometer data, app data, device details, screen usage etc.
- Explicit Questions and Answers. These enable the customer to self-populate their Data Vault with data such as preferences, beliefs, attitudes, desires etc. And generate psychometrics (see Personal Intelligence below).
- Data from third-party API’s: External data gathered into the app to unlock customer Jobs. There are thousands of potential data sources available, covering Social Media, Health, Bank, Car, Home data etc. These connections are managed by the DataSapien API Gateway and Orchestrator (more on this below).
Browser Data: The SDK has an optional browser extension module that the customer can deploy into their browser to enable the gathering of web browser data. In addition to gathering data, the browser extension can be used as a private, omnichannel communication path, connecting directly from the app to enable, for example, real-time personal marketing messages based on browser domain.
Verifiable Credentials: The Wallet (see below) holds 3^rd party verifiable credentials as defined by related standards which may also be stored in the DataVault along with related meta-data. These then become available to the Personal Intelligence stack for multi-variate data processing.
Insights and signals: The Personal Intelligence stack (see below) creates insights and signals which may be shared with the Brand as Zero-Party Data. Copies may also be stored in the customers DataVault to make them available for further processing. This data network effect is an important advantage of edge data processing.

The Data is multi-dimensional, recording the data item plus time and location [HH1] [SD2] where applicable. For example, the DataVault may privately store that the individual plays tennis at X location and Y time.

Data Wallet

The Data Wallet stores and exchanges Verifiable Credentials (VCs). These may come from formal providers of Verifiable Credentials (primarily government entities) and may then be shared with requestors of Verifiable Credentials, including as a Self-Sovereign exchange, i.e. providing proofs without revealing the sensitive underlying data.

The Data Wallet is designed using Open Wallet Foundation (OWF) standards. These are the same standards used by Google, Microsoft and Visa. It is capable of interacting with eIDAS and Mobile Driving License (mDL) providers of credentials, covering both European and USA standards.

A copy of verifiable credentials data is also stored in the DataVault to enable the processing (with Personal Intelligence) of extrapolation and inference data. For example, a verified date of birth (e.g. 15/01/2000) is processed to provide a verified age (24 years at the time of writing), verified age group (20-30 years at the time of writing) and verified age threshold (over 18/21). These may then be shared as Zero-Party Data without sharing the sensitive verified data. They are also stored in the data wallet as inferred verified credentials.

Personal Cloud

Private and secure data backup, plus multi-device data synchronisation, are provided by utilising the customers’ own iOS or Android Data personal cloud. Data is encrypted within the personal cloud. If a device is lost, the owner is able to connect a new device to their personal cloud to restore the data on their new device and synchronise. This approach maintains the integrity of the DataSapien edge architecture approach.

2. Personal AI and Algorithms

With the DataSapien edge architecture SDK, personal data held on the edge (on the device) is also processed on the customer’s edge device, utilising the DataSapien Personal Intelligence stack. This is similar to the new Apple Intelligence approach to moving intelligence out to devices and with DataSapien Brands can tap into similar technology. In a Brand and Customer context; DataSapien Personal Intelligence enables Brands to create personal customer journeys that prioritise and serve the jobs that customers hire the brand to get done for them. Brands optimise the delivery of customer outcomes that align with their Brand Promise.

The DataSapien Personal Intelligence architecture comprises three layers:

2.1 Deterministic Intelligence

A sophisticated rules engine which runs deterministic ‘if-this-then-that’ actions locally on the data in the DataVault. The rules are defined in The Orchestrator (see below). This creates logical decision flows, surfaces insights, generates signals and conducts customer journeys on the customer’s edge device. This is ideal for activities such as the local generation of Next Best Actions (NBA) for the customer.

Deterministic logic can also be used to process rating scales in the app. These might include psychometric insights or customer segmentations, for example, framed as “your shopper personality type”.

Privacy Enhancing Technologies (PETs) such as Homomorphic Encryption will extend these capabilities to interrogate and process data while it remains encrypted, utilising the key strengths of the edge architecture approach.

2.2 Probabilistic Intelligence

Machine Learning models can be deployed into the local SDK to process data and make probabilistic inferences and suggestions on the edge. These models are trained centrally and are deployed locally on-device, using the Orchestrator platform (see below).

A simple example might be deploying local recommendation models. These privately process a combination of data sources (e.g. age, location, interests, apps-on-device, browsing data and psychometrics) to provide Personal Intelligence which informs the serving of contextual coupons on local devices.

We have conceptualised methods to train models on-the-edge directly between devices utilising Privacy Enhancing Technology (PET) advances such as Secure Multi-Party Computation (SMPC). This is on our lab and development roadmaps.

2.3 Generative Intelligence

Running GPT LLM models on the total sum of all of a customer’s Life Data, holds enormous promise. There are a growing number of powerful open source ‘Small Language Models’ capable of being deployed on Smart Phones. These can be tuned on the local data held on the edge using techniques like Retrieval-Augmented Generation in the DataVault to create highly personal marketing.

However, for the moment at least, there are significant risks in deploying GPT LLM technology into a customer’s native device. A useful diagram from Gartner highlights where GenAI is and is not useful.

In addition to inherent biases in most foundational models, GPT LLM models are creative by nature, and therefore prone to hallucination. In some situations, such as content creation or movie suggestions, this presents a relatively low risk, but in others, this trait is highly dangerous and may leave the brand with major liabilities. For example, today, the technology may generate incorrect nutrition advice to a person with severe food allergies, or provide the wrong flight departure time to a travelling customer.

Consequently, the utility and readiness of Large and Small Language Models that run on-the-edge (on the device), with data also held on-the-edge, are being investigated and tested in the DataSapien Lab.

The Holistic Personal Intelligence Stack: Combining Three Tiers

We are excited by the results of our early experimentation that combines Generative AI with deterministic and probabilistic intelligence to provide optimal outcomes. We look forward to providing more details on this as our learning progresses.

3. Data Orchestration & Administration

The Data Orchestrator is the control surface for the DataSapien edge architecture.

Brands define the data types that can be gathered and stored on the device, and the combinations of on-device Personal Intelligence deployed to process it.

The Orchestrator elements include a logical and easy-to-use web dashboard to define data schemas and model parameters.

The API Gateway provides centralised control and security for 3^rd Party API connections that enable customers to gather data. Utilising our extensive experience in building Consumer Insights platforms, the Question & Answer (Q&A) functionality of the Orchestrator is on par with the leading market Research SaaS Platforms. It also incorporates advanced features such as image sharing with object recognition.

Included in the Orchestrator is Platform Administration, which provides administrator controls such as Identity and Access Management, ‘locking’ of types of sensitive Zero-Shared-Data, DataScience training environments and integration into other elements of the brand’s Customer Data environment, such as Customer Relationship Management (CRM) and Customer Data Platform (CDP) platforms.

Conclusion

DataSapien Technology enables brands to harness the power of Zero-Shared Data, ensuring user privacy and enhancing personal customer experiences.

These insights can be selectively and transparently shared with brands as Zero-Party Data, fostering trust and ensuring compliance with stringent privacy regulations.

Designed with a focus on seamless integration and user empowerment, the DataSapien platform offers a robust SDK and control surface that simplifies the implementation process for brands. This platform not only enhances the accuracy and relevance of consumer insights but also promotes transparency and user control over their personal data. The result is a significant uplift in a comprehensive array of customer metrics: Average Revenue Per User (ARPU), Net Promoter Score (NPS) Customer Retention Rate (CRR), Customer Satisfaction Score (CSAT), Customer Lifetime Value (CLV) etc.

With its innovative edge architecture approach, DataSapien is set to redefine data privacy standards and revolutionize the way brands engage with their customers, offering a win-win scenario where both parties benefit from a more secure and personalized data exchange.

For more technical information on the DataSapien edge architecture, checkout the DataSapien Developer Portal.

If you have any comments or questions, please do get in touch. We’d love to hear from you.