Event data, also called interaction data or behavioural data, is essential to discover how users and customers interact with digital products and services.
These can be anything from SaaS platforms to websites or smartphone/tablet apps, or software of pretty much any kind.
Utilising customer data is one of the primary methods that businesses and organisations can use to become more data-driven as it provides an empirical description of how users interact with a brand or business and its products.
Today, customer data tends to live inside of customer data platforms and customer data infrastructure and their related tools.
There’s a reason why the customer data platform category is expected to grow from $2.4 billion in 2020 to $10.3 billion by 2025 – these platforms enable businesses rapid access to customer data and its various uses for optimising products, campaigns and more.
Event and Entity Data
Event data can be created whenever an entity interacts with a product or service. For example, clicks, scrolls and swipes are all navigational events and provide clues as to how users actually use a product or service, whereas purchases, subscriptions and other events enable teams to analyse the sales funnel or customer journey.
Entity data provides another level of meaning to event data but events can be created with or without users. The presence of entity data – or users – enables data practitioners to understand who is doing what.
The primary entity that data teams will possess in most commercial contexts is probably user_id, which will be included in any product that requires users to sign in prior to use.
This is the chief identifier of who is in the session – in the driver’s seat. Once user ids and other identifiers combine with event data, you can answer questions such as:
- How many users use X feature?
- How long did it take for X user to find X feature?
- What happens before X user quits the software?
- How do loyal or long-term customers use the product?
- How many events occurred before activation?
- Which users are more likely to churn?
Questions such as the last point here ‘which users are more likely to churn’ can be further broken down, e.g. with an analysis that interrogates the events churners have in common and how these correlate with their entity attributes, such as age, occupation, location, etc.
These kinds of investigations allow product teams to create targeted emails, in-app messages, push notifications, etc, which are aimed at certain segments of customers.
One Event Can Link to Multiple Entities
It’s worth bearing in mind that whilst the User is often the primary entity, accounts and products can also be associated with events. So, a product in an eCommerce store, subscription product retailer, ride-hailing/taxi app, etc, are likely used by one user.
This makes linking events to that single user fairly straightforward.
However, for say, a SaaS product, there might be multiple users operating under one organisation. This means associating multiple entities with events at both account and organisation level, i.e. linking both user_id and organisation_id to the same events. See the below example.
So, if user Joe Bloggs creates a new project inside some project management software which is used by ACME INC who has 20 users, there are two actions:
- Joe Bloggs created a project under his user ID
- ACME INC created a new project under its organisation
Both entities matter here, only linking one of the two with the events will result in issues.
Events that pertain to organisation-level control, e.g. starting a SaaS subscription, should be associated with both the organisation and all its users. These events (e.g. events that affect all employees) do not take place at the standard user level. So, the main account holder will likely require different marketing and product comms than employees.
At the same time, other events should definitely not be restricted to organisation level only. For example, if a product team is designing some in-app help messages for some SaaS software then these will need to trigger based on the behaviour of the user Joe Bloggs.
One User Links To Multiple Accounts
Another scenario is when a user is linked to multiple accounts, e.g. a freelance marketer that works with multiple brands. This is also possible for users connecting to project management software like Asana, Trello, ClickUp, Notion, etc, where one user is linked to multiple accounts and organisations.
In this scenario, it’s important to be able to delineate which organisation the user is working for in any given session. This provides a description of what is going on at the account level, and not user level, which is somewhat the opposite of the above.
This is more of a data integrity point than anything else. It’s important for event data to be structured in the right way to avoid the possibility of false positives and other incorrect analyses. Isolating events at organisation/account and user levels ensures that event data is structured properly.
Many to Many Relationships
When it comes to customer data, the vast majority of the relationships between different entities (e.g. customers and their purchases) are many to many relationships. This simply means that there are many customers and many products, and thus, the database or table has to support many to many relationships.
Many to many relationships may either be stored in a single self-linked table (which is easiest when the entities are the same class, e.g. customers and their relationships to purchases within the same product category, or different membership tiers).
Alternatively, many to many relationships may be displayed in different tables for different classes of entities, which are connected by junction tables. These are also called associative tables. The junction table maps or relates two or more tables together by referencing the primary keys of each data table.
In the case of the self-linked table, you’d only be able to store multiple linear relationships between customers and their purchases. But, if there are different characteristics to those purchases, or you want to relate customers to different products classes (e.g. both products, subscriptions and digital downloads), or you want to relate customers and their purchases other data attributes, e.g. the device type they used to order, or promo/coupon IDs, then you’ll need a junction table.
Entities and events may also link hierarchically. A data hierarchy involves parent-child relationships which are organised in an overall tree structure and will assist in developing data models and visualising the problem space.
Products may also be hierarchically linked through their various categories, e.g. the ‘home’ section of a shop might include various subsections ‘garden’, ‘furniture’, etc.
An order or purchase will be hierarchically linked to the order channel (e.g. smartphone app or website), order date, shipping time, promo code used (if applicable), device type, add-ons, order note, etc.
Customer data is also typically hierarchical. This assists in segmentation, enabling the selection of customers that share one attribute in the hierarchy tree but not the others.
A good example of a data hierarchy is Google Ads, which has a simple top-down structure for organising campaigns and their various components – see below.
Customer Entity Data Contains Multiple Identifiers
Entity data goes far beyond the user_id. It’s composed of different forms of personally identifiable information (PII), such as names, addresses, emails, age, gender, phone numbers, etc. Security and regulatory compliance (e.g. GDPR) is essential when dealing with PII.
PII data will also need to be categorised properly. Correlating events with different segments of users is one of the most powerful ways to harness event and entity data together.
Some examples of identifiers include:
- Names, email and phone numbers
- Demographic information; gender, age and location
- Persona, sector or industry
- Personal and brand preferences, genres, product categories, etc
- Buying history
- Product usage stats, e.g. usage time, time windows, apps used, features used, etc
Specifying these properly in your data tracking plan is essential. You need to know what entity data you’re collecting, how you’re handling it and what you’re using it for.
Summary: How Do Entities Link To Event Data
Merely considering how entities and events intersect can lead to ideas. Want to know what happens when volumes of your under-25 users churn on the 15th each month? Puzzled over why your cart is abandoned when someone adds X product to it? Need to find out which users are the biggest spenders and what they buy? Then collecting both event and entity data can help.
Once you’ve got a smooth-flowing modern data stack in place, thinking creatively about these kinds of problems suddenly becomes a lot easier. You can then develop tracking plans to harness your customer data and utilise it for both analytics and the design of targeted campaigns aimed at different customer segments.
What is entity data?
An entity in terms of data science is a single object, usually a person, product or organization. Entities have attributes that describe that entity. e.g. a timestamp, a phone number, an address or some other value. These values are broken down into various data types which are another important concept here
What is event data?
Event data describes a process, state or some other input. This encompasses a huge range of possibilities from the events that occur on a website when someone scrolls and clicks through it to the events that occur in a digital climate model. Event data can be extremely detailed, providing data scientists with a rich blow-by-blow account of what is going on in a particular system.
What is customer data?
Customer data includes both entities and events. The entities here are usually people or organisations/businesses and the events describe how they interact with products, software, etc.