Always On

The Datorama blog on recent marketing trends,
product updates, and industry thought leadership.

Demo

Data Correlation Best Practices:

Correlation Does Not Necessarily Equal Causation

Rebecca Haly | 05.06.2020

There is no doubt that we see cause and effect in marketing, after all, marketers are looking for consumers to take action as a result of their marketing efforts. However, there are dangers to be aware of when correlating data, especially when you’re seeking to understand causality. In this post, we will explore the pitfalls and best practices when it comes to correlating marketing data and the steps marketers can take to avoid inaccurate causality.

 

First, let’s define these terms:

Correlation: is a statistical measure (expressed as a number) that describes the size and direction of a relationship between two or more variables…”

In other words, the relationship between data sets.

Causation: indicates that one event is the result of the occurrence of the other event; i.e. there is a causal relationship between the two events…”

Or simply put, the impact of one thing on another.

 

As such, we must start with correlation to then understand if this qualifies as causation.

Just like a year 8 science experiment, data analysis should be led by the hypothesis you are trying to prove, or disprove. For example:

  • COVID-19 is having a positive impact on eCommerce sales for our shoe department.
  • Climate change is having a negative impact on the number of car sales we’ve made in the last 10 years.
  • The Launch campaign created an overall uplift in brand sentiment, awareness and sales for 1H.

Any data can be correlated to show a relationship, however that does not necessarily mean that one has had an impact on the other. It has never been more important for marketers to have an accurate view of the impact their messages, campaigns, media spend. The more insight they gain, the greater impact marketers have both on their customers and their business.

Here are some key tips to help avoid easily made mistakes when trying to show correlation or causality in marketing data sets:

1. Avoid inflammatory statements

One of the greatest traps when it comes to describing observations in data correlation is the use of inflammatory or sensationalist descriptions to prove a tenuous relationship and lead toward causation. Avoid this at all costs. Your executives will see straight through it.

Instead, Digital Marketing Evangelist for Google, Avinash Kaushik, implores us to be skeptical and pose further questions to demonstrate our rigour. Ask yourself key questions:

  • What if we cut the data differently?
  • What if we removed particular channels?
  • Are there too many variable changes to accurately see a correlation or causation?
  • What assumptions are we making?
  • Was this intentionally designed (i.e. planned campaign) or byproduct of other activity?

2. Choose the right graph for your analysis

Within the Datorama platform, you can choose from over 100 widgets to effectively visualise your data. Through smart recommendations, you will receive suggestions for the appropriate widget to use, according to the data set you’re analysing. You can also create your own custom widgets to visualize data in your own unique way.

Some suggested graphs to start your correlation analysis include:

Time Series Graphs: Graphs that compare multiple metrics over a period of time. These are great for looking at trends and seasonality in data.

Distribution Graphs: Graphs that can easily demonstrate whether there is a correlation. These are great for seeing distribution against a mean.

Relationship Graphs: Graphs that show a relationship between 2 or more variables. A bubble chart is best used for showing a relationship between 3 variables.

When sharing your findings, only show the most simple graphs to demonstrate your points. Your homework is important, but not for the executive summary.

3. Use a robust data set for more informed analysis

When analysing a campaign’s impact on brand KPIs or sales, it’s tempting to leverage an outlier as the backbone for your impact analysis. Consistent correlation provides a far more robust argument for analysis.

When interrogating outliers, analyse the integrity of that outlier through rigorous questioning. Is there a strategic cause and effect relationship here or simply an increase in budget? Call out your assumptions.

4. Source reputable data sources

When using external data sources to prove or disprove your hypothesis, make sure you’re sourcing reputable data sources. Partial data sets and non-reputable data sources are dangerous data points to correlate with your rigorous marketing data. Used as reference or contextual data points, and described as such, may bring further rigour into your findings, however they should be called out accordingly. Otherwise you might face data bias.

5. Limit variables for testing

Many marketers are adjusting strategies today to account for the new ‘normal,’– they are in the test and learn phase and are constantly optimising across many tactics and channels. As a result, it can be difficult to isolate the impact of each campaign activity on hypotheses or KPIs.

To further understand the correlation effect, a designed test and learn campaign is a great place to start, where you limit variable testing and analyse these isolated results.

6. Always start with a hypothesis

Think of correlation analysis as an experiment. Knowing what your hypothesis is from the start will help you to choose the right visuals in aiding your experiment. Oftentimes marketers dive into data without knowing what they want to get out of it, which can lead to an incoherent story.

7. Short term and long term measurement

Brands aren’t built overnight. Advertising may or may not have an immediate impact on your brand KPIs. Understanding how your consumers digest the media you put in front of them will help you to come up with better ways of analysing your data. A short term and long term measurement framework is useful here.

The key to analysing correlation and causality is to understand the difference between the two. Start your analysis with correlation, and let your questions guide your findings. May your assumptions always be called out, and your graphs be simple yet effective.

Go forth and correlate.

 

This post was authored by Rebecca Haly, Director of Success Managers for ASEAN, GCR & India at Datorama. You’ll continue to hear from our Datorama experts from across the globe. Stay tuned and check back at datorama.com/blog

Blog posts by Rebecca Haly | Follow Datorama on:

IBM Case Study

See how Datorama enabled IBM to
optimize its performance at scale.

Read the Case Study

How Can We Help You?

Find out how Datorama
can power your business.

Set Up a Demo