Skip to content

TV Commercials Problem


You own a national restaurant chain called Applewasps. To increase sales, you decide to launch a multi-regional television marketing campaign.

At the end of the campaign you have a table of commercials indicating when and where each commercial aired, and a table of sales indicating when and where customers generated sales.

import numpy as np
import pandas as pd

generator = np.random.default_rng(5555)
regions = ['north', 'south', 'east', 'west']

commercials = pd.DataFrame({
    'commercial_id': range(10),
    'region': generator.choice(regions, size=10),
    'date_time': pd.to_datetime('2020-01-01') + pd.to_timedelta(generator.integers(240, size=10), unit='h')
})

sales = pd.DataFrame({
    'sale_id': range(10),
    'region': generator.choice(regions, size=10),
    'date_time': pd.to_datetime('2020-01-01') + pd.to_timedelta(generator.integers(240, size=10), unit='h'),
    'revenue': np.round(generator.normal(loc=20, scale=5, size=10), 2)
})

print(commercials)
#    commercial_id region           date_time
# 0              0   west 2020-01-10 12:00:00
# 1              1  north 2020-01-10 16:00:00
# 2              2  south 2020-01-09 01:00:00
# 3              3   east 2020-01-10 19:00:00
# 4              4  south 2020-01-08 22:00:00
# 5              5   east 2020-01-03 02:00:00
# 6              6  south 2020-01-07 15:00:00
# 7              7   west 2020-01-05 22:00:00
# 8              8   east 2020-01-03 04:00:00
# 9              9   west 2020-01-05 04:00:00

print(sales)
#    sale_id region           date_time  revenue
# 0        0   west 2020-01-05 08:00:00    20.14
# 1        1   east 2020-01-08 22:00:00    22.98
# 2        2  south 2020-01-07 21:00:00    22.98
# 3        3   west 2020-01-05 17:00:00    16.82
# 4        4   west 2020-01-02 12:00:00    20.47
# 5        5   east 2020-01-10 09:00:00    26.93
# 6        6  north 2020-01-08 19:00:00    20.25
# 7        7  south 2020-01-01 08:00:00    23.38
# 8        8  south 2020-01-01 17:00:00    25.74
# 9        9  south 2020-01-10 22:00:00    22.28

In order to analyze the performance of each commercial, map each sale to the commercial that aired prior to the sale, in the same region.


Try with Google Colab See the solution