TV Commercials Problem¶

You own a national restaurant chain called Applewasps. To increase sales, you decide to launch a multi-regional television marketing campaign.

At the end of the campaign you have a table of commercials indicating when and where each commercial aired, and a table of sales indicating when and where customers generated sales.

import numpy as np
import pandas as pd

generator = np.random.default_rng(5555)
regions = ['north', 'south', 'east', 'west']

commercials = pd.DataFrame({
'commercial_id': range(10),
'region': generator.choice(regions, size=10),
'date_time': pd.to_datetime('2020-01-01') + pd.to_timedelta(generator.integers(240, size=10), unit='h')
})

sales = pd.DataFrame({
'sale_id': range(10),
'region': generator.choice(regions, size=10),
'date_time': pd.to_datetime('2020-01-01') + pd.to_timedelta(generator.integers(240, size=10), unit='h'),
'revenue': np.round(generator.normal(loc=20, scale=5, size=10), 2)
})

print(commercials)
#    commercial_id region           date_time
# 0              0   west 2020-01-10 12:00:00
# 1              1  north 2020-01-10 16:00:00
# 2              2  south 2020-01-09 01:00:00
# 3              3   east 2020-01-10 19:00:00
# 4              4  south 2020-01-08 22:00:00
# 5              5   east 2020-01-03 02:00:00
# 6              6  south 2020-01-07 15:00:00
# 7              7   west 2020-01-05 22:00:00
# 8              8   east 2020-01-03 04:00:00
# 9              9   west 2020-01-05 04:00:00

print(sales)
#    sale_id region           date_time  revenue
# 0        0   west 2020-01-05 08:00:00    20.14
# 1        1   east 2020-01-08 22:00:00    22.98
# 2        2  south 2020-01-07 21:00:00    22.98
# 3        3   west 2020-01-05 17:00:00    16.82
# 4        4   west 2020-01-02 12:00:00    20.47
# 5        5   east 2020-01-10 09:00:00    26.93
# 6        6  north 2020-01-08 19:00:00    20.25
# 7        7  south 2020-01-01 08:00:00    23.38
# 8        8  south 2020-01-01 17:00:00    25.74
# 9        9  south 2020-01-10 22:00:00    22.28


In order to analyze the performance of each commercial, map each sale to the commercial that aired prior to the sale, in the same region.