Skip to content

Dog Costumes Problem


You own an ecommerce store that sells costumes for dogs 🐶. You'd like to analyze your web traffic by measuring the count of unique visitors amongst your web sessions data. Additionally, you want the ability to filter by date, so you can answer questions like

How many visitors visited my site?

and

How many visitors visited my site on February 20th, 2022?

You come up with the following count_visitors() function that you place inside measures.py.

measures.py
def count_visitors(sessions, date=None):
    """
    Count the number of unique visitors

    :param sessions: list of sessions were each session has a visitor_id and date
    :param date: optional filter by date
    :return: number of unique visitors
    """

    if date is None:
        visitors = {x.visitor_id for x in sessions}
    else:
        visitors = {x.visitor_id for x in sessions if x.date == date}

    return len(visitors)

To test it, you create test_measures.py as follows.

test_measures.py
from datetime import date
from types import SimpleNamespace
import time

from measures import count_visitors

def load_sessions():
    """Super expensive function that loads the sessions data"""

    print("loading sessions data.. hold on a sec")
    time.sleep(5)

    sessions = [
        SimpleNamespace(visitor_id=372, date=date.fromisoformat('2022-02-02')),
        SimpleNamespace(visitor_id=925, date=date.fromisoformat('2022-02-02')),
        SimpleNamespace(visitor_id=123, date=date.fromisoformat('2022-02-04')),
        SimpleNamespace(visitor_id=925, date=date.fromisoformat('2022-02-10')),
        SimpleNamespace(visitor_id=372, date=date.fromisoformat('2022-02-10')),
        SimpleNamespace(visitor_id=123, date=date.fromisoformat('2022-02-10')),
        SimpleNamespace(visitor_id=925, date=date.fromisoformat('2022-02-11')),
        SimpleNamespace(visitor_id=128, date=date.fromisoformat('2022-02-15')),
        SimpleNamespace(visitor_id=925, date=date.fromisoformat('2022-02-17')),
        SimpleNamespace(visitor_id=372, date=date.fromisoformat('2022-02-17'))
    ]

    return sessions


def test_count_visitors():
    """Confirm that count_visitors works as expected"""

    sessions = load_sessions()
    assert count_visitors(sessions) == 4


def test_count_visitors_on_date():
    """Confirm that count_visitors works as expected when a date filter is used"""

    sessions = load_sessions()
    assert count_visitors(sessions, date=date.fromisoformat('2022-02-17')) == 2

You have a problem.. Both of your test functions call the load_sessions() function in order to load the sessions data, but this data is really expensive (slow) to fetch. (In theory, load_sessions() might connect to a database and run an expensive query.) See if you can cut your tests runtime in half 😉.

Run with messages

Notice the print() statement inside the load_sessions() function. To view the output of this print statement in the console, run the tests with pytest -s as opposed to simply pytest. The-s flag tells pytest not to capture the result of standard out as it normally does.

pytest output: pytest_without_s-flag

pytest -s output: pytest_with_s-flag

Directory Structure

dog_costumes/
  measures.py
  test_measures.py