Fair Teams Solution¶
coaches = coaches.sample(frac=1, random_state=2357)
players = players.sample(frac=1, random_state=7532)
repeats = np.ceil(len(players)/len(coaches)).astype('int64')
coaches_repeated = pd.concat([coaches] * repeats).head(len(players))
result = players.copy()
result.index = pd.Index(coaches_repeated, name='coach')
print(result)
# coach
# Stephen Julian
# Scott Joshua
# Aaron Elizabeth
# Joshua Asher
# Peter Oliver
# Donald William
# Stephen Wyatt
# Scott Isaiah
# Aaron Ethan
# Joshua Madison
# Peter Leo
# Donald Ryan
# Stephen Scarlett
# Scott Mia
# Aaron Connor
# Joshua James
# Peter Emily
# Donald Hannah
# Stephen Layla
# Scott Isabella
# dtype: string
Explanation¶
-
Randomly sort the
coaches
andplayers
using theSeries.sample()
method withfrac=1
, which means "randomly sample 100% of the series values". By default this happens without replacement, so what we're doing is randomly shuffling each Series.# randomly shuffle each series coaches = coaches.sample(frac=1, random_state=2357) players = players.sample(frac=1, random_state=7532) print(coaches) # 5 Stephen # 4 Scott # 0 Aaron # 2 Joshua # 3 Peter # 1 Donald # dtype: string print(players) # 10 Julian # 9 Joshua # 2 Elizabeth # 0 Asher # 15 Oliver # 18 William # 19 Wyatt # 7 Isaiah # 4 Ethan # 13 Madison # 12 Leo # 16 Ryan # 17 Scarlett # 14 Mia # 1 Connor # 8 James # 3 Emily # 5 Hannah # 11 Layla # 6 Isabella # dtype: string
-
Repeat the
coaches
Series until it's at least as long as theplayers
Series. With help from NumPy'sceil()
method, we figure we must repeat thecoaches
Series four times.repeats = np.ceil(len(players)/len(coaches)).astype('int64') print(repeats) # 4
We can repeat the Series by wrapping it inside a list and using Python's repetition operator
*4
.[coaches] * repeats
Note
This makes a list with four elements, each of which is a reference to the
coaches
series. The data isn't actually copied four times.Then we can use the
concat()
function to convert the list of Series into a single Series object.pd.concat([coaches] * repeats) # 5 Stephen # 4 Scott # 0 Aaron # 2 Joshua # 3 Peter # 1 Donald # 5 Stephen # 4 Scott # ... ..... # 3 Peter # 1 Donald # dtype: string
This new Series has 24 elements which is slightly longer than our
players
Series which has 20 elements, so we'll chain it with.head(len(players))
to pick out the first 20 values.coaches_repeated = pd.concat([coaches] * repeats).head(len(players)) print(coaches_repeated) # 5 Stephen # 4 Scott # 0 Aaron # 2 Joshua # 3 Peter # 1 Donald # 5 Stephen # 4 Scott # 0 Aaron # 2 Joshua # 3 Peter # 1 Donald # 5 Stephen # 4 Scott # 0 Aaron # 2 Joshua # 3 Peter # 1 Donald # 5 Stephen # 4 Scott # dtype: string
-
Lastly we copy the
players
Series into a new Series calledresult
and then set its index equal to thecoaches_repeated
values we generated in the last step.result = players.copy() result.index = pd.Index(coaches_repeated, name='coach') print(result) # coach # Stephen Julian # Scott Joshua # Aaron Elizabeth # Joshua Asher # Peter Oliver # Donald William # Stephen Wyatt # Scott Isaiah # Aaron Ethan # Joshua Madison # Peter Leo # Donald Ryan # Stephen Scarlett # Scott Mia # Aaron Connor # Joshua James # Peter Emily # Donald Hannah # Stephen Layla # Scott Isabella # dtype: string