# Mine Field Solution¶

import numpy as np
from scipy import sparse

# Describe the field dimensions
dims = (10000, 10000)
Ncells = dims * dims

# Build a random number generator
rng = np.random.default_rng(2357)

# Randomly determine the total number of mines
Nmines = rng.binomial(n=Ncells, p=0.1, size=1)

# Randomly determine each mine's position
position_ids = rng.choice(a=Ncells, size=Nmines, replace=False)
positions = np.unravel_index(position_ids, shape=dims)

# Build a compressed sparse row matrix with the constructor:
# csr_matrix((data, (row_ind, col_ind)), [shape=(M, N)])
minefield = sparse.csr_matrix((np.ones(shape=Nmines), positions), shape=dims)

minefield
# <10000x10000 sparse matrix of type '<class 'numpy.float64'>'
#   with 9995505 stored elements in Compressed Sparse Row format>

print(minefield)
# (0, 2)    1.0
# (0, 16)   1.0
# (0, 22)   1.0
# : :
# (9999, 9940)  1.0
# (9999, 9945)  1.0
# (9999, 9948)  1.0


### Explanation¶

Our end goal is to build a compressed sparse row matrix.

Why Compressed Sparse Row?

We use compressed sparse row (csr) instead of compressed sparse column (csc) because the problem notes that we expect to fetch the matrix's rows frequently. Remember - csr matrices have fast row retrieval; csc matrices have fast column retrieval.

1. First we generate random mines, each with probability 0.001.

NumPy Random Generator

Here we make use of NumPy's default random generator. If you're unfamiliar with it, see here.

Your first instinct might be to generate a 10,000 x 10,000 array where each element is randomly determined via the Bernoulli distribution,

$$Pr(X=1)=p$$
$$Pr(X=0)=1-p$$

where $$p=0.002$$.

In NumPy you could achieve this using random.Generator.choice().

# Build a random number generator
rng = np.random.default_rng(2357)

# Build a 4x4 array of 0/1 where P(1) = 0.25
rng.choice([0,1], size=(4,4), replace=True, p=[0.25, 0.75])
# array([[1, 1, 1, 1],
#        [0, 1, 1, 1],
#        [1, 1, 1, 0],
#        [0, 0, 1, 1]])


For small arrays this is fine, but this technique does not scale well to large matrices. In our case, it would require running 100M Bernoulli trials!

To get around this problem, we randomly determine the total number of mines, then randomly determine their positions.

1. Randomly determine the total number of mines.

We can randomly determine the total number of mines using the Binomial distribution. Fortunately NumPy's random Generator has a binomial() method that makes this easy.

# Describe the field dimensions
dims = (10000, 10000)
Ncells = dims * dims

# Build a random number generator
rng = np.random.default_rng(2357)

# Randomly determine the total number of mines
Nmines = rng.binomial(n=Ncells, p=0.1, size=1)

print(Nmines)
# 9995505

2. Randomly determine each mine's position.

Your first instinct might be to randomly choose row and column indices in the range [0, 10,000), but

• if you sample with replacement you risk giving two mines the same position.
• if you sample without replacement, you'll unfairly determine mine positions. (No two mines will ever lie in the same row or column.)

To get around this issue, imagine assigning a unique sequential id to each cell in the grid in row major order. For example, a 4x4 grid would get ids like this.

[[ 0  1  2  3]
[ 4  5  6  7]
[ 8  9 10 11]
[12 13 14 15]]


Then we can randomly sample from all ids without replacement to determine each mine's location.

position_ids = rng.choice(a=Ncells, size=Nmines, replace=False)

print(position_ids)
# [73040916 95752365 94186347 ... 38138282 37314072 34738633]


We can use basic math to determine each position_id's corresponding row and column index.

row_idxs = np.floor(position_ids/dims).astype('int64')
col_idxs = position_ids % dims


..but NumPy's unravel_index() function does this for us.

positions = np.unravel_index(position_ids, shape=dims) # (1)!

print(positions)
# (array([7304, 9575, 9418, ..., 3813, 3731, 3473]), array([ 916, 2365, 6347, ..., 8282, 4072, 8633]))

1. unravel_index() returns a tuple of (row indices, col indices).

print(positions) # array of row indices
# [7304 9575 9418 ... 3813 3731 3473]

print(positions) # array of column indices
# [ 916 2365 6347 ... 8282 4072 8633]

2. Next, we build the compressed sparse row matrix.

scipy's csr_matrix has numerous constructors. The one we want looks like

csr_matrix((data, (row_ind, col_ind)), [shape=(M, N)])

where

• data is an array containing the non-zero values we want to include in the matrix
• (row_ind, col_ind) is a tuple of (row indices, col indices)
• shape is the shape we desire

Alas,

minefield = sparse.csr_matrix((np.ones(shape=Nmines), positions), shape=dims)

minefield
# <10000x10000 sparse matrix of type '<class 'numpy.float64'>'
#   with 9995505 stored elements in Compressed Sparse Row format>

print(minefield)
# (0, 2)    1.0
# (0, 16)   1.0
# (0, 22)   1.0
# : :
# (9999, 9940)  1.0
# (9999, 9945)  1.0
# (9999, 9948)  1.0