# Intermediate

## Broadcasting¶

When we add a scalar to a 1-d array like this, the scalar gets added to each element of the array.

```
np.array([1,2,3]) + 0.5
# [1.5, 2.5, 3.5]
```

In essence, numpy is expanding the scalar into 3-element array and then does element-wise addition between the
arrays. (NumPy doesn't *actually* do this because it'd be horribly inneficient, but in essence that's what's happening.)
This is an example of *broadcasting*.

### Compatibility¶

Not every pair of arrays are compatible for broadcasting. Suppose we want to add two arrays, `A`

and `B`

..

- Moving backwards from the last dimension of each array,
- We check if their dimensions are "compatible". Dimensions are compatible if they are equal or either of them is 1
- If all of
`A`

's dimensions are compatible with`B`

's dimensions, or vice versa, they are compatible arrays

### Examples¶

#### Example 1¶

```
np.random.seed(1234)
A = np.random.randint(low = 1, high = 10, size = (3, 4))
B = np.random.randint(low = 1, high = 10, size = (3, 1))
print(A)
# [[4 7 6 5]
# [9 2 8 7]
# [9 1 6 1]]
print(B)
# [[7]
# [3]
# [1]]
```

`A + B`

= ???

*Compatibility*

```
A.shape # (3, 4)
B.shape # (3, 1)
## ^ ^
## compatible
```

Here, `A`

is a 3x4 array and `B`

is a 3x1 array. We start by comparing the last dimension of each array.

- Since the last dimension of
`A`

is length 4 and the last dimension of`B`

is length 1, numpy can expand`B`

by making 4 copies of it along its second axis. So, these dimensions are compatible. - Now we have to compare the first dimension of
`A`

and`B`

. Since they're both length 3, they’re compatible.

The only thing left for numpy is to carry out whatever procedure we wanted on two equivalently sized 3x4 arrays. (Remember, NumPy doesn't actually expand B like this because it'd be horribly inneficient.)

#### Example 2¶

```
np.random.seed(4321)
A = np.random.randint(low = 1, high = 10, size = (4, 4))
B = np.random.randint(low = 1, high = 10, size = (2, 1))
print(A)
# [[3 9 3 2]
# [8 6 3 5]
# [7 1 9 7]
# [6 4 2 2]]
print(B)
# [[7]
# [2]]
```

`A + B`

= ???

*Compatibility*

```
A.shape # (4, 4)
B.shape # (2, 1)
## ^ ^
## not compatible
```

Here, `A`

is a 4x4 array and `B`

is a 2x1 array.

- The last dimension of
`A`

is length 4 and the last dimension of`B`

is length 1, so these dimensions are compatible. We can temporarily transform`B`

by making 4 copies of it along its 2nd axis. - Now we compare the 1st dimension of each array. In this case, there isn't an obvious way to expand
`B`

into a 4x4 array to match`A`

or vice versa, so**these arrays are not compatible**.

#### Example 3¶

```
np.random.seed(1111)
A = np.random.randint(low = 1, high = 10, size = (3, 1, 4))
B = np.random.randint(low = 1, high = 10, size = (2, 1))
print(A)
# [[[8 6 2 3]]
# [[5 9 7 5]]
# [[9 7 3 7]]]
print(B)
# [[9]
# [4]]
```

`A + B`

= ???
*Compatibility*

```
A.shape # (3, 1, 4)
B.shape # ( 2, 1)
# ^ ^ ^
# compatible
```

Here, `A`

is a 3x1x4 array and `B`

is a 2x1 array.

- We start by comparing the last dimension of each array. In this case,
`A`

is length 4 and`B`

is length 1, so we can expand`B`

into a 2x4 array, making these dimensions compatible. - Next, we compare the 2nd to last dimension of each array. In this case,
`A`

is length 1 and`B`

is length 2. This time, we expand`A`

, copying it twice along its second axis to match`B`

. - At this point, we're out of
`B`

dimensions, so we know`A`

and`B`

are compatible. To complete our mental model of how math between these arrays would work, we can imagine copying`B`

3 times along a newly added first dimension. - We're left with two transformed arrays, each with shape 3x2x4, which we can easily add.

## newaxis¶

`np.newaxis`

allows us to promote the dimensionality of an array by giving it a new axis.

For example, suppose you have 1-d arrays `A`

and `B`

,

```
A = np.array([3, 11, 4, 5])
B = np.array([5, 0, 3])
```

and your goal is to build a difference matrix where element \((i,j)\) gives \(A_i - B_j\). In other words, your goal is to
subtract each element of `B`

from each element of `A`

.

If you do `A - B`

, you'll get an error because the arrays don't have compatible shapes, and even if they were the same
size, numpy would just do *element-wise* subtraction.

However, if A was a 4x1 array and B was 1x3 array, numpy would broadcast the arrays so that `A - B`

produces the
difference matrix we desire.

We can convert `A`

from a `(4,)`

array into a `(4,1)`

array via

```
A[:, np.newaxis]
# array([[ 3],
# [11],
# [ 4],
# [ 5]])
```

Similarly, we can convert `B`

from a `(3,)`

array into a `(1,3)`

array via

```
B[np.newaxis, :]
# array([[5, 0, 3]])
```

Then we can calculate the difference matrix as

```
A[:, np.newaxis] - B[np.newaxis, :]
# array([[-2, 3, 0],
# [ 6, 11, 8],
# [-1, 4, 1],
# [ 0, 5, 2]])
```

We can further simplify this to `A[:, np.newaxis] - B`

since broadcasting rules will make `B`

compatible with ```
A[:,
np.newaxis]
```

.

## Note

`newaxis`

is just an alias for `None`

, so `A[:, np.newaxis] - B`

is equivalent to `A[:, None] - B`

.

## reshape()¶

You can use the `reshape()`

function to change the shape of an array.

For example,

```
foo = np.arange(start=1, stop=9)
# [1 2 3 4 5 6 7 8]
```

We can reshape `foo`

into a 2x4 array using either `np.reshape()`

```
np.reshape(a=foo, newshape=(2,4))
# array([[1, 2, 3, 4],
# [5, 6, 7, 8]])
```

or the `reshape()`

method of the array object.

```
foo.reshape(2,4)
# array([[1, 2, 3, 4],
# [5, 6, 7, 8]])
```

These methods implement the same logic, just with slightly different interfaces.

## Info

With `foo.reshape()`

, we can pass in the new dimensions individually instead of as a tuple, but this comes at
the expense of not being able to specify the `newshape`

keyword.

### Array Transpose¶

You can also transpose an array using `np.transpose()`

or the `.T`

attribute of an array object.

```
bar = np.array([[1,2,3,4], [5,6,7,8]])
print(bar)
# [[1 2 3 4]
# [5 6 7 8]]
print(bar.T)
# [[1 5]
# [2 6]
# [3 7]
# [4 8]]
```

## Boolean Indexing¶

With boolean indexing, you can subset an array `A`

using another array `B`

of boolean values.

### Examples¶

#### Example 1¶

Suppose we have a 3x3 array, `foo`

```
foo = np.array([
[3, 9, 7],
[2, 0, 3],
[3, 3, 1]
])
```

and we set `mask = foo == 3`

```
mask = foo == 3
print(mask)
# [[ True False False]
# [False False True]
# [ True True False]]
```

We can use `mask`

to identify elements of `foo`

which are equal to 3.

```
print(foo[mask])
# [3 3 3 3]
```

Furthermore, we can use `mask`

to convert 3s in `foo`

to 0s.

```
foo[mask] = 0
print(foo)
# [[0 9 7]
# [2 0 0]
# [0 0 1]]
```

#### Example 2¶

Just like integer arrays, we can use 1-d boolean arrays to pick out specific rows or columns of a 2-d array.

Consider this 3x3 array, `foo`

```
foo = np.array([
[3, 9, 7],
[2, 0, 3],
[3, 3, 1]
])
```

and these 1-d, length 3 boolean arrays `r13`

and `c23`

.

```
r13 = np.array([True, False, True])
c23 = np.array([False, True, True])
```

We can use `r13`

to select rows 1 and 3 from `foo`

.

```
print(foo[r13])
# [[3 9 7]
# [3 3 1]]
```

We can use `c23`

to select columns 2 and 3 from `foo`

.

```
print(foo[c23])
[[2 0 3]
[3 3 1]]
```

Observe what happens when we index `foo`

with both `r13`

and `c23`

.

```
print(foo[r13, c23]) # (1)!
# [9 1]
```

- This is equivalent to
`foo[[0,2], [1,2]]`

NumPy treats boolean indices like integer indices, where the integers used are the indices of True elements. In other
words, NumPy treats the boolean index array `[True, False, True]`

just like the integer index array `[0, 2]`

and it
treats the boolean index array `[False, True, True]`

just like the integer index array `[1,2]`

.

So, `foo[r13, c23]`

is equivalent to `foo[[0, 2], [1, 2]]`

. Recall that when you combine row and column index arrays in
this way, NumPy uses corresponding indices from each index array to select elements from the target array - in this
case,
elements `(0,1)`

and `(2,2)`

.

### Logical Operators¶

Logical operators let us combine boolean arrays. They include the "bitwise-and" operator, the "bitwise-or" operator, and the "bitwise-xor" operator.

```
b1 = np.array([False, False, True, True])
b2 = np.array([False, True, False, True])
b1 & b2 # [False, False, False, True], and
b1 | b2 # [False, True, True, True], or
b1 ^ b2 # [False, True, True, False], xor
```

#### Boolean Negation¶

We can negate a boolean array by preceding it with a tilde `~`

.

```
~np.array([False, True])
# array([ True, False])
```

## NaN¶

You can use `NaN`

to represent missing or invalid values. ** NaN is a floating point constant** that numpy
reserves and treats specially.

For example, consider this array called `bot`

which contains two missing values.

```
bot = np.ones(shape = (3, 4))
bot[[0, 2], [1, 2]] = np.nan
print(bot)
# [[ 1. nan 1. 1.]
# [ 1. 1. 1. 1.]
# [ 1. 1. nan 1.]]
```

If you want to identify which elements of `bot`

are `NaN`

, you might be inclined to try `bot == np.nan`

but the result may surprise you.

```
print(bot == np.nan)
# [[False False False False]
# [False False False False]
# [False False False False]]
```

NumPy designed `NaN`

so that `nan == nan`

returns False, but `nan != nan`

returns True.

```
np.nan == np.nan # False
np.nan != np.nan # True
```

This is because equivalence between missing or invalid values is not well-defined.

In order to see which elements of an array are `NaN`

, you can use NumPy's `isnan()`

function.

```
np.isnan(bot)
# array([[False, True, False, False],
# [False, False, False, False],
# [False, False, True, False]])
```

## Caution

`NaN`

is a special floating point constant, so it can only exist inside an array of floats. If you try inserting
`NaN`

into an array of integers, booleans, or strings, you’ll get an error or unexpected behavior.

## Infinity¶

Like `NaN`

, numpy reserves floating point constants for infinity and negative infinity
that behave specially.

If you want to insert these values directly, you can use `np.inf`

and `np.NINF`

```
np.array([np.inf, np.NINF])
# array([ inf, -inf])
```

More commonly, these values occur when you divide by 0.

```
np.array([-1, 1])/0
# array([-inf, inf])
```

## random¶

You can use NumPy's random module to shuffle arrays, sample values from arrays, and draw values from a host of probability distributions.

### Generators¶

Since Numpy version 1.17.0, it is recommended to use a *Generator* to produce random values rather than use the random
module directly.

In most cases, the default random number generator is sufficient.

*Initialize default_rng without a seed*

```
rng = np.random.default_rng()
```

*Initialize default_rng with a seed*

```
rng = np.random.default_rng(12345)
```

### Examples¶

#### Sample integers in range with replacement¶

Draw three integers from the range 1 to 6, *with* replacement.

```
generator = np.random.default_rng(seed=123)
generator.integers(low=1, high=7, size=3)
# array([1, 5, 4])
```

#### Sample integers in range without replacement¶

Draw three integers from the range 0 to 9, *without* replacement.

```
generator = np.random.default_rng(seed=123)
generator.choice(a=10, size=3, replace=False)
# array([5, 6, 0])
```

#### Randomly permute the rows of a 2-d array¶

Randomly shuffle the rows of this 5x2 array, `foo`

```
foo = np.array([
[1, 2],
[3, 4],
[5, 6],
[7, 8],
[9, 10]
])
generator = np.random.default_rng(seed=123)
generator.permutation(foo, axis=0)
# array([[ 9, 10],
# [ 1, 2],
# [ 5, 6],
# [ 7, 8],
# [ 3, 4]])
```

*See random.Generator.permutation*

#### Random sample from uniform distribution¶

Randomly sample four values between 1 and 2, then output as a 2x2 array.

```
generator = np.random.default_rng(seed=123)
generator.uniform(low=1.0, high=2.0, size=(2, 2))
# array([[1.68235186, 1.05382102],
# [1.22035987, 1.18437181]])
```

#### Random sample from normal distribution¶

Randomly sample two values from a standard normal distribution, then output as a length-2 1-d array.

```
generator = np.random.default_rng(seed=123)
generator.normal(loc=0.0, scale=1.0, size=2)
# array([-0.98912135, -0.36778665])
```

#### Random sample from binomial distribution¶

Randomly sample six values from a binomial distribution with n=10 and p=0.25, then output as a 3x2 array.

```
generator = np.random.default_rng(seed=123)
generator.binomial(n=10, p=0.25, size=(3, 2))
# array([[3, 0],
# [1, 1],
# [1, 4]])
```