Python:Data Analytics and Visualization
上QQ阅读APP看书,第一时间看更新

NumPy random numbers

An important part of any simulation is the ability to generate random numbers. For this purpose, NumPy provides various routines in the submodule random. It uses a particular algorithm, called the Mersenne Twister, to generate pseudorandom numbers.

First, we need to define a seed that makes the random numbers predictable. When the value is reset, the same numbers will appear every time. If we do not assign the seed, NumPy automatically selects a random seed value based on the system's random number generator device or on the clock:

>>> np.random.seed(20)

An array of random numbers in the [0.0, 1.0] interval can be generated as follows:

>>> np.random.rand(5)
array([0.5881308, 0.89771373, 0.89153073, 0.81583748, 
 0.03588959])
>>> np.random.rand(5)
array([0.69175758, 0.37868094, 0.51851095, 0.65795147, 
 0.19385022])

>>> np.random.seed(20) # reset seed number
>>> np.random.rand(5)
array([0.5881308, 0.89771373, 0.89153073, 0.81583748, 
 0.03588959])

If we want to generate random integers in the half-open interval [min, max], we can user the randint(min, max, length) function:

>>> np.random.randint(10, 20, 5)
array([17, 12, 10, 16, 18])

NumPy also provides for many other distributions, including the Beta, bionomial, chi-square, Dirichlet, exponential, F, Gamma, geometric, or Gumbel.

The following table will list some distribution functions and give examples for generating random numbers:

We can also use the random number generation to shuffle items in a list. Sometimes this is useful when we want to sort a list in a random order:

>>> a = np.arange(10)
>>> np.random.shuffle(a)
>>> a
array([7, 6, 3, 1, 4, 2, 5, 0, 9, 8])

The following figure shows two distributions, binomial and poisson , side by side with various parameters (the visualization was created with matplotlib, which will be covered in Chapter 4, Data Visualization):