numpy数据分析练习
1、创建一维数组
arr = np.arange(10)
arr
# > array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
2、创建一个numpy数组元素值全为True(真)的数组
np.full((3, 3), True, dtype=bool)
# > array([[ True, True, True],
# > [ True, True, True],
# > [ True, True, True]], dtype=bool)
# Alternate method:
np.ones((3,3), dtype=bool)
3、将一维数组转换为2行的2维数组
arr = np.arange(10)
arr.reshape(2, -1) # Setting to -1 automatically decides the number of cols
# > array([[0, 1, 2, 3, 4],
# > [5, 6, 7, 8, 9]])
4、垂直堆叠数组a和数组b
a = np.arange(10).reshape(2,-1)
b = np.repeat(1, 10).reshape(2,-1)
# Answers
# Method 1:
np.concatenate([a, b], axis=0)
# Method 2:
np.vstack([a, b])
# Method 3:
np.r_[a, b]
# > array([[0, 1, 2, 3, 4],
# > [5, 6, 7, 8, 9],
# > [1, 1, 1, 1, 1],
# > [1, 1, 1, 1, 1]])
5、将数组a和数组b水平堆叠。
a = np.arange(10).reshape(2,-1)
b = np.repeat(1, 10).reshape(2,-1)
# Answers
# Method 1:
np.concatenate([a, b], axis=1)
# Method 2:
np.hstack([a, b])
# Method 3:
np.c_[a, b]
# > array([[0, 1, 2, 3, 4, 1, 1, 1, 1, 1],
# > [5, 6, 7, 8, 9, 1, 1, 1, 1, 1]])
6、创建以下模式而不使用硬编码。只使用numpy函数和下面的输入数组a。
np.r_[np.repeat(a, 3), np.tile(a, 3)]
# > array([1, 1, 1, 2, 2, 2, 3, 3, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3])
7、获取数组a和数组b之间的公共项。
a = np.array([1,2,3,2,3,4,3,4,5,6])
b = np.array([7,2,10,2,7,4,9,4,9,8])
np.intersect1d(a,b)
# > array([2, 4])
8、从数组a中删除数组b中的所有项。
a = np.array([1,2,3,4,5])
b = np.array([5,6,7,8,9])
# From 'a' remove all of 'b'
np.setdiff1d(a,b)
# > array([1, 2, 3, 4])
9、获取a和b元素匹配的位置。
a = np.array([1,2,3,2,3,4,3,4,5,6])
b = np.array([7,2,10,2,7,4,9,4,9,8])
np.where(a == b)
# > (array([1, 3, 5, 7]),)
10、获取5到10之间的所有项目。
a = np.arange(15)
# Method 1
index = np.where((a >= 5) & (a <= 10))
a[index]
# Method 2:
index = np.where(np.logical_and(a>=5, a<=10))
a[index]
# > (array([6, 9, 10]),)
# Method 3: (thanks loganzk!)
a[(a >= 5) & (a <= 10)]
11、转换适用于两个标量的函数maxx,以处理两个数组。
def maxx(x, y):
"""Get the maximum of two items"""
if x >= y:
return x
else:
return y
pair_max = np.vectorize(maxx, otypes=[float])
a = np.array([5, 7, 9, 8, 6, 4, 5])
b = np.array([6, 3, 4, 8, 9, 7, 1])
pair_max(a, b)
# > array([ 6., 7., 9., 8., 9., 7., 5.])
12、在数组arr中交换列1和2。
# Input
arr = np.arange(9).reshape(3,3)
arr
# Solution
arr[:, [1,0,2]]
# > array([[1, 0, 2],
# > [4, 3, 5],
# > [7, 6, 8]])
13、交换数组arr中的第1和第2行:
# Input
arr = np.arange(9).reshape(3,3)
# Solution
arr[[1,0,2], :]
# > array([[3, 4, 5],
# > [0, 1, 2],
# > [6, 7, 8]])
14、反转二维数组arr的列。
# Input
arr = np.arange(9).reshape(3,3)
# Solution
arr[:, ::-1]
# > array([[2, 1, 0],
# > [5, 4, 3],
# > [8, 7, 6]])
15、创建一个形状为5x3的二维数组,以包含5到10之间的随机十进制数。
答案:
# Input
arr = np.arange(9).reshape(3,3)
# Solution Method 1:
rand_arr = np.random.randint(low=5, high=10, size=(5,3)) + np.random.random((5,3))
# print(rand_arr)
# Solution Method 2:
rand_arr = np.random.uniform(5,10, size=(5,3))
print(rand_arr)
# > [[ 8.50061025 9.10531502 6.85867783]
# > [ 9.76262069 9.87717411 7.13466701]
# > [ 7.48966403 8.33409158 6.16808631]
# > [ 7.75010551 9.94535696 5.27373226]
# > [ 8.0850361 5.56165518 7.31244004]]
16、只打印或显示numpy数组rand_arr的小数点后3位。
# Input
rand_arr = np.random.random((5,3))
# Create the random array
rand_arr = np.random.random([5,3])
# Limit to 3 decimal places
np.set_printoptions(precision=3)
rand_arr[:4]
# > array([[ 0.443, 0.109, 0.97 ],
# > [ 0.388, 0.447, 0.191],
# > [ 0.891, 0.474, 0.212],
# > [ 0.609, 0.518, 0.403]])
17、通过e式科学记数法来打印rand_arr(如1e10)
# Reset printoptions to default
np.set_printoptions(suppress=False)
# Create the random array
np.random.seed(100)
rand_arr = np.random.random([3,3])/1e3
rand_arr
# > array([[ 5.434049e-04, 2.783694e-04, 4.245176e-04],
# > [ 8.447761e-04, 4.718856e-06, 1.215691e-04],
# > [ 6.707491e-04, 8.258528e-04, 1.367066e-04]])
np.set_printoptions(suppress=True, precision=6) # precision is optional
rand_arr
# > array([[ 0.000543, 0.000278, 0.000425],
# > [ 0.000845, 0.000005, 0.000122],
# > [ 0.000671, 0.000826, 0.000137]])