您现在的位置是：首页 > 技术教程正文

头歌Python实训——Numpy 数据统计

admin 阅读：36 2024-03-25

后台-插件-广告管理-内容页头部广告（手机）

第1关：创建特定形态的 ndarray 数组

任务描述

本关任务：编写程序求取特定形态的ndarray数组，并输出。

知识讲解

NumPy 是 Python 的一种数值计算库，它提供了高效的多维数组和矩阵计算。其核心数据结构的 ndarray 数组。
NumPy模块通过import语句引入，习惯用别名np表示。
import numpy as np

ndarray数组定义

ndarray数组是存储单一数据类型的多维数组结构。通过numpy.array(object)创建，object是数据序列。
1. >>> import numpy as np
2. >>> a = np.array([1,2,3,4])
3. >>> type(a)
4. <class 'numpy.ndarray'>
5. >>> a
6. array([1, 2, 3, 4])
ndarray数组的常用属性如下：
1. ndim：数组维度数
3. shape：数组尺寸
5. size：数组的元素个数
7. dtype：数组元素的类型

>>> b = np.array([[1,2,3],[3,4,5]])
>>> b.ndim
2
>>> b.shape
(2, 3)
>>> b.size
6
>>> b.dtype
dtype('int32')

ndarray数组变形

在不改变数组元素个数的前提下，可以通过修改数组的shape属性来改变数组形状。

>>> b
array([[1, 2, 3],
[3, 4, 5]])
>>> b.size
6
>>> b.shape
(2, 3)
>>> b.shape=(3,2)
>>> b
array([[1, 2],
[3, 3],
[4, 5]])
>>> b.size
6

创建特型数组

numpy提供了一些快速创建特殊形态数组的方法。如：

等差数组：arange(开始值，终值，步长)
1. >>> np.arange(1,10,2)
2. array([1, 3, 5, 7, 9])
等比数组：logspace(开始值，终值，个数)
1. >>> np.logspace(1,2,5)
2. array([ 10. , 17.7827941 , 31.6227766 , 56.23413252, 100. ])
全0数组：zeros(维度)
1. >>> np.zeros((2,3))
2. array([[0., 0., 0.],
3. [0., 0., 0.]])
全1数组：ones(维度)
1. >>> np.ones((3,2))
2. array([[1., 1.],
3. [1., 1.],
4. [1., 1.]])
单位矩阵：eye(维度)
1. >>> np.eye(3)
2. array([[1., 0., 0.],
3. [0., 1., 0.],
4. [0., 0., 1.]])
对角线矩阵：diag(列表)
1. >>> np.diag([1,2,3])
2. array([[1, 0, 0],
3. [0, 2, 0],
4. [0, 0, 3]])

随机数

numpy.random模块提供了随机数的操作方法。常用的方法有：

np.random.random：0-1的随机数
np.random.rand：均匀分布随机数
np.random.randn：正态分布随机数
np.random.randint：给定范围的随机整数
np.random.seed：随机数种子，相同的种子产生相同随机数。

>>> np.random.random((2,3))
array([[0.37504179, 0.87765674, 0.77247682],
[0.66231209, 0.63139262, 0.74569536]])
>>> np.random.rand(2,3)
array([[0.87504014, 0.12031278, 0.53230602],
[0.93854757, 0.17804695, 0.65469302]])
>>> np.random.randn(2,3)
array([[-1.15633193, 0.01430615, -0.78105448],
[ 2.21185689, 0.20768009, -0.65431958]])
>>> np.random.randint(1,10,size=[2,3])
array([[5, 9, 6],
[2, 8, 9]])

编程要求

根据提示，在右侧编辑器Begin-End处补充代码。

测试说明

平台会对你编写的代码进行测试：预期输出：

the result of 1.1 is:
[[-2. 0. 0. 0. 0.]
[ 0. -1. 0. 0. 0.]
[ 0. 0. 0. 0. 0.]
[ 0. 0. 0. 1. 0.]
[ 0. 0. 0. 0. 2.]]
the result of 1.2 is:
[[-8 -6 -4]
[-2 0 2]
[ 4 6 8]]
the result of 1.3 is:
[[False False False]
[False False False]
[ True True True]]
the result of 1.4 is:
[8 9 3 8 8 0 5 3 9 9]
the result of 1.5 is:
[ 0.44122749 -0.33087015 2.43077119 -0.25209213 0.10960984 1.58248112 -0.9092324 -0.59163666 0.18760323 -0.32986996 -1.19276461 -0.20487651 -0.35882895 0.6034716 -1.66478853 -0.70017904 1.15139101 1.85733101 -1.51117956 0.64484751]

详见“预期输出”的结果。

代码内容

"""
作业2：Numpy统计应用
"""
'''
1、创建特定形态的数组
'''
import numpy as np
'''
1.1
构建并输出如下数组：
[[-2. 0. 0. 0. 0.]
[ 0. -1. 0. 0. 0.]
[ 0. 0. 0. 0. 0.]
[ 0. 0. 0. 1. 0.]
[ 0. 0. 0. 0. 2.]]
'''
print("the result of 1.1 is:")
############begin############
np_1 = np.array([[-2., 0., 0., 0., 0.],
[0., -1., 0., 0., 0.],
[0., 0., 0., 0., 0.],
[0., 0., 0., 1., 0.],
[0., 0., 0., 0., 2.]])
print(np_1)
########### end #############
'''
1.2
生成并输出特型数组：
[[-8 -6 -4]
[-2 0 2]
[ 4 6 8]]
'''
print("the result of 1.2 is:")
############begin############
np_2 = np.array([[-8, -6, -4],
[-2, 0, 2],
[4, 6, 8]
])
print(np_2)
########### end #############
'''
1.3
特型数组输出：
[[False False False]
[False False False]
[ True True True]]
'''
print("the result of 1.3 is:")
############begin############
np_3 = np.array([[False, False, False],
[False, False, False],
[True, True, True]
])
print(np_3)
########### end #############
'''
1.4
生成并输出一个长度为10，在[0,10]范围内的随机数组（seed =3）:
[8 9 3 8 8 0 5 3 9 9]
'''
print("the result of 1.4 is:")
############begin############
np.random.seed(3)
np_4 = np.random.randint(0, 10, size=(10))
print(np_4)
########### end #############
'''
1.5
生成并输出一个长度为20，服从正态分布的随机数组（seed=5）
[ 0.44122749 -0.33087015 2.43077119 -0.25209213 0.10960984 1.58248112
-0.9092324 -0.59163666 0.18760323 -0.32986996 -1.19276461 -0.20487651
-0.35882895 0.6034716 -1.66478853 -0.70017904 1.15139101 1.85733101
-1.51117956 0.64484751]
'''
print("the result of 1.5 is:")
############begin############
np.random.seed(5)
np_5 = np.random.randn(20)
print(np_5)
########### end #############

第2关：ndarray 数组访问

任务描述

本关任务：编写程序检索ndarray数组，输出指定结果。

知识讲解

通过索引访问数组

和列表（list）一样，可以使用索引下表和切片的方式访问数组元素。
多维数组有两种访问方法，如下所示。
1. >>> a = np.ndarray([[1,2],[3,4]])
2. >>> a[1,0] # 方法1
3. 3
4. >>> a[1][0] # 方法2
5. 3

数组形态的变换

维度变换：reshape
1. >>> a = np.arange(1,9)
2. >>> a
3. array([ 1, 2, 3, 4, 5, 6, 7, 8])
4. >>> a.reshape([2,4])
5. array([[ 1, 2, 3, 4],
6. [ 5, 6, 7, 8]])
展平：ravel
1. >>> a = np.arange(1,9).reshape([2,4])
2. >>> a
3. array([[1, 2, 3, 4],
4. [5, 6, 7, 8]])
5. >>> a.ravel()
6. array([1, 2, 3, 4, 5, 6, 7, 8])

编程要求

根据提示，在右侧编辑器Begin-End处补充代码，完成本关任务。

测试说明

平台会对你编写的代码进行测试：预期输出：

the result of 2.1 is:
[ 1 3 5 7 9 11 13 15 17 19]
the result of 2.2 is:
[ 0 5 10 15]
the result of 2.3 is:
[[ 0 5]
[10 15]]
the result of 2.4 is:
[[ 0 1 2 3]
[ 8 9 10 11]
[16 17 18 19]]
the result of 2.5 is:
[ 3 19]
the result of 2.6 is:
[ 0 2 4 6 8 10 12 14 16 18 1 3 5 7 9 11 13 15 17 19]
the result of 2.7 is:
[[1 2 1]
[4 5 4]
[7 8 7]]

代码内容

"""
作业2：Numpy统计应用
"""
'''
2
数组访问
有数组a = np.array(range(0,20))，试输出：
'''
import numpy as np
a = np.array(range(0,20))
'''
2.1
[ 1 3 5 7 9 11 13 15 17 19]
'''
print("the result of 2.1 is:")
############begin############
a_1 = a[1:20:2]
print(a_1)
########### end #############
'''
2.2
[ 0 5 10 15]
'''
print("the result of 2.2 is:")
############begin############
a_2 = a[0:20:5]
print(a_2)
########### end #############
'''
2.3
[[ 0 5]
[10 15]]
'''
print("the result of 2.3 is:")
############begin############
a_3 = a_2.reshape([2, 2])
print(a_3)
########### end #############
'''
2.4
[[ 0 1 2 3]
[ 8 9 10 11]
[16 17 18 19]]
'''
print("the result of 2.4 is:")
############begin############
a_4 = np.array([a[0:4],a[8:12], a[16:20]])
print(a_4)
########### end #############
'''
2.5
[ 3 19]
'''
print("the result of 2.5 is:")
############begin############
a_5 = np.array([a[3], a[19]])
print(a_5)
########### end #############
'''
2.6
[ 0 2 4 6 8 10 12 14 16 18 1 3 5 7 9 11 13 15 17 19]
'''
print("the result of 2.6 is:")
############begin############
a_6_1 = a[0:19:2]
a_6_2 = a[1:20:2]
a_6 = np.array([a_6_1, a_6_2])
a_6 = a_6.ravel()
print(a_6)
########### end #############
'''
2.7
[[1 2 1]
[4 5 4]
[7 8 7]]
'''
print("the result of 2.7 is:")
############begin############
a_7 = np.array([[a[1], a[2], a[1]],
[a[4], a[5], a[4]],
[a[7], a[8], a[7]],
])
print(a_7)
########### end #############

第3关：简单矩阵变换

任务描述

本关任务：编写程序，解决矩阵计算中的若干问题。

知识讲解

NumPy模块提供了矩阵计算的专用结构matrix，以及相关计算方法。

矩阵定义

np.matrix(object)方法负责创建矩阵类型对象。

>>> a
array([[1, 2],
[3, 4]])
>>> m = np.matrix(a)
>>> m
matrix([[1, 2],
[3, 4]])

矩阵计算

矩阵对应元素的加减计算：m1±m2
矩阵的3种乘法：
矩阵与数值相乘：m1*3
矩阵*矩阵：m1*m2
矩阵对应元素相乘：np.multiply(m1*m2)

>>> a
array([[1, 2],
[3, 4]])
>>> m1 = np.matrix(a)
>>> m1
matrix([[1, 2],
[3, 4]])
>>> b
array([[5, 6],
[7, 8]])
>>> m2 = np.matrix(b)
>>> m2
matrix([[5, 6],
[7, 8]])
>>> m1*3
matrix([[ 3, 6],
[ 9, 12]])
>>> m1*m2
matrix([[19, 22],
[43, 50]])
>>> np.multiply(m1,m2)
matrix([[ 5, 12],
[21, 32]])

编程要求

根据提示，在右侧编辑器Begin-End处补充代码，完成本关任务。

测试说明

平台会对你编写的代码进行测试：预期输出：

the result of 3.1 is:
[[ 3 6 9]
[12 15 18]
[21 24 27]]
the result of 3.2 is:
[[0. 2. 3.]
[4. 4. 6.]
[7. 8. 8.]]
the result of 3.3 is:
[[1 2 3 1 2 3]
[4 5 6 4 5 6]
[7 8 9 7 8 9]]
the result of 3.4 is:
[[1 4 7]
[2 5 8]
[3 6 9]]

代码内容

"""
作业2：Numpy统计应用
"""
'''
3、矩阵计算
有矩阵m = np.mat("1 2 3;4 5 6;7 8 9")，试求解并输出：
'''
import numpy as np
m = np.mat("1 2 3;4 5 6;7 8 9")
'''
3.1
[[ 3 6 9]
[12 15 18]
[21 24 27]]
'''
print("the result of 3.1 is:")
############begin############
m_1 = m * 3
print(m_1)
########### end #############
'''
3.2
[[0. 2. 3.]
[4. 4. 6.]
[7. 8. 8.]]
'''
print("the result of 3.2 is:")
############begin############
n_1 = np.mat('1. 0. 0.;0. 1. 0.;0. 0. 1.')
m_2 = m - n_1
print(m_2)
########### end #############
'''
3.3
[[1 2 3 1 2 3]
[4 5 6 4 5 6]
[7 8 9 7 8 9]]
'''
print("the result of 3.3 is:")
############begin############
#n_2 = np.mat('0 0 0 1 2 3;0 0 0 4 5 6;0 0 0 7 8 9')
#m_3 = m + n_2
m_3 = np.insert(m, 3, [1, 4 ,7], axis=1)
m_3 = np.insert(m_3, 4, [2, 5 ,8], axis=1)
m_3 = np.insert(m_3, 5, [3, 6, 9], axis=1)
print(m_3)
########### end #############
'''
3.4
转置矩阵
[[1 4 7]
[2 5 8]
[3 6 9]]
'''
print("the result of 3.4 is:")
############begin############
m_4 = m.T
print(m_4)
########### end #############

第4关：NumPy 统计应用

任务描述

本关任务：编写程序，读取iris_sepal_length.csv文件，计算鸢尾花数据集中花萼样本的频度分布。

知识讲解

为了完成本关任务，你需要掌握：1.csv文件读取，2.comsum函数应用，3.格式化输出。

用numpy读取csv数据文件

numpy中的loadtxt(file,delimiter)方法可以读取文本文件，其中参数file是文件路径，参数delimiter用于指定分隔符，csv文件的分隔符为逗号,，该方法返回一个二维数组

常用的统计函数

函数名	功能
amin	最小值
amax	最大值
sum	求和
mean	平均值
std	标准差
var	方差
cumsum	所有元素的累积和

注意：numpy提供了在不同维度上进行统计的功能，用参数axis指定统计的轴。例如：对于2维数组，axis=1表示以纵轴为基准，统计横轴数据；axis=0表示以横轴为基准，统计纵轴数据。

编程要求

根据提示，在右侧编辑器Begin-End处补充代码，完成本关任务。

测试说明

平台会对你编写的代码进行测试：预期输出：

the result of 4 is:
Simple size: 150
Range:Size(Percent)
4 - 5: 22 (14.7%)
5 - 6: 61 (40.7%)
6 - 7: 54 (36.0%)
7 - 8: 13 (8.7%)

详见评测预期。

代码内容

"""
作业2：Numpy统计应用
"""
'''
4、NumPy统计应用
文件iris_sepal_length.csv存储150个鸢尾花花萼长度样本数据，请利用Numpy模块的统计功能，计算花萼长度的频度分布。
提示：np.cumsum()
输出样式：
Simple size: 150
Range:Size(Percent)
4 - 5: 22 (14.7%)
5 - 6: 61 (40.7%)
6 - 7: 54 (36.0%)
7 - 8: 13 (8.7%)
'''
import numpy as np
print("the result of 4 is:")
############begin############
x = np.loadtxt('./iris_sepal_length.csv', delimiter=',')
print('Simple size: '+ str(len(x)))
print()
print('Range:Size(Percent)')
count_1, count_2, count_3, count_4= 0, 0, 0, 0
for i in x:
if (i >= 4 and i < 5):
count_1 += 1
if (i >= 5 and i < 6):
count_2 += 1
if (i >= 6 and i < 7):
count_3 += 1
if (i >= 7 and i < 8):
count_4 += 1
count = count_1 + count_2 + count_3 + count_4
print('4 - 5:', count_1, '({:.1%})'.format(count_1 / count))
print('5 - 6:', count_2, '({:.1%})'.format(count_2 / count))
print('6 - 7:', count_3, '({:.1%})'.format(count_3 / count))
print('7 - 8:', count_4, '({:.1%})'.format(count_4 / count))
########### end #############

标签:

声明

1.本站遵循行业规范，任何转载的稿件都会明确标注作者和来源；2.本站的原创文章，请转载时务必注明文章作者和来源，不尊重原创的行为我们将追究责任；3.作者投稿可能会经我们编辑修改或补充。

在线投稿：投稿站长QQ：1888636

后台-插件-广告管理-内容页尾部广告（手机）