pandans_resample函数

2023-01-28  本文已影响0人  敬子v

数据源:链接: https://pan.baidu.com/s/1EFqJFXf70t2Rubkh6D19aw 提取码: syqg
数据源示例:

探索1960 - 2014 美国犯罪数据

步骤1 导入必要的库

import pandas as pd
import numpy as np

步骤2 从以下地址导入数据集

path1='pandas_exercise\exercise_data/US_Crime_Rates_1960_2014.csv'

步骤3 将数据框命名为crime

crime=pd.read_csv(path1)
print(crime.head())

步骤4 每一列(column)的数据类型是什么样的?用info

print(crime.info())

步骤5 将Year的数据类型转换为 datetime64 用pd.to_datetime

crime['Year']=pd.to_datetime(crime.Year,format='%Y')
print(crime.head())

步骤6 将列Year设置为数据框的索引 用set_index

crime=crime.set_index('Year',drop=True)
print(crime.head())

步骤7 删除名为Total的列 用del

del crime['Total']
print(crime.head())

步骤8 按照Year对数据框进行分组并求和 每十年 时间序列重采样resample

crimes=crime.resample('10AS').sum() #对每一列进行十年加和运算
crimes['Population']=crime['Population'].resample('10AS').max() #用resample去得到“Population”列的最大值,并替换
print(crimes)

步骤9 何时是美国历史上生存最危险的年代?

print(crime.idxmax(0)) #采用idxmax()函数用于沿索引轴查找最大值的索引
示例:

输出

# 步骤3
   Year  Population    Total  ...  Burglary  Larceny_Theft  Vehicle_Theft
0  1960   179323175  3384200  ...    912100        1855400         328200
1  1961   182992000  3488000  ...    949600        1913000         336000
2  1962   185771000  3752200  ...    994300        2089600         366800
3  1963   188483000  4109500  ...   1086400        2297800         408300
4  1964   191141000  4564600  ...   1213200        2514400         472800
[5 rows x 12 columns]
# 步骤4
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 55 entries, 0 to 54
Data columns (total 12 columns):
 #   Column              Non-Null Count  Dtype
---  ------              --------------  -----
 0   Year                55 non-null     int64
 1   Population          55 non-null     int64
 2   Total               55 non-null     int64
 3   Violent             55 non-null     int64
 4   Property            55 non-null     int64
 5   Murder              55 non-null     int64
 6   Forcible_Rape       55 non-null     int64
 7   Robbery             55 non-null     int64
 8   Aggravated_assault  55 non-null     int64
 9   Burglary            55 non-null     int64
 10  Larceny_Theft       55 non-null     int64
 11  Vehicle_Theft       55 non-null     int64
dtypes: int64(12)
memory usage: 5.3 KB
None
# 步骤5
        Year  Population    Total  ...  Burglary  Larceny_Theft  Vehicle_Theft
0 1960-01-01   179323175  3384200  ...    912100        1855400         328200
1 1961-01-01   182992000  3488000  ...    949600        1913000         336000
2 1962-01-01   185771000  3752200  ...    994300        2089600         366800
3 1963-01-01   188483000  4109500  ...   1086400        2297800         408300
4 1964-01-01   191141000  4564600  ...   1213200        2514400         472800
[5 rows x 12 columns]
# 步骤6
            Population    Total  ...  Larceny_Theft  Vehicle_Theft
Year                             ...                              
1960-01-01   179323175  3384200  ...        1855400         328200
1961-01-01   182992000  3488000  ...        1913000         336000
1962-01-01   185771000  3752200  ...        2089600         366800
1963-01-01   188483000  4109500  ...        2297800         408300
1964-01-01   191141000  4564600  ...        2514400         472800
[5 rows x 11 columns]
# 步骤7
            Population  Violent  ...  Larceny_Theft  Vehicle_Theft
Year                             ...                              
1960-01-01   179323175   288460  ...        1855400         328200
1961-01-01   182992000   289390  ...        1913000         336000
1962-01-01   185771000   301510  ...        2089600         366800
1963-01-01   188483000   316970  ...        2297800         408300
1964-01-01   191141000   364220  ...        2514400         472800
[5 rows x 10 columns]
# 步骤8
            Population   Violent  ...  Larceny_Theft  Vehicle_Theft
Year                              ...                              
1960-01-01   201385000   4134930  ...       26547700        5292100
1970-01-01   220099000   9607930  ...       53157800        9739900
1980-01-01   248239000  14074328  ...       72040253       11935411
1990-01-01   272690813  17527048  ...       77679366       14624418
2000-01-01   307006550  13968056  ...       67970291       11412834
2010-01-01   318857056   6072017  ...       30401698        3569080
[6 rows x 10 columns]
# 步骤9
Population           2014-01-01
Violent              1992-01-01
Property             1991-01-01
Murder               1991-01-01
Forcible_Rape        1992-01-01
Robbery              1991-01-01
Aggravated_assault   1993-01-01
Burglary             1980-01-01
Larceny_Theft        1991-01-01
Vehicle_Theft        1991-01-01
dtype: datetime64[ns]

上一篇下一篇

猜你喜欢

热点阅读