Series
2019-01-31 本文已影响6人
庵下桃花仙
一维数组型对象,包含值序列和索引。
In [1]: import pandas as pd
In [3]: obj = pd.Series([4, 7, -5, 3])
In [4]: obj
Out[4]:
0 4
1 7
2 -5
3 3
dtype: int64
两个属性
In [5]: obj.values
Out[5]: array([ 4, 7, -5, 3], dtype=int64)
In [6]: obj.index
Out[6]: RangeIndex(start=0, stop=4, step=1)
用标签标识每个数据点
In [7]: obj2 = pd.Series([4, 7, -5, 3], index=['d', 'b', 'a', 'c'])
In [8]: obj2
Out[8]:
d 4
b 7
a -5
c 3
dtype: int64
In [10]: obj2.index
Out[10]: Index(['d', 'b', 'a', 'c'], dtype='object')
使用标签进行索引
In [12]: obj2['a']
Out[12]: -5
In [13]: obj2['d'] = 6
In [14]: obj2[['c', 'a', 'd']]
Out[14]:
c 3
a -5
d 6
dtype: int64
数学操作保存索引值连接
In [15]: obj2[obj2 > 0]
Out[15]:
d 6
b 7
c 3
dtype: int64
In [16]: obj * 2
Out[16]:
0 8
1 14
2 -10
3 6
dtype: int64
In [18]: import numpy as np
In [19]: np.exp(obj2)
Out[19]:
d 403.428793
b 1096.633158
a 0.006738
c 20.085537
dtype: float64
Series 可以看作长度固定且有序的字典
In [21]: 'e' in obj2
Out[21]: False
In [22]: 'b' in obj2
Out[22]: True
In [23]: sdata = {'0hio': 35000, 'Texas': 71000, 'Oregon': 16000, 'Utah': 5000}
In [24]: obj3 = pd.Series(sdata)
In [25]: obj3
Out[25]:
0hio 35000
Texas 71000
Oregon 16000
Utah 5000
dtype: int64
使生成的Series的索引顺序符合预期
In [26]: states = ['California', 'Ohio', 'Oregon', 'Texas']
In [27]: obj4 = pd.Series(sdata, index=states)
In [28]: obj4
Out[28]:
California NaN
Ohio NaN
Oregon 16000.0
Texas 71000.0
dtype: float64
NaN(not a number),表示缺失值,pandas中用 isnull 和 notnull 函数检查缺失数据。
In [29]: pd.isnull(obj4)
Out[29]:
California True
Ohio True
Oregon False
Texas False
dtype: bool
In [30]: pd.notnull(obj4)
Out[30]:
California False
Ohio False
Oregon True
Texas True
dtype: bool
In [31]: obj4.isnull()
Out[31]:
California True
Ohio True
Oregon False
Texas False
dtype: bool
Series 可以自动对齐索引
In [32]: obj3
Out[32]:
0hio 35000
Texas 71000
Oregon 16000
Utah 5000
dtype: int64
In [33]: obj4
Out[33]:
California NaN
Ohio NaN
Oregon 16000.0
Texas 71000.0
dtype: float64
In [34]: obj3 + obj4
Out[34]:
0hio NaN
California NaN
Ohio NaN
Oregon 32000.0
Texas 142000.0
Utah NaN
dtype: float64
对象自身和索引都有 name 属性
In [35]: obj4.name = 'population'
In [36]: obj4.index.name = 'satate'
In [37]: obj4
Out[37]:
satate
California NaN
Ohio NaN
Oregon 16000.0
Texas 71000.0
Name: population, dtype: float64
索引可以改变
In [38]: obj
Out[38]:
0 4
1 7
2 -5
3 3
dtype: int64
In [40]: obj.index = ['Bob','Steve', 'Jeff', 'Ryan']
In [41]: obj
Out[41]:
Bob 4
Steve 7
Jeff -5
Ryan 3
dtype: int64