Pandas技巧

Pandas_Select_Data_Boolen

2020-03-31  本文已影响0人  Kaspar433

Pandas_Select_Data_Boolen

import pandas as pd
import numpy as np
​
iris = pd.read_csv('iris.csv')
iris.head(2)

out:
sepal_length    sepal_width petal_length    petal_width species
0   5.1 3.5 1.4 0.2 setosa
1   4.9 3.0 1.4 0.2 setosa

另一种常见操作是使用布尔向量来过滤数据。

操作符为:| 对应or,& 对应and,~对应not。

必须使用括号对这些进行分组。

使用布尔向量索引系列的工作方式与NumPy ndarray完全相同:

iris[iris.sepal_length>7]

out:
sepal_length    sepal_width petal_length    petal_width species
102 7.1 3.0 5.9 2.1 virginica
105 7.6 3.0 6.6 2.1 virginica
107 7.3 2.9 6.3 1.8 virginica
109 7.2 3.6 6.1 2.5 virginica
117 7.7 3.8 6.7 2.2 virginica
118 7.7 2.6 6.9 2.3 virginica
122 7.7 2.8 6.7 2.0 virginica
125 7.2 3.2 6.0 1.8 virginica
129 7.2 3.0 5.8 1.6 virginica
130 7.4 2.8 6.1 1.9 virginica
131 7.9 3.8 6.4 2.0 virginica
135 7.7 3.0 6.1 2.3 virginica
iris.loc[iris.sepal_length>7]

out:
sepal_length    sepal_width petal_length    petal_width species
102 7.1 3.0 5.9 2.1 virginica
105 7.6 3.0 6.6 2.1 virginica
107 7.3 2.9 6.3 1.8 virginica
109 7.2 3.6 6.1 2.5 virginica
117 7.7 3.8 6.7 2.2 virginica
118 7.7 2.6 6.9 2.3 virginica
122 7.7 2.8 6.7 2.0 virginica
125 7.2 3.2 6.0 1.8 virginica
129 7.2 3.0 5.8 1.6 virginica
130 7.4 2.8 6.1 1.9 virginica
131 7.9 3.8 6.4 2.0 virginica
135 7.7 3.0 6.1 2.3 virginica

& (and)

iris[(iris.sepal_length>7) & (iris.sepal_width<3)]

out:
sepal_length    sepal_width petal_length    petal_width species
107 7.3 2.9 6.3 1.8 virginica
118 7.7 2.6 6.9 2.3 virginica
122 7.7 2.8 6.7 2.0 virginica
130 7.4 2.8 6.1 1.9 virginica
iris.loc[(iris.sepal_length>7) & (iris.sepal_width<3)]

out:
sepal_length    sepal_width petal_length    petal_width species
107 7.3 2.9 6.3 1.8 virginica
118 7.7 2.6 6.9 2.3 virginica
122 7.7 2.8 6.7 2.0 virginica
130 7.4 2.8 6.1 1.9 virginica

|(or)

iris[(iris.sepal_length>7) | (iris.sepal_width>4)]

out:
sepal_length sepal_width petal_length petal_width species
15 5.7 4.4 1.5 0.4 setosa
32 5.2 4.1 1.5 0.1 setosa
33 5.5 4.2 1.4 0.2 setosa
102 7.1 3.0 5.9 2.1 virginica
105 7.6 3.0 6.6 2.1 virginica
107 7.3 2.9 6.3 1.8 virginica
109 7.2 3.6 6.1 2.5 virginica
117 7.7 3.8 6.7 2.2 virginica
118 7.7 2.6 6.9 2.3 virginica
122 7.7 2.8 6.7 2.0 virginica
125 7.2 3.2 6.0 1.8 virginica
129 7.2 3.0 5.8 1.6 virginica
130 7.4 2.8 6.1 1.9 virginica
131 7.9 3.8 6.4 2.0 virginica
135 7.7 3.0 6.1 2.3 virginica


```python
iris.loc[(iris.sepal_length>7) | (iris.sepal_width>4)]

out:
sepal_length    sepal_width petal_length    petal_width species
15  5.7 4.4 1.5 0.4 setosa
32  5.2 4.1 1.5 0.1 setosa
33  5.5 4.2 1.4 0.2 setosa
102 7.1 3.0 5.9 2.1 virginica
105 7.6 3.0 6.6 2.1 virginica
107 7.3 2.9 6.3 1.8 virginica
109 7.2 3.6 6.1 2.5 virginica
117 7.7 3.8 6.7 2.2 virginica
118 7.7 2.6 6.9 2.3 virginica
122 7.7 2.8 6.7 2.0 virginica
125 7.2 3.2 6.0 1.8 virginica
129 7.2 3.0 5.8 1.6 virginica
130 7.4 2.8 6.1 1.9 virginica
131 7.9 3.8 6.4 2.0 virginica
135 7.7 3.0 6.1 2.3 virginica

~(not)

iris[~(iris.sepal_length<=7)]

out:
sepal_length    sepal_width petal_length    petal_width species
102 7.1 3.0 5.9 2.1 virginica
105 7.6 3.0 6.6 2.1 virginica
107 7.3 2.9 6.3 1.8 virginica
109 7.2 3.6 6.1 2.5 virginica
117 7.7 3.8 6.7 2.2 virginica
118 7.7 2.6 6.9 2.3 virginica
122 7.7 2.8 6.7 2.0 virginica
125 7.2 3.2 6.0 1.8 virginica
129 7.2 3.0 5.8 1.6 virginica
130 7.4 2.8 6.1 1.9 virginica
131 7.9 3.8 6.4 2.0 virginica
135 7.7 3.0 6.1 2.3 virginica

使用“-”也可以起到同样作用。

iris[-(iris.sepal_length<=7)]

out:
sepal_length    sepal_width petal_length    petal_width species
102 7.1 3.0 5.9 2.1 virginica
105 7.6 3.0 6.6 2.1 virginica
107 7.3 2.9 6.3 1.8 virginica
109 7.2 3.6 6.1 2.5 virginica
117 7.7 3.8 6.7 2.2 virginica
118 7.7 2.6 6.9 2.3 virginica
122 7.7 2.8 6.7 2.0 virginica
125 7.2 3.2 6.0 1.8 virginica
129 7.2 3.0 5.8 1.6 virginica
130 7.4 2.8 6.1 1.9 virginica
131 7.9 3.8 6.4 2.0 virginica
135 7.7 3.0 6.1 2.3 virginica
iris.loc[~(iris.sepal_length<=7)]

out:
sepal_length    sepal_width petal_length    petal_width species
102 7.1 3.0 5.9 2.1 virginica
105 7.6 3.0 6.6 2.1 virginica
107 7.3 2.9 6.3 1.8 virginica
109 7.2 3.6 6.1 2.5 virginica
117 7.7 3.8 6.7 2.2 virginica
118 7.7 2.6 6.9 2.3 virginica
122 7.7 2.8 6.7 2.0 virginica
125 7.2 3.2 6.0 1.8 virginica
129 7.2 3.0 5.8 1.6 virginica
130 7.4 2.8 6.1 1.9 virginica
131 7.9 3.8 6.4 2.0 virginica
135 7.7 3.0 6.1 2.3 virginica

使用map()

criterion = iris.sepal_length.map(lambda s: s>7.5)
iris[criterion]

out:
sepal_length    sepal_width petal_length    petal_width species
105 7.6 3.0 6.6 2.1 virginica
117 7.7 3.8 6.7 2.2 virginica
118 7.7 2.6 6.9 2.3 virginica
122 7.7 2.8 6.7 2.0 virginica
131 7.9 3.8 6.4 2.0 virginica
135 7.7 3.0 6.1 2.3 virginica
上一篇下一篇

猜你喜欢

热点阅读