pandas多索引(MultiIndex)简介

2020-02-26  本文已影响0人  python测试开发

pandas通常具有“索引”,即用一列每一行提供名称。 它像数据库表中的主键一样工作。 Pandas还支持MultiIndex,其中行的索引是几列的复合键。

从CSV文件创建未索引的DataFrame

>>> import pandas, io
>>> data = io.StringIO('''Fruit,Color,Count,Price
... Apple,Red,3,$1.29
... Apple,Green,9,$0.99
... Pear,Red,25,$2.59
... Pear,Green,26,$2.79
... Lime,Green,99,$0.39
... ''')
>>> df_unindexed = pandas.read_csv(data)
>>> df_unindexed
   Fruit  Color  Count  Price
0  Apple    Red      3  $1.29
1  Apple  Green      9  $0.99
2   Pear    Red     25  $2.59
3   Pear  Green     26  $2.79
4   Lime  Green     99  $0.39
>>> df = df_unindexed.set_index(['Fruit', 'Color'])
>>> df
             Count  Price
Fruit Color
Apple Red        3  $1.29
      Green      9  $0.99
Pear  Red       25  $2.59
      Green     26  $2.79
Lime  Green     99  $0.39
>>>
>>>
>>> df.xs('Apple')
       Count  Price
Color
Red        3  $1.29
Green      9  $0.99
>>>
>>> df.xs('Red', level='Color')
       Count  Price
Fruit
Apple      3  $1.29
Pear      25  $2.59
>>> df.loc['Apple', :]
       Count  Price
Color
Red        3  $1.29
Green      9  $0.99
>>>
>>>
>>> df.loc[('Apple', 'Red'), :]
Count        3
Price    $1.29
Name: (Apple, Red), dtype: object
>>>

https://www.somebits.com/~nelson/pandas-multiindex-slice-demo.html

pandas.DataFrame.xs

此方法采用关键参数来选择MultiIndex特定级别的数据,实际上也适用于单列索引,用于通过索引的方式访问行,和loc类似。

>>> d = {'num_legs': [4, 4, 2, 2],
...      'num_wings': [0, 0, 2, 2],
...      'class': ['mammal', 'mammal', 'mammal', 'bird'],
...      'animal': ['cat', 'dog', 'bat', 'penguin'],
...      'locomotion': ['walks', 'walks', 'flies', 'walks']}
>>> df = pd.DataFrame(data=d)
>>> df
   num_legs  num_wings   class   animal locomotion
0         4          0  mammal      cat      walks
1         4          0  mammal      dog      walks
2         2          2  mammal      bat      flies
3         2          2    bird  penguin      walks
>>> df = df.set_index(['class', 'animal', 'locomotion'])
>>> df
                           num_legs  num_wings
class  animal  locomotion
mammal cat     walks              4          0
       dog     walks              4          0
       bat     flies              2          2
bird   penguin walks              2          2
>>> df.xs('mammal')
                   num_legs  num_wings
animal locomotion
cat    walks              4          0
dog    walks              4          0
bat    flies              2          2
>>> df.xs(('mammal', 'dog'))
sys:1: PerformanceWarning: indexing past lexsort depth may impact performance.
            num_legs  num_wings
locomotion
walks              4          0
>>> df.xs('cat', level=1)
                   num_legs  num_wings
class  locomotion
mammal walks              4          0
>>> df.xs(('bird', 'walks'),level=[0, 'locomotion'])
         num_legs  num_wings
animal
penguin         2          2
>>> df.xs('num_wings', axis=1)
class   animal   locomotion
mammal  cat      walks         0
        dog      walks         0
        bat      flies         2
bird    penguin  walks         2
Name: num_wings, dtype: int64
上一篇下一篇

猜你喜欢

热点阅读