Pandas 教程集合我爱编程

Pandas.DataFrame插入列和行

2016-08-17  本文已影响64099人  16926b49840e

以csv实例文件操作插入DataFrame的列和行
文件名:example.csv
内容:

date spring summer autumn winter
2000 12.233881 16.907301 15.692383 14.085962
2001 12.847481 16.750469 14.514066 13.503746
2002 13.558175 17.203393 15.699948 13.233652
2003 12.654725 16.894915 15.661465 12.843479
2004 13.253730 17.046967 15.209054 14.364791
2005 13.444305 16.745982 16.622188 11.610823
2006 13.505696 16.833579 15.497928 12.199344
2007 13.488526 16.667733 15.817014 13.743822
2008 13.151532 16.486507 15.729573 12.932336
2009 13.457715 16.639238 18.260180 12.653159
2010 13.194548 16.728689 15.426353 13.883358
2011 14.347794 16.689421 14.176580 12.366542
2012 13.605087 17.130568 14.717968 13.292552
2013 13.027908 17.386193 16.203455 13.186121
2014 12.746682 16.544287 14.736768 12.870651
2015 13.465904 16.506123 12.442437 11.018138

插入列

先把数据按列分割,然后再把分出去的列重新插入原数据块中。

In [1]:
import numpy as np
import pandas as pd
 
table = pd.read_csv('example.csv')
table

Out[1]:
date    spring  summer  autumn  winter
0   2000    12.233881   16.907301   15.692383   14.085962
1   2001    12.847481   16.750469   14.514066   13.503746
2   2002    13.558175   17.203393   15.699948   13.233652
3   2003    12.654725   16.894915   15.661465   12.843479
4   2004    13.253730   17.046967   15.209054   14.364791
5   2005    13.444305   16.745982   16.622188   11.610823
6   2006    13.505696   16.833579   15.497928   12.199344
7   2007    13.488526   16.667733   15.817014   13.743822
8   2008    13.151532   16.486507   15.729573   12.932336
9   2009    13.457715   16.639238   18.260180   12.653159
10  2010    13.194548   16.728689   15.426353   13.883358
11  2011    14.347794   16.689421   14.176580   12.366542
12  2012    13.605087   17.130568   14.717968   13.292552
13  2013    13.027908   17.386193   16.203455   13.186121
14  2014    12.746682   16.544287   14.736768   12.870651
15  2015    13.465904   16.506123   12.442437   11.018138

In [2]:
date = table.pop('date')
date

Out[2]:
0     2000
1     2001
2     2002
3     2003
4     2004
5     2005
6     2006
7     2007
8     2008
9     2009
10    2010
11    2011
12    2012
13    2013
14    2014
15    2015
Name: date, dtype: int64

In [3]:
summer = table.pop('summer')
summer

Out[3]:
0     16.907301
1     16.750469
2     17.203393
3     16.894915
4     17.046967
5     16.745982
6     16.833579
7     16.667733
8     16.486507
9     16.639238
10    16.728689
11    16.689421
12    17.130568
13    17.386193
14    16.544287
15    16.506123
Name: summer, dtype: float64

In [4]:
winter = table.pop('winter')
winter

Out[4]:
0     14.085962
1     13.503746
2     13.233652
3     12.843479
4     14.364791
5     11.610823
6     12.199344
7     13.743822
8     12.932336
9     12.653159
10    13.883358
11    12.366542
12    13.292552
13    13.186121
14    12.870651
15    11.018138
Name: winter, dtype: float64

In [5]:
table

Out[5]:
spring  autumn
0   12.233881   15.692383
1   12.847481   14.514066
2   13.558175   15.699948
3   12.654725   15.661465
4   13.253730   15.209054
5   13.444305   16.622188
6   13.505696   15.497928
7   13.488526   15.817014
8   13.151532   15.729573
9   13.457715   18.260180
10  13.194548   15.426353
11  14.347794   14.176580
12  13.605087   14.717968
13  13.027908   16.203455
14  12.746682   14.736768
15  13.465904   12.442437

分割完毕,现在要把各列重新插入,除在最右侧插入用标签直接创建外,其他列用.insert()方法进行插入。

In [6]:
table.insert(0,'date',date)
table

Out[6]:
date    spring  autumn
0   2000    12.233881   15.692383
1   2001    12.847481   14.514066
2   2002    13.558175   15.699948
3   2003    12.654725   15.661465
4   2004    13.253730   15.209054
5   2005    13.444305   16.622188
6   2006    13.505696   15.497928
7   2007    13.488526   15.817014
8   2008    13.151532   15.729573
9   2009    13.457715   18.260180
10  2010    13.194548   15.426353
11  2011    14.347794   14.176580
12  2012    13.605087   14.717968
13  2013    13.027908   16.203455
14  2014    12.746682   14.736768
15  2015    13.465904   12.442437

In [7]:
table.insert(2,'summer',summer)
table

Out[7]:
date    spring  summer  autumn
0   2000    12.233881   16.907301   15.692383
1   2001    12.847481   16.750469   14.514066
2   2002    13.558175   17.203393   15.699948
3   2003    12.654725   16.894915   15.661465
4   2004    13.253730   17.046967   15.209054
5   2005    13.444305   16.745982   16.622188
6   2006    13.505696   16.833579   15.497928
7   2007    13.488526   16.667733   15.817014
8   2008    13.151532   16.486507   15.729573
9   2009    13.457715   16.639238   18.260180
10  2010    13.194548   16.728689   15.426353
11  2011    14.347794   16.689421   14.176580
12  2012    13.605087   17.130568   14.717968
13  2013    13.027908   17.386193   16.203455
14  2014    12.746682   16.544287   14.736768
15  2015    13.465904   16.506123   12.442437

In [8]:
table['winter'] = winter
table

Out[8]:
date    spring  summer  autumn  winter
0   2000    12.233881   16.907301   15.692383   14.085962
1   2001    12.847481   16.750469   14.514066   13.503746
2   2002    13.558175   17.203393   15.699948   13.233652
3   2003    12.654725   16.894915   15.661465   12.843479
4   2004    13.253730   17.046967   15.209054   14.364791
5   2005    13.444305   16.745982   16.622188   11.610823
6   2006    13.505696   16.833579   15.497928   12.199344
7   2007    13.488526   16.667733   15.817014   13.743822
8   2008    13.151532   16.486507   15.729573   12.932336
9   2009    13.457715   16.639238   18.260180   12.653159
10  2010    13.194548   16.728689   15.426353   13.883358
11  2011    14.347794   16.689421   14.176580   12.366542
12  2012    13.605087   17.130568   14.717968   13.292552
13  2013    13.027908   17.386193   16.203455   13.186121
14  2014    12.746682   16.544287   14.736768   12.870651
15  2015    13.465904   16.506123   12.442437   11.018138

插入行

目前来说我还没有找到一个直接插入行的函数或方法,所以用的方法是先切割,再拼接。

创建一个DataFrame准备插入odata中第2行与第3行之间,将odata分割为上下两段,利用append方法将它们拼接起来,注意参数中的ignore_index=True,如果不把这个参数设为True,新排的数据块索引不会重新排列。

In [9]:
insertRow = pd.DataFrame([[0.,0.,0.,0.,0.]],columns=['date','spring','summer','autumne','winter'])
above = table.loc[:2]
below = table.loc[3:]
newdata = above.append(insertRow,ignore_index=True).append(below,ignore_index=True)
newdata

Out[9]:
date    spring  summer  autumne winter
0   2000    12.233881   16.907301   15.692383   14.085962
1   2001    12.847481   16.750469   14.514066   13.503746
2   2002    13.558175   17.203393   15.699948   13.233652
3   0   0.000000    0.000000    0.000000    0.000000
4   2003    12.654725   16.894915   15.661465   12.843479
5   2004    13.253730   17.046967   15.209054   14.364791
6   2005    13.444305   16.745982   16.622188   11.610823
7   2006    13.505696   16.833579   15.497928   12.199344
8   2007    13.488526   16.667733   15.817014   13.743822
9   2008    13.151532   16.486507   15.729573   12.932336
10  2009    13.457715   16.639238   18.260180   12.653159
11  2010    13.194548   16.728689   15.426353   13.883358
12  2011    14.347794   16.689421   14.176580   12.366542
13  2012    13.605087   17.130568   14.717968   13.292552
14  2013    13.027908   17.386193   16.203455   13.186121
15  2014    12.746682   16.544287   14.736768   12.870651
16  2015    13.465904   16.506123   12.442437   11.018138

也可以用.concat()的方法来进行拼接,注意ignore_index=True

In [10]:
newdata2=pd.concat([above,insert,below],ignore_index=True)
newdata2

Out[10]:
date    spring  summer  autumne winter
0   2000    12.233881   16.907301   15.692383   14.085962
1   2001    12.847481   16.750469   14.514066   13.503746
2   2002    13.558175   17.203393   15.699948   13.233652
3   0   0.000000    0.000000    0.000000    0.000000
4   2003    12.654725   16.894915   15.661465   12.843479
5   2004    13.253730   17.046967   15.209054   14.364791
6   2005    13.444305   16.745982   16.622188   11.610823
7   2006    13.505696   16.833579   15.497928   12.199344
8   2007    13.488526   16.667733   15.817014   13.743822
9   2008    13.151532   16.486507   15.729573   12.932336
10  2009    13.457715   16.639238   18.260180   12.653159
11  2010    13.194548   16.728689   15.426353   13.883358
12  2011    14.347794   16.689421   14.176580   12.366542
13  2012    13.605087   17.130568   14.717968   13.292552
14  2013    13.027908   17.386193   16.203455   13.186121
15  2014    12.746682   16.544287   14.736768   12.870651
16  2015    13.465904   16.506123   12.442437   11.018138
上一篇下一篇

猜你喜欢

热点阅读