14- 连接与修补 concat、combine_first

2018-08-26  本文已影响47人  蓝剑狼

连接 - 沿轴执行连接操作

pd.concat(objs, axis=0, join='outer', join_axes=None, ignore_index=False, keys=None, levels=None, names=None, verify_integrity=False, copy=True)

# 连接:concat

s1 = pd.Series([1,2,3])
s2 = pd.Series([2,3,4])
s3 = pd.Series([1,2,3],index = ['a','c','h'])
s4 = pd.Series([2,3,4],index = ['b','e','d'])
print(pd.concat([s1,s2]))
print(pd.concat([s3,s4]).sort_index())
print('-----')
# 默认axis=0,行+行

print(pd.concat([s3,s4], axis=1))
print('-----')
# axis=1,列+列,成为一个Dataframe
#执行结果
0    1
1    2
2    3
0    2
1    3
2    4
dtype: int64
a    1
b    2
c    2
d    4
e    3
h    3
dtype: int64
-----
     0    1
a  1.0  NaN
b  NaN  2.0
c  2.0  NaN
d  NaN  4.0
e  NaN  3.0
h  3.0  NaN
-----
# 连接方式:join,join_axes

s5 = pd.Series([1,2,3],index = ['a','b','c'])
s6 = pd.Series([2,3,4],index = ['b','c','d'])
print(pd.concat([s5,s6], axis= 1))
print(pd.concat([s5,s6], axis= 1, join='inner'))
print(pd.concat([s5,s6], axis= 1, join_axes=[['a','b','d']]))
# join:{'inner','outer'},默认为“outer”。如何处理其他轴上的索引。outer为联合和inner为交集。
# join_axes:指定联合的index
#执行结果
 0    1
a  1.0  NaN
b  2.0  2.0
c  3.0  3.0
d  NaN  4.0
   0  1
b  2  2
c  3  3
     0    1
a  1.0  NaN
b  2.0  2.0
d  NaN  4.0
# 覆盖列名

sre = pd.concat([s5,s6], keys = ['one','two'])
print(sre,type(sre))
print(sre.index)
print('-----')
# keys:序列,默认值无。使用传递的键作为最外层构建层次索引

sre = pd.concat([s5,s6], axis=1, keys = ['one','two'])
print(sre,type(sre))
# axis = 1, 覆盖列名
#执行结果
one  a    1
     b    2
     c    3
two  b    2
     c    3
     d    4
dtype: int64 <class 'pandas.core.series.Series'>
MultiIndex(levels=[['one', 'two'], ['a', 'b', 'c', 'd']],
           labels=[[0, 0, 0, 1, 1, 1], [0, 1, 2, 1, 2, 3]])
-----
   one  two
a  1.0  NaN
b  2.0  2.0
c  3.0  3.0
d  NaN  4.0 <class 'pandas.core.frame.DataFrame'>

修补 pd.combine_first()

df1 = pd.DataFrame([[np.nan, 3., 5.], [-4.6, np.nan, np.nan],[np.nan, 7., np.nan]])
df2 = pd.DataFrame([[-42.6, np.nan, -8.2], [-5., 1.6, 4]],index=[1, 2])
print(df1)
print(df2)
print(df1.combine_first(df2))
print('-----')

根据index,df1的空值被df2替代

如果df2的index多于df1,则更新到df1上,比如index=['a',1]

df1.update(df2)
print(df1)

update,直接df2覆盖df1,相同index位置

执行结果

0 1 2
0 NaN 3.0 5.0
1 -4.6 NaN NaN
2 NaN 7.0 NaN
0 1 2
1 -42.6 NaN -8.2
2 -5.0 1.6 4.0
0 1 2
0 NaN 3.0 5.0
1 -4.6 NaN -8.2
2 -5.0 7.0 4.0


  0    1    2

0 NaN 3.0 5.0
1 -42.6 NaN -8.2
2 -5.0 1.6 4.0

上一篇下一篇

猜你喜欢

热点阅读