使用pd.merge实现数据合并
2020-01-25 本文已影响0人
清梦载星河
一、pd.merge()
pandas.merge(left, right,
how='inner', on=None,
left_on=None, right_on=None,
left_index=False, right_index=False,
sort=False, suffixes=('_x', '_y'),
copy=True, indicator=False,
validate=None)
- left : DataFrame
- right : DataFrame or named Series
- how : {‘left’, ‘right’, ‘outer’, ‘inner’}, default ‘inner’(设置数据连接的集合操作规则)
- left: 返回的结果只包含左列
- right: 返回的结果只包含右列
- inner: 交集
- outer: 并集
- on :label or list(此参数只有在两个DataFrame有共同列名的时候才可以使用)
- left_on与right_on: label or list, or array-like(合并两个列名不同的数据集)
- left_index与right_index : bool, default False(合并索引)
- suffixes : tuple of (str, str), default ('_x', '_y')(为重复列名自定义后缀)
二、pd.merge()的示例代码
# 简单连接
# 只有一个共同列名时参数 on 可省略
import pandas as pd
df1 = pd.DataFrame({'Warframe':['saryn','volt','trinity','loki'],
'group':['A','B','C','D']})
df2 = pd.DataFrame({'Warframe':['volt','loki','saryn','trinity'],
'support':[2004,2008,2012,2014]})
print(pd.merge(df1,df2))

# left_on和right_on
df3 = pd.DataFrame({'name':['saryn','volt','loki','trinity'],
'support':[2012,2004,2008,2014]})
print(df3)
print("===========")
print(pd.merge(df1,df3,
left_on='Warframe',right_on='name'))

# left_index和right_index
df11 = df1.set_index('Warframe')
df22 = df2.set_index('Warframe')
print(df11)
print(df22)
print(pd.merge(df11,df22,
left_index=True,right_index=True))

# how参数
df5 = pd.DataFrame({'name':['Ember','Frost','Garuda'],
'ability':['Fire','Ice','Blood']},
columns=['name','ability'])
df6 = pd.DataFrame({'name':['Garuda','Hydroid'],
'face':['beatiful','emmm']},
columns=['name','face'])
print(df5)
print("===================================")
print(df6)
print("===================================")
print(pd.merge(df5,df6))
print("===================================")
print(pd.merge(df5,df6,how='inner'))
print("===================================")
print(pd.merge(df5,df6,how='left'))
print("===================================")
print(pd.merge(df5,df6,how='right'))
print("===================================")
print(pd.merge(df5,df6,how='outer'))
