属性规约
2019-09-20 本文已影响0人
叫我老村长
属性规约
-
原始数据中属性太多,根据航空公司客户价值LRFMC模型,选择与模型相关的六个属性
-
删除其他无用属性,如会员卡号等等
def reduction_data(data):
data = data[['LOAD_TIME', 'FFP_DATE', 'LAST_TO_END', 'FLIGHT_COUNT', 'SEG_KM_SUM', 'avg_discount']]
# data['L']=pd.datetime(data['LOAD_TIME'])-pd.datetime(data['FFP_DATE'])
# data['L']=int(((parse(data['LOAD_TIME'])-parse(data['FFP_ADTE'])).days)/30)
d_ffp = pd.to_datetime(data['FFP_DATE'])
d_load = pd.to_datetime(data['LOAD_TIME'])
res = d_load - d_ffp
data2=data.copy()
data2['L'] = res.map(lambda x: x / np.timedelta64(30 * 24 * 60, 'm'))
data2['R'] = data['LAST_TO_END']
data2['F'] = data['FLIGHT_COUNT']
data2['M'] = data['SEG_KM_SUM']
data2['C'] = data['avg_discount']
data3 = data2[['L', 'R', 'F', 'M', 'C']]
return data3
data3=reduction_data(data)
print(data3)
data3=reduction_data(data)
print(data3)
————————————以下是以上代码处理后数据————————————
L R F M C
0 90.200000 1 210 580717 0.961639
1 86.566667 7 140 293678 1.252314
2 87.166667 11 135 283712 1.254676
3 68.233333 97 23 281336 1.090870
4 60.533333 5 152 309928 0.970658
5 74.700000 79 92 294585 0.967692
6 97.700000 1 101 287042 0.965347
7 48.400000 3 73 287230 0.962070
8 34.266667 6 56 321489 0.828478