2018-04-03 开胃学习Data系列 - Feature
2018-04-04 本文已影响0人
Kaiweio
导入
前提条件:
# Import the libraries we will be using
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline
sns.set(style='ticks', palette='Set2')
plt.rcParams['figure.figsize'] = 10, 8
import sys
sys.path.append("..")
from ds_utils.features_pipeline import pipeline_from_config
我们使用真正的直接营销活动 direct marketing campaign 中的邮件回复数据集。每个记录都代表一个直接 marketing offer 的个人。solicitation 请求慈善募捐。
The columns (features) are:
x | x |
---|---|
income | household income |
Firstdate | data assoc. with the first gift by this individual |
Lastdate | data associated with the most recent gift |
Amount | average amount by this individual over all periods (incl. zeros) |
rfaf2 | frequency code |
rfaa2 | donation amount code |
pepstrfl | flag indicating a star donator |
glast | amount of last gift |
gavr | amount of average gift |
The target variables is class and is equal to one if they gave in this campaign and zero otherwise.
# Load the data
mailing_url = "https://gist.githubusercontent.com/anonymous/5275f1f59be561ec9734c90d80d176b9/raw/f92227f9b8cdca188c1e89094804b8e46f14f30b/-"
mailing_df = pd.read_csv(mailing_url)
# Let's take a look at the data
mailing_df.head(5)