seaborn回归 pairgrid
2018-12-02 本文已影响5人
榴莲气象
[Visualizing linear relationships] (http://seaborn.pydata.org/tutorial/regression.html#regression-tutorial)
Python实践:seaborn的散点图矩阵(Pairs Plots)可视化数据
image.png image.png image.png使用PairGrid类的真正好处在于我们想要创建自定义函数来将不同的信息映射到图上。例如,我可能想要将两个变量之间的Pearson相关系数添加到散点图中。为此,我会编写一个函数,它接受两个数组、计算统计量,然后在图上绘制它。下面的代码显示了这是如何完成的(归功于这个Stack Overflow答案):
# Function to calculate correlation coefficient between two arrays
def corr(x, y, **kwargs):
# Calculate the value
coef = np.corrcoef(x, y)[0][1]
# Make the label
label = r'$\rho$ = ' + str(round(coef, 2))
# Add the label to the plot
ax = plt.gca()
ax.annotate(label, xy = (0.2, 0.95), size = 20, xycoords = ax.transAxes)
# Create a pair grid instance
grid = sns.PairGrid(data= df[df['year'] == 2007],
vars = ['life_exp', 'log_pop', 'log_gdp_per_cap'], size = 4)
# Map the plots to the locations
grid = grid.map_upper(plt.scatter, color = 'darkred')
grid = grid.map_upper(corr)
grid = grid.map_lower(sns.kdeplot, cmap = 'Reds')
grid = grid.map_diag(plt.hist, bins = 10, edgecolor = 'k', color = 'darkred');
image.png
seaborn pairgrid: using kdeplot with 2 hues
image.png image.png image.pngfrom scipy import stats
import seaborn as sns
import matplotlib
def corrfunc(x, y, **kws):
r, _ = stats.pearsonr(x, y)
ax = plt.gca()
# count how many annotations are already present
n = len([c for c in ax.get_children() if
isinstance(c, matplotlib.text.Annotation)])
pos = (.1, .9 - .1*n)
# or make positions for every label by hand
pos = (.1, .9) if kws['label'] == 'Yes' else (.1,.8)
## 需特别注意有没有label
ax.annotate("{}: r = {:.2f}".format(kws['label'],r),
xy=pos, xycoords=ax.transAxes)
tips = sns.load_dataset("tips")
g = sns.PairGrid(data = tips, vars = ['tip', 'total_bill'], hue="smoker", size=4)
g.map_upper(plt.scatter, s=10)
g.map_diag(sns.distplot, kde=False)
g.map_lower(sns.kdeplot, cmap="Blues_d")
g.map_lower(corrfunc)
g.add_legend()