pandas 之test1:对表格进行过滤P值和FC值
2020-05-31 本文已影响0人
夕颜00
1、文件:附件3_蛋白质定量和差异分析列表.xlsx
![](https://img.haomeiwen.com/i22798912/65f403839698eeb6.png)
![](https://img.haomeiwen.com/i22798912/0d8f0ad8613ee83f.png)
![](https://img.haomeiwen.com/i22798912/2b89ae0b31b89090.png)
2、目的:
通过P值和FC的过滤,生成样本表达矩阵;
![](https://img.haomeiwen.com/i22798912/9220ac1fb60dd924.png)
3、脚本:
import pandas as pd
import numpy as np
import re
file3 = u"附件3_蛋白质定量和差异分析列表.xlsx"
out = "out.csv"
df = pd.read_excel(file3, sheet_name=u"B VS D显著性差异分析")
df1 = df[(df.iloc[:, -1] < 0.05) & ((df.iloc[:, -2] > 1) | (df.iloc[:, -2] < -1))].iloc[:, np.r_[1, 5:17]]
#print(df1.columns)
df1.columns = df1.columns.str.replace(".raw.PG.Quantity", "")
df1.columns = [re.sub("\[\d+\]","",v) for v in df1.columns] ##df1.columns是个数组
df1.to_csv(out, sep=",", index=False)