数据分析案例

我的第一个数据分析项目——51job“数据分析”岗位分析(数据清

2018-09-20  本文已影响0人  大力SAMA

一、统一薪资格式


1
2
3

薪资的类型一共有6种,把薪资统一到千/月,对薪资是一个范围的数据分开放入2列(low_salary,high_salary),薪资只有一个值的放入low_salary

主要代码内容如下:

def Salary(path,table):
    workbook = load_workbook(path)
    sheet = workbook[table]
    n = sheet.max_row
    
    a = Fraction(10,12)#万/年换算成千/月 先乘10再除以12
    b = Fraction(20,1000)#每月工作20天

    #6种薪资类型
    k1 = "万/月"
    k2 = "千/月"
    k3 = "万/年"
    k4 = "元/天"
    k5 = "千以下/月"
    k6 = "万以下/年"

    salarydic1 = {k1:10,k2:1,k3:a}#处理有“-”的数据
    salarydic2 = {k4:b,k5:1,k6:a}#处理无“-”的数据
    
    sheet.cell(1,2).value = "low_salary"
    sheet.cell(1,3).value = "high_salary"
    
    for i in range(2,n+1):
        s = sheet.cell(i,1).value
        if s:
            if "-" in s:
                for k in salarydic1:
                    if k in s:
                        slist = s.split("-")
                        low_salary = float(slist[0])*salarydic1[k]
                        high_salary = float(slist[1].replace(k,""))*salarydic1[k]
                        #print(round(low_salary,2),round(high_salary,2))
                        sheet.cell(i,2).value = round(low_salary,2)
                        sheet.cell(i,3).value = round(high_salary,2)
            else:
                for k in salarydic2:
                    if k in s:
                        low_salary = float(s.replace(k,""))*salarydic2[k]
                        #print(round(low_salary,2))
                        sheet.cell(i,2).value = round(low_salary,2)
    workbook.save(path)
薪资处理结果

二、简化所在行业,取第一个逗号前的文本


行业显示

主要代码内容如下:

def Companyfield(path,table):
    workbook = load_workbook(path)
    sheet = workbook[table]
    n = sheet.max_row
    
    sheet.cell(1,3).value = "new companyfield"
    for i in range(2,n+1):
        s = sheet.cell(i,2).value
        if s:
            if "," in s:
                s = s.split(",")[0]
            else:
                s = s
            sheet.cell(i,3).value = s
    workbook.save(path)
上一篇下一篇

猜你喜欢

热点阅读