疫情防控工作自动化办公(一)

2022-05-24  本文已影响0人  不懂球的2大业

1.背景

1.与大数据局对接,接收到前一天没有完成核酸检测的人员名单 
2.按照所属平台分类,将未做核酸人员名单下发至各平台联系人
3.督促各平台联系人完成补采统计,并对未补采的原因进行汇总 
4.向大数据局总结反馈

2.改进点

2.1对于步骤2的改进

import pandas as pd
import datetime

# 读取每日数据
data = pd.read_excel("data.xls")
# 筛选出每日不达标的人员
un_satisfy = data[data["isDabiao"] == "不达标"]
# 读入COMPANY-平台字典
map_company = pd.read_excel("map1.xls")
# 左连接操作,结果会增加平台字段(real_company)
result = pd.merge(left = un_satisfy,right = map_company,on = "company")
# 根据平台进行分组
group_by_res = list(result_type)
# 每个平台的数据按照格式命名并保存
for item in group_by_res:
    today = datetime.date.today()
    i = datetime.datetime.now()
    item[1].to_excel(excel_writer="%s%s%s%d.xlsx" % (str(today),str(i.strftime('%p')),item[0],len(item[1])), sheet_name='sheet_1')

2.2对于对于步骤3的改进

应采1603人,实采1511人。经系统核实92人未采,其中41人已补采,39人离职,4人因接种第三针疫苗48小时内无法检测,2人因受伤居家修养,6人已离苏。
import pandas as pd
import os
from sklearn.decomposition import sparse_encode
import xlwings as xw
from jieba import lcut
from gensim.similarities import SparseMatrixSimilarity
from gensim.corpora import Dictionary
from gensim.models import TfidfModel
import numpy as np

def make_labels(labels):
    labels = [lcut(label) for label in labels]
    dictionary = Dictionary(labels)
    num_features = len(dictionary.token2id)
    corpus = [dictionary.doc2bow(label) for label in labels]
    return dictionary,corpus

def make_choice_word_index(key_word,dictionary,corpus):
    kw_vector = dictionary.doc2bow(lcut(key_word)) 
    tfidf = TfidfModel(corpus)
    tf_texts = tfidf[corpus]
    num_features = len(dictionary.token2id)
    sparse_matrix = SparseMatrixSimilarity(tf_texts,num_features)
    tf_kw = tfidf[kw_vector]
    similarities = sparse_matrix.get_similarities(tf_kw)
    index = np.argmax(similarities)
    return index

labels = ["已做","已离职跑路","打疫苗未做","因受伤、手术居家医院未做","请假回老家未做","请假未做"]

dictionary,corpus = make_labels(labels)

data = pd.read_excel("data.xlsx")

lst = list(data.loc[:,"备注"])

count = [0,0,0,0,0,0]

for word in lst:
    index = make_choice_word_index(word,dictionary,corpus)
    count[index] = count[index] + 1

print("总计:",sum(count),"人。",
      "已补采:",count[0],"人 ",
     "离职:",count[1],"人 ",
      "因接种第三针疫苗48小时内无法检测:",count[2],"人 ",
      "因受伤、手术居家休养未做:",count[3],"人 ",
      "离苏:",count[4],"人 ",
      "请假未做:",count[5],"人 "
     )

上一篇 下一篇

猜你喜欢

热点阅读