python爬虫:新浪微博的时间格式处理

2021-02-25  本文已影响0人  format_b1d8

新浪微博时间的格式比较多,处理起来比较复杂。写了个demo,方便自己日后查看

from datetime import datetime
from datetime import timedelta
publish_time = '02月19日 20:19'
publish_time = '2020年12月18日 13:36'
publish_time = '今天06:50'
publish_time = '10分钟前'
publish_time = '20秒前'
publish_time = '2021-02-25  08:32 转赞人数超过200:00'
publish_time = '今天 08:32 转赞人数超过200'
if '人数' in publish_time:
    result = publish_time.split(' ')
    result.remove(result[-1])
    publish_time = ' '.join(result)
if "刚刚" in publish_time:
    publish_time = datetime.now().strftime('%Y-%m-%d %H:%M')
elif "分钟" in publish_time:
    minute = publish_time[:publish_time.find("分钟")]
    minute = timedelta(minutes=int(minute))
    publish_time = (
        datetime.now() - minute).strftime(
        "%Y-%m-%d %H:%M")
elif "今天" in publish_time:
    today = datetime.now().strftime("%Y-%m-%d")
    time = publish_time.replace('今天','')
    publish_time = today + " " + time
elif '年'  in publish_time:
    publish_time = publish_time.replace('年','-').replace('月','-').replace('日','')
elif "月" in publish_time:
    year = datetime.now().strftime("%Y")
    publish_time = str(publish_time)
    publish_time = year + "-" +publish_time.replace('月','-').replace('日','')
else:  # 多少秒
    publish_time = datetime.now().strftime('%Y-%m-%d %H:%M')
publish_time = publish_time+":00"
print("微博发布时间: " + publish_time)
上一篇下一篇

猜你喜欢

热点阅读