Hacker News 文章热度排序算法

2019-11-16  本文已影响0人  mapoor

Hacker News 文章热度排序算法较为简单有效,使用 点赞数 点踩数发布时间 三个维度来衡量一篇文章的热度值。


对应的Python代码如下:

# Rewritten code from /r2/r2/lib/db/_sorts.pyx

from datetime import datetime, timedelta
from math import log

epoch = datetime(1970, 1, 1)

def epoch_seconds(date):
    td = date - epoch
    return td.days * 86400 + td.seconds + (float(td.microseconds) / 1000000)

def score(ups, downs):
    return ups - downs

def hot(ups, downs, date):
    s = score(ups, downs)
    order = log(max(abs(s), 1), 10)
    sign = 1 if s > 0 else -1 if s < 0 else 0
    seconds = epoch_seconds(date) - 1134028003
    return round(sign * order + seconds / 45000, 7)

下面是针对smzdm商品排序(http://39.106.99.186/)修改的

from math import log

epoch = datetime(1970, 1, 1)

def epoch_seconds(date):
    td = date - epoch
    return td.days * 86400 + td.seconds + (float(td.microseconds) / 1000000)

def score(ups, downs):
    return ups - downs/2  # 踩数太多,这里只取一半

def hot(ups, downs, date):
    s = score(ups-1, downs)  # -1剔除发布者自己的赞
    order = log(max(abs(s), 1), 10)
    sign = 1 if s > 0 else -1 if s < 0 else 0
    seconds = epoch_seconds(date) - 1134028003  # 2005-12-08 15:46:43之后

    # w表示: 当s=10时 相当于加了w个小时
    w = 1.25
    return round(sign * order + seconds / (3600*w), 7)

更精细的方法,可以看下Reddit的排序算法。

参考: https://medium.com/hacking-and-gonzo/how-reddit-ranking-algorithms-work-ef111e33d0d9

上一篇下一篇

猜你喜欢

热点阅读