CBOW ／ Skip-gram

2017-08-14 本文已影响0人小绿叶mj

总体结构：

CBOW & Skip-gram

Skip-gram模型的目标函数是最大化：

Skip-gram目标函数

对于Skip-gram，更大的context window 可以生成更多的训练样本，获得更精确的表达，但训练时间更长。

Softmax表示

Trick：
1).Hierarchical Softmax
The main advantage is that instead of evaluating W output nodes in the neural network to obtain the probability distribution, it is needed to evaluate only about log2(W) nodes.
简而言之，构造了一颗二叉树，减少运算量

2).Negative Sampling
Sorry, I can't understand

3).Subsampling of Frequent Words
以概率：

discarsion p

抛弃单词，其中f是词频，t是阈值，通常为10^-5。

CBOW ／ Skip-gram

猜你喜欢

热点阅读