如何在python3下使用TextGrocery
2019-10-11 本文已影响0人
郭彦超
TextGrocery是一款高效的短文本分类工具,后期我们会通过该工具训练文本规则实现给作品内容自动打标签; 该项目作者目前已不再维护此项目,最新版本只支持python2 ,为了在python3也能使用,需做如下修改
首先第一步通过 pip 安装TextGrocery
pip install tgrocery
# 该项目作者已不再维护,最新版是0.14
找不到module
- No module named ‘converter’
converter 不要使用第三方的,TextGrocery安装路径下有,修改init文件,在converter 前加 "."
#1、修改 /home/bigdata/anaconda3/lib/python3.7/site-packages/tgrocery/__init__.py 为
from .classifier import *
from .converter import *
#2、修改./site-packages/tgrocery/classifier.py 加 “.”
from .converter import GroceryTextConverter
from .learner import *
from .base import *
- No module named ‘cPickle’
python2 中的cPickle模块在python3中改名了,先安装pickle5 在修改converter文件
pip install pickle5
vi ./site-packages/tgrocery/converter.py
将 import cPickle 改为 import pickle5 as cPickle
- No module named ‘base’
#修改 site-packages/tgrocery/converter.py 在base前加“.”
import .base
print函数在python3中有调整(需加括号)
# 修改site-packages/tgrocery/.base.py
print( self.draw_table(
zip(
['%.2f%%' % (s * 100) for s in self.accuracy_labels.values()],
['%.2f%%' % (s * 100) for s in self.recall_labels.values()]
),
self.accuracy_labels.keys(),
('accuracy', 'recall')
) )
NameError: name ‘unicode’ is not defined
python3中将unicode换成了str,将 site-packages/tgrocery/classifier.py中所有出现的unicode进行替换
TypeError: The argument should be plain text
注释掉下面的语句
# vi site-packages/tgrocery/classifier.py
if not isinstance(text,str):
raise TypeError('The argument should be plain text')
修改jieba.cache目录为当前安装目录
# vi site-packages/jieba/__init__.py
self.tmp_dir = "/home/bigdata/anaconda3/lib/python3.7/site-packages/jieba/"
'dict' object has no attribute 'iteritems'
在 site-packages/tgrocery/converter.py 将所有的 iteritems 替换为 items
大功告成、官方实例运行如下
>>> from tgrocery import Grocery
>>> grocery = Grocery('sample')
>>> train_src = [
... ('education', '名师指导托福语法技巧:名词的复数形式'),
... ('education', '中国高考成绩海外认可 是“狼来了”吗?'),
... ('sports', '图文:法网孟菲尔斯苦战进16强 孟菲尔斯怒吼'),
... ('sports', '四川丹棱举行全国长距登山挑战赛 近万人参与')
... ]
>>> grocery.train(train_src)
Building prefix dict from the default dictionary ...
Dumping model to file cache /home/bigdata/anaconda3/lib/python3.7/site-packages/jieba/jieba.cache
Loading model cost 0.595 seconds.
Prefix dict has been built succesfully.
*
optimization finished, #iter = 3
Objective value = -1.092381
nSV = 8
<tgrocery.Grocery object at 0x7ffedbea5290>
>>> grocery.predict('考生必读:新托福写作考试评分标准')
<tgrocery.base.GroceryPredictResult object at 0x7ffed68e9610>
>>> grocery.predict('考生必读:新托福写作考试评分标准').accuracy_labels
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: 'GroceryPredictResult' object has no attribute 'accuracy_labels'
>>> grocery.predict('考生必读:新托福写作考试评分标准').dec_values
{'education': 0.03393735426359166, 'sports': -0.033937354263591644}