Python2.7.X字符串比较注意点

2018-11-16  本文已影响0人  LaxChan

字符串前缀说明

出现问题现象

a12 = [s for s in a1 if s in a2]

初步方案

a12 = [s for s in a1 if s.encode("utf-8") in a2]
('diff|loop speed ', 4998.8330078125ms, '| encode loop speed ', 123.25ms)

使用intersection函数

a13 = list(set(a1).intersection(set(a2)))
('diff|loop speed ', 4998.8330078125ms, '| encode loop speed ', 123.25ms, '|intersection speed : ', 3.626953125ms)

初步结论

几种方式对比

def loopComp(a,b):
    c=[s for s in a if s in b]
    print('loopComp ret size : ',len(c))

def intersectionComp(a,b):
    c=list(set(a).intersection(set(b)))
    print('intersectionComp ret size : ',len(c))

def encodeIntersectionComp(a,b):
    a1=[s.encode("utf-8") for s in a]
    c=list(set(a1).intersection(set(b)))
    print('encodeIntersectionComp ret size : ',len(c))

def encodeloopComp(a,b):
    c=[s for s in a if s.encode("utf-8") in b]
    print('encodeloopComp ret size : ',len(c))

print('==========same encode list==========')
%time loopComp(a1,a2)
%time encodeloopComp(a1,a2)
%time intersectionComp(a1,a2)
%time encodeIntersectionComp(a1,a2)

print('==========diff encode list==========')
%time loopComp(a1,a3)
%time encodeloopComp(a1,a3)
%time intersectionComp(a1,a3)
%time encodeIntersectionComp(a1,a3)
==========same encode list==========
('loopComp ret size : ', 3559)
CPU times: user 172 ms, sys: 3.1 ms, total: 175 ms
Wall time: 167 ms
('encodeloopComp ret size : ', 3559)
CPU times: user 4.79 s, sys: 4.86 ms, total: 4.8 s
Wall time: 4.82 s
('intersectionComp ret size : ', 3559)
CPU times: user 920 µs, sys: 0 ns, total: 920 µs
Wall time: 851 µs
('encodeIntersectionComp ret size : ', 3559)
CPU times: user 4.97 ms, sys: 0 ns, total: 4.97 ms
Wall time: 4.88 ms
==========diff encode list==========
('loopComp ret size : ', 3559)
CPU times: user 4.81 s, sys: 7.46 ms, total: 4.82 s
Wall time: 4.83 s
('encodeloopComp ret size : ', 3559)
CPU times: user 125 ms, sys: 0 ns, total: 125 ms
Wall time: 126 ms
('intersectionComp ret size : ', 3559)
CPU times: user 3.53 ms, sys: 0 ns, total: 3.53 ms
Wall time: 3.54 ms
('encodeIntersectionComp ret size : ', 3559)
CPU times: user 2.34 ms, sys: 0 ns, total: 2.34 ms
Wall time: 2.32 ms

结论

扩展


==========same encode list==========
loopComp ret size :  3559
CPU times: user 129 ms, sys: 1.32 ms, total: 130 ms
Wall time: 129 ms
encodeloopComp ret size :  0
CPU times: user 253 ms, sys: 122 µs, total: 253 ms
Wall time: 253 ms
intersectionComp ret size :  3559
CPU times: user 605 µs, sys: 0 ns, total: 605 µs
Wall time: 706 µs
encodeIntersectionComp ret size :  0
CPU times: user 1.31 ms, sys: 0 ns, total: 1.31 ms
Wall time: 1.32 ms
==========diff encode list==========
loopComp ret size :  3559
CPU times: user 123 ms, sys: 0 ns, total: 123 ms
Wall time: 122 ms
encodeloopComp ret size :  0
CPU times: user 248 ms, sys: 0 ns, total: 248 ms
Wall time: 249 ms
intersectionComp ret size :  3559
CPU times: user 0 ns, sys: 0 ns, total: 0 ns
Wall time: 689 µs
encodeIntersectionComp ret size :  0
CPU times: user 1.47 ms, sys: 0 ns, total: 1.47 ms
Wall time: 1.3 ms
上一篇 下一篇

猜你喜欢

热点阅读