Python 性能调试工具Line-profiler极简使用笔记

2021-01-08  本文已影响0人  井底蛙蛙呱呱呱

虽然Python屡屡被人诟病速度问题,但是该用的还得用,速度问题只能靠代码优化来解决了。Line-Profiler是一个代码优化工具,利用line—profiler我们可以得到我们每一行代码的运行总时间以及单次平均运行时间,以便我们对耗时最长的地方进行优化。

安装:

pip install line_profiler

1、极简模式

下面我们使用line-profiler查看一个简单实例各行代码时间都花在哪。

import random

def do_stuff():
    numbers = []
    for i in range(1000):
        numbers.append(random.randint(0,1000))
    s = sum(numbers)
    l = [numbers[i]/43 for i in range(len(numbers))]
    m = ['hello'+str(numbers[i]) for i in range(len(numbers))]
    return None
 

if __name__ == '__main__':
    from line_profiler import LineProfiler
    
    lp = LineProfiler()
    lp_wrapper = lp(do_stuff)
    lp_wrapper()
    lp.print_stats()

下面我们再命令行运行一下看看时间都去哪了:

$ root@cpu-k8ss-0 # python loader_test.py 
Timer unit: 1e-06 s

Total time: 0.0075 s
File: loader_test.py
Function: do_stuff at line 88

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
    88                                           def do_stuff():
    89         1          1.0      1.0      0.0      numbers = []
    90      1001        697.0      0.7      9.3      for i in range(1000):
    91      1000       6093.0      6.1     81.2          numbers.append(random.randint(0,1000))
    92         1         10.0     10.0      0.1      s = sum(numbers)
    93         1        240.0    240.0      3.2      l = [numbers[i]/43 for i in range(len(numbers))]
    94         1        458.0    458.0      6.1      m = ['hello'+str(numbers[i]) for i in range(len(numbers))]
    95         1          1.0      1.0      0.0      return None

上面输出中,可以看到我们测试的函数为do_stuff, 起始于位于脚本的第88行。运行程序共花费时间0.0075 s。

紧接着的各列含义如下:

从上面可以看到我们的时间主要花在了第一个生成数据的for循环中,共计占了90.5%,我们可以使用列表推导式来对其进行优化。

$ root@cpu-k8ss-0 # python loader_test.py 
Timer unit: 1e-06 s

Total time: 0.00503 s
File: loader_test.py
Function: do_stuff at line 88

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
    88                                           def do_stuff():
    89                                               # numbers = []
    90                                               # for i in range(1000):
    91                                               # numbers.append(random.randint(0,1000))
    92         1       4434.0   4434.0     88.2      numbers = [random.randint(1,100) for i in range(1000)]
    93         1          8.0      8.0      0.2      s = sum(numbers)
    94         1        194.0    194.0      3.9      l = [numbers[i]/43 for i in range(len(numbers))]
    95         1        393.0    393.0      7.8      m = ['hello'+str(numbers[i]) for i in range(len(numbers))]
    96         1          1.0      1.0      0.0      return None

可以看到使用列表推导式后,我们的总时间减少了,生成数据这一步占程序总运行时间比例也降低了。

给函数传入参数

下面我们将numbers数据生成过程放到外面,以参数形式传入到函数中:

def do_stuff(numbers):
    # numbers = []
    # for i in range(1000):
    # numbers.append(random.randint(0,1000))
    #numbers = [random.randint(1,100) for i in range(1000)]
    s = sum(numbers)
    l = [numbers[i]/43 for i in range(len(numbers))]
    m = ['hello'+str(numbers[i]) for i in range(len(numbers))]
    return None

if __name__ == '__main__':
    from line_profiler import LineProfiler
    lp = LineProfiler()
    lp_wrapper = lp(do_stuff)
    # 生成参数并传入
    numbers = [random.randint(1,100) for i in range(100000)]
    lp_wrapper(numbers)
    lp.print_stats()

命令行运行:

$ root@cpu-k8ss-0 # python loader_test.py 
Timer unit: 1e-06 s

Total time: 0.084887 s
File: loader_test.py
Function: do_stuff at line 88

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
    88                                           def do_stuff(numbers):
    89                                               # numbers = []
    90                                               # for i in range(1000):
    91                                               # numbers.append(random.randint(0,1000))
    92                                               #numbers = [random.randint(1,100) for i in range(1000)]
    93         1        788.0    788.0      0.9      s = sum(numbers)
    94         1      26504.0  26504.0     31.2      l = [numbers[i]/43 for i in range(len(numbers))]
    95         1      57593.0  57593.0     67.8      m = ['hello'+str(numbers[i]) for i in range(len(numbers))]
    96         1          2.0      2.0      0.0      return None

可以看到,现在程序运行的时间主要花在计算除数和连接字符串了。

显示内层调用函数的运行时间

在上面的例子中,我们将所有代码放在了一个函数中,只查看这个函数的代码的运行时间。line-profiler支持查看调用的函数内部运行时间。

def generate_numbers(n):
    numbers = [random.randint(1,100) for i in range(1000)]
    return numbers

 
def do_stuff(n):
    numbers = generate_numbers(n)
    s = sum(numbers)
    l = [numbers[i]/43 for i in range(len(numbers))]
    m = ['hello'+str(numbers[i]) for i in range(len(numbers))]
    return None

if __name__ == '__main__':
    from line_profiler import LineProfiler
    lp = LineProfiler()
    lp_wrapper = lp(do_stuff)
    # 加入并显示调用函数各行代码用时
    lp.add_function(generate_numbers)
    # 生成参数并传入
    n = 100000
    lp_wrapper(n)
    lp.print_stats()

命令行运行:

$ root@cpu-k8ss-0 # python loader_test.py 
Timer unit: 1e-06 s

Total time: 0.005112 s
File: loader_test.py
Function: generate_numbers at line 87

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
    87                                           def generate_numbers(n):
    88         1       5111.0   5111.0    100.0      numbers = [random.randint(1,100) for i in range(1000)]
    89         1          1.0      1.0      0.0      return numbers

Total time: 0.005709 s
File: loader_test.py
Function: do_stuff at line 92

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
    92                                           def do_stuff(n):
    93         1       5125.0   5125.0     89.8      numbers = generate_numbers(n)
    94         1          8.0      8.0      0.1      s = sum(numbers)
    95         1        189.0    189.0      3.3      l = [numbers[i]/43 for i in range(len(numbers))]
    96         1        387.0    387.0      6.8      m = ['hello'+str(numbers[i]) for i in range(len(numbers))]
    97         1          0.0      0.0      0.0      return None

在上面我们可以看到两个函数各行代码的运行时间,非常方便。

当然,line-profile还可以通过装饰器@profiler方式统计我们想要优化的函数。然后在命令行中通过命令$ kernprof -l script_to_profile.py来进行运行。

参考:
官方说明
使用line_profiler查看api接口函数每行代码执行时间
python 性能调试工具(line_profiler)使用

上一篇下一篇

猜你喜欢

热点阅读