大数据 爬虫Python AI Sqlpython自学Python3自学 爬虫实战

python各种推导式(超级详细)

2019-07-19  本文已影响1人  小董不太懂

推导式comprehensions(又称解析式),是Python的一种独有特性。推导式是可以从一个数据序列构建另一个新的数据序列的结构体。 共有三种推导,在Python2和3中都有支持:


基本格式为:
[表达式 for 变量 in 列表] 或者 [表达式 for 变量 in 列表 if 条件]

具体可分为两种:

* [x for x in data if condition]

此处if主要起条件判断作用,data数据中只有满足if条件的才会被留下,最后统一生成为一个数据列表。

* [exp1 if condition else exp2 for x in data]

此处if...else主要起赋值作用,当data中的数据满足if条件时将其做exp1处理,否则按照exp2处理,最后统一生成为一个数据列表

为了加深理解我们举个例子
variable = [out_exp_res for out_exp in input_list if out_exp == 2]
out_exp_res:  列表生成元素表达式,可以是有返回值的函数
for out_exp in input_list:  迭代input_list将out_exp传入out_exp_res表达式中
if out_exp == 2:  根据条件过滤哪些值可以

我们可以再举几个例子:

multiples = [i for i in range(30) if i % 3 == 0]
print(multiples)
Output: [0, 3, 6, 9, 12, 15, 18, 21, 24, 27]

multiples = [squared(i) for i in range(30) if i % 3 == 0]
print multiples
Output: [0, 9, 36, 81, 144, 225, 324, 441, 576, 729]

data = ['driver', '2017-07-13', 1827.0, 2058.0, 978.0, 1636.0, 1863.0, 2537.0, 1061.0]
(1)若我要取得以上列表中值大于2000的数值,这里可以使用列表推导式的形式①: [x for x in data if x > 2000] 得到如下结果(字符串类型数据被认为是无穷大数):['driver', '2017-07-13', 2058.0, 2537.0]
(2)若要解决我上面提到的问题,则需要使用列表推导式的形式② : [int(x) if type(x) == float else x for x in data] 得到结果:['driver', '2017-07-13', 1827, 2058, 978, 1636, 1863, 2537, 1061]

两个例子肯定不够理解的,我们要实战一下,亲自上手敲敲代码
例1:过滤掉长度小于或等于3的字符串列表,并将剩下的转换成大写字母:

>>> names = ['Bob','Tom','alice','Jerry','Wendy','Smith']
>>> new_names = [name.upper()for name in names if len(name)>3]
>>> print(new_names)
['ALICE', 'JERRY', 'WENDY', 'SMITH']

例2:生成间隔5分钟的时间列表序列:

>>> time = ['%.2d:%.2d'%(h,m )for h in range(24) for m in range(0,60,5) ]
>>> print(time)
['00:00', '00:05', '00:10', '00:15', '00:20', '00:25', '00:30', '00:35', '00:40', '00:45', '00:50', '00:55', '01:00', '01:05', '01:10', '01:15', '01:20', '01:25', '01:30', '01:35', '01:40', '01:45', '01:50', '01:55', '02:00', '02:05', '02:10', '02:15', '02:20', '02:25', '02:30', '02:35', '02:40', '02:45', '02:50', '02:55', '03:00', '03:05', '03:10', '03:15', '03:20', '03:25', '03:30', '03:35', '03:40', '03:45', '03:50', '03:55', '04:00', '04:05', '04:10', '04:15', '04:20', '04:25', '04:30', '04:35', '04:40', '04:45', '04:50', '04:55', '05:00', '05:05', '05:10', '05:15', '05:20', '05:25', '05:30', '05:35', '05:40', '05:45', '05:50', '05:55', '06:00', '06:05', '06:10', '06:15', '06:20', '06:25', '06:30', '06:35', '06:40', '06:45', '06:50', '06:55', '07:00', '07:05', '07:10', '07:15', '07:20', '07:25', '07:30', '07:35', '07:40', '07:45', '07:50', '07:55', '08:00', '08:05', '08:10', '08:15', '08:20', '08:25', '08:30', '08:35', '08:40', '08:45', '08:50', '08:55', '09:00', '09:05', '09:10', '09:15', '09:20', '09:25', '09:30', '09:35', '09:40', '09:45', '09:50', '09:55', '10:00', '10:05', '10:10', '10:15', '10:20', '10:25', '10:30', '10:35', '10:40', '10:45', '10:50', '10:55', '11:00', '11:05', '11:10', '11:15', '11:20', '11:25', '11:30', '11:35', '11:40', '11:45', '11:50', '11:55', '12:00', '12:05', '12:10', '12:15', '12:20', '12:25', '12:30', '12:35', '12:40', '12:45', '12:50', '12:55', '13:00', '13:05', '13:10', '13:15', '13:20', '13:25', '13:30', '13:35', '13:40', '13:45', '13:50', '13:55', '14:00', '14:05', '14:10', '14:15', '14:20', '14:25', '14:30', '14:35', '14:40', '14:45', '14:50', '14:55', '15:00', '15:05', '15:10', '15:15', '15:20', '15:25', '15:30', '15:35', '15:40', '15:45', '15:50', '15:55', '16:00', '16:05', '16:10', '16:15', '16:20', '16:25', '16:30', '16:35', '16:40', '16:45', '16:50', '16:55', '17:00', '17:05', '17:10', '17:15', '17:20', '17:25', '17:30', '17:35', '17:40', '17:45', '17:50', '17:55', '18:00', '18:05', '18:10', '18:15', '18:20', '18:25', '18:30', '18:35', '18:40', '18:45', '18:50', '18:55', '19:00', '19:05', '19:10', '19:15', '19:20', '19:25', '19:30', '19:35', '19:40', '19:45', '19:50', '19:55', '20:00', '20:05', '20:10', '20:15', '20:20', '20:25', '20:30', '20:35', '20:40', '20:45', '20:50', '20:55', '21:00', '21:05', '21:10', '21:15', '21:20', '21:25', '21:30', '21:35', '21:40', '21:45', '21:50', '21:55', '22:00', '22:05', '22:10', '22:15', '22:20', '22:25', '22:30', '22:35', '22:40', '22:45', '22:50', '22:55', '23:00', '23:05', '23:10', '23:15', '23:20', '23:25', '23:30', '23:35', '23:40', '23:45', '23:50', '23:55']

例3: 求(x,y),其中x是0-5之间的偶数,y是0-5之间的奇数组成的元祖列表:

list = [(x,y) for x in range(5) if x%2 == 0 for y in range(5) if y%2 == 1]
print(list)

D:\anaconda\python.exe D:/bilibili大学/简书代码/推导式.py
[(0, 1), (0, 3), (2, 1), (2, 3), (4, 1), (4, 3)]

Process finished with exit code 0

例4: 求M中3,6,9组成的列表:

M = [[1,2,3],[4,5,6],[7,8,9]]
list_1 = [row[2] for row in M]
print(list_1)

D:\anaconda\python.exe D:/bilibili大学/简书代码/推导式.py
[3, 6, 9]

Process finished with exit code 0

例5: 求M中斜线1,5,9组成的列表:

M = [[1,2,3],[4,5,6],[7,8,9]]
list_1 = [M[x][x] for x in range(len(M)) ]
print(list_1)

D:\anaconda\python.exe D:/bilibili大学/简书代码/推导式.py
[1, 5, 9]

Process finished with exit code 0

例6: 求M,N中矩阵和元素的乘积:

M = [[1,2,3],[4,5,6],[7,8,9]]
N = [[2,2,2],[3,3,3], [4,4,4]]
list = [M[row][col]*N[row][col] for row in range(3) for col in range(3)]
print(list)

D:\anaconda\python.exe D:/bilibili大学/简书代码/推导式.py
[2, 4, 6, 12, 15, 18, 28, 32, 36]

Process finished with exit code 0

注意:

使用()生成generator:
将俩表推导式的[]改成()即可得到生成器。

multiples = (i for i in range(30) if i % 3 is 0)
print(type(multiples))

Output: <type 'generator'>

我们看先来看使用字典推导式的基础模板:{ key:value for key,value in existing_data_structure }
这里和list有所不同,因位dict里面有两个关键的属性,key 和 value,但大同小异,我们现在的expression部分可以同时对 key 和 value 进行操作
下面来看最常见的应用
例1: 用字典推导式配合枚举的使用案例:

strings = ['import','is','with','if','file','exception','shim','lucy']
dict = {k:v for v,k in enumerate(strings)}
print(dict)
*********************************
D:\anaconda\python.exe D:/bilibili大学/简书代码/推导式.py
{'import': 0, 'is': 1, 'with': 2, 'if': 3, 'file': 4, 'exception': 5, 'shim': 6, 'lucy': 7}

Process finished with exit code 0

从这个例题我们发散一下,上题的k是字符串,v是序列。如果我们更换kv呢:

strings = ['import','is','with','if','file','exception','shim','lucy']
dict = {k:v for k,v in enumerate(strings)}
print(dict)
*************************
D:\anaconda\python.exe D:/bilibili大学/简书代码/推导式.py
{0: 'import', 1: 'is', 2: 'with', 3: 'if', 4: 'file', 5: 'exception', 6: 'shim', 7: 'lucy'}

Process finished with exit code 0

显然中间的kv就是控制键值的。
关于enumerate()函数请参考:https://www.runoob.com/python/python-func-enumerate.html
例2:互换key和value的值:

person = {'角色名':'宫本武藏','定位':'刺客'}
person_reverse = {k:v for v,k in person.items()}
#person_reverse = {v:k for k,v in person.items()}#也可以实现
print(person_reverse)
******************************************
D:\anaconda\python.exe D:/bilibili大学/简书代码/推导式.py
{'宫本武藏': '角色名', '刺客': '定位'}

Process finished with exit code 0

例3:源数据的key是字母的大小写混在一起,我们想统计同一个字母(不论大小写)的key所对应的键值对的和:

nums = {'a':10,'b':20,'A':5,'B':3,'d':4}
num_frequency  = {k.lower():nums.get(k.lower(),0) + nums.get(k.upper(),0)
                  for k in nums.keys() }
#nums是字典,nums.get(k.lower(),0)的意思是在字典nums中查找小写Key
#找到了返回KEY对应的Value,否则返回参数0,nums.get(k.upper(),0)同上
print(num_frequency)
*******************
D:\anaconda\python.exe D:/bilibili大学/简书代码/推导式.py
{'a': 15, 'b': 23, 'd': 4}

Process finished with exit code 0

例4:我们有一个fruit的list,现在想要得到每一种水果的单词长度:

fruits = ['apple','orange','banana','mango','peach']
fruits_dict = {fruit:len(fruit) for fruit in fruits}
print(fruits_dict)
********************************************
D:\anaconda\python.exe D:/bilibili大学/简书代码/推导式.py
{'apple': 5, 'orange': 6, 'banana': 6, 'mango': 5, 'peach': 5}

Process finished with exit code 0

让我们看先来看使用集合推导式的基础模板:{ expression for item in Sequence if conditional }
其实集合推导式和list的推导式很像,但是既然是集合,肯定会配合利用Set的特有属性来实现我们的目的。
对Set数据结构不够了解,推荐参考:https://segmentfault.com/a/1190000018109634?_ea=7068836

例1:首先,我们来看一个根据Set值唯一的特性的例子,我们有一个list叫names,用来存储名字,其中的数据很不规范,有大写,小写,还有重复的,我们想要去重并把名字的格式统一为首字母大写,实现方法便是用Set推导式:

names = [ 'Bob', 'JOHN', 'alice', 'bob', 'ALICE', 'James', 'Bob','JAMES','jAMeS' ]
new_names = {n[0].upper() + n[1:].lower() for n in names}
print(new_names)
***********************************************************
D:\anaconda\python.exe D:/bilibili大学/简书代码/推导式.py
{'Bob', 'James', 'John', 'Alice'}

Process finished with exit code 0

上一篇下一篇

猜你喜欢

热点阅读