递归、深度优先、广度优先 实现目录内文件的遍历
os.path.join()函数
语法:
os.path.join(path1[,path2[,......]])
返回值:
将多个路径组合后返回
注:第一个绝对路径之前的参数将被忽略
import os
def getall(path):
filelist = os.listdir(path)
for filename in filelist:
filepath = os.path.join(path,filename)
if os.path.isdir(filepath):
getall(filepath)
print("目录:",filepath)
else:
print("文件:",filename)
getall(r"/Users/miraco/PycharmProjects") ##here to type 路径,以这个路径为例
输出:
/Library/Frameworks/Python.framework/Versions/3.6/bin/python3.6 /Users/miraco/PycharmProjects/untitled/test333.py
文件: .DS_Store
文件: pimrc2017.txt
文件: pimrc2017.txt
文件: globecom2017.txt
文件: wcnc2017.txt
目录: /Users/miraco/PycharmProjects/Paper Research/所有文章
文件: 物联网.txt
目录: /Users/miraco/PycharmProjects/Paper Research/物联网
文件: .DS_Store
文件: 众包.txt
目录: /Users/miraco/PycharmProjects/Paper Research/众包
文件: Combining Dynamic Clustering and Scheduling for Coordinated Multi-Point Transmission in LTE.pdf
文件: Capacity of Infrastructure-based Cooperative Vehicular Networks.pdf
文件: Cooperative Transmission in Cognitive and Energy Harvesting-based D2D Networks.pdf
文件: .DS_Store
文件: Cournot-Nash Equilibria for Bandwidth Allocation under Base-Station Cooperation.pdf
文件: A Benchmark for D2D in Cellular Networks- The Importance of Information.pdf
文件: Mobility Aware Caching Incentive Scheme for D2D Cellular Networks.pdf
文件: Hybrid Coordination Function Controlled Channel Access for Latency-Sensitive Tactile Applications .pdf
文件: 协作通信调研结果:.docx
文件: An Optimal LTE-V2I-Based Cooperative Communication Scheme for Vehicular Networks.pdf
文件: A D2D Mode Selection Scheme with Energy Consumption Minimization Underlaying Two-tier Heterogeneous Cellular Networks.pdf
文件: 08292169.pdf
文件: Power Allocation for Full-Duplex Cooperative Non-Orthogonal Multiple Access Systems.pdf
文件: ON:OFF Reporting Mechanism for Robust Cooperative Sensing in Cognitive IoT Networks.pdf
文件: 协作通信.txt
文件: User Scheduling for Non-orthogonal Transmission in UAV-Assisted Relay Network.pdf
文件: Computation Collaboration in Ultra Dense Network Integrated with Mobile Edge Computing.pdf
文件: High-Throughput and Fair Scheduling for Access Point Cooperation in Dense Wireless Networks.pdf
目录: /Users/miraco/PycharmProjects/Paper Research/协作通信
文件: Delay Efficient Disconnected RSU Placement Algorithm for VANET Safety Applications.pdf
文件: On the Handover Security Key Update and Residence Management in LTE Networks.pdf
文件: Increasing the Security of Wireless Communication Through Relaying and Interference Generation.pdf
文件: A Semi-Outsourcing Secure Data Privacy Scheme for IoT Data Transmission.pdf
文件: Security Enhancement to Successive Interference Cancellation Algorithm for Non-Orthogonal Multiple Access (NOMA).pdf
文件: Lightweight and Privacy-preserving Fog-assisted Information Sharing Scheme for Health Big Data.pdf
文件: A Comparative Study of Possible Solutions for Transmission of Vehicular Safety Messages in LTE-based Networks.pdf
文件: Privacy-Aware Offloading in Mobile-Edge Computing.pdf
文件: 安全隐私.txt
文件: Towards Scalable and Privacy Preserving Commercial Content Dissemination in Social Wireless Networks.pdf
文件: Fairness and Safety Capacity Oriented Resource Allocation Scheme for D2D Communications.pdf
文件: Physical Layer Security in D2D-enabled Cellular Networks- Artificial Noise Assisted.pdf
文件: Privacy-Preserving Data Forwarding in VANETs- A Personal-Social Behavior Based Approach.pdf
文件: Privacy-preserving and Multi-dimensional Range Query in Two-tiered Wireless Sensor Networks.pdf
文件: UAV Assisted Public Safety Communications with LTE-Advanced HetNets and FeICIC.pdf
文件: Dependent Interferer Arrangement for Physical Layer Security- Secrecy Outage Probability in Clustered Wireless Networks.pdf
文件: A Load Balancing Scheme for Supporting Safety Applications in Heterogeneous Software Defined LTE-V Networks.pdf
文件: Promoting Security and Efficiency in D2D Underlay Communication- A Bargaining Game Approach.pdf
文件: Enhancing Physical Layer Security of OFDM Systems Using Channel Shortening.pdf
目录: /Users/miraco/PycharmProjects/Paper Research/安全隐私
文件: Content-Centric Event-Insensitive Big Data Reduction in Internet of Things .pdf
文件: Twitter as a Source for Spatial Traffic Information in Big Data-Enabled Self-Organizing Networks.pdf
文件: Edge Big Data-Enabled Low-Cost Indoor Localization Based on Bayesian Analysis of RSS.pdf
文件: Reliable Content Dissemination in Internet of Vehicles Using Social Big Data.pdf
文件: Big Data Driven Similarity Based U-Model for Online Social Networks.pdf
文件: Lightweight and Privacy-preserving Fog-assisted Information Sharing Scheme for Health Big Data.pdf
文件: Multi-Keyword Fuzzy and Sortable Ciphertext Retrieval Scheme for big data .pdf
文件: Profit Maximization Auction and Data Management in Big Data Markets.pdf
文件: 大数据.txt
文件: Features Selection Model for Internet of e-Health Things using Big Data.pdf
文件: A Big Data Deep Reinforcement Learning Approach to Next Generation Green Wireless Networks.pdf
文件: Big Data Synchronization among Isolated Data Servers in Disaster.pdf
文件: A Hybrid Location Privacy Protection Scheme in Big Data Environment.pdf
目录: /Users/miraco/PycharmProjects/Paper Research/大数据
文件: 自组织和传感器.txt
目录: /Users/miraco/PycharmProjects/Paper Research/自组织和传感器
文件: downtitle.cpython-36.pyc
目录: /Users/miraco/PycharmProjects/Paper Research/__pycache__
文件: researching.py
文件: 干扰协调管理缓解.txt
目录: /Users/miraco/PycharmProjects/Paper Research/干扰协调管理缓解
文件: D2d中继.txt
目录: /Users/miraco/PycharmProjects/Paper Research/D2d中继
文件: 刘绍博的论文调研.zip
文件: 车联网.txt
目录: /Users/miraco/PycharmProjects/Paper Research/车联网
文件: downtitle.py
文件: sortandfilter.py
文件: globecom2017.txt
文件: researching.py
文件: downtitle.py
文件: sortandfilter.py
文件: 运行脚本之前阅读.rtf
目录: /Users/miraco/PycharmProjects/Paper Research/代码
文件: wcnc2017.txt
文件: A Contract-Based Incentive Mechanism for Data Caching in Ultra-Dense Small-Cells Networks .pdf
文件: Mobility Aware Caching Incentive Scheme for D2D Cellular Networks.pdf
文件: Fine-grained Incentive Mechanism for Sensing Augmented Spectrum Database.pdf
文件: Distributed Caching via Rewarding- An Incentive Caching Model for ICN.pdf
文件: QoS-based Incentive Mechanism for Mobile Data Offloading.pdf
文件: Incentive Mechanism for Cached-Enabled Small Cell Sharing- A Stackelberg Game Approach.pdf
文件: 合作激励.txt
文件: 合作激励调研.docx
文件: Incentive Based Cooperative Content Caching in Social Wireless Networks.pdf
目录: /Users/miraco/PycharmProjects/Paper Research/合作激励
目录: /Users/miraco/PycharmProjects/Paper Research
文件: .DS_Store
文件: convert.py
文件: test.py
文件: replica.conf.txt
文件: hosts.txt
文件: leetcode.py
文件: hosts2.txt
文件: encodings.xml
文件: hosts.iml
文件: profiles_settings.xml
目录: /Users/miraco/PycharmProjects/hosts/.idea/inspectionProfiles
文件: workspace.xml
文件: modules.xml
文件: misc.xml
目录: /Users/miraco/PycharmProjects/hosts/.idea
目录: /Users/miraco/PycharmProjects/hosts
文件: .DS_Store
文件: 666.py
文件: Wcnc151617Statistics.py
文件: Globecom141516.py
文件: WCNC2015.py
文件: downtitle.cpython-36.pyc
文件: exp3.cpython-36.pyc
文件: exp2.cpython-36.pyc
文件: exp.cpython-36.pyc
目录: /Users/miraco/PycharmProjects/untitled/__pycache__
文件: test.py
文件: exp2.py
文件: exp3.py
文件: downtitle.py
文件: test333.py
文件: exp.py
文件: encodings.xml
文件: profiles_settings.xml
目录: /Users/miraco/PycharmProjects/untitled/.idea/inspectionProfiles
文件: workspace.xml
文件: untitled.iml
文件: modules.xml
文件: misc.xml
目录: /Users/miraco/PycharmProjects/untitled/.idea
目录: /Users/miraco/PycharmProjects/untitled
当然还可以放在列表里面,一起输出啊:
import os
allfilepath = []
allfilename = []
def getall(path):
filelist = os.listdir(path)
for filename in filelist:
filepath = os.path.join(path,filename)
if os.path.isdir(filepath):
getall(filepath)
else:
allfilename.append(filename)
getall(r"/Users/miraco/PycharmProjects") ##here to type 路径
print("文件:", allfilename)
输出的文件
遍历的方式有好几种,深度遍历和广度遍历
使用深度遍历进行模拟压栈
def getall(path):
realfilelist = []
mystack = []
#压栈
mystack.append(path)
while len(mystack)!=0:
#出栈
openpath = mystack.pop()
#找出目录下的所有文件
filelist = os.listdir(openpath)
for filename in filelist:
abspath = os.path.join(openpath,filename) #这生成个绝对路径
if os.path.isdir(abspath):
#是目录,就压栈
mystack.append(abspath)
else:
#是文件
realfilelist.append(abspath)
return realfilelist
arr = getall(r"/Users/miraco/PycharmProjects")
for item in arr:
print(item)
输出结果:
image.png
说说collection模块(资料来自廖雪峰)
collections是Python内建的一个集合模块,提供了许多有用的集合类。
namedtuple
我们知道tuple
可以表示不变集合,例如,一个点的二维坐标就可以表示成:
>>> p = (1, 2)
但是,看到(1, 2)
,很难看出这个tuple
是用来表示一个坐标的。
定义一个class又小题大做了,这时,namedtuple
就派上了用场:
>>> from collections import namedtuple
>>> Point = namedtuple('Point', ['x', 'y'])
>>> p = Point(1, 2)
>>> p.x
1
>>> p.y
2
namedtuple
是一个函数,它用来创建一个自定义的tuple
对象,并且规定了tuple
元素的个数,并可以用属性而不是索引来引用tuple
的某个元素。
这样一来,我们用namedtuple
可以很方便地定义一种数据类型,它具备tuple的不变性,又可以根据属性来引用,使用十分方便。
可以验证创建的Point
对象是tuple
的一种子类:
>>> isinstance(p, Point)
True
>>> isinstance(p, tuple)
True
类似地,如果要用坐标和半径表示一个圆,也可以用namedtuple
定义:
# namedtuple('名称', [属性list]):
Circle = namedtuple('Circle', ['x', 'y', 'r'])
deque
使用list
存储数据时,按索引访问元素很快,但是插入和删除元素就很慢了,因为list
是线性存储,数据量大的时候,插入和删除效率很低。
deque是为了高效实现插入和删除操作的双向列表,适合用于队列和栈:
>>> from collections import deque
>>> q = deque(['a', 'b', 'c'])
>>> q.append('x')
>>> q.appendleft('y')
>>> q
deque(['y', 'a', 'b', 'c', 'x'])
deque
除了实现list的append()
和pop()
外,还支持appendleft()
和popleft()
,这样就可以非常高效地往头部添加或删除元素。
defaultdict
使用dict
时,如果引用的Key不存在,就会抛出KeyError
。如果希望key不存在时,返回一个默认值,就可以用defaultdict
:
>>> from collections import defaultdict
>>> dd = defaultdict(lambda: 'N/A')
>>> dd['key1'] = 'abc'
>>> dd['key1'] # key1存在
'abc'
>>> dd['key2'] # key2不存在,返回默认值
'N/A'
注意默认值是调用函数返回的,而函数在创建defaultdict
对象时传入。
除了在Key不存在时返回默认值,defaultdict
的其他行为跟dict
是完全一样的。
OrderedDict
使用dict
时,Key是无序的。在对dict
做迭代时,我们无法确定Key的顺序。
如果要保持Key的顺序,可以用OrderedDict
:
>>> from collections import OrderedDict
>>> d = dict([('a', 1), ('b', 2), ('c', 3)])
>>> d # dict的Key是无序的
{'a': 1, 'c': 3, 'b': 2}
>>> od = OrderedDict([('a', 1), ('b', 2), ('c', 3)])
>>> od # OrderedDict的Key是有序的
OrderedDict([('a', 1), ('b', 2), ('c', 3)])
注意,OrderedDict
的Key会按照插入的顺序排列,不是Key本身排序:
>>> od = OrderedDict()
>>> od['z'] = 1
>>> od['y'] = 2
>>> od['x'] = 3
>>> list(od.keys()) # 按照插入的Key的顺序返回
['z', 'y', 'x']
OrderedDict
可以实现一个FIFO(先进先出)的dict,当容量超出限制时,先删除最早添加的Key:
from collections import OrderedDict
class LastUpdatedOrderedDict(OrderedDict):
def __init__(self, capacity):
super(LastUpdatedOrderedDict, self).__init__()
self._capacity = capacity
def __setitem__(self, key, value):
containsKey = 1 if key in self else 0
if len(self) - containsKey >= self._capacity:
last = self.popitem(last=False)
print('remove:', last)
if containsKey:
del self[key]
print('set:', (key, value))
else:
print('add:', (key, value))
OrderedDict.__setitem__(self, key, value)
Counter
Counter
是一个简单的计数器,例如,统计字符出现的个数:
>>> from collections import Counter
>>> c = Counter()
>>> for ch in 'programming':
... c[ch] = c[ch] + 1
...
>>> c
Counter({'g': 2, 'm': 2, 'r': 2, 'a': 1, 'i': 1, 'o': 1, 'n': 1, 'p': 1})
Counter
实际上也是dict
的一个子类,上面的结果可以看出,字符'g'
、'm'
、'r'
各出现了两次,其他字符各出现了一次。
广度优先遍历先进先出
import os
import collections
def getall(path):
queue = collections.deque([]) #一个队列
realfilelist = [] #列表,用来放文件名
#进入队列
queue.append(path)
while len(queue) != 0:
onepath = queue.popleft() #先进先出的队列,最左端取出元素
filelist = os.listdir(onepath) #列出取出元素的目录的元素
for filename in filelist: #检索每个文件(夹)
abspath = os.path.join(onepath,filename) #合成绝对路径
if os.path.isdir(abspath): #如果路径是是文件夹
queue.append(abspath) #进入队列
else:
realfilelist.append(abspath) #如果是文件就输出文件名
return realfilelist
arr = getall(r"/Users/miraco/PycharmProjects")
for item in arr:
print(item)
···