递归、深度优先、广度优先 实现目录内文件的遍历

2018-06-24  本文已影响19人  夏威夷的芒果

os.path.join()函数

语法:

os.path.join(path1[,path2[,......]])

返回值:

将多个路径组合后返回

注:第一个绝对路径之前的参数将被忽略

import os
def getall(path):
    filelist = os.listdir(path)
    for filename in filelist:
        filepath = os.path.join(path,filename)
        if os.path.isdir(filepath):
            getall(filepath)
            print("目录:",filepath)
        else:
            print("文件:",filename)
getall(r"/Users/miraco/PycharmProjects")   ##here to type 路径,以这个路径为例

输出:

/Library/Frameworks/Python.framework/Versions/3.6/bin/python3.6 /Users/miraco/PycharmProjects/untitled/test333.py
文件: .DS_Store
文件: pimrc2017.txt
文件: pimrc2017.txt
文件: globecom2017.txt
文件: wcnc2017.txt
目录: /Users/miraco/PycharmProjects/Paper Research/所有文章
文件: 物联网.txt
目录: /Users/miraco/PycharmProjects/Paper Research/物联网
文件: .DS_Store
文件: 众包.txt
目录: /Users/miraco/PycharmProjects/Paper Research/众包
文件: Combining Dynamic Clustering and Scheduling for Coordinated Multi-Point Transmission in LTE.pdf
文件: Capacity of Infrastructure-based Cooperative Vehicular Networks.pdf
文件: Cooperative Transmission in Cognitive and Energy Harvesting-based D2D Networks.pdf
文件: .DS_Store
文件: Cournot-Nash Equilibria for Bandwidth Allocation under Base-Station Cooperation.pdf
文件: A Benchmark for D2D in Cellular Networks- The Importance of Information.pdf
文件: Mobility Aware Caching Incentive Scheme for D2D Cellular Networks.pdf
文件: Hybrid Coordination Function Controlled Channel Access for Latency-Sensitive Tactile Applications .pdf
文件: 协作通信调研结果:.docx
文件: An Optimal LTE-V2I-Based Cooperative Communication Scheme for Vehicular Networks.pdf
文件: A D2D Mode Selection Scheme with Energy Consumption Minimization Underlaying Two-tier Heterogeneous Cellular Networks.pdf
文件: 08292169.pdf
文件: Power Allocation for Full-Duplex Cooperative Non-Orthogonal Multiple Access Systems.pdf
文件: ON:OFF Reporting Mechanism for Robust Cooperative Sensing in Cognitive IoT Networks.pdf
文件: 协作通信.txt
文件: User Scheduling for Non-orthogonal Transmission in UAV-Assisted Relay Network.pdf
文件: Computation Collaboration in Ultra Dense Network Integrated with Mobile Edge Computing.pdf
文件: High-Throughput and Fair Scheduling for Access Point Cooperation in Dense Wireless Networks.pdf
目录: /Users/miraco/PycharmProjects/Paper Research/协作通信
文件: Delay Efficient Disconnected RSU Placement Algorithm for VANET Safety Applications.pdf
文件: On the Handover Security Key Update and Residence Management in LTE Networks.pdf
文件: Increasing the Security of Wireless Communication Through Relaying and Interference Generation.pdf
文件: A Semi-Outsourcing Secure Data Privacy Scheme for IoT Data Transmission.pdf
文件: Security Enhancement to Successive Interference Cancellation Algorithm for Non-Orthogonal Multiple Access (NOMA).pdf
文件: Lightweight and Privacy-preserving Fog-assisted Information Sharing Scheme for Health Big Data.pdf
文件: A Comparative Study of Possible Solutions for Transmission of Vehicular Safety Messages in LTE-based Networks.pdf
文件: Privacy-Aware Offloading in Mobile-Edge Computing.pdf
文件: 安全隐私.txt
文件: Towards Scalable and Privacy Preserving Commercial Content Dissemination in Social Wireless Networks.pdf
文件: Fairness and Safety Capacity Oriented Resource Allocation Scheme for D2D Communications.pdf
文件: Physical Layer Security in D2D-enabled Cellular Networks- Artificial Noise Assisted.pdf
文件: Privacy-Preserving Data Forwarding in VANETs- A Personal-Social Behavior Based Approach.pdf
文件: Privacy-preserving and Multi-dimensional Range Query in Two-tiered Wireless Sensor Networks.pdf
文件: UAV Assisted Public Safety Communications with LTE-Advanced HetNets and FeICIC.pdf
文件: Dependent Interferer Arrangement for Physical Layer Security- Secrecy Outage Probability in Clustered Wireless Networks.pdf
文件: A Load Balancing Scheme for Supporting Safety Applications in Heterogeneous Software Defined LTE-V Networks.pdf
文件: Promoting Security and Efficiency in D2D Underlay Communication- A Bargaining Game Approach.pdf
文件: Enhancing Physical Layer Security of OFDM Systems Using Channel Shortening.pdf
目录: /Users/miraco/PycharmProjects/Paper Research/安全隐私
文件: Content-Centric Event-Insensitive Big Data Reduction in Internet of Things .pdf
文件: Twitter as a Source for Spatial Traffic Information in Big Data-Enabled Self-Organizing Networks.pdf
文件: Edge Big Data-Enabled Low-Cost Indoor Localization Based on Bayesian Analysis of RSS.pdf
文件: Reliable Content Dissemination in Internet of Vehicles Using Social Big Data.pdf
文件: Big Data Driven Similarity Based U-Model for Online Social Networks.pdf
文件: Lightweight and Privacy-preserving Fog-assisted Information Sharing Scheme for Health Big Data.pdf
文件: Multi-Keyword Fuzzy and Sortable Ciphertext Retrieval Scheme for big data .pdf
文件: Profit Maximization Auction and Data Management in Big Data Markets.pdf
文件: 大数据.txt
文件: Features Selection Model for Internet of e-Health Things using Big Data.pdf
文件: A Big Data Deep Reinforcement Learning Approach to Next Generation Green Wireless Networks.pdf
文件: Big Data Synchronization among Isolated Data Servers in Disaster.pdf
文件: A Hybrid Location Privacy Protection Scheme in Big Data Environment.pdf
目录: /Users/miraco/PycharmProjects/Paper Research/大数据
文件: 自组织和传感器.txt
目录: /Users/miraco/PycharmProjects/Paper Research/自组织和传感器
文件: downtitle.cpython-36.pyc
目录: /Users/miraco/PycharmProjects/Paper Research/__pycache__
文件: researching.py
文件: 干扰协调管理缓解.txt
目录: /Users/miraco/PycharmProjects/Paper Research/干扰协调管理缓解
文件: D2d中继.txt
目录: /Users/miraco/PycharmProjects/Paper Research/D2d中继
文件: 刘绍博的论文调研.zip
文件: 车联网.txt
目录: /Users/miraco/PycharmProjects/Paper Research/车联网
文件: downtitle.py
文件: sortandfilter.py
文件: globecom2017.txt
文件: researching.py
文件: downtitle.py
文件: sortandfilter.py
文件: 运行脚本之前阅读.rtf
目录: /Users/miraco/PycharmProjects/Paper Research/代码
文件: wcnc2017.txt
文件: A Contract-Based Incentive Mechanism for Data Caching in Ultra-Dense Small-Cells Networks .pdf
文件: Mobility Aware Caching Incentive Scheme for D2D Cellular Networks.pdf
文件: Fine-grained Incentive Mechanism for Sensing Augmented Spectrum Database.pdf
文件: Distributed Caching via Rewarding- An Incentive Caching Model for ICN.pdf
文件: QoS-based Incentive Mechanism for Mobile Data Offloading.pdf
文件: Incentive Mechanism for Cached-Enabled Small Cell Sharing- A Stackelberg Game Approach.pdf
文件: 合作激励.txt
文件: 合作激励调研.docx
文件: Incentive Based Cooperative Content Caching in Social Wireless Networks.pdf
目录: /Users/miraco/PycharmProjects/Paper Research/合作激励
目录: /Users/miraco/PycharmProjects/Paper Research
文件: .DS_Store
文件: convert.py
文件: test.py
文件: replica.conf.txt
文件: hosts.txt
文件: leetcode.py
文件: hosts2.txt
文件: encodings.xml
文件: hosts.iml
文件: profiles_settings.xml
目录: /Users/miraco/PycharmProjects/hosts/.idea/inspectionProfiles
文件: workspace.xml
文件: modules.xml
文件: misc.xml
目录: /Users/miraco/PycharmProjects/hosts/.idea
目录: /Users/miraco/PycharmProjects/hosts
文件: .DS_Store
文件: 666.py
文件: Wcnc151617Statistics.py
文件: Globecom141516.py
文件: WCNC2015.py
文件: downtitle.cpython-36.pyc
文件: exp3.cpython-36.pyc
文件: exp2.cpython-36.pyc
文件: exp.cpython-36.pyc
目录: /Users/miraco/PycharmProjects/untitled/__pycache__
文件: test.py
文件: exp2.py
文件: exp3.py
文件: downtitle.py
文件: test333.py
文件: exp.py
文件: encodings.xml
文件: profiles_settings.xml
目录: /Users/miraco/PycharmProjects/untitled/.idea/inspectionProfiles
文件: workspace.xml
文件: untitled.iml
文件: modules.xml
文件: misc.xml
目录: /Users/miraco/PycharmProjects/untitled/.idea
目录: /Users/miraco/PycharmProjects/untitled

当然还可以放在列表里面,一起输出啊:

import os
allfilepath = []
allfilename = []
def getall(path):
    filelist = os.listdir(path)
    for filename in filelist:
        filepath = os.path.join(path,filename)
        if os.path.isdir(filepath):
            getall(filepath)
        else:
            allfilename.append(filename)

getall(r"/Users/miraco/PycharmProjects")   ##here to type 路径
print("文件:", allfilename)
输出的文件

遍历的方式有好几种,深度遍历和广度遍历

使用深度遍历进行模拟压栈

def getall(path):
    realfilelist = []
    mystack = []
    #压栈
    mystack.append(path)

    while len(mystack)!=0:
        #出栈
        openpath = mystack.pop()
        #找出目录下的所有文件
        filelist = os.listdir(openpath)
        for filename in filelist:
            abspath = os.path.join(openpath,filename)  #这生成个绝对路径
            if os.path.isdir(abspath):
            #是目录,就压栈
                mystack.append(abspath)
            else:
                #是文件
                realfilelist.append(abspath)
    return realfilelist
arr = getall(r"/Users/miraco/PycharmProjects")
for item in arr:
    print(item)

输出结果:


image.png

说说collection模块(资料来自廖雪峰)

collections是Python内建的一个集合模块,提供了许多有用的集合类。

namedtuple

我们知道tuple可以表示不变集合,例如,一个点的二维坐标就可以表示成:

>>> p = (1, 2)

但是,看到(1, 2),很难看出这个tuple是用来表示一个坐标的。

定义一个class又小题大做了,这时,namedtuple就派上了用场:

>>> from collections import namedtuple
>>> Point = namedtuple('Point', ['x', 'y'])
>>> p = Point(1, 2)
>>> p.x
1
>>> p.y
2

namedtuple是一个函数,它用来创建一个自定义的tuple对象,并且规定了tuple元素的个数,并可以用属性而不是索引来引用tuple的某个元素。

这样一来,我们用namedtuple可以很方便地定义一种数据类型,它具备tuple的不变性,又可以根据属性来引用,使用十分方便。

可以验证创建的Point对象是tuple的一种子类:

>>> isinstance(p, Point)
True
>>> isinstance(p, tuple)
True

类似地,如果要用坐标和半径表示一个圆,也可以用namedtuple定义:

# namedtuple('名称', [属性list]):
Circle = namedtuple('Circle', ['x', 'y', 'r'])

deque

使用list存储数据时,按索引访问元素很快,但是插入和删除元素就很慢了,因为list是线性存储,数据量大的时候,插入和删除效率很低。

deque是为了高效实现插入和删除操作的双向列表,适合用于队列和栈:

>>> from collections import deque
>>> q = deque(['a', 'b', 'c'])
>>> q.append('x')
>>> q.appendleft('y')
>>> q
deque(['y', 'a', 'b', 'c', 'x'])

deque除了实现list的append()pop()外,还支持appendleft()popleft(),这样就可以非常高效地往头部添加或删除元素。

defaultdict

使用dict时,如果引用的Key不存在,就会抛出KeyError。如果希望key不存在时,返回一个默认值,就可以用defaultdict

>>> from collections import defaultdict
>>> dd = defaultdict(lambda: 'N/A')
>>> dd['key1'] = 'abc'
>>> dd['key1'] # key1存在
'abc'
>>> dd['key2'] # key2不存在,返回默认值
'N/A'

注意默认值是调用函数返回的,而函数在创建defaultdict对象时传入。

除了在Key不存在时返回默认值,defaultdict的其他行为跟dict是完全一样的。

OrderedDict

使用dict时,Key是无序的。在对dict做迭代时,我们无法确定Key的顺序。

如果要保持Key的顺序,可以用OrderedDict

>>> from collections import OrderedDict
>>> d = dict([('a', 1), ('b', 2), ('c', 3)])
>>> d # dict的Key是无序的
{'a': 1, 'c': 3, 'b': 2}
>>> od = OrderedDict([('a', 1), ('b', 2), ('c', 3)])
>>> od # OrderedDict的Key是有序的
OrderedDict([('a', 1), ('b', 2), ('c', 3)])

注意,OrderedDict的Key会按照插入的顺序排列,不是Key本身排序:

>>> od = OrderedDict()
>>> od['z'] = 1
>>> od['y'] = 2
>>> od['x'] = 3
>>> list(od.keys()) # 按照插入的Key的顺序返回
['z', 'y', 'x']

OrderedDict可以实现一个FIFO(先进先出)的dict,当容量超出限制时,先删除最早添加的Key:

from collections import OrderedDict

class LastUpdatedOrderedDict(OrderedDict):

    def __init__(self, capacity):
        super(LastUpdatedOrderedDict, self).__init__()
        self._capacity = capacity

    def __setitem__(self, key, value):
        containsKey = 1 if key in self else 0
        if len(self) - containsKey >= self._capacity:
            last = self.popitem(last=False)
            print('remove:', last)
        if containsKey:
            del self[key]
            print('set:', (key, value))
        else:
            print('add:', (key, value))
        OrderedDict.__setitem__(self, key, value)

Counter

Counter是一个简单的计数器,例如,统计字符出现的个数:

>>> from collections import Counter
>>> c = Counter()
>>> for ch in 'programming':
...     c[ch] = c[ch] + 1
...
>>> c
Counter({'g': 2, 'm': 2, 'r': 2, 'a': 1, 'i': 1, 'o': 1, 'n': 1, 'p': 1})

Counter实际上也是dict的一个子类,上面的结果可以看出,字符'g''m''r'各出现了两次,其他字符各出现了一次。

广度优先遍历先进先出

import os
import collections


def getall(path):
    queue = collections.deque([])  #一个队列
    realfilelist = []  #列表,用来放文件名
    #进入队列
    queue.append(path)

    while len(queue) != 0:
        onepath = queue.popleft()  #先进先出的队列,最左端取出元素
        filelist  = os.listdir(onepath)    #列出取出元素的目录的元素
        for filename in filelist:     #检索每个文件(夹)
            abspath = os.path.join(onepath,filename)     #合成绝对路径
            if os.path.isdir(abspath):        #如果路径是是文件夹
                queue.append(abspath)         #进入队列
            else:
                realfilelist.append(abspath)   #如果是文件就输出文件名
    return realfilelist

arr = getall(r"/Users/miraco/PycharmProjects")
for item in arr:
    print(item)
···
上一篇下一篇

猜你喜欢

热点阅读