递归、深度优先、广度优先实现目录内文件的遍历

2018-06-24 本文已影响19人夏威夷的芒果

os.path.join()函数

语法：

os.path.join(path1[,path2[,......]])

返回值：

将多个路径组合后返回

注：第一个绝对路径之前的参数将被忽略

import os
def getall(path):
    filelist = os.listdir(path)
    for filename in filelist:
        filepath = os.path.join(path,filename)
        if os.path.isdir(filepath):
            getall(filepath)
            print("目录：",filepath)
        else:
            print("文件：",filename)
getall(r"/Users/miraco/PycharmProjects")   ##here to type 路径,以这个路径为例

输出：

/Library/Frameworks/Python.framework/Versions/3.6/bin/python3.6 /Users/miraco/PycharmProjects/untitled/test333.py
文件： .DS_Store
文件： pimrc2017.txt
文件： pimrc2017.txt
文件： globecom2017.txt
文件： wcnc2017.txt
目录： /Users/miraco/PycharmProjects/Paper Research/所有文章
文件： 物联网.txt
目录： /Users/miraco/PycharmProjects/Paper Research/物联网
文件： .DS_Store
文件： 众包.txt
目录： /Users/miraco/PycharmProjects/Paper Research/众包
文件： Combining Dynamic Clustering and Scheduling for Coordinated Multi-Point Transmission in LTE.pdf
文件： Capacity of Infrastructure-based Cooperative Vehicular Networks.pdf
文件： Cooperative Transmission in Cognitive and Energy Harvesting-based D2D Networks.pdf
文件： .DS_Store
文件： Cournot-Nash Equilibria for Bandwidth Allocation under Base-Station Cooperation.pdf
文件： A Benchmark for D2D in Cellular Networks- The Importance of Information.pdf
文件： Mobility Aware Caching Incentive Scheme for D2D Cellular Networks.pdf
文件： Hybrid Coordination Function Controlled Channel Access for Latency-Sensitive Tactile Applications .pdf
文件： 协作通信调研结果：.docx
文件： An Optimal LTE-V2I-Based Cooperative Communication Scheme for Vehicular Networks.pdf
文件： A D2D Mode Selection Scheme with Energy Consumption Minimization Underlaying Two-tier Heterogeneous Cellular Networks.pdf
文件： 08292169.pdf
文件： Power Allocation for Full-Duplex Cooperative Non-Orthogonal Multiple Access Systems.pdf
文件： ON:OFF Reporting Mechanism for Robust Cooperative Sensing in Cognitive IoT Networks.pdf
文件： 协作通信.txt
文件： User Scheduling for Non-orthogonal Transmission in UAV-Assisted Relay Network.pdf
文件： Computation Collaboration in Ultra Dense Network Integrated with Mobile Edge Computing.pdf
文件： High-Throughput and Fair Scheduling for Access Point Cooperation in Dense Wireless Networks.pdf
目录： /Users/miraco/PycharmProjects/Paper Research/协作通信
文件： Delay Efficient Disconnected RSU Placement Algorithm for VANET Safety Applications.pdf
文件： On the Handover Security Key Update and Residence Management in LTE Networks.pdf
文件： Increasing the Security of Wireless Communication Through Relaying and Interference Generation.pdf
文件： A Semi-Outsourcing Secure Data Privacy Scheme for IoT Data Transmission.pdf
文件： Security Enhancement to Successive Interference Cancellation Algorithm for Non-Orthogonal Multiple Access (NOMA).pdf
文件： Lightweight and Privacy-preserving Fog-assisted Information Sharing Scheme for Health Big Data.pdf
文件： A Comparative Study of Possible Solutions for Transmission of Vehicular Safety Messages in LTE-based Networks.pdf
文件： Privacy-Aware Offloading in Mobile-Edge Computing.pdf
文件： 安全隐私.txt
文件： Towards Scalable and Privacy Preserving Commercial Content Dissemination in Social Wireless Networks.pdf
文件： Fairness and Safety Capacity Oriented Resource Allocation Scheme for D2D Communications.pdf
文件： Physical Layer Security in D2D-enabled Cellular Networks- Artificial Noise Assisted.pdf
文件： Privacy-Preserving Data Forwarding in VANETs- A Personal-Social Behavior Based Approach.pdf
文件： Privacy-preserving and Multi-dimensional Range Query in Two-tiered Wireless Sensor Networks.pdf
文件： UAV Assisted Public Safety Communications with LTE-Advanced HetNets and FeICIC.pdf
文件： Dependent Interferer Arrangement for Physical Layer Security- Secrecy Outage Probability in Clustered Wireless Networks.pdf
文件： A Load Balancing Scheme for Supporting Safety Applications in Heterogeneous Software Defined LTE-V Networks.pdf
文件： Promoting Security and Efficiency in D2D Underlay Communication- A Bargaining Game Approach.pdf
文件： Enhancing Physical Layer Security of OFDM Systems Using Channel Shortening.pdf
目录： /Users/miraco/PycharmProjects/Paper Research/安全隐私
文件： Content-Centric Event-Insensitive Big Data Reduction in Internet of Things .pdf
文件： Twitter as a Source for Spatial Traffic Information in Big Data-Enabled Self-Organizing Networks.pdf
文件： Edge Big Data-Enabled Low-Cost Indoor Localization Based on Bayesian Analysis of RSS.pdf
文件： Reliable Content Dissemination in Internet of Vehicles Using Social Big Data.pdf
文件： Big Data Driven Similarity Based U-Model for Online Social Networks.pdf
文件： Lightweight and Privacy-preserving Fog-assisted Information Sharing Scheme for Health Big Data.pdf
文件： Multi-Keyword Fuzzy and Sortable Ciphertext Retrieval Scheme for big data .pdf
文件： Profit Maximization Auction and Data Management in Big Data Markets.pdf
文件： 大数据.txt
文件： Features Selection Model for Internet of e-Health Things using Big Data.pdf
文件： A Big Data Deep Reinforcement Learning Approach to Next Generation Green Wireless Networks.pdf
文件： Big Data Synchronization among Isolated Data Servers in Disaster.pdf
文件： A Hybrid Location Privacy Protection Scheme in Big Data Environment.pdf
目录： /Users/miraco/PycharmProjects/Paper Research/大数据
文件： 自组织和传感器.txt
目录： /Users/miraco/PycharmProjects/Paper Research/自组织和传感器
文件： downtitle.cpython-36.pyc
目录： /Users/miraco/PycharmProjects/Paper Research/__pycache__
文件： researching.py
文件： 干扰协调管理缓解.txt
目录： /Users/miraco/PycharmProjects/Paper Research/干扰协调管理缓解
文件： D2d中继.txt
目录： /Users/miraco/PycharmProjects/Paper Research/D2d中继
文件： 刘绍博的论文调研.zip
文件： 车联网.txt
目录： /Users/miraco/PycharmProjects/Paper Research/车联网
文件： downtitle.py
文件： sortandfilter.py
文件： globecom2017.txt
文件： researching.py
文件： downtitle.py
文件： sortandfilter.py
文件： 运行脚本之前阅读.rtf
目录： /Users/miraco/PycharmProjects/Paper Research/代码
文件： wcnc2017.txt
文件： A Contract-Based Incentive Mechanism for Data Caching in Ultra-Dense Small-Cells Networks .pdf
文件： Mobility Aware Caching Incentive Scheme for D2D Cellular Networks.pdf
文件： Fine-grained Incentive Mechanism for Sensing Augmented Spectrum Database.pdf
文件： Distributed Caching via Rewarding- An Incentive Caching Model for ICN.pdf
文件： QoS-based Incentive Mechanism for Mobile Data Offloading.pdf
文件： Incentive Mechanism for Cached-Enabled Small Cell Sharing- A Stackelberg Game Approach.pdf
文件： 合作激励.txt
文件： 合作激励调研.docx
文件： Incentive Based Cooperative Content Caching in Social Wireless Networks.pdf
目录： /Users/miraco/PycharmProjects/Paper Research/合作激励
目录： /Users/miraco/PycharmProjects/Paper Research
文件： .DS_Store
文件： convert.py
文件： test.py
文件： replica.conf.txt
文件： hosts.txt
文件： leetcode.py
文件： hosts2.txt
文件： encodings.xml
文件： hosts.iml
文件： profiles_settings.xml
目录： /Users/miraco/PycharmProjects/hosts/.idea/inspectionProfiles
文件： workspace.xml
文件： modules.xml
文件： misc.xml
目录： /Users/miraco/PycharmProjects/hosts/.idea
目录： /Users/miraco/PycharmProjects/hosts
文件： .DS_Store
文件： 666.py
文件： Wcnc151617Statistics.py
文件： Globecom141516.py
文件： WCNC2015.py
文件： downtitle.cpython-36.pyc
文件： exp3.cpython-36.pyc
文件： exp2.cpython-36.pyc
文件： exp.cpython-36.pyc
目录： /Users/miraco/PycharmProjects/untitled/__pycache__
文件： test.py
文件： exp2.py
文件： exp3.py
文件： downtitle.py
文件： test333.py
文件： exp.py
文件： encodings.xml
文件： profiles_settings.xml
目录： /Users/miraco/PycharmProjects/untitled/.idea/inspectionProfiles
文件： workspace.xml
文件： untitled.iml
文件： modules.xml
文件： misc.xml
目录： /Users/miraco/PycharmProjects/untitled/.idea
目录： /Users/miraco/PycharmProjects/untitled

当然还可以放在列表里面，一起输出啊：

import os
allfilepath = []
allfilename = []
def getall(path):
    filelist = os.listdir(path)
    for filename in filelist:
        filepath = os.path.join(path,filename)
        if os.path.isdir(filepath):
            getall(filepath)
        else:
            allfilename.append(filename)

getall(r"/Users/miraco/PycharmProjects")   ##here to type 路径
print("文件：", allfilename)

输出的文件

遍历的方式有好几种，深度遍历和广度遍历

使用深度遍历进行模拟压栈

def getall(path):
    realfilelist = []
    mystack = []
    #压栈
    mystack.append(path)

    while len(mystack)!=0:
        #出栈
        openpath = mystack.pop()
        #找出目录下的所有文件
        filelist = os.listdir(openpath)
        for filename in filelist:
            abspath = os.path.join(openpath,filename)  #这生成个绝对路径
            if os.path.isdir(abspath):
            #是目录，就压栈
                mystack.append(abspath)
            else:
                #是文件
                realfilelist.append(abspath)
    return realfilelist
arr = getall(r"/Users/miraco/PycharmProjects")
for item in arr:
    print(item)

输出结果：

image.png

说说collection模块（资料来自廖雪峰）

collections是Python内建的一个集合模块，提供了许多有用的集合类。

namedtuple

我们知道tuple可以表示不变集合，例如，一个点的二维坐标就可以表示成：

>>> p = (1, 2)

但是，看到(1, 2)，很难看出这个tuple是用来表示一个坐标的。

定义一个class又小题大做了，这时，namedtuple就派上了用场：

>>> from collections import namedtuple
>>> Point = namedtuple('Point', ['x', 'y'])
>>> p = Point(1, 2)
>>> p.x
1
>>> p.y
2

namedtuple是一个函数，它用来创建一个自定义的tuple对象，并且规定了tuple元素的个数，并可以用属性而不是索引来引用tuple的某个元素。

这样一来，我们用namedtuple可以很方便地定义一种数据类型，它具备tuple的不变性，又可以根据属性来引用，使用十分方便。

可以验证创建的Point对象是tuple的一种子类：

>>> isinstance(p, Point)
True
>>> isinstance(p, tuple)
True

类似地，如果要用坐标和半径表示一个圆，也可以用namedtuple定义：

# namedtuple('名称', [属性list]):
Circle = namedtuple('Circle', ['x', 'y', 'r'])

deque

使用list存储数据时，按索引访问元素很快，但是插入和删除元素就很慢了，因为list是线性存储，数据量大的时候，插入和删除效率很低。

deque是为了高效实现插入和删除操作的双向列表，适合用于队列和栈：

>>> from collections import deque
>>> q = deque(['a', 'b', 'c'])
>>> q.append('x')
>>> q.appendleft('y')
>>> q
deque(['y', 'a', 'b', 'c', 'x'])

deque除了实现list的append()和pop()外，还支持appendleft()和popleft()，这样就可以非常高效地往头部添加或删除元素。

defaultdict

使用dict时，如果引用的Key不存在，就会抛出KeyError。如果希望key不存在时，返回一个默认值，就可以用defaultdict：

>>> from collections import defaultdict
>>> dd = defaultdict(lambda: 'N/A')
>>> dd['key1'] = 'abc'
>>> dd['key1'] # key1存在
'abc'
>>> dd['key2'] # key2不存在，返回默认值
'N/A'

注意默认值是调用函数返回的，而函数在创建defaultdict对象时传入。

除了在Key不存在时返回默认值，defaultdict的其他行为跟dict是完全一样的。

OrderedDict

使用dict时，Key是无序的。在对dict做迭代时，我们无法确定Key的顺序。

如果要保持Key的顺序，可以用OrderedDict：

>>> from collections import OrderedDict
>>> d = dict([('a', 1), ('b', 2), ('c', 3)])
>>> d # dict的Key是无序的
{'a': 1, 'c': 3, 'b': 2}
>>> od = OrderedDict([('a', 1), ('b', 2), ('c', 3)])
>>> od # OrderedDict的Key是有序的
OrderedDict([('a', 1), ('b', 2), ('c', 3)])

注意，OrderedDict的Key会按照插入的顺序排列，不是Key本身排序：

>>> od = OrderedDict()
>>> od['z'] = 1
>>> od['y'] = 2
>>> od['x'] = 3
>>> list(od.keys()) # 按照插入的Key的顺序返回
['z', 'y', 'x']

OrderedDict可以实现一个FIFO（先进先出）的dict，当容量超出限制时，先删除最早添加的Key：

from collections import OrderedDict

class LastUpdatedOrderedDict(OrderedDict):

    def __init__(self, capacity):
        super(LastUpdatedOrderedDict, self).__init__()
        self._capacity = capacity

    def __setitem__(self, key, value):
        containsKey = 1 if key in self else 0
        if len(self) - containsKey >= self._capacity:
            last = self.popitem(last=False)
            print('remove:', last)
        if containsKey:
            del self[key]
            print('set:', (key, value))
        else:
            print('add:', (key, value))
        OrderedDict.__setitem__(self, key, value)

Counter

Counter是一个简单的计数器，例如，统计字符出现的个数：

>>> from collections import Counter
>>> c = Counter()
>>> for ch in 'programming':
...     c[ch] = c[ch] + 1
...
>>> c
Counter({'g': 2, 'm': 2, 'r': 2, 'a': 1, 'i': 1, 'o': 1, 'n': 1, 'p': 1})

Counter实际上也是dict的一个子类，上面的结果可以看出，字符'g'、'm'、'r'各出现了两次，其他字符各出现了一次。

广度优先遍历先进先出

import os
import collections


def getall(path):
    queue = collections.deque([])  #一个队列
    realfilelist = []  #列表，用来放文件名
    #进入队列
    queue.append(path)

    while len(queue) != 0:
        onepath = queue.popleft()  #先进先出的队列，最左端取出元素
        filelist  = os.listdir(onepath)    #列出取出元素的目录的元素
        for filename in filelist:     #检索每个文件（夹）
            abspath = os.path.join(onepath,filename)     #合成绝对路径
            if os.path.isdir(abspath):        #如果路径是是文件夹
                queue.append(abspath)         #进入队列
            else:
                realfilelist.append(abspath)   #如果是文件就输出文件名
    return realfilelist

arr = getall(r"/Users/miraco/PycharmProjects")
for item in arr:
    print(item)
···

递归、深度优先、广度优先实现目录内文件的遍历

os.path.join()函数

语法：

返回值：

遍历的方式有好几种，深度遍历和广度遍历

说说collection模块（资料来自廖雪峰）

namedtuple

deque

defaultdict

OrderedDict

Counter

广度优先遍历先进先出

猜你喜欢

热点阅读

递归、深度优先、广度优先 实现目录内文件的遍历

os.path.join()函数

语法：

返回值：

遍历的方式有好几种，深度遍历和广度遍历

说说collection模块（资料来自廖雪峰）

namedtuple

deque

defaultdict

OrderedDict

Counter

广度优先遍历先进先出

猜你喜欢

热点阅读

递归、深度优先、广度优先实现目录内文件的遍历