codecs模块进行文件操作

2018-08-14 本文已影响108人 EnjoyWT

编码转换时，通常需要以unicode作为中间编码，即先将其他编码的字符串解码（decode）成unicode，再从unicode编码（encode）成另一种编码。

unicode.png

由于python中默认的编码是ascii，如果直接使用open方法得到文件对象然后进行文件的读写，都将无法使用包含中文字符（以及其他非ascii码字符），因此建议使用utf-8编码。
使用方法

下面的代码读取了文件，将每一行的内容组成了一个列表。

import codecs
file = codecs.open('test.txt','r','utf-8')
lines = [line.strip() for line in file] 
file.close()

下面的代码写入了一行英文和一行中文到文件中。

import codecs
file = codecs.open('test.txt','w','utf-8')
file.write('Hello World!\n')
file.write('哈哈哈\n')
file.close()
文件读写模式

最为常见的三种模式，见下表，其中模式就是指获取文件对象时传入的参数，最常用的是前三个。