python 文件操作

发布时间：2018-02-24 20:24:31编辑：admin阅读（4668）

python基本的文件操作，包括 open，read，write

对文件操作流程:

1.打开文件，得到文件句柄并赋值给一个变量

2.通过句柄对文件进行操作

3.关闭文件

新建一个txt文件，内容是《Yesterday When I Was Young》一首歌的歌词

内容如下:

Somehow, it seems the love I knew was always the most destructive kind
Yesterday when I was young
The taste of life was sweet
As rain upon my tongue
I teased at life as if it were a foolish game
The way the evening breeze
May tease the candle flame
The thousand dreams I dreamed
The splendid things I planned
I always built to last on weak and shifting sand
I lived by night and shunned the naked light of day
And only now I see how the time ran away
Yesterday when I was young
So many lovely songs were waiting to be sung
So many wild pleasures lay in store for me
And so much pain my eyes refused to see
I ran so fast that time and youth at last ran out
I never stopped to think what life was all about
And every conversation that I can now recall
Concerned itself with me and nothing else at all
The game of love I played with arrogance and pride
And every flame I lit too quickly, quickly died
The friends I made all somehow seemed to slip away
And only now I'm left alone to end the play, yeah
Oh, yesterday when I was young
So many, many songs were waiting to be sung
So many wild pleasures lay in store for me
And so much pain my eyes refused to see
There are so many songs in me that won't be sung
I feel the bitter taste of tears upon my tongue
The time has come for me to pay for yesterday
When I was young

打开文件，并读取内容

data = open("Yesterday.txt").read()
print(data)

执行报错

UnicodeDecodeError: 'gbk' codec can't decode byte 0x80 in position 105: illegal multibyte sequence

为什么呢？

因为Windows的默认编码是gbk，而python的默认编码是utf-8。编码不匹配，所以，打开文件时，要指定编码

data = open("Yesterday.txt",encoding="utf-8").read()
print(data)

再次执行，就不会报错了。

上面的代码，是不符合正常的编码规范的。

因为data就是在内存中的一个变量了，没法对文件做进一步的操作了，只能读取。

正常情况下，是打开一个文件，一般用变量f，去掉read()

f = open("Yesterday.txt",encoding="utf-8")

f它是一个内存对象，也称之为 文件句柄

句柄就是文件的内存对象。

句柄包含文件的文件名、字符集、大小、硬盘起始位置。

读取

data = f.read()
print(data)

写入

f.write("我爱北京天安门")

执行报错

io.UnsupportedOperation: not writable

为什么呢？因为文件默认打开，是只读默认，不允许写入。

所以需要对f做进一步修改。

w表示写入格式必须要写在文件的后面

f = open("Yesterday.txt",'w',encoding="utf-8")
data = f.read()
f.write("我爱北京天安门")

执行报错

io.UnsupportedOperation: not readable

因为w 只能写入，不能读取

同时，Yesterday.txt文件内容被清空了。w是创建一个新文件，如果文件存在，直接覆盖，所以变成一个空文件了。

不存在，则创建

写2句话

f = open("Yesterday.txt",'w',encoding="utf-8")
f.write("我爱北京天安门\n")
f.write("天安门上太阳升\n")

查看文件内容

还有一种追加的写入方式

a就是append的意思

f = open("Yesterday.txt",'a',encoding="utf-8")
f.write("伟大领袖毛主席\n")
#关闭资源
f.close()

查看文件内容

同样的，也不能使用read()方法，执行就报错。

把上面的歌词重新复制到Yesterday.txt中

读取前5行

f = open("Yesterday.txt",'r',encoding="utf-8")
for i in range(5):
    #readline()每次读取一行,strip()去除左右的换行和空格
    print(f.readline().strip())
f.close()

执行输出

Somehow, it seems the love I knew was always the most destructive kind

不知为何，我经历的爱情总是最具毁灭性的的那种

Yesterday when I was young

昨日当我年少轻狂

The taste of life was sweet

readlines() 方法用于读取所有行(直到结束符 EOF)并返回列表，每一行就是一个元素

注意: 大文件，不建议使用readlines()，会造成程序卡死

f = open("Yesterday.txt",'r',encoding="utf-8")
print(f.readlines())

执行输出

['Somehow, it seems the love I knew was always the most destructive kind\n', '不知为何，我经历的爱情总是最具毁灭性的的那种\n',...]

读取文件，在第3行的时候，输出特殊标记

enumerate() 函数用于将一个可遍历的数据对象(如列表、元组或字符串)组合为一个索引序列，同时列出数据和数据下标，一般用在 for 循环当中。

所以直接用enumerate() 读取索引，判断索引为3时，输出标记

f = open("Yesterday.txt",'r',encoding="utf-8")
for index,line in enumerate(f.readlines()):
    if index == 3:
        print('------Here is the segmenting line------')
        #跳出本次循环
        continue
    print(line.strip())
f.close()

执行输出

Somehow, it seems the love I knew was always the most destructive kind

不知为何，我经历的爱情总是最具毁灭性的的那种

Yesterday when I was young

------Here is the segmenting line------

The taste of life was sweet

...

上面的方法比较low

下面介绍一个比较高效的方法

f = open("Yesterday.txt",'r',encoding="utf-8")
#计数器，默认为0
count = 0
#循环句柄
for line in f:
    if count == 3:
        print('------Here is the segmenting line------')
        #计数加1
        count += 1
        # 跳出本次循环
        continue
    #打印一行内容
    print(line)
    # 计数加1
    count += 1

执行输出，效果同上

因为循环句柄的时候，没法同时输出行号，所以临时加了一个计数器，用来做判断。

这种方法，占用内存极少。不会把文件所有内容写入内存，而只是每次把一行的内容写入到内存。当下一次循环时，上一次内存的内容被覆盖。

所以整个程序执行完成，内存只保存了一行的内容。处理超大文件，也不在话下。

关键字：

上一篇： python 集合

下一篇： python 文件操作2



搜索

热门推荐

最新文章

博主信息

姓名：Run
职业：谜
邮箱：383697894@qq.com
定位：上海 · 松江

扫我打开

友情链接

百度 淘宝 腾讯 慕课网 CSDN 博客园 51cto博客