问题描述
我正在尝试使用 python 的 ftplib 读取文件而不写入它们.大致相当于:
i am trying to read files using python's ftplib without writing them. something roughly equivalent to:
def get_page(url): try: return urllib.urlopen(url).read() except: return ""
但使用 ftp.
我试过了:
def get_page(path): try: ftp = ftp('ftp.site.com', 'anonymous', 'passwd') return ftp.retrbinary('retr ' path, open('page').read()) except: return ''
但这不起作用.文档中的唯一示例涉及使用 ftp.retrbinary('retr readme', open('readme', 'wb').write) 格式编写文件.是否可以不先写入就读取ftp文件?
but this doesn't work. the only examples in the docs involve writing files using the ftp.retrbinary('retr readme', open('readme', 'wb').write) format. is it possible to read ftp files without writing first?
推荐答案
好吧,答案就在眼前:ftp.retrbinary 方法接受作为第二个参数的函数引用每当从 ftp 连接检索文件内容时都会调用它.
well, you have the answer right in front of you: the ftp.retrbinary method accepts as second parameter a reference to a function that is called whenever file content is retrieved from the ftp connection.
这是一个简单的例子:
#!/usr/bin/env python from ftplib import ftp def writefunc(s): print "read: " s ftp = ftp('ftp.kernel.org') ftp.login() ftp.retrbinary('retr /pub/readme_about_bz2_files', writefunc)
你应该实现 writefunc 以便它实际上将读取的数据附加到一个内部变量,就像这样,它使用一个可调用的对象:
you should implement writefunc so that it actually appends the data read to an internal variable, something like this, which uses a callable object:
#!/usr/bin/env python from ftplib import ftp class reader: def __init__(self): self.data = "" def __call__(self,s): self.data = s ftp = ftp('ftp.kernel.org') ftp.login() r = reader() ftp.retrbinary('retr /pub/readme_about_bz2_files', r) print r.data
更新:我意识到 python 标准库中有一个模块专门用于此类事情,bytesio:
update: i realized that there is a module in the python standard library that is meant for this kind of things, bytesio:
#!/usr/bin/env python from ftplib import ftp from io import bytesio ftp = ftp('ftp.kernel.org') ftp.login() r = bytesio() ftp.retrbinary('retr /pub/readme_about_bz2_files', r.write) print r.getvalue()