Python读取PDF文件

2019-12-12  本文已影响0人  月夜星空下

pdf.py

from PyPDF2 import PdfFileReader
def getTextPDF(pdfFileName):
    pdf_file = open(pdfFileName,'rb')
    read_pdf = PdfFileReader(pdf_file)
    text = []
    for i in range(0,read_pdf.getNumPages()-1):
        text.append(read_pdf.getPage(i).extractText())
    return '\n'.join(text)

TestPDFs.py

import pdf
pdfFile = '/Users/lilong/Desktop/1.pdf'
# pdfFileEncrypted = 'sonnets.pdf'
print("PDF 1:\n",pdf.getTextPDF(pdfFile))
上一篇 下一篇

猜你喜欢

热点阅读