Python pyPdf2&pyPdf 多页PDF文档处理报错
当一个pdf文件有多page的时候,它将出来见你!偷笑
方法是取直接修改那个文件generic.py
(1)pyPdf
路径大约在这里:
/usr/lib/python2.7/site-packages/pyPdf/generic.py
if data.has_key(key):
# multiple definitions of key not permitted
raise utils.PdfReadError, "multiple definitions in dictionary"
data[key] = value
大约在532--536行
将它修改为:
if not data.get(key):
data[key] = value
(2)pyPdf2
路径大约在:
/usr/lib/python2.7/site-packages/PyPDF2/generic.py
if not data.get(key):
data[key] = value
elif pdf.strict:
# multiple definitions of key not permitted
raise PdfReadError(
"Multiple definitions in dictionary at byte %s for key %s" \
% (utils.hexStr(stream.tell()), key))
else:
warnings.warn(
"Multiple definitions in dictionary at byte %s for key %s" \
% (utils.hexStr(stream.tell()), key), PdfReadWarning)
修改为:
if not data.get(key):
data[key] = value
# elif pdf.strict:
# # multiple definitions of key not permitted
# raise PdfReadError(
# "Multiple definitions in dictionary at byte %s for key %s" \
# % (utils.hexStr(stream.tell()), key))
# else:
# warnings.warn(
# "Multiple definitions in dictionary at byte %s for key %s" \
# % (utils.hexStr(stream.tell()), key), PdfReadWarning)
本作品采用 知识共享署名-相同方式共享 4.0 国际许可协议 进行许可。