Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: File generated by OCRmyPDF doesn't open in all PDF editors #1400

Open
sklart opened this issue Oct 1, 2024 · 2 comments
Open

[Bug]: File generated by OCRmyPDF doesn't open in all PDF editors #1400

sklart opened this issue Oct 1, 2024 · 2 comments
Assignees

Comments

@sklart
Copy link

sklart commented Oct 1, 2024

Describe the bug

Hello!
There was an interesting problem with a file Сертификат качества №3490 (опора 1У110-5+10, 2 шт.).pdf.

It opens only in PDF-XChange Editor 10.4.1.389, and does not open in either Foxit PDF Editor Pro 12.1.1.15289 or Acrobat Pro 2023.006.20360 (64-Bit).
It seems to me that pikepdf saves in some pdf format that is not supported by other editors.

The algorithm for creating PDF is as follows:

  1. Scan to MFP.
  2. Manual processing in PDF-XChange Editor: rotating pages, changing page sizes to A4, A3 formats (sometimes PDFs with non-standard dimensions of several hundred mm are received from the MFP).
  3. Next, in all PDFs in the folder I run optical text recognition using the ocrmypdf program (pikepdf is used by it to process pdf) with the command
    FOR /r %F IN (*.pdf) DO ocrmypdf -l eng+rus --rotate-pages --skip-text --optimize 1 --output-type pdf "%F" "%~fF"

After converting a page of a document in PDF-XChange Editor into an image and then saving it back to PDF, the file can be read by all editors.

But I still wonder what could be the reason for this behavior?
Maybe someone can tell me?

P.S. I initially wrote about this behavior in the pikepdf issues, but they explained that this is not a consequence of the work of this program and this is more likely to turn out to be an OCRmyPDF issue.

Steps to reproduce

No response

Files

No response

How did you download and install the software?

PyPI (pip, poetry, pipx, etc.)

OCRmyPDF version

ocrmypdf 16.4.3

Relevant log output

No response

@sklart sklart added the triage Issue needs triage label Oct 1, 2024
@jbarlow83
Copy link
Collaborator

Please provide the input file, before processing with ocrmypdf.

@jbarlow83 jbarlow83 added need test file and removed triage Issue needs triage labels Oct 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants