Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

当使用ocrmypdf输入 PDF 为中文时,结果 复制PDF 中有额外的空格 #1391

Open
1 of 3 tasks
deict opened this issue Aug 30, 2024 · 1 comment
Open
1 of 3 tasks
Assignees
Labels
triage Issue needs triage

Comments

@deict
Copy link

deict commented Aug 30, 2024

Simple sanity checks

  • This is an issue with an app that uses OCRmyPDF for OCR
  • I am using a recent version of the third party app
  • I will include a file that reproduces the issuse

Third party app name and version

No response

Describe the bug

使用ocrmysql识别后的截图image
从识别后的pdf复制的内容‘短期 负 荷 预 测; 影 响负荷 的 因 素 很 多 ,存 在 着 不 确 定 性 ,首 先 需 要 进 行 一 定 的 数 据 清 洗 ,
过 滁 掉 一 些 数 据 , 然 后 进 行 特 征 选 择 , 选 取 上 一 天 的 负 荷’
这些中间会有一些空格

运行截图image

Steps to reproduce

1. Import attached file into Paperless-ngx
2. Trigger OCR
3. Check log file
4. ....\ocrmypdf -l chi_sim C:/Users/15179/Pictures/pp.pdf C:/Users/15179/Pictures/pp2.pdf

Files

pp.pdf

OCRmyPDF version

16.4.3

Relevant log output

No response

@deict deict added the triage Issue needs triage label Aug 30, 2024
@wywzxxz
Copy link

wywzxxz commented Sep 21, 2024

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
triage Issue needs triage
Projects
None yet
Development

No branches or pull requests

3 participants