Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve accuracy on long input lines #32

Merged
merged 2 commits into from
Feb 27, 2024
Merged

Improve accuracy on long input lines #32

merged 2 commits into from
Feb 27, 2024

Conversation

robertknight
Copy link
Owner

@robertknight robertknight commented Feb 27, 2024

Increase the maximum width of text line images after preprocessing, to reduce the amount by which long input lines are squashed horizontally. This reduces the frequency of letters being dropped or misinterpreted when processing long lines.

The downside is that the time taken to recognize lines is proportional to their width after preprocessing, so this significantly (eg. 2x) increases recognition time for long lines.

Using the test image in this commit, taken from a screenshot of https://en.wikipedia.org/wiki/Polar_bear:

Before:

Fosis of polar bears are uneomman 108l the odesf known fosi is a 180.000- o 10-000-year-Od jan bone. bund on Prince Cheres Foreland
Nowau,in 2un4 /? Saienicis in he 20th century sumised that polr hears oiredty dexendted fon a popuatio of brown bears, posiby in easlen
Stheria or Alesta @1 3) whactontial DhWA studies in the I9lls and 2001s sunported the satus oy the polar bear asv  derivathve ofthe hown bear
linding thet sona brown bear pooulations were more dosaliy elated to polar beas thean to ater brown bears partceuiarty the ABC sands heasot
Sountheast Alaska P01 12 12010 stoy esimated that the polar hear imneage solit irom other brown bears around 150,000 years 280. 0.

Processing time ~850ms.

After:

Fossils of polar bears are uncommon. [12][15] The oldest known fossil is a 130.000- to 110,000-year-old jaw bone, found on Prince Charles Foreland,
Norway, in 2004,120)[1) Scientists in the 20th century surmised that polar bears directly descended from a population of brown bears, possibly in eastern
Siberia or Alaska.[12][15] Mitochondrial DNA studies in the 1990s and 2000s supported the status of the polar bear as a derivative of the brown bear.
finding that some brown bear populations were more closely related to polar bears than to other brown bears, particularly the ABC Islands bears of
Southeast Alaska.[20][21]22] A 2010 study estimated that the polar bear lineage split from other brown bears around 150,000 years ago./20]

Processing time ~1500ms.

Fixes #31

Increase the maximum width of text line images after preprocessing, to reduce
the amount by which long input lines are squashed horizontally. This reduces
the frequency of letters being dropped or misinterpreted when processing long
lines.

The downside is that the time taken to recognize lines is proportional to their
width after preprocessing, so this significantly (eg. 2x) increases recognition
time for long lines.

Using the test image in this commit, taken from a screenshot of
https://en.wikipedia.org/wiki/Polar_bear:

Before:

```
Fosis of polar bears are uneomman 108l the odesf known fosi is a 180.000- o 10-000-year-Od jan bone. bund on Prince Cheres Foreland
Nowau,in 2un4 /? Saienicis in he 20th century sumised that polr hears oiredty dexendted fon a popuatio of brown bears, posiby in easlen
Stheria or Alesta @1 3) whactontial DhWA studies in the I9lls and 2001s sunported the satus oy the polar bear asv  derivathve ofthe hown bear
linding thet sona brown bear pooulations were more dosaliy elated to polar beas thean to ater brown bears partceuiarty the ABC sands heasot
Sountheast Alaska P01 12 12010 stoy esimated that the polar hear imneage solit irom other brown bears around 150,000 years 280. 0.
```

Processing time ~850ms.

After: (See polar-bears.expected.txt in this commit)

Processing time ~1500ms.

Fixes #31
During training the max width is only 800px, but the model generalizes well to
longer sequences at inference time.
@robertknight robertknight merged commit 8dec774 into main Feb 27, 2024
2 checks passed
@robertknight robertknight deleted the long-lines branch February 27, 2024 09:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Improve recognition accuracy for long text lines
1 participant