Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve support for small typos #137

Open
offsky opened this issue Oct 17, 2024 · 2 comments
Open

Improve support for small typos #137

offsky opened this issue Oct 17, 2024 · 2 comments

Comments

@offsky
Copy link

offsky commented Oct 17, 2024

Would it be possible to improve the fuzzy search algorithm so it can handle simple typos? Here is an example:

Screenshot 2024-10-17 at 11 09 47 AM

If you have perfect spelling as above you get a good match (69%) and the highlight captures the entire phrase. However, if you transpose the last two letters of the search string, the score drops to 25% and ranks below a result that has shorter matches. Also, the highlight is no longer showing as much as it could. The search string has swapped the last two letters, but the highlight has now dropped the last 10 letters. I was expecting the highlight to at least contain "turn any list into an animat" since this is an exact match. I could understand if the word "animated" gets dropped from the highlight, but why is the prior word "an" being dropped as well?

Screenshot 2024-10-17 at 11 10 07 AM

Here is a more extreme example, the "x" at the end of the search string is causing the entire result to be omitted, even though the first 5 words of the search appear perfectly in the result. The misspelled word in the search poisons the entire query.

Screenshot 2024-10-17 at 11 18 06 AM

@farzher
Copy link
Owner

farzher commented Oct 19, 2024

Would it be possible to improve the fuzzy search algorithm so it can handle simple typos?

no.

early versions used to handle a single transpose like in your 2nd example and return a normal score of 69%, but this caused more problems than it solved. lots of issues asking why things were a match and it was because of this.

i can't think of a simple/fast way to deal with typos that doesn't cause other issues.

if you care about typos you should use a different search library like fuse.js, which often returns nonsense results, probably a direct consequence of supporting typos

@leeoniya
Copy link

shameless plug for https://github.com/leeoniya/uFuzzy

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants