-
Notifications
You must be signed in to change notification settings - Fork 907
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
regex for language codes does not accept es-419 #1352
Comments
Thanks for reporting this, it's really something we've missed. The regexp which validates languages is in https://github.com/Chainlit/chainlit/blob/main/backend/chainlit/server.py#L647, I'd love a PR correcting this! |
@dosu What would you suggest as a regex to extend support for BCP 47? |
Hey @dokterbob! I'm here to help you with any bugs, questions, or contributions you have while waiting for a human maintainer. Let's squash those bugs together! 🤖 To extend support for BCP 47 language tags, including numeric region codes like ^[a-zA-Z]{2,3}(-[a-zA-Z0-9]{2,3})?(-[a-zA-Z0-9]{2,8})?(-x-[a-zA-Z0-9]{1,8})?$ This pattern allows for:
This should cover language codes like |
Hi! I am experiencing the same issue. What @DanielVelaJ proposes solved it, but it should be an important fix |
Describe the bug
When using Chainlit, setting Google Chrome's language to "español (Latinoamérica)" causes the application to fail with a
422 Unprocessable Entity
error. The issue arises because Chainlit's language validation pattern does not accept the language codees-419
, which corresponds to "español (Latinoamérica)". As a result, the application is unable to load translations and settings, preventing it from functioning properly.To Reproduce
Steps to reproduce the behavior:
Set Google Chrome Language to "español (Latinoamérica)":
Run a Chainlit Application:
chainlit hello
to start a basic Chainlit application.Open the Application:
http://localhost:8000
.Observe the Error:
F12
or right-click and select Inspeccionar (Inspect), then go to the Console tab).422 Unprocessable Entity
errors related to requests to/project/translations
and/project/settings
with the query parameterlanguage=es-419
.Expected behavior
Chainlit should accept the
es-419
language code corresponding to "español (Latinoamérica)" and load the appropriate translations if available. If translations fores-419
are not available, the application should gracefully fall back to a default language (e.g.,es
for general Spanish oren
for English) without causing errors. The application should load normally and be fully functional regardless of the browser's language settings.Screenshots
Desktop (please complete the following information):
Smartphone (please complete the following information):
Not applicable.
Additional context
Error Details:
The server returns the following error message:
Cause of the Issue:
The error occurs because Chainlit's validation regex for the
language
query parameter does not accept numeric region codes like419
. The regex pattern only allows alphabetic characters in the region and variant parts, soes-419
(which corresponds to "español (Latinoamérica)") is rejected.Impact:
Users with Google Chrome set to "español (Latinoamérica)" cannot load Chainlit applications properly, affecting accessibility for Spanish-speaking users in Latin America and the Caribbean.
Workaround:
Changing Chrome's language setting to general Spanish (
es
) or Spanish (Spain) (es-ES
) allows the application to load correctly. However, this is not an ideal solution for end-users who prefer "español (Latinoamérica)".Suggested Fix:
Modify the Validation Regex:
Update the regex pattern in Chainlit's code to accept numeric region codes. For example:
This change allows numeric values in the region and variant parts, accommodating language codes like
es-419
.Graceful Fallback:
Implement logic to default to a base language (e.g.,
es
) if a specific regional variant is not supported. Ifes-419
translations are not available, Chainlit should usees.json
oren.json
without causing errors.References:
Chainlit Documentation:
The documentation mentions that translation files are named after the language code and that the language is dynamically set based on the browser's language. However, it does not specify limitations regarding numeric region codes.
IETF Language Tags:
According to the IETF BCP 47 standard, language tags like
es-419
are valid and commonly used to represent regional variations.Additional Notes:
Reproducing the Issue: The issue was observed exclusively on Google Chrome with the language set to "español (Latinoamérica)". Other browsers were not tested.
Translation Files: Attempting to add an
es-419.json
translation file did not resolve the issue due to the validation pattern rejecting thees-419
code.The text was updated successfully, but these errors were encountered: