Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test cases for languages #118

Open
heatherleaf opened this issue Jan 3, 2019 · 5 comments
Open

Test cases for languages #118

heatherleaf opened this issue Jan 3, 2019 · 5 comments

Comments

@heatherleaf
Copy link
Member

Can we define how to write test cases for RGL languages?

Here's a simple suggestion: every language (starting with new ones) should have a small corpus with positive examples. The corpus can consist of lines of the following kind:

  1. jag sover i huset -- LangEng: I sleep in the house
  2. huset är stort -- Lang: PhrUtt NoPConj (UttS (UseCl (TTAnt TPres ASimul) PPos (PredVP (DetCN (DetQuant DefArt NumSg) (UseN house_N)) (UseComp (CompAP (PositA big_A)))))) NoVoc
  3. ibland sover jag inte

Testing should be fairly easy, and can be done automatically (perhaps part of Travis CI?):

  1. Translate the sentence into (LangEng) and test if any of the translations is the given one
  2. Parse the sentence, and test if any of the parse trees is the given one
  3. Parse the sentence
@heatherleaf
Copy link
Member Author

@inariksit 's GFTest would be good to use here, but that will involve more work to set up

@inariksit
Copy link
Member

@heatherleaf I think gftest can be used as an inspiration to create test sentences, which are then stored in the corpus. gftest doesn't have any notion of translation equivalents, it just generates test cases for a single language (sure it can then linearise them in multiple languages), but you can't tell gftest "expect X to be translated as Y and alert me if not". For such a case, basic unit tests are much better idea.

What would be nice, if gftest wasn't so slow for certain languages, would be to use the feature that tests a single grammar against its older version, and outputs all differences. But given that this feature can take hours to run for any reasonably complex language (and still minutes for, say, Dutch), it's not a good idea to put it in Travis.

Regarding your issue #117 , I am already using gftest for sanity checking my big updates in Arabic. So for your point "2. There must be test cases: at least a simple parallel corpus or treebank with 10-20 sentences", I can easily include a set of sentences that were wrong before my update and are now correct. I haven't bothered to document my pull requests before, because in practice, I merge them immediately, and afaik there is nobody in the GF core team who knows Arabic to any relevant extent. But if we want to set a good example for future RGL contributors, I can start doing that for my future pull requests.

@johnjcamilleri
Copy link
Member

Sounds like a good start. The format should probably use a few more line breaks though, for developer sanity:

jag sover i huset
LangEng: I sleep in the house

huset är stort
Lang: PhrUtt NoPConj (UttS (UseCl (TTAnt TPres ASimul) PPos (PredVP (DetCN (DetQuant DefArt NumSg) (UseN house_N)) (UseComp (CompAP (PositA big_A)))))) NoVoc

ibland sover jag inte

To add a bit of structure:

  • All files such as the one above should go in <language>/test/<name>.tests, e.g. swedish/test/basic.tests
  • This way we can have multiple tests files per language to allow for some organisation.
  • We have a top-level script which will look for all these files for all (or specified) languages and runs them as you outlined above. Can this be written in plain Haskell? Do we need Shell/Batch versions?

@heatherleaf
Copy link
Member Author

I like your suggestion.

Should the examples also specify the top-level category or just assume the default one? (Which is Utt, right?)

Perhaps the first line in each example should also specify its concrete grammar? Like this:

LangSwe: jag sover i huset
LangEng: I sleep in the house

LangSwe: huset är stort
Lang: PhrUtt NoPConj (UttS (UseCl (TTAnt TPres ASimul) PPos (PredVP (DetCN (DetQuant DefArt NumSg) (UseN house_N)) (UseComp (CompAP (PositA big_A)))))) NoVoc

LangSwe: ibland sover jag inte

The advantage is then that we can test different modules (e.g., Lang, Dict, Extra, ...)

@heatherleaf
Copy link
Member Author

Regarding Haskell vs Shell: I would prefer non-dependence on Haskell, if it's not too complicated.

It would be nice if RGL authors do not have to install Haskell.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants