-
Notifications
You must be signed in to change notification settings - Fork 87
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Many templates have been floating around in the DMOJ community for validation and input handling in checkers. This commit aims to consolidate them. It has two main goals: - Correct. Duh. - Simple. Other templates that circulate, including the ones I have published, are too complex. People naively try and write their own. I am sick and tired of reading over incorrect validators. These templates forgo some principles of good design (such as object-oriented programming) in favour of pure simplicity. They should be simple enough that they are understandable by the broader community, and are not a black box. Hopefully this also dissuades re-writing.
- Loading branch information
Showing
65 changed files
with
662 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,21 @@ | ||
name: build | ||
on: [push, pull_request] | ||
jobs: | ||
lint: | ||
runs-on: ubuntu-latest | ||
steps: | ||
- uses: actions/checkout@v3 | ||
- name: Install clang-format 12 | ||
run: | | ||
wget -O clang-format https://github.com/DMOJ/clang-tools-static-binaries/releases/download/master-5ea3d18c/clang-format-12_linux-amd64 | ||
chmod a+x ./clang-format | ||
- name: Run clang-format | ||
run: find sample_files/problem_setting \( -name '*.h' -or -name '*.cpp' -or -name '*.c' \) -print0 | xargs -0 ./clang-format --dry-run -Werror --color | ||
cpp_template_tests: | ||
runs-on: ubuntu-latest | ||
steps: | ||
- uses: actions/checkout@v3 | ||
- name: Run C++ template tests | ||
run: | | ||
cd sample_files/problem_setting/test | ||
./run_test.sh |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,63 @@ | ||
# C++ Problem Setting Templates - `cpp_psetting_templates` | ||
|
||
There are three C++ input-handling templates provided for aiding problem setters. They are as follows: | ||
|
||
- [Validator Template](https://github.com/DMOJ/docs/blob/master/sample_files/problem_setting/validator.cpp) | ||
- [Identical Checker/Interactor Template](https://github.com/DMOJ/docs/blob/master/sample_files/problem_setting/identical_checker_interactor.cpp) | ||
- [Standard Checker/Interactor Template](https://github.com/DMOJ/docs/blob/master/sample_files/problem_setting/standard_checker_interactor.cpp) | ||
|
||
## Validator | ||
|
||
This is a template for validating the input data of problems. It aims to be simple and of course, correct. It contains seven functions. The first three are whitespace functions: | ||
|
||
- `void readSpace()` expects a space at the current position in the input, and aborts the program if there is not a space. | ||
- `void readNewLine()` expects a newline at the current position in the input. | ||
- `void readEOF()` expects the input file to end immediately at the current position. | ||
|
||
The remaining four are for actual content: | ||
|
||
- `std::string readToken(char min_char = 0, char max_char = 127)` returns the next token in the input stream. A token is defined as a whitespace-separated string. If the next character in the input is a whitespace character, this method aborts the program. The optional arguments `min_char` and `max_char` can be used to enforce a range on the characters in the token. For instance, `readToken('a', 'z')` reads a lowercase string of english letters. | ||
- `std::string readLine(char min_char = 0, char max_char = 127)` returns the next line in the input stream. Specifically, it reads until it encounters a `\n`, and discards it (the newline is not part of the returned string). `min_char` and `max_char` are the same as for `readToken`. If `readLine` encounters an EOF, it fails. | ||
- `long long readInt(long long lo, long long hi)` parses the next token as an integer. It aborts on overflow, malformed integers, and if the resultant integer is not in the range [lo, hi], inclusive. Leading zeroes and `-0` are not accepted. | ||
- `long double readFloat(long double lo, long double hi, long double eps = 1e-9)` parses the next token as a float. It aborts on overflow, malformed floats, and if the resultant float is not in the range [lo, hi], inclusive, using the provided epsilon to perform the comparison. Scientific notation and NaNs are not accepted, nor are leading zeroes. `-0` is allowed. Trailing zeroes are also permitted. | ||
- `std::vector<T> readIntArray(size_t N, long long lo, long long hi)` parses the next space-separated N integers into an array, and then reads a final newline. It must be given a template argument, which is the type of the array elements. For example, `readIntArray<int>(5, 1, 10)` reads five space-separated integers into a `std::vector<int>`, where each integer is in the range [1, 10], inclusive. | ||
|
||
A small caveat: `readToken` and `readLine` will throw if the string exceeds 10 million characters. | ||
|
||
`readFloat()` will likely be of no use for many validators, and can be safely deleted. Similarly, `readIntArray` can be deleted if unneeded. | ||
|
||
## Checkers/Interactors | ||
|
||
The next pair of templates are for checkers/interactors. The difference is the type of whitespace handling: the identical checker/interactor expects whitespace to match exactly. The standard checker/interactor handles whitespace like the `standard` checker. | ||
|
||
The checkers and interactors are designed for the `coci` bridged checker/interactor type. However, updating the codes used and the order of command line parameters to work with other types should not be challenging. | ||
|
||
Both files can be used for either checkers/interactors, with the following caveat: interactors MUST close `stdout` BEFORE calling `readEOF()`, so that the user process can terminate in case it _also_ expects an EOF. Checker stdout is used for feedback displayed to the user, and as such `stdout` should not be closed in this case. Validators also do not need to worry about this - only interactors do, and they should only call `readEOF()` once they have finished communicating with the user, to clean up and assert that the user didn't send any trailing data. | ||
|
||
The general format of the checkers/interactors are the same as the validator, with a few changes: | ||
|
||
- `readSpace(), readNewLine(), readEOF()`: Under the identical checker, these return Presentation Error if the check fails. Under the standard checker, these return WA. | ||
- `readToken()`: Under the identical checker, this returns Presentation Error if the token is empty, and WA if any character is not in range. | ||
- `readLine()`: Under the identical checker, this returns Presentation Error if an EOF is encountered, and WA if any character is not in range. This function cannot be used correctly under the standard checker, and so is not provided in that template. | ||
- `readInt(), readIntArray(), readFloat()`: Returns WA if the token is malformed or out of range. | ||
|
||
Additionally, two new functions are provided. | ||
|
||
- `exitWA()` unconditionally exits with a WA verdict. | ||
- `assertWA(bool)` takes a condition and exits with WA if the condition is false. | ||
|
||
Under the identical checker, corresponding functions `exitPE()` and `assertPE` are provided. Standard checkers should not use the Presentation Error code, as the builtin `standard` checker does not use this code. | ||
|
||
Finally, there is an empty function `errorHook()`. This function is called whenever the provided function would exit with an error. It should be used to do custom handling, such as providing partial points for outputting part of an answer, or outputting `-1` in interactors to signal errors to the user submission. | ||
|
||
## Standard Checker/Interactor Design | ||
|
||
This section is purely for those interested in the design and inner workings of the standard checker/interactor routines. | ||
|
||
The general overview is that `readSpace()` should read non-line whitespace characters, `readNewLine` should read whitespace and expect a line whitespace character, and `readEOF` should read all whitespace and check for EOF. Additionally, any leading whitespace in the input should be trimmed. | ||
|
||
There are two major challenges with making a standard checker/interactor design ergonomic: | ||
- Under interactors, it is not acceptable to consume all whitespace in the `readNewLine` method, as the user submission will likely output a single line and then wait for the interactor to send another query. If the interactor naively tried to consume all whitespace, it would block, and the user submission would TLE. | ||
- After reading the end of the input, it's most ergonomic to have the checker read a newline, and then call `readEOF()`, as this is the canonical input format. However, the standard checker allows users to forgo the last newline, and if the `readNewLine()` method expected a newline, we would erroneously return WA. | ||
|
||
To solve both of these problems, we employ a lazy whitespace checking scheme. `readSpace()` and `readNewLine()` simply set a flag for `readToken()`. `readToken()` then consumes the whitespace and validates it, before reading the token. Additionally, `readEOF()`, if called after `readNewLine()`, ignores the flag and consumes all whitespace, and then checks for EOF. |
129 changes: 129 additions & 0 deletions
129
sample_files/problem_setting/identical_checker_interactor.cpp
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,129 @@ | ||
#include <cstdio> | ||
#include <cstdlib> | ||
#include <fstream> | ||
#include <regex.h> | ||
#include <stdexcept> | ||
#include <string> | ||
#include <vector> | ||
|
||
namespace regex_helpers { | ||
regex_t compile(const char *pattern) { | ||
regex_t re; | ||
if (regcomp(&re, pattern, REG_EXTENDED | REG_NOSUB) != 0) { | ||
throw std::runtime_error("Pattern failed to compile."); | ||
} | ||
return re; | ||
} | ||
bool match(regex_t re, const std::string &text) { | ||
return regexec(&re, text.c_str(), 0, NULL, 0) == 0; | ||
} | ||
} // namespace regex_helpers | ||
|
||
void errorHook(); | ||
void exitWA() { | ||
errorHook(); | ||
std::exit(1); | ||
} | ||
void exitPE() { | ||
errorHook(); | ||
std::exit(2); | ||
} | ||
void assertWA(bool condition) { | ||
if (!condition) { | ||
exitWA(); | ||
} | ||
} | ||
void assertPE(bool condition) { | ||
if (!condition) { | ||
exitPE(); | ||
} | ||
} | ||
void readSpace() { assertPE(getchar() == ' '); } | ||
void readNewLine() { assertPE(getchar() == '\n'); } | ||
void readEOF() { assertPE(getchar() == EOF); } | ||
std::string readToken(char min_char = 0, char max_char = 127) { | ||
static constexpr size_t MAX_TOKEN_SIZE = 1e7; | ||
std::string token; | ||
int c = getchar(); | ||
assertPE(!isspace(c)); | ||
while (!isspace(c) && c != EOF) { | ||
assertWA(token.size() < MAX_TOKEN_SIZE); | ||
assertWA(min_char <= c && c <= max_char); | ||
token.push_back(char(c)); | ||
c = getchar(); | ||
} | ||
ungetc(c, stdin); | ||
return token; | ||
} | ||
std::string readLine(char min_char = 0, char max_char = 127) { | ||
static constexpr size_t MAX_LINE_SIZE = 1e7; | ||
std::string line; | ||
int c = getchar(); | ||
while (c != '\n') { | ||
assertPE(c != EOF); | ||
assertWA(line.size() < MAX_LINE_SIZE); | ||
assertWA(min_char <= c && c <= max_char); | ||
line.push_back(char(c)); | ||
c = getchar(); | ||
} | ||
return line; | ||
} | ||
long long readInt(long long lo, long long hi) { | ||
static regex_t re = regex_helpers::compile("^(0|-?[1-9][0-9]*)$"); | ||
std::string token = readToken(); | ||
assertWA(regex_helpers::match(re, token)); | ||
|
||
long long parsedInt; | ||
try { | ||
parsedInt = stoll(token); | ||
} catch (const std::invalid_argument &) { | ||
exitWA(); | ||
} catch (const std::out_of_range &) { | ||
exitWA(); | ||
} | ||
assertWA(lo <= parsedInt && parsedInt <= hi); | ||
return parsedInt; | ||
} | ||
long double readFloat(long double min, long double max, | ||
long double eps = 1e-9) { | ||
static regex_t re = regex_helpers::compile("^-?(0|[1-9][0-9])(\\.[0-9]+)?$"); | ||
std::string token = readToken(); | ||
assertWA(regex_helpers::match(re, token)); | ||
long double parsedDouble; | ||
try { | ||
parsedDouble = stold(token); | ||
} catch (const std::invalid_argument &) { | ||
exitWA(); | ||
} catch (const std::out_of_range &) { | ||
exitWA(); | ||
} | ||
assertWA(min - eps <= parsedDouble && parsedDouble <= max + eps); | ||
return parsedDouble; | ||
} | ||
template <typename T> | ||
std::vector<T> readIntArray(size_t N, long long lo, long long hi) { | ||
std::vector<T> arr; | ||
arr.reserve(N); | ||
for (size_t i = 0; i < N; i++) { | ||
arr.push_back(readInt(lo, hi)); | ||
if (i != N - 1) { | ||
readSpace(); | ||
} | ||
} | ||
readNewLine(); | ||
return arr; | ||
} | ||
void errorHook() {} | ||
|
||
// If this is a checker: | ||
// int main(int argc, char **argv) { | ||
// std::ifstream judge_input(argv[1]); | ||
// freopen(argv[2], "r", stdin); | ||
// std::ifstream judge_answer(argv[3]); | ||
// } | ||
|
||
// If this is an interactor: | ||
// int main(int argc, char **argv) { | ||
// std::ifstream judge_input(argv[1]); | ||
// std::ifstream judge_answer(argv[2]); | ||
// } |
171 changes: 171 additions & 0 deletions
171
sample_files/problem_setting/standard_checker_interactor.cpp
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,171 @@ | ||
#include <cstdio> | ||
#include <cstdlib> | ||
#include <fstream> | ||
#include <regex.h> | ||
#include <stdexcept> | ||
#include <string> | ||
#include <vector> | ||
|
||
void assertWA(bool); | ||
|
||
// Implementation of the tricky whitespace logic for standard checkers. | ||
namespace standard_whitespace_detail { | ||
enum WhitespaceFlag { NONE = 0, SPACE = 1, NEWLINE = 2, ALL = 3 }; | ||
WhitespaceFlag current_flag = ALL; // At checker start, consume all whitespace. | ||
|
||
void pokeFlag(WhitespaceFlag flag) { | ||
if (current_flag != NONE && (current_flag != NEWLINE || flag != ALL)) { | ||
throw std::runtime_error("Never call two whitespace methods in a row, " | ||
"except for readNewLine() followed by readEOF()."); | ||
} | ||
current_flag = flag; | ||
} | ||
|
||
enum ConsumeResult { | ||
NO_WHITESPACE, | ||
NO_LINES, | ||
LINES, | ||
}; | ||
ConsumeResult consumeWhitespace() { | ||
int c = getchar(); | ||
ConsumeResult result = NO_WHITESPACE; | ||
while (isspace(c) && c != EOF) { | ||
if (result == NO_WHITESPACE) { | ||
result = NO_LINES; | ||
} | ||
if (c == '\r' || c == '\n') { | ||
result = LINES; | ||
} | ||
c = getchar(); | ||
} | ||
ungetc(c, stdin); | ||
current_flag = NONE; | ||
return result; | ||
} | ||
|
||
void preReadToken() { | ||
switch (current_flag) { | ||
case NONE: | ||
throw std::runtime_error( | ||
"Must not call readInt (or readToken, or readFloat) twice in a row!"); | ||
case SPACE: | ||
assertWA(consumeWhitespace() == NO_LINES); | ||
break; | ||
case NEWLINE: | ||
assertWA(consumeWhitespace() == LINES); | ||
break; | ||
case ALL: | ||
consumeWhitespace(); | ||
break; | ||
} | ||
} | ||
} // namespace standard_whitespace_detail | ||
|
||
namespace regex_helpers { | ||
regex_t compile(const char *pattern) { | ||
regex_t re; | ||
if (regcomp(&re, pattern, REG_EXTENDED | REG_NOSUB) != 0) { | ||
throw std::runtime_error("Pattern failed to compile."); | ||
} | ||
return re; | ||
} | ||
bool match(regex_t re, const std::string &text) { | ||
return regexec(&re, text.c_str(), 0, NULL, 0) == 0; | ||
} | ||
} // namespace regex_helpers | ||
|
||
void errorHook(); | ||
void exitWA() { | ||
errorHook(); | ||
std::exit(1); | ||
} | ||
void assertWA(bool condition) { | ||
if (!condition) { | ||
exitWA(); | ||
} | ||
} | ||
void readSpace() { | ||
standard_whitespace_detail::pokeFlag(standard_whitespace_detail::SPACE); | ||
} | ||
void readNewLine() { | ||
standard_whitespace_detail::pokeFlag(standard_whitespace_detail::NEWLINE); | ||
} | ||
void readEOF() { | ||
standard_whitespace_detail::pokeFlag(standard_whitespace_detail::ALL); | ||
standard_whitespace_detail::consumeWhitespace(); | ||
assertWA(getchar() == EOF); | ||
} | ||
std::string readToken(char min_char = 0, char max_char = 127) { | ||
standard_whitespace_detail::preReadToken(); | ||
static constexpr size_t MAX_TOKEN_SIZE = 1e7; | ||
std::string token; | ||
int c = getchar(); | ||
assertWA(!isspace(c)); | ||
while (!isspace(c) && c != EOF) { | ||
assertWA(token.size() < MAX_TOKEN_SIZE); | ||
assertWA(min_char <= c && c <= max_char); | ||
token.push_back(char(c)); | ||
c = getchar(); | ||
} | ||
ungetc(c, stdin); | ||
return token; | ||
} | ||
long long readInt(long long lo, long long hi) { | ||
static regex_t re = regex_helpers::compile("^(0|-?[1-9][0-9]*)$"); | ||
std::string token = readToken(); | ||
assertWA(regex_helpers::match(re, token)); | ||
|
||
long long parsedInt; | ||
try { | ||
parsedInt = stoll(token); | ||
} catch (const std::invalid_argument &) { | ||
exitWA(); | ||
} catch (const std::out_of_range &) { | ||
exitWA(); | ||
} | ||
assertWA(lo <= parsedInt && parsedInt <= hi); | ||
return parsedInt; | ||
} | ||
long double readFloat(long double min, long double max, | ||
long double eps = 1e-9) { | ||
static regex_t re = regex_helpers::compile("^-?(0|[1-9][0-9])(\\.[0-9]+)?$"); | ||
std::string token = readToken(); | ||
assertWA(regex_helpers::match(re, token)); | ||
long double parsedDouble; | ||
try { | ||
parsedDouble = stold(token); | ||
} catch (const std::invalid_argument &) { | ||
exitWA(); | ||
} catch (const std::out_of_range &) { | ||
exitWA(); | ||
} | ||
assertWA(min - eps <= parsedDouble && parsedDouble <= max + eps); | ||
return parsedDouble; | ||
} | ||
template <typename T> | ||
std::vector<T> readIntArray(size_t N, long long lo, long long hi) { | ||
std::vector<T> arr; | ||
arr.reserve(N); | ||
for (size_t i = 0; i < N; i++) { | ||
arr.push_back(readInt(lo, hi)); | ||
if (i != N - 1) { | ||
readSpace(); | ||
} | ||
} | ||
readNewLine(); | ||
return arr; | ||
} | ||
void errorHook() {} | ||
|
||
// If this is a checker: | ||
// int main(int argc, char **argv) { | ||
// std::ifstream judge_input(argv[1]); | ||
// freopen(argv[2], "r", stdin); | ||
// std::ifstream judge_answer(argv[3]); | ||
// } | ||
|
||
// If this is an interactor: | ||
// int main(int argc, char **argv) { | ||
// std::ifstream judge_input(argv[1]); | ||
// std::ifstream judge_answer(argv[2]); | ||
// } |
Oops, something went wrong.