Problem Package Format
This is the 2023-07-draft
version of the Kattis problem package format.
Overview
This document describes the format of a Kattis problem package, used for distributing and sharing problems for algorithmic programming contests as well as educational use.
General Requirements
- The package must consist of a single directory containing files as described below. The directory name must consist solely of lower case letters a–z and digits 0–9. Alternatively, the package can be a ZIP-compressed archive of such a directory with identical base name and extension
.kpp
or.zip
. - All file names for files included in the package must match the regexp
i.e., they must be of length at least 2, consist solely of lower or upper case letters a–z, A–Z, digits 0–9, period, dash, or underscore, but must not begin or end with period, dash, or underscore.[a-zA-Z0-9][a-zA-Z0-9_.-]*[a-zA-Z0-9]
- All text files for a problem must be UTF-8 encoded and not have a byte-order mark (BOM).
- All text files must have Unix-style line endings (newline/LF byte only). Note that LF is line-ending and not line-separating in POSIX, which means that all non-empty text files must end with a newline.
- All floating-point numbers must be given as the external character sequences defined by IEEE 754-2008 and may use up to double precision.
Programs
There are a number of different kinds of programs that may be provided in the problem package: submissions, input validators, and output validators. All programs are always represented by a single file or directory. In other words, if a program consists of several files, these must be provided in a single directory. In the case that a program is a single file, it is treated as if a directory with the same name takes its place, which contains only that file. The name of the program, for the purpose of referring to it within the package, is the base name of the file or the name of the directory. There can't be two programs of the same kind with the same name.
Validators, but not submissions, in the form of a directory may include two POSIX-compliant scripts, build
and run
. If at least one of these two files is included:
- First, if the
build
script is present, it will be run. The working directory will be (a copy of) the program directory. Therun
file must exist afterbuild
is done. - Then, the
run
file (which now exists) must be executable, and will be invoked in the same way as a single file program.
Programs without build
and run
scripts are built and run according to what language is used. Language is determined by the language
key in submissions.yaml
if present (for submissions only); otherwise, by looking at the file endings as specified in the languages table. For validators, this language must be C++ or Python 3. If a single language can't be determined, building fails.
For languages where there could be several entry points, the entrypoint specified by the entrypoint
key in submissions.yaml
if present (for submissions only) is used; otherwise, the default entry point in the languages table will be used.
The binary (and other artifacts that result from compiling the program) must be placed in the program's directory (or a copy of it). Each submission must be run with a working directory that contains (a copy of) the submitted files and any compiled binaries, but it must not contain any of the files described in the "Test data" section. Each input validator must be run with a working directory that contains the files in the program directory of the input validator in question. Each output validator must be run with a working directory that contains the submitted files and any compiled binaries of the submission being validated.
Problem Metadata
Metadata about the problem (e.g., source, license, limits) are provided in a YAML file named problem.yaml
placed in the root directory of the package.
The keys are defined as below. Keys are optional unless explicitly stated. Any unknown keys should be treated as an error.
Key | Type | Default | Comments |
---|---|---|---|
problem_format_version | String | Required. Version of the Problem Package Format used for this package. If using this version of the Format, it must be the string 2023-07-draft . The string will be in the form <yyyy>-<mm> for a stable version, <yyyy>-<mm>-draft or draft for a draft version, or legacy or legacy-icpc for the version before the addition of problem_format_version. Documentation for version <version> is available at https://www.kattis.com/problem-package-format/spec/problem_package_format/<version> . | |
type | String or sequence of strings | pass-fail | Type of problem. Values must be from the ones defined below. |
name | String or map of strings | Required. The name of the problem in each of the languages for which a problem statement exists. See below for details. | |
uuid | String | Required. UUID identifying the problem. | |
credits | String or map with keys as defined below. | Who should get credits. See below for details. | |
source | String, a sequence, or a map as defined below. | The problem set from which the problem originates. See below for details. | |
license | String | unknown | License under which the problem may be used. Value has to be one of the ones defined below. |
rights_owner | String | Value of authors, if present, otherwise value of source. | Owner of the copyright of the problem. If not present, authors are the owners. If author is not present either, source is owner. Required if license is something other than unknown or public domain . Forbidden if license is public domain . |
limits | Map with keys as defined below | see definition below | |
keywords | Sequence of strings | List of keywords. | |
languages | String or sequence of strings | all | List of programming languages or the string all . |
constants | map of strings to int, float, or string | Global constant values used by the problem. See definition below. |
Type
Allowed values for type.
Value | Incompatible with | Comments |
---|---|---|
pass-fail | scoring | Default. Submissions are judged as either accepted or rejected (though the "rejected" judgement is more fine-grained and divided into results such as "Wrong Answer", "Time Limit Exceeded", etc). |
scoring | pass-fail | A submission that is accepted is additionally given a score, which is a non-negative numeric value (and the goal is to maximize this value). |
multi-pass | submit-answer | A submission should run multiple times with inputs for the next pass generated by the output validator of the current pass. |
interactive | submit-answer | The output validator is run interactively with the submission. |
submit-answer | multi-pass, interactive | A submission consists of the answers to the test cases, instead of source code for a program that produces the answers. |
Name
If there are statements in more than one language, the name
field must be a map with the language codes as keys and the problem names as values. The set of languages for which name
is given must exactly match the set of languages for which a problem statement exists.
If only a single problem statement exists, this may be a string with the name of the problem in that language (but a map with a single key is allowed).
Credits
Allowed keys for credits.
A person is specified as either a string with their full name (optionally with an email formatted as Full Name <[email protected]>
), or a map with keys name
and email
, which both have values of type string.
Key | Type | Default | Comments |
---|---|---|---|
authors | Person or sequence of persons | The people who conceptualized the problem. | |
contributors | Person or sequence of persons | The people who developed the problem package, such as statement, validators, and test data. | |
testers | Person or sequence of persons | The people who tested the problem package, for example by providing a solution and reviewing the statement. | |
translators | Map of strings to sequences of persons | The people who translated the statement to other languages. Each key must be an ISO 639 language code. | |
acknowledgements | Person or sequence of persons | Extra acknowledgements or special thanks that do not fit the other categories. |
The examples
credits:
authors: [Author One, Author Two <[email protected]>]
and
credits:
authors:
- name: Author One
- name: Author Two
email: [email protected]
are both valid ways to describe the same two authors.
A full example would be
credits:
authors: Authy McAuth <[email protected]>
contributors:
- Authy McAuth <[email protected]>
- name: Additional Contributor
email: [email protected]
testers:
- Tester One
- Tester Two
- Tester Three
translators:
da:
- name: Mads Jensen
email: [email protected]
eo:
- Ludoviko Lazaro Zamenhofo
acknowledgements:
- Inspirational Speaker 1
- Inspirational Speaker 2
which demonstrates all the available credit types.
Credits are sometimes omitted when authors instead choose to only give source credit, but both may be specified. If a string is provided instead of a map for credits, such as
credits: Authy McAuth <[email protected]>
it is treated as if only a single author is being specified, so it is equivalent to
credits:
authors: Authy McAuth <[email protected]>
to support a less verbose credits section.
Source
The source is defined by a sequence of maps, each containing two keys, name
and url
. The key name
is required, but the key url
is optional. If url
is omitted, a string may be used instead of a map to specify the name. If the sequence contains only one item, whether it be a string or a map, then it can be specified as a single string or a map.
If specified, each key name
should typically map to the name (and year) of the problem set (such as a contest or a course), where the problem was first used or for which it was created, and the key url
should map to a link to the event's page.
A sequence is specified in the special case that the first use was actually from multiple (typically simultaneous) sources.
License
Allowed values for license.
Values other than unknown or public domain require rights_owner to have a value.
Value | Comments | Link |
---|---|---|
unknown | The default value. In practice means that the problem can not be used. | |
public domain | There are no known copyrights on the problem, anywhere in the world. | http://creativecommons.org/about/pdm |
cc0 | CC0, "no rights reserved". | http://creativecommons.org/about/cc0 |
cc by | CC attribution. | http://creativecommons.org/licenses/by/4.0/ |
cc by-sa | CC attribution, share alike. | http://creativecommons.org/licenses/by-sa/4.0/ |
educational | May be freely used for educational purposes. | |
permission | Used with permission. The rights owner must be contacted for every additional use. |
Limits
A map with the following keys:
Key | Comments | Default | Typical system default |
---|---|---|---|
time_multipliers | optional | see below | |
time_limit | optional float, in seconds | see below | |
time_resolution | optional float, in seconds | 1.0 | |
memory | optional, in MiB | system default | 2048 |
output | optional, in MiB | system default | 8 |
code | optional, in KiB | system default | 128 |
compilation_time | optional, in seconds | system default | 60 |
compilation_memory | optional, in MiB | system default | 2048 |
validation_time | optional, in seconds | system default | 60 |
validation_memory | optional, in MiB | system default | 2048 |
validation_output | optional, in MiB | system default | 8 |
For most keys, the system default will be used if nothing is specified. This can vary, but you SHOULD assume that it's reasonable. Only specify limits when the problem needs a specific limit, but do specify limits even if the "typical system default" is what is needed.
Problem Timing
time_multipliers
is a map with the following keys:
Key | Comments | Default |
---|---|---|
ac_to_time_limit | float | 2.0 |
time_limit_to_tle | float | 1.5 |
The value of time_limit
is an integer or floating-point problem time limit in seconds. The time multipliers specify safety margins relative to the slowest accepted submission, T_ac
, and fastest time_limit_exceeded submission, T_tle
. The time_limit
must satisfy T_ac * ac_to_time_limit <= time_limit
and time_limit * time_limit_to_tle <= T_tle
. In these calculations, T_tle
is treated as infinity if the problem does not provide at least one time_limit_exceeded submission.
If no time_limit
is provided, the default value is the smallest integer multiple of time_resolution
that satisfies the above inequalities. It is an error if no such multiple exists. The time_resolution
key is ignored if the problem provides an explicit time limit (and in particular, the time limit is not required to be a multiple of the resolution). Since time multipliers are more future-proof than absolute time limits, avoid specifying time_limit
whenever practical.
Judge systems should make a best effort to respect the problem time limit, and should warn when importing a problem whose time limit is specified with precision greater than can be resolved by system timers.
Languages
A list of programming language codes from the table below or all
.
If a list is given, the problem may only be solved using those programming languages.
File endings in parenthesis are not used for determining language.
Code | Language | Default entry point | File endings |
---|---|---|---|
algol68 | Algol 68 | .a68 | |
apl | APL | .apl | |
bash | Bash | .sh | |
ada | Ada | .adb, .ads | |
c | C | .c | |
cgmp | C with GMP | (.c) | |
cobol | COBOL | .cob | |
cpp | C++ | .cc, .cpp, .cxx, .c++, .C | |
cppgmp | C++ with GMP | (.cc, .cpp, .cxx, .c++, .C) | |
crystal | Crystal | .cr | |
csharp | C# | .cs | |
dart | Dart | .dart | |
fsharp | F# | .fs | |
fortran | Fortran | .f90 | |
gerbil | Gerbil | .ss | |
go | Go | .go | |
haskell | Haskell | .hs | |
java | Java | Main | .java |
javascript | JavaScript | main.js | .js |
julia | Julia | .jl | |
kotlin | Kotlin | MainKt | .kt |
lisp | Common Lisp | main.{lisp,cl} | .lisp, .cl |
lua | Lua | .lua | |
nim | Nim | .nim | |
objectivec | Objective-C | .m | |
ocaml | OCaml | .ml | |
octave | Octave | (.m) | |
odin | Odin | .odin | |
pascal | Pascal | .pas | |
perl | Perl | .pm, (.pl) | |
php | PHP | main.php | .php |
prolog | Prolog | .pl | |
python2 | Python 2 | main.py2 | (.py), py2 |
python3 | Python 3 | main.py | .py, .py3 |
python3numpy | Python 3 with NumPy | main.py | (.py, .py3) |
ruby | Ruby | .rb | |
rust | Rust | .rs | |
scala | Scala | .scala | |
simula | Simula | .sim | |
snobol | Snobol | .sno | |
swift | Swift | .swift | |
typescript | TypeScript | .ts | |
visualbasic | Visual Basic | .vb | |
zig | Zig | .zig |
Constants
A map of names to values. Names must match the following regex: [a-zA-Z_][a-zA-Z0-9_]*
. Constant sequences are tokens (regex words) of the form {{name}}
, where name
is one of the names defined in constants
. Tags {{xyz}}
containing a name that is not defined are not modified, but may be warned for.
All constant sequences in the following files will be replaced by the value of the corresponding constant:
- problem statements
- input and output validators
- included code
- example submissions
testdata.yaml
Constant sequences are not replaced in test data files or in problem.yaml
itself.
Problem Statements
The problem statement of the problem is provided in the directory problem_statement/
.
This directory must contain one file per language, for at least one language, named problem.<language>.<filetype>
, that contains the problem text itself, including input and output specifications. Language must be given as the shortest ISO 639 code. If needed, a hyphen and an ISO 3166-1 alpha-2 code may be appended to an ISO 639 code. Optionally, the language code can be left out; the default is then English (en
). Filetype can be either tex
for LaTeX files, md
for Markdown, or pdf
for PDF.
Please note that many kinds of transformations on the problem statements, such as conversion to HTML or styling to fit in a single document containing many problems will not be possible for PDF problem statements, so using this format should be avoided if at all possible.
Auxiliary files needed by the problem statement files must all be in problem_statement/
. problem.<language>.<filetype>
should reference auxiliary files as if the working directory is problem_statement/
. Image file formats supported are .png
, .jpg
, .jpeg
, and .pdf
.
Sample Data
- For problem statements provided in LaTeX or Markdown: the statement file must contain only the problem description and input/output specifications and no sample data. It is the judge system's responsibility to append the sample data.
- For problem statements provided as PDFs: the judge system will display the PDF verbatim, and therefore any sample data must be included in the PDF. The judge system is not required to reconcile sample data embedded in PDFs with the
sample
test data group nor to validate it in any other way.
LaTeX Environment and Supported Subset
Problem statements provided in LaTeX must consist only of the problem statement body (i.e., the content that would be placed within a document
environment). It is the judging system's responsibility to wrap this text in an appropriate LaTeX class.
The LaTeX class shall provide the convenience environments Input
, Output
, and Interaction
for delineating sections of the problem statement. It shall also provide the following commands:
\problemname{name}
, which must be the first line of the problem statement.name
gives a LaTeX-formatted problem name to be used when rendering the problem statement header. This argument can be empty, in which case thename
value matching the problem statement's language fromproblem.yaml
is used instead.\illustration{width}{filename}{caption}
, a convenience command for adding a figure to the problem statement.width
is a floating-point argument specifying the width of the figure, as a fraction of the total width of the problem statement;filename
is the image to display, andcaption
, the text to include below the figure. The illustration should be flushed right with text flowing around it (as in awrapfigure
).
Arbitrary LaTeX is not guaranteed to render correctly by HTML-based judging systems. However, judging systems must make a best effort to correctly render at minimum the following LaTeX subset when displaying a LaTeX problem statement:
- All MathJax-supported TeX commands within inline (
$ $
) and display ($$ $$
) math mode. - The following text-mode environments:
itemize
,enumerate
,lstlisting
,verbatim
,quote
,center
,tabular
,figure
,wrapfigure
(from thewrapfig
package). \item
within list environments and\hline
,\cline
,\multirow
,\multicol
within tables.- The following typesetting constructs: smart quotes (
' '
,<< >>
,`` ''
), dashes (--
,---
), non-breaking space (~
), ellipses (\ldots
and\textellipsis
), and\noindent
. - The following font weight and size modifiers:
\bf
,\textbf
,\it
,\textit
,\t
,\tt
,\texttt
,\emph
,\underline
,\sout
,\textsc
,\tiny
,\scriptsize
,\small
,\normalsize
,\large
,\Large
,\LARGE
,\huge
,\Huge
. \includegraphics
from the packagegraphicx
, including the Polygon-style workaround for scaling the image using\def \htmlPixelsInCm
.- The miscellaneous commands
\url
,\href
,\section
,\subsection
, and\epigraph
.
Attachments
Public, i.e. non-secret, files to be made available in addition to the problem statement and sample test data are provided in the directory attachments/
.
Solution description
A description of how the problem is intended to be solved is provided in the directory solution/
.
This directory must contain one file per language, for at least one language, named solution.<language>.<filetype>
. Language is given the same way as for problem statements. Optionally, the language code can be left out; the default is then English (en
). The set of languages used can be different from what was used for the problem statement. Filetype can be either tex
for LaTeX files, md
for Markdown, or pdf
for PDF.
Auxiliary files needed by the solution description files must all be in solution/
. solution.<language>.<filetype>
should reference auxiliary files as if the working directory is solution/
.
Exactly how the solution description is used is up to the user or tooling.
Test data
The test data are provided in subdirectories of data/
. The sample data in data/sample/
and the secret data in data/secret/
.
All files and directories associated with a single test case have the same base name with varying extensions. Here base name is defined to be the relative path from the data
directory to the test case input file, without extensions. For example, the files secret/test.in
and secret/test.ans
are associated with the same test case that has the base name secret/test
. The existence of the .in
file declares the existence of the test case. If the test case exists, then an associated .ans
file must exist while the others are optional. If the test case does not exist, then the other files must not exist. The table below summarizes the supported test data:
Extension | Described In | Summary |
---|---|---|
.in | Input | Input piped to standard input |
.ans | Expected (AC) answer | |
.hint | Annotations | Hint for solving the test case |
.desc | Annotations | Purpose of the test |
.png , .jpg , .jpeg , .svg | Annotations | Illustration of the test case |
.interaction | Interactive Problems | Interaction protocol sample |
.args | Input | Input passed as command-line arguments |
.files | Input | Input available via file I/O |
Judge systems may assume that the result of running a program on a test case is deterministic. For any two test cases, if the contents of their .in
and .args
files and .files
directory are equivalent, then the input of the two test cases is equivalent. This means that for any two test cases, if their input, output validator flags (see test data groups below) and the contents of their .ans
files are equivalent, then the test cases are equivalent. The assumption of determinism means that a judge system could choose to reuse the result of a previous run, or to re-run the equivalent test case.
Input
Each test case can supply input via standard input, command-line arguments, and/or the file system. These options are not exclusive. For a test case with base name test
, the file test.in
is piped to the submission as standard input. The submission will be run with the whitespace-separated tokens in test.args
as command-line arguments, if the file exists. Note that usually the submission's entry point, whether it be a binary or an interpreted file, will be the absolute first command line argument. However, there are languages such as Java where there is no initial command line argument representing the entry point.
The directory test.files
, if it exists, contains privileged data files available to the submission via file I/O. All files in this directory must be copied into the submission's working directory after compiling, but before executing the submission, possibly overwriting the compiled submission file or included data in the case of name conflicts.
Annotations
One hint, description, and/or illustration file may be provided per test case. The files must share the base name of the associated test case. Description and illustration files are meant to be privileged information.
Category | File type | Filename extension | Remark |
---|---|---|---|
hint | text | .hint | |
description | text | .desc | privileged information |
illustration | image | .png , .jpg , .jpeg , or .svg | privileged information |
-
A hint provides feedback for solving a test case to, e.g., somebody whose submission didn't pass.
-
A description conveys the purpose of a test case. It is an explanation of what aspect or edge case of the solution that the input file is meant to test.
-
An illustration provides a visualization of the associated test case. Note that at most one image file may exist for each test case.
Interactive Problems
For interactive problems, any sample test cases must provide an interaction protocol as a text file with the extension .interaction
for each sample demonstrating the communication between the submission and the output validator, meant to be displayed in the problem statement. An interaction protocol consists of a series of lines starting with >
and <
. Lines starting with >
signify an output from the submission to the output validator, while <
signify an input from the output validator to the submission.
A sample test case may have just an .interaction
file without a corresponding .in
and .ans
file. However, if either of a .in
or a .ans
file is present the other one must also be present. Unlike .in
and .ans
files for non-interactive problem, interactive .in
and .ans
files must not be displayed to teams: not in the problem statement, nor as part of sample input download. If you want to provide files related to interactive problems (such as testing tools or input files) you can use attachments.
Test Data Groups
The test data for the problem can be organized into a tree-like structure. Each node of this tree is represented by a directory and referred to as a test data group. Each test data group may consist of zero or more test cases (i.e., input-answer files) and zero or more subgroups of test data (i.e., subdirectories).
At the top level, the test data is divided into exactly two groups: sample
and secret
. These two groups may be further split into subgroups as desired.
The result of a test data group is computed by applying a grader to all of the sub-results (test cases and subgroups) in the group. See Scoring for more details.
Test cases and groups will be used in lexicographical order on file base name. If a specific order is desired a numbered prefix such as 00
, 01
, 02
, 03
, and so on, can be used.
In each test data group, a YAML file testdata.yaml
may be placed to specify how the result of the test data group should be computed. Some of the keys and their associated values will be inherited from the testdata.yaml
in the closest ancestor group from the test case to the root data
directory that has one. Others need to be explicitly defined in the group's testdata.yaml
file, otherwise they are set to the default values. If there is no testdata.yaml
file in the root data
group, one is implicitly added with the default values.
The format of testdata.yaml
is as follows:
Key | Type | Default | Inheritance | Comments |
---|---|---|---|---|
scoring | Map | See Scoring | Not inherited | Description of how the results of the group test cases and subgroups should be aggregated. |
input_validator_flags | String or map of strings to strings | empty string | Inherited | Arguments passed to each input validator for this test data group. If a string then those are the flags that will be passed to each input validator for this test data group. If a map then each key is the name of the input validator and the value is the flags to pass to that input validator for this test data group. Validators not present in the map are run without flags. |
output_validator_flags | String | empty string | Inherited | Arguments passed to the output validator for this test data group. |
run_samples | Boolean | true | Not applicable | Signifies whether samples should be run, see below for details. |
Skipping Execution of Samples
In some problem, the author may want to provide samples in the statement for demonstrative purposes, but does not want them to be run. One example of the uses for this are problems that require probabilistic methods to solve. In such a problem, the sample input is small for the sake of demonstration, but the intended solution works reliably only for large test cases. The option run_samples
may be used to signal to the judging system whether samples should or should not be run.
To prevent sample test cases from being run, the author may specify
run_samples: false
in the root data
group's testdata.yaml
file. This option may only be used at the root and if omitted, its value defaults to true
. The samples should still be validated by input and output validators. Note that the keys input_validator_flags
and output_validator_flags
in testdata.yaml
, as described above, may be used to alter the behavior of the validators for the sample test data group.
Invalid Test Cases
The data
directory may contain directories of test cases that must be rejected by validation. Their goal is to ensure integrity and quality of the test data and validation programs.
Invalid input
The files under invalid_input
are invalid inputs. Unlike in sample
and secret
, there are no .ans
files. Each tc.in
under invalid_input
must be rejected by at least one input validator.
Invalid output
The test cases in invalid_output
describe invalid outputs for non-interactive problems. They consist of three files. The input file tc.in
must pass input validation. The default answer file tc.ans
must pass output validation. The output file tc.out
must fail output validation.
In particular, for an existing feedback directory dir
,
<output_validator_program> tc.in tc.ans dir [flags] < tc.ans # MUST PASS
<output_validator_program> tc.in tc.ans dir [flags] < tc.out # MUST FAIL
The directory invalid
can be organised into a tree-like structure similar to secret
and contain flags in testdata.yaml
files that are passed to the validators.
Included Files
Files that should be included with all submissions are provided in one non-empty directory per supported language. Files that should be included for all languages are placed in the non-empty directory include/default/
. Files that should be included for a specific language, overriding the default, are provided in the non-empty directory include/<language>/
. If either no included files are provided or default files are provided, then the set of languages in which a submission is allowed to be is unrestricted. However, if included files are provided in specific languages, but no default files are provided, then the set of languages is restricted to the specified languages.
The files should be copied from a language directory based on the language of the submission, to the submission files before compiling, but after checking whether the submission exceeds the code limit, overwriting files from the submission in the case of name collision. Language must be given as one of the language codes in the language table in the overview section. If any of the included files are supposed to be the main file (i.e., a driver), that file must have the language-dependent name as given in the table referred above.
Example Submissions
Correct and incorrect solutions to the problem are provided in subdirectories of submissions/
. The possible subdirectories are:
Value | Requirement | Comment |
---|---|---|
accepted | Accepted as a correct solution for all test cases. | At least one is required. |
partially_accepted | Overall verdict must be Accepted. Overall score must be less than max_score . | Must not be used for pass-fail problems. |
rejected | Is rejected (i.e., not accepted) for any reason in the final verdict. | |
wrong_answer | Wrong answer for some test case. May be too slow or crash for other test cases. | |
time_limit_exceeded | Too slow relative to the safety margin for some test case. May output wrong answers or crash for other test cases. | |
run_time_error | Crashes for some test case. May output wrong answers or be too slow for other test cases. | Very rarely useful. |
Every file or directory in these directories represents a separate solution. It is mandatory to provide at least one accepted solution.
Metadata about the example submissions are provided in a YAML file named submissions.yaml
placed directly in the submissions/
directory. The top level keys in submissions.yaml
are globs matching example submissions. For example, accepted/*
would match any submission in the submissions/accepted/
directory.
Each glob maps to a map with keys as defined below, specifying metadata for all submissions that are matched by the glob.
Key | Type | Default | Comment |
---|---|---|---|
language | String | As determined by file endings given in the language list | |
entrypoint | String | As specified in the language list | |
authors | Person or sequence of persons | Author(s) of submission(s) |
Input Validators
Input Validators, for verifying the correctness of the input files, are provided in input_validators/
. Input validators can be specified as VIVA-files (with file ending .viva
), Checktestdata-file (with file ending .ctd
), or as a program. Programs must either be written in C++ or Python 3, or must provide a build
or run
script as specified above.
All input validators provided will be run on every input file. Validation fails if any validator fails.
Invocation
An input validator program must be an application (executable or interpreted) capable of being invoked with a command line call.
All input validators provided will be run on every test data file using the arguments specified for the test data group they are part of. Validation fails if any validator fails.
When invoked, the input validator will get the input file on stdin.
The validator should be possible to use as follows on the command line:
<input_validator_program> [arguments] < inputfile
Output
The input validator may output debug information on stdout and stderr. This information may be displayed to the user upon invocation of the validator.
Exit codes
The input validator must exit with code 42 on successful validation. Any other exit code means that the input file could not be confirmed as valid.
Dependencies
The validator MUST NOT read any files outside those defined in the Invocation section. Its result MUST depend only on these files and the arguments.
Output Validator
Overview
An output validator is a program that is given the output of a submitted program, together with the corresponding input file, and a correct answer file for the input, and then decides whether the output provided is a correct output for the given input file.
A validator program must be an application (executable or interpreted) capable of being invoked with a command line call. The details of this invocation are described below. The validator program has two ways of reporting back the results of validating:
- The validator must give a judgement (see Reporting a judgement).
- The validator may give additional feedback, e.g., an explanation of the judgement to humans (see Reporting Additional Feedback).
A custom output validator is used if the problem requires more complicated output validation than what is provided by the default diff variant described below. It must be provided as the directory output_validator/
. It must either be written in C++ or Python 3, or must provide a build
or run
script as specified above. It must adhere to the Output validator specification described below. If no custom validator is provided, the default output validator will be used.
The output validator will be run on the output for every test data file using the arguments specified for the test data group.
Default Output Validator Specification
The default output validator is essentially a beefed-up diff. In its default mode, it tokenizes the files to compare and compares them token by token. It supports the following command-line arguments to control how tokens are compared.
Arguments | Description |
---|---|
case_sensitive | indicates that comparisons should be case-sensitive. |
space_change_sensitive | indicates that changes in the amount of whitespace should be rejected (the default is that any sequence of 1 or more whitespace characters are equivalent). |
float_relative_tolerance ε | indicates that floating-point tokens should be accepted if they are within relative error ≤ ε (see below for details). |
float_absolute_tolerance ε | indicates that floating-point tokens should be accepted if they are within absolute error ≤ ε (see below for details). |
float_tolerance ε | short-hand for applying ε as both relative and absolute tolerance. |
When supplying both a relative and an absolute tolerance, the semantics are that a token is accepted if it is within either of the two tolerances. When a floating-point tolerance has been set, any valid formatting of floating point numbers is accepted for floating point tokens. So, for instance, if a token in the answer file says 0.0314
, a token of 3.14000000e-2
in the output file would be accepted.
It is an error to provide any of the float_relative_tolerance
, float_absolute_tolerance
, or float_tolerance
arguments more than once, or to provide a float_tolerance
alongside float_relative_tolerance
and/or float_absolute_tolerance
.
If no floating point tolerance has been set, floating point tokens are treated just like any other token and have to match exactly.
Invocation
When invoked the output validator will be passed at least three command line parameters and the output stream to validate on stdin.
The validator should be possible to use as follows on the command line:
<output_validator_program> input judge_answer feedback_dir [additional_arguments] < team_output [ > team_input ]
The meaning of the parameters listed above are:
-
input: a string specifying the name of the input data file that was used to test the program whose results are being validated.
-
judge_answer: a string specifying the name of an arbitrary "answer file" which acts as input to the validator program. The answer file may, but is not necessarily required to, contain the "correct answer" for the problem. For example, it might contain the output that was produced by a judge's solution for the problem when run with input file as input. Alternatively, the "answer file" might contain information, in arbitrary format, which instructs the validator in some way about how to accomplish its task. The meaning of the contents of the answer file is not defined by this format.
-
feedback_dir: a string which specifies the name of a "feedback directory" in which the validator can produce "feedback files" in order to report additional information on the validation of the output file. The feedbackdir must end with a path separator (typically ‘/' or ‘\' depending on operating system), so that simply appending a filename to feedbackdir gives the path to a file in the feedback directory.
-
additional_arguments: in case the problem specifies additional validator_flags, these are passed as additional arguments to the validator on the command line.
-
team_output: the output produced by the program being validated is given on the validator's standard input pipe.
-
team_input: when running the validator in interactive mode everything written on the validator's standard output pipe is given to the program being validated. Please note that when running interactive the program will only receive the output produced by the validator and will not have direct access to the input file.
The two files pointed to by input and judge_answer must exist (though they are allowed to be empty) and the validator program must be allowed to open them for reading. The directory pointed to by feedback_dir must also exist.
Reporting a judgement
A validator program is required to report its judgement by exiting with specific exit codes:
- If the output is a correct output for the input file (i.e., the submission that produced the output is to be Accepted), the validator exits with exit code 42.
- If the output is incorrect (i.e., the submission that produced the output is to be judged as Wrong Answer), the validator exits with exit code 43.
Any other exit code (including 0!) indicates that the validator did not operate properly, and the judging system invoking the validator must take measures to report this to contest personnel. The purpose of these somewhat exotic exit codes is to avoid conflict with other exit codes that results when the validator crashes. For instance, if the validator is written in Java, any unhandled exception results in the program crashing with an exit code of 1, making it unsuitable to assign a judgement meaning to this exit code.
Reporting Additional Feedback
The purpose of the feedback directory is to allow the validator program to report more information to the judging system than just the accept/reject verdict. Using the feedback directory is optional for a validator program, so if one just wants to write a bare-bones minimal validator, it can be ignored.
The validator is free to create different files in the feedback directory, in order to provide different kinds of information to the judging system, in a simple but organized way. For instance, there may be a "judgemessage.txt" file, the contents of which gives a message that is presented to a judge reviewing the current submission (typically used to help the judge verify why the submission was judged as incorrect, by specifying exactly what was wrong with its output). Other examples of files that may be useful in some contexts (though not in the ICPC) are a score.txt file, giving the submission a score based on other factors than correctness, or a teammessage.txt file, giving a message to the team that submitted the solution, providing additional feedback on the submission.
A judging system that implements this format must support the judgemessage.txt file described above (I.e., content of the "judgemessage.txt" file, if produced by the validator, must be provided by the judging system to a human judge examining the submission). Having the judging system support other files is optional.
Note that a validator may choose to ignore the feedback directory entirely. In particular, the judging system must not assume that the validator program creates any files there at all.
Multi-pass validation
A multi-pass validator can be used for problems that should run the submission multiple times sequentially, using a new input generated by output validator during the previous invocation of the submission.
The time and memory limit apply for each invocation separately.
To signal that the submission should be run again, the output validator must exit with code 42 and output the new input in the file nextpass.in
in the feedback directory. Judging stops if no nextpass.in
was created or the output validator exited with any other code. Note that the nextpass.in
will be removed before the next pass.
It is a judge error to create the nextpass.in
file and exit with any other code than 42.
All other files inside the feedback directory are guaranteed to persist between passes. In particular, the validator should only append text to the "judgemessage.txt" to provide combined feedback for all passes.
Samples for multi-pass problems must be provided in a .interaction
file, like for interactive problems. Passes are separated by a line containing ---
(three dashes). When the problem is not interactive, simply start each pass by a number of lines starting with <
, containing the sample input, followed by some lines starting with >
, containing the sample answer.
Examples
An example of a judgemessage.txt file:
Team failed at test case 14.
Team output: "31", Judge answer: "30".
Team failed at test case 18.
Team output: "hovercraft", Judge answer: "7".
Summary: 2 test cases failed.
An example of a teammessage.txt file:
Almost all test cases failed — are you even trying to solve the problem?
Validator standard output and standard error
A validator program is allowed to write any kind of debug information to its standard error pipe. This information may be displayed to the user upon invocation of the validator.
Scoring
For pass-fail problems, the verdict of a submission is the first non-accepted verdict, where test cases are run in lexicographical order of their full file paths (note that sample
comes before secret
in this order).
In scoring problems, submissions are given a non-negative score. The goal of the submission is to maximize the score.
Given a submission, scores are determined for test cases, test groups, and the submission itself.
The score of a failed test case is always 0. By default, the score of an accepted test case is 1. If a custom output validator produces a score or if a different score
is set in testdata.yaml
, then the product of those values is used instead.
The score of a test group is determined by its subgroups and test cases. If it has no subgroups or test cases, then its score is 0. Otherwise, the score depends on the aggregation mode, which is either min
or sum
. The score of the group then results in the minimum or the sum of all the sub-scores, respectively. If using min
aggregation, then the verdict must not be accepted if any sub-verdict is rejected. If using sum
aggregation, then the verdict must be accepted if any sub-verdict is accepted.
The score of the submission is its score on the topmost test group data
.
Ultimately, the score cannot be non-zero if the topmost verdict is not accepted. Generally, an accepted verdict will be accompanied by a strictly positive score. However, note that there can be a distinction between a non-accepted verdict and an accepted verdict with score 0. An accepted verdict with a score of 0 could mean that the output is well formed but not good enough to score points.
For example, consider an optimization problem such as the Travelling Salesperson Problem. The goal is to provide the shortest possible route, and score is determined by how short of a route the submission outputs. If a submission instead outputs the longest possible route, it may get a verdict of accepted, because the output is well formed. However, the route is so long that by most metrics, the submission's score would be 0. In most cases, an output validator should probably reject a test case rather than accepting and returning a score of 0, but this is a detail left to the author of the problem.
The scoring behavior is configured by the following flags under scoring
in testdata.yaml
:
Key | Type | Description |
---|---|---|
score | String | The score assigned to an accepted input file in the group. If a scoring output validator is used, this score is multiplied by the score from the validator. |
max_score | String | A number specifying the maximum score allowed for this test group. It is an error to exceed this. |
aggregation | sum or min | If sum, the score is the sum of the subresult scores. If min, the score is the minimum of the subresult scores. |
The defaults are as follows:
- The data and secret group:
score
is 1,aggregation
is sum. - The sample group:
score
is 0,aggregation
issum
. - Other groups:
score
is1
,aggregation
ismin
. max_score
is the sum of themax_score
of the subresults ifscoring
issum
, and the minimum of themax_score
of the subresults ifscoring
ismin
. For individual testcases, themax_score
here means thescore
value of the group it is in.
They are chosen to minimize the configuration needed for a typical IOI-style problem, where points are given for passing all test cases in test data groups called subtasks. To achieve that behaviour, create groups secret/subtask1
, secret/subtask2
, … and set the score
value in the secret/subtaskX/testdata.yaml
.