Problem Package Format

This is the 2023-07-draft version of the Kattis problem package format.

Overview

This document describes the format of a Kattis problem package, used for distributing and sharing problems for algorithmic programming contests as well as educational use.

General Requirements

  • The package must consist of a single directory containing files as described below. The directory name must consist solely of lower case letters a–z and digits 0–9. Alternatively, the package can be a ZIP-compressed archive of such a directory with identical base name and extension .kpp or .zip.
  • All file names for files included in the package must match the regexp
    [a-zA-Z0-9][a-zA-Z0-9_.-]*[a-zA-Z0-9]
    
    i.e., they must be of length at least 2, consist solely of lower or upper case letters a–z, A–Z, digits 0–9, period, dash, or underscore, but must not begin or end with period, dash, or underscore.
  • All text files for a problem must be UTF-8 encoded and not have a byte-order mark (BOM).
  • All text files must have Unix-style line endings (newline/LF byte only). Note that LF is line-ending and not line-separating in POSIX, which means that all non-empty text files must end with a newline.
  • All floating-point numbers must be given as the external character sequences defined by IEEE 754-2008 and may use up to double precision.

Programs

There are a number of different kinds of programs that may be provided in the problem package: submissions, input validators, and output validators. All programs are always represented by a single file or directory. In other words, if a program consists of several files, these must be provided in a single directory. In the case that a program is a single file, it is treated as if a directory with the same name takes its place, which contains only that file. The name of the program, for the purpose of referring to it within the package, is the base name of the file or the name of the directory. There can't be two programs of the same kind with the same name.

Validators, but not submissions, in the form of a directory may include two POSIX-compliant scripts, build and run. If at least one of these two files is included:

  1. First, if the build script is present, it will be run. The working directory will be (a copy of) the program directory. The run file must exist after build is done.
  2. Then, the run file (which now exists) must be executable, and will be invoked in the same way as a single file program.

Programs without build and run scripts are built and run according to what language is used. Language is determined by the language key in submissions.yaml if present (for submissions only); otherwise, by looking at the file endings as specified in the languages table. For validators, this language must be C++ or Python 3. If a single language can't be determined, building fails.

For languages where there could be several entry points, the entrypoint specified by the entrypoint key in submissions.yaml if present (for submissions only) is used; otherwise, the default entry point in the languages table will be used.

The binary (and other artifacts that result from compiling the program) must be placed in the program's directory (or a copy of it). Each submission must be run with a working directory that contains (a copy of) the submitted files and any compiled binaries, but it must not contain any of the files described in the "Test data" section. Each input validator must be run with a working directory that contains the files in the program directory of the input validator in question. Each output validator must be run with a working directory that contains the submitted files and any compiled binaries of the submission being validated.

Problem Metadata

Metadata about the problem (e.g., source, license, limits) are provided in a YAML file named problem.yaml placed in the root directory of the package.

The keys are defined as below. Keys are optional unless explicitly stated. Any unknown keys should be treated as an error.

Key Type Default Comments
problem_format_version String Required. Version of the Problem Package Format used for this package. If using this version of the Format, it must be the string 2023-07-draft. The string will be in the form <yyyy>-<mm> for a stable version, <yyyy>-<mm>-draft or draft for a draft version, or legacy or legacy-icpc for the version before the addition of problem_format_version. Documentation for version <version> is available at https://www.kattis.com/problem-package-format/spec/problem_package_format/<version>.
type String or sequence of strings pass-fail Type of problem. Values must be from the ones defined below.
name String or map of strings Required. The name of the problem in each of the languages for which a problem statement exists. See below for details.
uuid String Required. UUID identifying the problem.
credits String or map with keys as defined below. Who should get credits. See below for details.
source String, a sequence, or a map as defined below. The problem set from which the problem originates. See below for details.
license String unknown License under which the problem may be used. Value has to be one of the ones defined below.
rights_owner String Value of authors, if present, otherwise value of source. Owner of the copyright of the problem. If not present, authors are the owners. If author is not present either, source is owner. Required if license is something other than unknown or public domain. Forbidden if license is public domain.
limits Map with keys as defined below see definition below
keywords Sequence of strings List of keywords.
languages String or sequence of strings all List of programming languages or the string all.
constants map of strings to int, float, or string Global constant values used by the problem. See definition below.

Type

Allowed values for type.

Value Incompatible with Comments
pass-fail scoring Default. Submissions are judged as either accepted or rejected (though the "rejected" judgement is more fine-grained and divided into results such as "Wrong Answer", "Time Limit Exceeded", etc).
scoring pass-fail A submission that is accepted is additionally given a score, which is a non-negative numeric value (and the goal is to maximize this value).
multi-pass submit-answer A submission should run multiple times with inputs for the next pass generated by the output validator of the current pass.
interactive submit-answer The output validator is run interactively with the submission.
submit-answer multi-pass, interactive A submission consists of the answers to the test cases, instead of source code for a program that produces the answers.

Name

If there are statements in more than one language, the name field must be a map with the language codes as keys and the problem names as values. The set of languages for which name is given must exactly match the set of languages for which a problem statement exists.

If only a single problem statement exists, this may be a string with the name of the problem in that language (but a map with a single key is allowed).

Credits

Allowed keys for credits.

A person is specified as either a string with their full name (optionally with an email formatted as Full Name <[email protected]>), or a map with keys name and email, which both have values of type string.

Key Type Default Comments
authors Person or sequence of persons The people who conceptualized the problem.
contributors Person or sequence of persons The people who developed the problem package, such as statement, validators, and test data.
testers Person or sequence of persons The people who tested the problem package, for example by providing a solution and reviewing the statement.
translators Map of strings to sequences of persons The people who translated the statement to other languages. Each key must be an ISO 639 language code.
acknowledgements Person or sequence of persons Extra acknowledgements or special thanks that do not fit the other categories.

The examples

credits:
  authors: [Author One, Author Two <[email protected]>]

and

credits:
  authors:
  - name: Author One
  - name: Author Two
    email: [email protected]

are both valid ways to describe the same two authors.

A full example would be

credits:
  authors: Authy McAuth <[email protected]>
  contributors:
  - Authy McAuth <[email protected]>
  - name: Additional Contributor
    email: [email protected]
  testers:
  - Tester One
  - Tester Two
  - Tester Three
  translators:
    da:
    - name: Mads Jensen
      email: [email protected]
    eo:
    - Ludoviko Lazaro Zamenhofo
  acknowledgements:
    - Inspirational Speaker 1
    - Inspirational Speaker 2

which demonstrates all the available credit types.

Credits are sometimes omitted when authors instead choose to only give source credit, but both may be specified. If a string is provided instead of a map for credits, such as

credits: Authy McAuth <[email protected]>

it is treated as if only a single author is being specified, so it is equivalent to

credits:
    authors: Authy McAuth <[email protected]>

to support a less verbose credits section.

Source

The source is defined by a sequence of maps, each containing two keys, name and url. The key name is required, but the key url is optional. If url is omitted, a string may be used instead of a map to specify the name. If the sequence contains only one item, whether it be a string or a map, then it can be specified as a single string or a map.

If specified, each key name should typically map to the name (and year) of the problem set (such as a contest or a course), where the problem was first used or for which it was created, and the key url should map to a link to the event's page.

A sequence is specified in the special case that the first use was actually from multiple (typically simultaneous) sources.

License

Allowed values for license.

Values other than unknown or public domain require rights_owner to have a value.

Value Comments Link
unknown The default value. In practice means that the problem can not be used.
public domain There are no known copyrights on the problem, anywhere in the world. http://creativecommons.org/about/pdm
cc0 CC0, "no rights reserved". http://creativecommons.org/about/cc0
cc by CC attribution. http://creativecommons.org/licenses/by/4.0/
cc by-sa CC attribution, share alike. http://creativecommons.org/licenses/by-sa/4.0/
educational May be freely used for educational purposes.
permission Used with permission. The rights owner must be contacted for every additional use.

Limits

A map with the following keys:

Key Comments Default Typical system default
time_multipliers optional see below
time_limit optional float, in seconds see below
time_resolution optional float, in seconds 1.0
memory optional, in MiB system default 2048
output optional, in MiB system default 8
code optional, in KiB system default 128
compilation_time optional, in seconds system default 60
compilation_memory optional, in MiB system default 2048
validation_time optional, in seconds system default 60
validation_memory optional, in MiB system default 2048
validation_output optional, in MiB system default 8

For most keys, the system default will be used if nothing is specified. This can vary, but you SHOULD assume that it's reasonable. Only specify limits when the problem needs a specific limit, but do specify limits even if the "typical system default" is what is needed.

Problem Timing

time_multipliers is a map with the following keys:

Key Comments Default
ac_to_time_limit float 2.0
time_limit_to_tle float 1.5

The value of time_limit is an integer or floating-point problem time limit in seconds. The time multipliers specify safety margins relative to the slowest accepted submission, T_ac, and fastest time_limit_exceeded submission, T_tle. The time_limit must satisfy T_ac * ac_to_time_limit <= time_limit and time_limit * time_limit_to_tle <= T_tle. In these calculations, T_tle is treated as infinity if the problem does not provide at least one time_limit_exceeded submission.

If no time_limit is provided, the default value is the smallest integer multiple of time_resolution that satisfies the above inequalities. It is an error if no such multiple exists. The time_resolution key is ignored if the problem provides an explicit time limit (and in particular, the time limit is not required to be a multiple of the resolution). Since time multipliers are more future-proof than absolute time limits, avoid specifying time_limit whenever practical.

Judge systems should make a best effort to respect the problem time limit, and should warn when importing a problem whose time limit is specified with precision greater than can be resolved by system timers.

Languages

A list of programming language codes from the table below or all.

If a list is given, the problem may only be solved using those programming languages.

File endings in parenthesis are not used for determining language.

Code Language Default entry point File endings
algol68 Algol 68 .a68
apl APL .apl
bash Bash .sh
ada Ada .adb, .ads
c C .c
cgmp C with GMP (.c)
cobol COBOL .cob
cpp C++ .cc, .cpp, .cxx, .c++, .C
cppgmp C++ with GMP (.cc, .cpp, .cxx, .c++, .C)
crystal Crystal .cr
csharp C# .cs
dart Dart .dart
fsharp F# .fs
fortran Fortran .f90
gerbil Gerbil .ss
go Go .go
haskell Haskell .hs
java Java Main .java
javascript JavaScript main.js .js
julia Julia .jl
kotlin Kotlin MainKt .kt
lisp Common Lisp main.{lisp,cl} .lisp, .cl
lua Lua .lua
nim Nim .nim
objectivec Objective-C .m
ocaml OCaml .ml
octave Octave (.m)
odin Odin .odin
pascal Pascal .pas
perl Perl .pm, (.pl)
php PHP main.php .php
prolog Prolog .pl
python2 Python 2 main.py2 (.py), py2
python3 Python 3 main.py .py, .py3
python3numpy Python 3 with NumPy main.py (.py, .py3)
ruby Ruby .rb
rust Rust .rs
scala Scala .scala
simula Simula .sim
snobol Snobol .sno
swift Swift .swift
typescript TypeScript .ts
visualbasic Visual Basic .vb
zig Zig .zig

Constants

A map of names to values. Names must match the following regex: [a-zA-Z_][a-zA-Z0-9_]*. Constant sequences are tokens (regex words) of the form {{name}}, where name is one of the names defined in constants. Tags {{xyz}} containing a name that is not defined are not modified, but may be warned for.

All constant sequences in the following files will be replaced by the value of the corresponding constant:

  • problem statements
  • input and output validators
  • included code
  • example submissions
  • testdata.yaml

Constant sequences are not replaced in test data files or in problem.yaml itself.

Problem Statements

The problem statement of the problem is provided in the directory problem_statement/.

This directory must contain one file per language, for at least one language, named problem.<language>.<filetype>, that contains the problem text itself, including input and output specifications. Language must be given as the shortest ISO 639 code. If needed, a hyphen and an ISO 3166-1 alpha-2 code may be appended to an ISO 639 code. Optionally, the language code can be left out; the default is then English (en). Filetype can be either tex for LaTeX files, md for Markdown, or pdf for PDF.

Please note that many kinds of transformations on the problem statements, such as conversion to HTML or styling to fit in a single document containing many problems will not be possible for PDF problem statements, so using this format should be avoided if at all possible.

Auxiliary files needed by the problem statement files must all be in problem_statement/. problem.<language>.<filetype> should reference auxiliary files as if the working directory is problem_statement/. Image file formats supported are .png, .jpg, .jpeg, and .pdf.

Sample Data

  • For problem statements provided in LaTeX or Markdown: the statement file must contain only the problem description and input/output specifications and no sample data. It is the judge system's responsibility to append the sample data.
  • For problem statements provided as PDFs: the judge system will display the PDF verbatim, and therefore any sample data must be included in the PDF. The judge system is not required to reconcile sample data embedded in PDFs with the sample test data group nor to validate it in any other way.

LaTeX Environment and Supported Subset

Problem statements provided in LaTeX must consist only of the problem statement body (i.e., the content that would be placed within a document environment). It is the judging system's responsibility to wrap this text in an appropriate LaTeX class.

The LaTeX class shall provide the convenience environments Input, Output, and Interaction for delineating sections of the problem statement. It shall also provide the following commands:

  • \problemname{name}, which must be the first line of the problem statement. name gives a LaTeX-formatted problem name to be used when rendering the problem statement header. This argument can be empty, in which case the name value matching the problem statement's language from problem.yaml is used instead.
  • \illustration{width}{filename}{caption}, a convenience command for adding a figure to the problem statement. width is a floating-point argument specifying the width of the figure, as a fraction of the total width of the problem statement; filename is the image to display, and caption, the text to include below the figure. The illustration should be flushed right with text flowing around it (as in a wrapfigure).

Arbitrary LaTeX is not guaranteed to render correctly by HTML-based judging systems. However, judging systems must make a best effort to correctly render at minimum the following LaTeX subset when displaying a LaTeX problem statement:

  1. All MathJax-supported TeX commands within inline ($ $) and display ($$ $$) math mode.
  2. The following text-mode environments: itemize, enumerate, lstlisting, verbatim, quote, center, tabular, figure, wrapfigure (from the wrapfig package).
  3. \item within list environments and \hline, \cline, \multirow, \multicol within tables.
  4. The following typesetting constructs: smart quotes (' ', << >>, `` ''), dashes (--, ---), non-breaking space (~), ellipses (\ldots and \textellipsis), and \noindent.
  5. The following font weight and size modifiers: \bf, \textbf, \it, \textit, \t, \tt, \texttt, \emph, \underline, \sout, \textsc, \tiny, \scriptsize, \small, \normalsize, \large, \Large, \LARGE, \huge, \Huge.
  6. \includegraphics from the package graphicx, including the Polygon-style workaround for scaling the image using \def \htmlPixelsInCm.
  7. The miscellaneous commands \url, \href, \section, \subsection, and \epigraph.

Attachments

Public, i.e. non-secret, files to be made available in addition to the problem statement and sample test data are provided in the directory attachments/.

Solution description

A description of how the problem is intended to be solved is provided in the directory solution/.

This directory must contain one file per language, for at least one language, named solution.<language>.<filetype>. Language is given the same way as for problem statements. Optionally, the language code can be left out; the default is then English (en). The set of languages used can be different from what was used for the problem statement. Filetype can be either tex for LaTeX files, md for Markdown, or pdf for PDF.

Auxiliary files needed by the solution description files must all be in solution/. solution.<language>.<filetype> should reference auxiliary files as if the working directory is solution/.

Exactly how the solution description is used is up to the user or tooling.

Test data

The test data are provided in subdirectories of data/. The sample data in data/sample/ and the secret data in data/secret/.

All files and directories associated with a single test case have the same base name with varying extensions. Here base name is defined to be the relative path from the data directory to the test case input file, without extensions. For example, the files secret/test.in and secret/test.ans are associated with the same test case that has the base name secret/test. The existence of the .in file declares the existence of the test case. If the test case exists, then an associated .ans file must exist while the others are optional. If the test case does not exist, then the other files must not exist. The table below summarizes the supported test data:

Extension Described In Summary
.in Input Input piped to standard input
.ans Expected (AC) answer
.hint Annotations Hint for solving the test case
.desc Annotations Purpose of the test
.png, .jpg, .jpeg, .svg Annotations Illustration of the test case
.interaction Interactive Problems Interaction protocol sample
.args Input Input passed as command-line arguments
.files Input Input available via file I/O

Judge systems may assume that the result of running a program on a test case is deterministic. For any two test cases, if the contents of their .in and .args files and .files directory are equivalent, then the input of the two test cases is equivalent. This means that for any two test cases, if their input, output validator flags (see test data groups below) and the contents of their .ans files are equivalent, then the test cases are equivalent. The assumption of determinism means that a judge system could choose to reuse the result of a previous run, or to re-run the equivalent test case.

Input

Each test case can supply input via standard input, command-line arguments, and/or the file system. These options are not exclusive. For a test case with base name test, the file test.in is piped to the submission as standard input. The submission will be run with the whitespace-separated tokens in test.args as command-line arguments, if the file exists. Note that usually the submission's entry point, whether it be a binary or an interpreted file, will be the absolute first command line argument. However, there are languages such as Java where there is no initial command line argument representing the entry point.

The directory test.files, if it exists, contains privileged data files available to the submission via file I/O. All files in this directory must be copied into the submission's working directory after compiling, but before executing the submission, possibly overwriting the compiled submission file or included data in the case of name conflicts.

Annotations

One hint, description, and/or illustration file may be provided per test case. The files must share the base name of the associated test case. Description and illustration files are meant to be privileged information.

Category File type Filename extension Remark
hint text .hint
description text .desc privileged information
illustration image .png, .jpg, .jpeg, or .svg privileged information
  • A hint provides feedback for solving a test case to, e.g., somebody whose submission didn't pass.

  • A description conveys the purpose of a test case. It is an explanation of what aspect or edge case of the solution that the input file is meant to test.

  • An illustration provides a visualization of the associated test case. Note that at most one image file may exist for each test case.

Interactive Problems

For interactive problems, any sample test cases must provide an interaction protocol as a text file with the extension .interaction for each sample demonstrating the communication between the submission and the output validator, meant to be displayed in the problem statement. An interaction protocol consists of a series of lines starting with > and <. Lines starting with > signify an output from the submission to the output validator, while < signify an input from the output validator to the submission.

A sample test case may have just an .interaction file without a corresponding .in and .ans file. However, if either of a .in or a .ans file is present the other one must also be present. Unlike .in and .ans files for non-interactive problem, interactive .in and .ans files must not be displayed to teams: not in the problem statement, nor as part of sample input download. If you want to provide files related to interactive problems (such as testing tools or input files) you can use attachments.

Test Data Groups

The test data for the problem can be organized into a tree-like structure. Each node of this tree is represented by a directory and referred to as a test data group. Each test data group may consist of zero or more test cases (i.e., input-answer files) and zero or more subgroups of test data (i.e., subdirectories).

At the top level, the test data is divided into exactly two groups: sample and secret. These two groups may be further split into subgroups as desired.

The result of a test data group is computed by applying a grader to all of the sub-results (test cases and subgroups) in the group. See Scoring for more details.

Test cases and groups will be used in lexicographical order on file base name. If a specific order is desired a numbered prefix such as 00, 01, 02, 03, and so on, can be used.

In each test data group, a YAML file testdata.yaml may be placed to specify how the result of the test data group should be computed. Some of the keys and their associated values will be inherited from the testdata.yaml in the closest ancestor group from the test case to the root data directory that has one. Others need to be explicitly defined in the group's testdata.yaml file, otherwise they are set to the default values. If there is no testdata.yaml file in the root data group, one is implicitly added with the default values.

The format of testdata.yaml is as follows:

Key Type Default Inheritance Comments
scoring Map See Scoring Not inherited Description of how the results of the group test cases and subgroups should be aggregated.
input_validator_flags String or map of strings to strings empty string Inherited Arguments passed to each input validator for this test data group. If a string then those are the flags that will be passed to each input validator for this test data group. If a map then each key is the name of the input validator and the value is the flags to pass to that input validator for this test data group. Validators not present in the map are run without flags.
output_validator_flags String empty string Inherited Arguments passed to the output validator for this test data group.
run_samples Boolean true Not applicable Signifies whether samples should be run, see below for details.

Skipping Execution of Samples

In some problem, the author may want to provide samples in the statement for demonstrative purposes, but does not want them to be run. One example of the uses for this are problems that require probabilistic methods to solve. In such a problem, the sample input is small for the sake of demonstration, but the intended solution works reliably only for large test cases. The option run_samples may be used to signal to the judging system whether samples should or should not be run.

To prevent sample test cases from being run, the author may specify

run_samples: false

in the root data group's testdata.yaml file. This option may only be used at the root and if omitted, its value defaults to true. The samples should still be validated by input and output validators. Note that the keys input_validator_flags and output_validator_flags in testdata.yaml, as described above, may be used to alter the behavior of the validators for the sample test data group.

Invalid Test Cases

The data directory may contain directories of test cases that must be rejected by validation. Their goal is to ensure integrity and quality of the test data and validation programs.

Invalid input

The files under invalid_input are invalid inputs. Unlike in sample and secret, there are no .ans files. Each tc.in under invalid_input must be rejected by at least one input validator.

Invalid output

The test cases in invalid_output describe invalid outputs for non-interactive problems. They consist of three files. The input file tc.in must pass input validation. The default answer file tc.ans must pass output validation. The output file tc.out must fail output validation.

In particular, for an existing feedback directory dir,

<output_validator_program> tc.in tc.ans dir [flags] < tc.ans # MUST PASS
<output_validator_program> tc.in tc.ans dir [flags] < tc.out # MUST FAIL

The directory invalid can be organised into a tree-like structure similar to secret and contain flags in testdata.yaml files that are passed to the validators.

Included Files

Files that should be included with all submissions are provided in one non-empty directory per supported language. Files that should be included for all languages are placed in the non-empty directory include/default/. Files that should be included for a specific language, overriding the default, are provided in the non-empty directory include/<language>/. If either no included files are provided or default files are provided, then the set of languages in which a submission is allowed to be is unrestricted. However, if included files are provided in specific languages, but no default files are provided, then the set of languages is restricted to the specified languages.

The files should be copied from a language directory based on the language of the submission, to the submission files before compiling, but after checking whether the submission exceeds the code limit, overwriting files from the submission in the case of name collision. Language must be given as one of the language codes in the language table in the overview section. If any of the included files are supposed to be the main file (i.e., a driver), that file must have the language-dependent name as given in the table referred above.

Example Submissions

Correct and incorrect solutions to the problem are provided in subdirectories of submissions/. The possible subdirectories are:

Value Requirement Comment
accepted Accepted as a correct solution for all test cases. At least one is required.
partially_accepted Overall verdict must be Accepted. Overall score must be less than max_score. Must not be used for pass-fail problems.
rejected Is rejected (i.e., not accepted) for any reason in the final verdict.
wrong_answer Wrong answer for some test case. May be too slow or crash for other test cases.
time_limit_exceeded Too slow relative to the safety margin for some test case. May output wrong answers or crash for other test cases.
run_time_error Crashes for some test case. May output wrong answers or be too slow for other test cases. Very rarely useful.

Every file or directory in these directories represents a separate solution. It is mandatory to provide at least one accepted solution.

Metadata about the example submissions are provided in a YAML file named submissions.yaml placed directly in the submissions/ directory. The top level keys in submissions.yaml are globs matching example submissions. For example, accepted/* would match any submission in the submissions/accepted/ directory.

Each glob maps to a map with keys as defined below, specifying metadata for all submissions that are matched by the glob.

Key Type Default Comment
language String As determined by file endings given in the language list
entrypoint String As specified in the language list
authors Person or sequence of persons Author(s) of submission(s)

Input Validators

Input Validators, for verifying the correctness of the input files, are provided in input_validators/. Input validators can be specified as VIVA-files (with file ending .viva), Checktestdata-file (with file ending .ctd), or as a program. Programs must either be written in C++ or Python 3, or must provide a build or run script as specified above.

All input validators provided will be run on every input file. Validation fails if any validator fails.

Invocation

An input validator program must be an application (executable or interpreted) capable of being invoked with a command line call.

All input validators provided will be run on every test data file using the arguments specified for the test data group they are part of. Validation fails if any validator fails.

When invoked, the input validator will get the input file on stdin.

The validator should be possible to use as follows on the command line:

<input_validator_program> [arguments] < inputfile

Output

The input validator may output debug information on stdout and stderr. This information may be displayed to the user upon invocation of the validator.

Exit codes

The input validator must exit with code 42 on successful validation. Any other exit code means that the input file could not be confirmed as valid.

Dependencies

The validator MUST NOT read any files outside those defined in the Invocation section. Its result MUST depend only on these files and the arguments.

Output Validator

Overview

An output validator is a program that is given the output of a submitted program, together with the corresponding input file, and a correct answer file for the input, and then decides whether the output provided is a correct output for the given input file.

A validator program must be an application (executable or interpreted) capable of being invoked with a command line call. The details of this invocation are described below. The validator program has two ways of reporting back the results of validating:

  1. The validator must give a judgement (see Reporting a judgement).
  2. The validator may give additional feedback, e.g., an explanation of the judgement to humans (see Reporting Additional Feedback).

A custom output validator is used if the problem requires more complicated output validation than what is provided by the default diff variant described below. It must be provided as the directory output_validator/. It must either be written in C++ or Python 3, or must provide a build or run script as specified above. It must adhere to the Output validator specification described below. If no custom validator is provided, the default output validator will be used.

The output validator will be run on the output for every test data file using the arguments specified for the test data group.

Default Output Validator Specification

The default output validator is essentially a beefed-up diff. In its default mode, it tokenizes the files to compare and compares them token by token. It supports the following command-line arguments to control how tokens are compared.

Arguments Description
case_sensitive indicates that comparisons should be case-sensitive.
space_change_sensitive indicates that changes in the amount of whitespace should be rejected (the default is that any sequence of 1 or more whitespace characters are equivalent).
float_relative_tolerance ε indicates that floating-point tokens should be accepted if they are within relative error ≤ ε (see below for details).
float_absolute_tolerance ε indicates that floating-point tokens should be accepted if they are within absolute error ≤ ε (see below for details).
float_tolerance ε short-hand for applying ε as both relative and absolute tolerance.

When supplying both a relative and an absolute tolerance, the semantics are that a token is accepted if it is within either of the two tolerances. When a floating-point tolerance has been set, any valid formatting of floating point numbers is accepted for floating point tokens. So, for instance, if a token in the answer file says 0.0314, a token of 3.14000000e-2 in the output file would be accepted.

It is an error to provide any of the float_relative_tolerance, float_absolute_tolerance, or float_tolerance arguments more than once, or to provide a float_tolerance alongside float_relative_tolerance and/or float_absolute_tolerance.

If no floating point tolerance has been set, floating point tokens are treated just like any other token and have to match exactly.

Invocation

When invoked the output validator will be passed at least three command line parameters and the output stream to validate on stdin.

The validator should be possible to use as follows on the command line:

<output_validator_program> input judge_answer feedback_dir [additional_arguments] < team_output [ > team_input ]

The meaning of the parameters listed above are:

  • input: a string specifying the name of the input data file that was used to test the program whose results are being validated.

  • judge_answer: a string specifying the name of an arbitrary "answer file" which acts as input to the validator program. The answer file may, but is not necessarily required to, contain the "correct answer" for the problem. For example, it might contain the output that was produced by a judge's solution for the problem when run with input file as input. Alternatively, the "answer file" might contain information, in arbitrary format, which instructs the validator in some way about how to accomplish its task. The meaning of the contents of the answer file is not defined by this format.

  • feedback_dir: a string which specifies the name of a "feedback directory" in which the validator can produce "feedback files" in order to report additional information on the validation of the output file. The feedbackdir must end with a path separator (typically ‘/' or ‘\' depending on operating system), so that simply appending a filename to feedbackdir gives the path to a file in the feedback directory.

  • additional_arguments: in case the problem specifies additional validator_flags, these are passed as additional arguments to the validator on the command line.

  • team_output: the output produced by the program being validated is given on the validator's standard input pipe.

  • team_input: when running the validator in interactive mode everything written on the validator's standard output pipe is given to the program being validated. Please note that when running interactive the program will only receive the output produced by the validator and will not have direct access to the input file.

The two files pointed to by input and judge_answer must exist (though they are allowed to be empty) and the validator program must be allowed to open them for reading. The directory pointed to by feedback_dir must also exist.

Reporting a judgement

A validator program is required to report its judgement by exiting with specific exit codes:

  • If the output is a correct output for the input file (i.e., the submission that produced the output is to be Accepted), the validator exits with exit code 42.
  • If the output is incorrect (i.e., the submission that produced the output is to be judged as Wrong Answer), the validator exits with exit code 43.

Any other exit code (including 0!) indicates that the validator did not operate properly, and the judging system invoking the validator must take measures to report this to contest personnel. The purpose of these somewhat exotic exit codes is to avoid conflict with other exit codes that results when the validator crashes. For instance, if the validator is written in Java, any unhandled exception results in the program crashing with an exit code of 1, making it unsuitable to assign a judgement meaning to this exit code.

Reporting Additional Feedback

The purpose of the feedback directory is to allow the validator program to report more information to the judging system than just the accept/reject verdict. Using the feedback directory is optional for a validator program, so if one just wants to write a bare-bones minimal validator, it can be ignored.

The validator is free to create different files in the feedback directory, in order to provide different kinds of information to the judging system, in a simple but organized way. For instance, there may be a "judgemessage.txt" file, the contents of which gives a message that is presented to a judge reviewing the current submission (typically used to help the judge verify why the submission was judged as incorrect, by specifying exactly what was wrong with its output). Other examples of files that may be useful in some contexts (though not in the ICPC) are a score.txt file, giving the submission a score based on other factors than correctness, or a teammessage.txt file, giving a message to the team that submitted the solution, providing additional feedback on the submission.

A judging system that implements this format must support the judgemessage.txt file described above (I.e., content of the "judgemessage.txt" file, if produced by the validator, must be provided by the judging system to a human judge examining the submission). Having the judging system support other files is optional.

Note that a validator may choose to ignore the feedback directory entirely. In particular, the judging system must not assume that the validator program creates any files there at all.

Multi-pass validation

A multi-pass validator can be used for problems that should run the submission multiple times sequentially, using a new input generated by output validator during the previous invocation of the submission.

The time and memory limit apply for each invocation separately.

To signal that the submission should be run again, the output validator must exit with code 42 and output the new input in the file nextpass.in in the feedback directory. Judging stops if no nextpass.in was created or the output validator exited with any other code. Note that the nextpass.in will be removed before the next pass.

It is a judge error to create the nextpass.in file and exit with any other code than 42.

All other files inside the feedback directory are guaranteed to persist between passes. In particular, the validator should only append text to the "judgemessage.txt" to provide combined feedback for all passes.

Samples for multi-pass problems must be provided in a .interaction file, like for interactive problems. Passes are separated by a line containing --- (three dashes). When the problem is not interactive, simply start each pass by a number of lines starting with <, containing the sample input, followed by some lines starting with >, containing the sample answer.

Examples

An example of a judgemessage.txt file:

Team failed at test case 14.
Team output: "31", Judge answer: "30".
Team failed at test case 18.
Team output: "hovercraft", Judge answer: "7".
Summary: 2 test cases failed.

An example of a teammessage.txt file:

Almost all test cases failed — are you even trying to solve the problem?

Validator standard output and standard error

A validator program is allowed to write any kind of debug information to its standard error pipe. This information may be displayed to the user upon invocation of the validator.

Scoring

For pass-fail problems, the verdict of a submission is the first non-accepted verdict, where test cases are run in lexicographical order of their full file paths (note that sample comes before secret in this order).

In scoring problems, submissions are given a non-negative score. The goal of the submission is to maximize the score.

Given a submission, scores are determined for test cases, test groups, and the submission itself.

The score of a failed test case is always 0. By default, the score of an accepted test case is 1. If a custom output validator produces a score or if a different score is set in testdata.yaml , then the product of those values is used instead.

The score of a test group is determined by its subgroups and test cases. If it has no subgroups or test cases, then its score is 0. Otherwise, the score depends on the aggregation mode, which is either min or sum. The score of the group then results in the minimum or the sum of all the sub-scores, respectively. If using min aggregation, then the verdict must not be accepted if any sub-verdict is rejected. If using sum aggregation, then the verdict must be accepted if any sub-verdict is accepted.

The score of the submission is its score on the topmost test group data.

Ultimately, the score cannot be non-zero if the topmost verdict is not accepted. Generally, an accepted verdict will be accompanied by a strictly positive score. However, note that there can be a distinction between a non-accepted verdict and an accepted verdict with score 0. An accepted verdict with a score of 0 could mean that the output is well formed but not good enough to score points.

For example, consider an optimization problem such as the Travelling Salesperson Problem. The goal is to provide the shortest possible route, and score is determined by how short of a route the submission outputs. If a submission instead outputs the longest possible route, it may get a verdict of accepted, because the output is well formed. However, the route is so long that by most metrics, the submission's score would be 0. In most cases, an output validator should probably reject a test case rather than accepting and returning a score of 0, but this is a detail left to the author of the problem.

The scoring behavior is configured by the following flags under scoring in testdata.yaml:

Key Type Description
score String The score assigned to an accepted input file in the group. If a scoring output validator is used, this score is multiplied by the score from the validator.
max_score String A number specifying the maximum score allowed for this test group. It is an error to exceed this.
aggregation sum or min If sum, the score is the sum of the subresult scores. If min, the score is the minimum of the subresult scores.

The defaults are as follows:

  • The data and secret group: score is 1, aggregation is sum.
  • The sample group: score is 0, aggregation is sum.
  • Other groups: score is 1, aggregation is min.
  • max_score is the sum of the max_score of the subresults if scoring is sum, and the minimum of the max_score of the subresults if scoring is min. For individual testcases, the max_score here means the score value of the group it is in.

They are chosen to minimize the configuration needed for a typical IOI-style problem, where points are given for passing all test cases in test data groups called subtasks. To achieve that behaviour, create groups secret/subtask1, secret/subtask2, … and set the score value in the secret/subtaskX/testdata.yaml.