Authoring ARC Validation Packages
In general, any script or tool can be used to validate aspects of an ARC. However, to be compatible with the Continuous Quality Control (CQC) pipelines on the DataHUB, validation packages must adhere to certain guidelines and conventions, both in terms of format and generated output.
In CQC pipelines, validation packages are pulled from the ARC Validation Package Registry (AVPR) and executed in isolated environments via the arc-validate CLI tool. Validation packages must adhere to the structure (e.g. programming language, metadata) and output requirements outlined below to be both publishable to AVPR and executable by arc-validate.
See also the relevant sections in the ARC specification
Validation package structure
Section titled Validation package structureSupported validation package file formats
Section titled Supported validation package file formatsValidation packages must be self-contained, single-file scripts.
The following programming languages can currently be used to create validation packages:
- F# (
.fsx)- software package management in F# scripts MUST use
#r nuget ...directives to reference any external dependencies. - F# scripts are executed via
dotnet fsi.
- software package management in F# scripts MUST use
- Python (
.py)- software package management in Python scripts MUST use uv inline script dependencies to reference any external dependencies. Ideally also specify the python language version.
- Python scripts are executed via
uv run.
Package metadata
Section titled Package metadataPackage metadata are used to display information about the validation package in the AVPR, keep track of versioning, release notes, etc.
The metadata MUST be the first thing occurring in the validation package file and can be either formatted as a standalone string or bound to a variable. It is formatted as YAML frontmatter, enclosed by triple-dashed lines (---).
String frontmatter
Section titled String frontmatterIn F# validation packages, the frontmatter MUST be enclosed in a multi-line comment ((* ... *)).
(*---<yaml frontmatter here>---*)In Python validation packages, the frontmatter MUST be enclosed in a multi-line string (""" ... """).
"""---<yaml frontmatter here>---"""Frontmatter bindings
Section titled Frontmatter bindingsYou can bind YAML frontmatter as a string inside your package. This is recommended because you can now re-use the metadata in your package code.
Binding must be placed at the start of the file to the name PACKAGE_METADATA with a [<Literal>] attribute exactly like this:
let [<Literal>] PACKAGE_METADATA = """(*---<yaml frontmatter here>---*)"""Binding must be placed at the start of the file to the name PACKAGE_METADATA exactly like this:
PACKAGE_METADATA = """---<yaml frontmatter here>---"""Re-use in the package code:
// The F# library for writing ARC validation packages, adjust version!#r "nuget: ARCExpect, 5.0.1"
Setup.ValidationPackage( metadata = Setup.Metadata(PACKAGE_METADATA), CriticalValidationCases = [...])|> Execute.ValidationPipeline( basePath = ...)A library for python is WIP, for now you can use string processing and YAML parsing libraries to extract metadata from the PACKAGE_METADATA variable.
Mandatory fields
Section titled Mandatory fields| Field | Type | Description |
|---|---|---|
| Name | string | the name of the package |
| MajorVersion | int | the major version of the package |
| MinorVersion | int | the minor version of the package |
| PatchVersion | int | the patch version of the package |
| Summary | string | a single sentence description (less than 50 words) of the package |
| Description | string | an unconstrained free text description of the package |
Example:
let [<Literal>] PACKAGE_METADATA = """(*---Name: my-packageMajorVersion: 1MinorVersion: 0PatchVersion: 0Summary: My package does the thing.Description: | My package does the thing. It does it very good, it does it very well. It does it very fast, it does it very swell.---*)"""PACKAGE_METADATA = """---Name: my-packageMajorVersion: 1MinorVersion: 0PatchVersion: 0Summary: My package does the thing.Description: | My package does the thing. It does it very good, it does it very well. It does it very fast, it does it very swell.---"""Optional fields
Section titled Optional fields| Field | Type | Description |
|---|---|---|
| Publish | bool | a boolean value indicating whether the package should be published to the registry. If set to true, the package will be built and pushed to the registry. If set to false (or not present), the package will be ignored. |
| Authors | author[] | the authors of the package. For more information about mandatory and optional fields in this object, see Objects > Author |
| Tags | string[] | a list of tags with optional ontology annotations that describe the package. For more information about mandatory and optional fields in this object, see Objects > Tag |
| ReleaseNotes | string[] | a list of release notes for the package indicating changes from previous versions |
| CQCHookEndpoint | string | an optional URL to a CQC Hook endpoint that can be used for continuous quality control (CQC) integration. If provided, this endpoint will be called with validation results after each package execution. |
Example:
let [<Literal>] PACKAGE_METADATA = """(*---Name: my-packageMajorVersion: 1MinorVersion: 0PatchVersion: 0Summary: My package does the thing.Description: | My package does the thing. It does it very good, it does it very well. It does it very fast, it does it very swell.Publish: trueAuthors: - FullName: John Doe Email: j@d.com Affiliation: University of Nowhere AffiliationLink: https://nowhere.edu - FullName: Jane Doe Email: jj@d.com Affiliation: University of Somewhere AffiliationLink: https://somewhere.eduTags: - Name: validation - Name: my-tag TermSourceREF: my-ontology TermAccessionNumber: MO:12345ReleaseNotes: | - initial release - does the thing - does it wellCQCHookEndpoint: https://some-url.xd---*)"""PACKAGE_METADATA = """---Name: my-packageMajorVersion: 1MinorVersion: 0PatchVersion: 0Summary: My package does the thing.Description: | My package does the thing. It does it very good, it does it very well. It does it very fast, it does it very swell.Publish: trueAuthors: - FullName: John Doe Email: j@d.com Affiliation: University of Nowhere AffiliationLink: https://nowhere.edu - FullName: Jane Doe Email: jj@d.com Affiliation: University of Somewhere AffiliationLink: https://somewhere.eduTags: - Name: validation - Name: my-tag TermSourceREF: my-ontology TermAccessionNumber: MO:12345ReleaseNotes: | - initial release - does the thing - does it wellCQCHookEndpoint: https://some-url.xd---"""Objects
Section titled ObjectsAuthor
Section titled AuthorAuthor metadata about the people that create and maintain the package.
| Field | Type | Description | Mandatory |
|---|---|---|---|
| FullName | string | the full name of the author | yes |
| string | the email address of the author | no | |
| Affiliation | string | the affiliation (e.g. institution) of the author | no |
| AffiliationLink | string | a link to the affiliation of the author | no |
Tags can be any string with an optional ontology annotation from a controlled vocabulary:
| Field | Type | Description | Mandatory |
|---|---|---|---|
| Name | string | the name of the tag | yes |
| TermSourceREF | string | Reference to a controlled vocabulary source | no |
| TermAccessionNumber | string | Accession in the referenced controlled vocabulary source | no |
Versioning packages
Section titled Versioning packagesPackages SHOULD be versioned according to the semantic versioning standard. This means that the version number of a package should be incremented according to the following rules:
- Major version: incremented when you make changes incompatible with previous versions
- Minor version: incremented when you add functionality in a backwards-compatible manner
- Patch version: incremented when you make backwards-compatible bug fixes
Validation Output
Section titled Validation OutputValidation packages MUST create a folder structure in the folder they are executed in that looks exactly like this:
Directory<execution_directory>
Directory.arc-validate-results
Directory<package-name>@<package_version>
- validation_report.xml
- validation_summary.json
- badge.svg
Meaning for the package invenio with version 3.1.0, the following folder structure MUST be created:
Directory<execution_directory>
Directory.arc-validate-results
Directoryinvenio@3.1.0
- validation_report.xml
- validation_summary.json
- badge.svg
JUnit report
Section titled JUnit reportA file in JUnit XML format named validation_report.xml MUST be created inside the package output folder. This file contains detailed information about all validation cases executed by the package, including their status (passed, failed,errored), execution time, and any error messages or stack traces.
Badges
Section titled BadgesA SVG badge named badge.svg MUST be created inside the package output folder. This badge summarizes the overall validation status of the package execution. The badge SHOULD indicate whether the validation passed or failed, and MAY include additional information such as the number of tests executed or the percentage of tests passed. This badge will be displayed on the ARC homepage.
Summary
Section titled SummaryA JSON file named validation_summary.json MUST be created inside the package output folder. This file contains a summary of the validation results, including the total number of tests executed, the number of tests passed, failed, and errored, as well as other information necessary to trigger downstream processes in CQC pipelines.
Current schema is available in the ARC specification
Submission to AVPR
Section titled Submission to AVPRTo submit a validation package to the ARC Validation Package Registry (AVPR) go to the AVPR repository in GitHub and file a PR as stated in the repository’s README.