Persistence

ABQ queues persists information about completed test runs, for the purposes of retries and result aggregation Data is persisted locally on the machine where abq runs, and can optionally be persisted to remote storage locations.

Local persistence

ABQ queues (invoked via abq start) persist test results and test suite manifests as files on the local filesystem in order to facilitate retries and result aggregation. Where these files are stored can be configured via the environment variables

ABQ_PERSISTED_MANIFESTS_DIR: the directory persisted manifest files should be stored. The ABQ process must have read and write access to all files in the directory. If not specified, a temporary directory is used.
ABQ_PERSISTED_RESULTS_DIR: the directory persisted results files should be stored. The ABQ process must have read and write access to all files in the directory. If not specified, a temporary directory is used.

Manifest and results files are written for all test suites executed on an ABQ queue instance, and as such, will steadily increase local disk space usage. Remote persistence can alleviate disk usage by offloading files over time.

When self-hosting ABQ, be sure to measure your needs and appropriately configure disk resource limits. In practice, we've found these files are typically not very large, and an ABQ queue can typically handle hundreds of test runs with a 1GB disk. However, each deployment of ABQ may vary.

Remote persistence

ABQ queues support configuring remote storage locations for the purposes of non-local persistence, offloading files to preserve disk usage, and sharing test runs between ABQ queue instances. Let's talk through how to configure ABQ remote persistence for all of these use cases.

Remote persistence can be configured to AWS S3, or a custom remote persistence implementation that communicates via an IPC interface.

By default, ABQ only persists files locally. To enable remote persistence, set the environment variable ABQ_REMOTE_PERSISTENCE_STRATEGY to custom or s3.

Persistence to AWS S3

Setting the environment variable and value ABQ_REMOTE_PERSISTENCE_STRATEGY=s3 will persist manifests, results, and the state of test suite runs to an AWS S3 bucket. Persistence to S3 requires the following additional environment variables to be set:

ABQ_REMOTE_PERSISTENCE_S3_BUCKET: The S3 bucket files should be written to.
ABQ_REMOTE_PERSISTENCE_S3_KEY_PREFIX: The prefix to use for keys written to the configured S3 bucket.

AWS credentials and region information are read from the environment, using the standard AWS environment variable support.

Custom persistence

Setting the environment variable and value ABQ_REMOTE_PERSISTENCE_STRATEGY=custom will persist manifests, results, and the state of test suite runs by calling a custom, provided executable.

This strategy requires the ABQ_REMOTE_PERSISTENCE_COMMAND environment variable to be set. The variable should be a comma-delimited string of the executable and the head arguments that should be called to perform an operation on the remote persistence target. The executable will be called in the following form:

<executable> <...arguments> <mode> <file-type> <run-id> <local-path>

Where

<mode> is either "store" or "load", depending on whether the file should be stored into the remote location, or loaded from the remote location.
<file-type> is "manifest", "results", or "run_state".
<run-id> is the run ID of the test suite run.
<local-path> is the path to the file on the local filesystem.

If the mode is "store", the content to upload should be read from the given path. If the mode is "load", the downloaded content should be written to the given path.

If the mode is "load" and the content does not exist remotely, the executable should exit with a non-zero exit code, and optionally a message on stderr indicating the failure.

If the command exits with a non-zero exit code, ABQ will assume the remote persistence operation (either loading or storing) to have failed.

The provided executable must be discoverable in the PATH that abq start is executed with.

As an example, if you have a Node.js script abq-remote-persister.js that performs remote persistence operations, the configured environment variable would be

ABQ_REMOTE_PERSISTENCE_COMMAND="node,abq-remote-persister.js"

Supposing that node is in the PATH.

Syncing of manifests and results files

When remote persistence is configured:

Manifests are synced to the remote persistence target after all entries in a test run's manifest have been assigned to a worker. The manifest is not modified again.
Results are synced to the remote persistence target when there are no writes of results to the local persistence target in-flight. That is, writes to the remote target are batched and executed when the local state is stable.

Offloading

To reduce local disk usage, ABQ supports offloading manifest and results files from the local file system to remote persistence. Offloaded manifest and results files are downloaded and cached to the local file system on-demand, as retries or result aggregations request them.

Offloading can be configured via the following environment variables:

ABQ_OFFLOAD_MANIFESTS_CRON: If configured, the cron schedule by which manifest files stored locally should be offloaded to the remote storage location. A manifest file is only eligible for offload if it is stale; see ABQ_OFFLOAD_STALE_FILE_THRESHOLD_HOURS below. All time in the cron expression will be interpreted as UTC.
ABQ_OFFLOAD_RESULTS_CRON: If configured, the cron schedule by which results files stored locally should be offloaded to the remote storage location. A results file is only eligible for offload if it is stale; see ABQ_OFFLOAD_STALE_FILE_THRESHOLD_HOURS below. All time in the cron expression will be interpreted as UTC.
ABQ_OFFLOAD_STALE_FILE_THRESHOLD_HOURS: The threshold, in hours, since the last time a while was accessed before it is eligible for offloading to the configured remote persistence location, if any is configured. The default is 6 hours.

If remote persistence is not configured, offloading is disabled. If either *_CRON environment variable is unset, its respective files will not be offloaded.

One of the most powerful features of remote persistence is the capability to retry test suite executions on an instance of ABQ different from the instance the test suite previous executed on.

If the same remote persistence strategy is configured between multiple ABQ queue instances, queue instances will have the capability to load test runs that initially executed on other queue instances.

At this time, sharing test runs between instances observes the following restrictions:

run_state files are schema-versioned, and no schema-version compatibility guarantees across versions of ABQ queues are provided at this time. run_state files are guaranteed to be compatible if shared between ABQ queues of the same version. If an ABQ queue loads a run_state file that it is incompatible with, the remote test run state will not be loaded. Executing a test suite whose run state file failed to be loaded will fall back on executing the test suite as a fresh run, similar to the pre-1.4.0 behavior.
The same run ID may not be executed, in parallel, on two different ABQ queue instances sharing the same remote persistence. For a given run ID, an ABQ queue will assume exclusive ownership of the test suite run associated with that run ID.

At this time, abq does not verify whether it indeed has exclusive ownership of a run ID. If you are self-hosting ABQ, you must ensure that run IDs are routed to a unique ABQ instance for the duration of a test run.

However, once a test run is complete, retries of the test run may be routed to another ABQ instance, so long as the exclusive ownership constraint continues to apply for the duration of the retry.

If you would like to avoid self-hosting, RWX's managed hosting of ABQ supports routing test runs under these constraints.