Using the CLI

The THOA CLI gives you full control over workloads, compute resources, and environments. Ideal for automation and reproducibility.
With the CLI, you can submit jobs, manage datasets, explore available tools, and monitor job history all from your terminal.

Tip:
You can always add --help to any command (like thoa run --help) to see a full list of options and detailed usage examples right in your terminal.

Here’s a quick overview of available commands:

  • thoa run – Submit and run jobs in the cloud using custom code and compute specs
  • thoa dataset – Manage, list, and query your datasets
  • thoa tools – Search for available tools and environments
  • thoa jobs – Monitor your past jobs and query job metadata

How to Use the API Key

Before running any CLI commands, you need to authenticate using your THOA API key. This key allows the CLI to connect securely to your workspace and submit jobs.

1. Create an API Key

You can create an API key here:
Create an API Key

Important: Anyone with access to this key can run jobs from your account and view, modify, or delete all of your datasets and results. Store it securely and never commit it to GitHub or share it.


2. Save the API Key

To use your API key with the CLI, it needs to be set as an environment variable called THOA_API_KEY.

To save it permanently:

macOS / Linux (bash / zsh)

You have two options:

Option 1: Manually edit your shell config file

Open your shell config file:

nano ~/.bashrc

(or ~/.zshrc if you are using zsh)

Add this line at the bottom:

export THOA_API_KEY="your-api-key-here"

Then reload your config:

source ~/.bashrc

Option 2: Append it automatically from the command line

echo 'export THOA_API_KEY="your-api-key-here"' >> ~/.bashrc
source ~/.bashrc

Windows (PowerShell)

Use setx to store the variable permanently:

setx THOA_API_KEY "your-api-key-here"

Important: You must restart PowerShell (or open a new terminal window) after using setx before the variable becomes available.


To set it just for the current session:

If you only want to set the API key temporarily (until you close the terminal), use the following:

macOS / Linux
export THOA_API_KEY="your-api-key-here"
Windows (PowerShell)
$env:THOA_API_KEY = "your-api-key-here"

These will only apply to the current terminal session. Once you close the terminal, you’ll need to run the command again.


thoa run

Description

The thoa run command is used to submit jobs to Thoa’s compute infrastructure.
Jobs let you run arbitrary code in isolated, reproducible environments of nearly any size, all without managing infrastructure.


Examples

Running a Python script

thoa run --cmd "python script.py" \
  --input ./inputdata \
  --output ./ \
  --tools python \
  --n-cores 16 \
  --ram 64 \
  --storage 10

Running an R script

thoa run --cmd "Rscript script.R" \
  --input ./inputdata \
  --output ./ \
  --tools r-base \
  --n-cores 16 \
  --ram 64 \
  --storage 10

Running a Bash script

thoa run --cmd "bash script.sh" \
  --input ./inputdata \
  --output ./ \
  --tools bash \
  --n-cores 16 \
  --ram 64 \
  --storage 10

Running raw Bash

thoa run --cmd 'echo "Hello from THOA!" && touch file.txt' \
  --input ./inputdata \
  --output ./outputdata \
  --tools bash \
  --n-cores 16 \
  --ram 64 \
  --storage 10

🧪 Running a Nextflow workflow (Coming soon!)

thoa run --cmd 'nextflow run pipeline.nf' \
  --input ./inputdata \
  --output ./outputdata \
  --tools nextflow \
  --n-cores 16 \
  --ram 64 \
  --storage 50

🧪 Running a Snakemake workflow (Coming soon!)

Check back later for more details on running Snakemake pipelines in Thoa


Arguments

FlagDescription
--inputPath(s) to input files or folders. Defaults to the current directory.
--input-datasetUse a pre-uploaded dataset by ID instead of re-uploading files.
--output (req)Folder path inside the job container where output files will be found.
--cmdThe shell command to run inside the compute environment.
--toolsComma-separated tools to load (e.g. fastqc, r-base, python).
--env-sourceDefine the environment using a Conda YAML, Docker image, or saved env.
--download-dirLocal directory where output files will be downloaded after the job finishes.
--n-coresNumber of CPU cores to allocate.
--ramGB of RAM to allocate.
--storageGB of free space available for outputs after inputs are uploaded.
--run-asyncIf true, streams logs and outputs in real time. Defaults to false.
--job-name(Optional) Give your job a custom name.
--job-description(Optional) Text description to help you track the job.
--dry-runRuns validation only, without submitting the job. Useful for testing.

thoa dataset

Description

The thoa dataset command group lets you manage and interact with datasets stored in Thoa.
Datasets are collections of uploaded input files used when launching jobs.
Each dataset is uniquely versioned and stored in the cloud, so you can reuse them across multiple jobs without re-uploading the same data.


Examples

List your top 10 datasets by size

thoa dataset list --sort-by size --number 10

Download specific files from a dataset

thoa dataset download <DATASET_UUID> ./localdir --include "reads/*.fastq.gz"

Find the largest dataset by number of files

thoa dataset list --sort-by files --number 1

Arguments

thoa dataset list

Lists all datasets available to your workspace.

thoa dataset list

Optional flags:

FlagDescription
--number, -nNumber of datasets to display
--sort-by, -sSort field: created, size, or files (default: created, descending)
--descSort in descending order (default: true)

thoa dataset download <DATASET_UUID> <DESTINATION_PATH>

Downloads a dataset (or selected files from it) to a local folder.

thoa dataset download <DATASET_ID> ./outputdir

Optional flags:

FlagDescription
--include, -iOnly download files matching these public IDs or globs
--exclude, -eExclude files matching these public IDs or globs

thoa dataset ls <DATASET_UUID>

Lists all files in a dataset by its UUID.

thoa dataset ls 157d2823-xxxx-xxxx-xxxx-xxxxxxxxxxxx

Optional flags:

FlagDescription
--level, -lHow many levels of the file hierarchy to display (int)

thoa tools

Description

The thoa tools command displays information about the tools currently supported by Thoa job environments.
At the moment, Thoa supports tools from Bioconda and conda-forge, with additional repositories planned soon.

Running thoa tools provides direct links to the complete lists of available tools.


Examples

Show all supported tools

thoa tools

Supported Tools

At the moment, Thoa supports every package available from the following repositories:


thoa jobs

Description

The thoa jobs command group lets you view jobs you have previously run with Thoa. Each job represents a single execution and includes details such as execution status, timestamps, and associated datasets.

Use thoa jobs to track job progress and explore your job history.


Examples

List your most recent jobs

thoa jobs list

Show the 5 newest jobs

thoa jobs list --number 5

Sort jobs alphabetically by status

thoa jobs list --sort-by status --asc

Commands


thoa jobs list

Displays jobs in a table with columns:

  • Name – Job name
  • ID – Full job ID
  • Started – When the job began
  • Status – Job state (created, running, completed, completed, …)
  • Input Dataset – Input dataset name
  • Output Dataset – Output dataset name
thoa jobs list

Options

FlagDescription
--number, -nLimit how many jobs to display
--sort-by, -sSort field: started or status (default: started)
--ascSort in ascending order (default: descending)

Summary

Use the CLI to:

  • Define compute resources explicitly
  • Launch reproducible containerized jobs
  • Manage your environments and datasets
  • Automate large-scale workflows easily