Recipe Template | Thoa Cookbook

Introduction

This recipe provides a standardized and reproducible workflow to analyze high-throughput biological data and transform raw files into interpretable results.

It is designed to answer questions such as:

What biological changes are present between samples or conditions?
Which features (genes, variants, regions) are significantly affected?

Data

Describe the input data (e.g., bulk RNA-seq, WGS, scRNA-seq) and starting format (e.g., FASTQ, count matrices).

Methods & Tools

Briefly describe the main processing steps and tools used (e.g., QC, alignment, quantification, statistics, Nextflow, nf-core).

Target Audience

State who this recipe is for (e.g., bioinformaticians, lab scientists, data analysts) and the required skill level.

Purpose

Explain why this recipe is useful (e.g., reproducibility, automation, standardized analysis).

Input Data Overview

Briefly describe the structure and type of input data required for this recipe.

Required Files

Raw reads (.fastq.gz)
Primary sequencing reads
Sample metadata (.csv)
Sample IDs and experimental conditions
Reference genome (.fa)
Genome sequence for alignment
Gene annotation (.gtf)
Gene models for quantification

Input Directory Structure (optional)

project/
  data/
    sample1_R1.fastq.gz
    sample1_R2.fastq.gz
  metadata/
    samples.csv
  reference/
    genome.fa
    genes.gtf

(Optional) Input figure or screenshot showing the data layout.

Tutorial

Step-by-step guide to run the recipe.

1. Prepare the Environment

Describe how to access the platform, workspace, or compute environment
(e.g., login, project setup, permissions).

2. Configure Parameters (optional)

Describe which parameters must be defined and where they are configured
(e.g., config file, UI form, environment variables).

<parameter_name>: <value>
<parameter_name>: <value>

3. Run the Workflow

Describe how the workflow is started (e.g., command-line, web interface, job submission button).

<command or action to start the workflow>

4. Monitor & Troubleshoot

Explain how users can:

Track job status
Access logs and reports
Resume or restart failed runs

Result Data Overview

Explain what the outputs are and how to find them.

Key Outputs

| Output | Location | Description | | --------- | ----------------------------- | --------------- | | Report | results/multiqc_report.html | QC summary | | Counts | results/counts.tsv | Raw gene counts | | BAM files | results/bams/ | Aligned reads |

Output Tree (optional)

results/
  bams/
  counts.tsv
  multiqc_report.html

Example Figure

Final Analysis & Interpretation

Describe what users should look for in the results.

What does a successful run look like?
How should users interpret findings?
Common next steps (e.g., DEG analysis, enrichment)

References & Resources

Pipeline documentation
Relevant paper or method
THOA support: hello@thoa.io