Skip to content

Prompt TemplateΒΆ

This is a simple prompt to tell AI how to turn your serial task script into Labtasker submit and run script.

# Labtasker Task Decomposition Guide

## Overview
Labtasker is a CLI tool that distributes tasks across computing resources. It uses two primary commands:
- `labtasker task submit` - Queues task arguments for later execution
- `labtasker loop` - Continuously fetches and executes queued tasks

This guide will help you decompose a serial task script into Labtasker submit and run scripts.

## Task Decomposition Process
For any serial task script, you need to create:
1. A **submit script** that queues all task variations
2. A **run script** that executes those tasks when resources are available

### Pattern Recognition
- Identify loops and parameter variations in the original script
- Move parameter generation to the submit script
- Create a template for task execution in the run script

## Example 1: Simple Parameter Grid

### Original Script
```bash
#!/bin/bash
# Simple parameter grid search
# tags: experimental

for arg1 in {0..2}; do
    for arg2 in {3..5}; do
        python main.py --arg1 $arg1 --arg2 $arg2
    done
done
```

### Submit Script
```bash
#!/bin/bash
submit_date=$(date +%s)

for arg1 in {0..2}; do
    for arg2 in {3..5}; do
        labtasker task submit --name simple_grid_search --metadata "{'tags': ['experimental', '$submit_date']}" -- --arg1=$arg1 --arg2=$arg2
    done
done
```

### Run Script
```bash
#!/bin/bash
labtasker loop -- python main.py --arg1 '%(arg1)' --arg2 '%(arg2)'
```

## Example 2: Complex Environment and Parameters

### Original Script
```bash
#!/bin/bash

export CUDA_HOME=/usr/local/cuda-12.1

BASE_LOG_DIR=/path/to/logs

DATASETS=("imagenet" "cifar10" "mnist" "custom_dataset")
DATASET_DESCRIPTIONS=("A person's \"image\" net" "A person's cifar 10" "A person's mnist" "A person's custom dataset")

MODELS=("resnet50" "vit" "transformer" "alexnet")

for idx in "${!DATASETS[@]}"; do
  for model_idx in "${!MODELS[@]}"; do

    DATASET_DESCRIPTION=${DATASET_DESCRIPTIONS[$idx]}
    DATASET=${DATASETS[$idx]}
    MODEL=${MODELS[$model_idx]}
    LOG_DIR="$BASE_LOG_DIR/$(echo "$DATASET" | tr '[:upper:]' '[:lower:]')/$MODEL"

    echo "Processing Dataset: $(echo "$DATASET" | tr '[:lower:]' '[:upper:]')"
    echo "Dataset Description: ${DATASET_DESCRIPTION}"
    echo "Model (short): ${MODEL:0:3}"
    echo "Log Directory: ${LOG_DIR}"

    # Execute training command
    python train.py --dataset "$DATASET" \
                    --dataset-description "$DATASET_DESCRIPTION" \
                    --model "$MODEL" \
                    --cuda-home "$CUDA_HOME" \
                    --log-dir "$LOG_DIR"
  done
done

echo "All tasks completed successfully."
```

### Submit Script
```bash
#!/bin/bash

export CUDA_HOME=/usr/local/cuda-12.1
BASE_LOG_DIR=/path/to/logs

DATASETS=("imagenet" "cifar10" "mnist" "custom_dataset")
DATASET_DESCRIPTIONS=("A person's \"image\" net" "A person's cifar 10" "A person's mnist" "A person's custom dataset")
MODELS=("resnet50" "vit" "transformer" "alexnet")

for idx in "${!DATASETS[@]}"; do
  for model_idx in "${!MODELS[@]}"; do
    DATASET_DESCRIPTION=${DATASET_DESCRIPTIONS[$idx]}
    DATASET=${DATASETS[$idx]}
    MODEL=${MODELS[$model_idx]}
    LOG_DIR="$BASE_LOG_DIR/$(echo "$DATASET" | tr '[:upper:]' '[:lower:]')/$MODEL"

    echo "Submitting task for dataset: $DATASET, model: $MODEL"

    labtasker task submit -- \
                    --dataset="$DATASET" \
                    --dataset-description="$DATASET_DESCRIPTION" \
                    --model="$MODEL" \
                    --cuda-home="$CUDA_HOME" \
                    --log-dir="$LOG_DIR"
  done
done

echo "All tasks submitted successfully."
```

### Run Script
```bash
#!/bin/bash
export CUDA_HOME=/usr/local/cuda-12.1

labtasker loop -- \
            python train.py --dataset '%(dataset)' \
            --dataset-description '%(dataset_description)' \
            --model '%(model)' \
            --cuda-home '%(cuda_home)' \
            --log-dir '%(log_dir)'
```

## Example 3: Complex Scripts (Using `labtasker loop --script-path` option)

### Original Script
```bash
#!/bin/bash

export CUDA_HOME=/usr/local/cuda-12.1

for dataset in imagenet cifar10 mnist; do
  for model in resnet50 vit transformer; do
    LOG_DIR=/path/to/logs/$dataset/$model
    python train.py --dataset $dataset \
      --model $model \
      --cuda-home $CUDA_HOME \
      --log-dir $LOG_DIR
  done
done

echo "done"
```

### Submit Script

```bash
#!/bin/bash
export CUDA_HOME=/usr/local/cuda-12.1

for dataset in imagenet cifar10 mnist; do
  for model in resnet50 vit transformer; do
    LOG_DIR=/path/to/logs/$dataset/$model

    labtasker task submit -- --CUDA_HOME="$CUDA_HOME" --LOG_DIR="$LOG_DIR" --dataset="$dataset" --model="$model"

  done
done
```

### Run Script
```bash
#!/bin/bash

export CUDA_HOME=/usr/local/cuda-12.1

LABTASKER_TASK_SCRIPT=$(mktemp)

cat <<'LABTASKER_LOOP_EOF' > "$LABTASKER_TASK_SCRIPT"
CUDA_HOME=%(CUDA_HOME)
LOG_DIR=%(LOG_DIR)
dataset=%(dataset)
model=%(model)
    python train.py --dataset $dataset \
      --model $model \
      --cuda-home $CUDA_HOME \
      --log-dir $LOG_DIR
LABTASKER_LOOP_EOF

labtasker loop --executable /bin/bash --script-path $LABTASKER_TASK_SCRIPT

echo "done"
```

## Key Points to Remember

- All variables passed to `labtasker task submit` become available as `%(variable_name)` in the run script. All submitted variables MUST be used in the run script. All variables used in the run script MUST be submitted in advance.
- DO NOT APPLY NORMALIZATION ON YOUR OWN. E.g. this is prohibited: original: `--value_a=1`; run: `--value_a 1`. Your script should respect the original format.
- The submitted arguments WILL BE normalized. E.g. `--value-a=1` will be normalized to `--value-a=%(value_a)`
- Complex values with spaces or special characters should be properly quoted
- Environment variables and preprocessing can be included in the submit script
- Environment variables should also be preserved in the run script in case they're needed
- The run script acts as a template that's filled with task-specific values at runtime
- For special characters in submit script, e.g. negative numbers `--value=-1` or empty strings `--value=""` or `--value=" "`, you should use `--value=<value>` instead of just `--value <value>`
- Argument interpolation using %(variable_name) syntax is only valid under `labtasker loop` command. If bash script needs some %(variable_name) syntax, use the `--script-path` option to specify a path to a script that contains the interpolation syntax.
- Properly handle env variables: wrong: `labtasker loop -- CUDA_VISIBLE_DEVICES=0 python main.py ...`; right: `CUDA_VISIBLE_DEVICES=0 labtasker loop -- python main.py ...`.
- If user provided with task relevant info such as name or tags or description, consider using `--name` or `--metadata` to record more information. Remember that `--metadata` must be a JSON string that can be converted to a Python dict using literal_eval.

Now, I will provide an original script that needs to be decomposed for Labtasker. You need to decompose it as per the above steps.