Prompt Template¶
This is a simple prompt to tell AI how to turn your serial task script into Labtasker submit and run script.
# Labtasker Task Decomposition Guide
## Overview
Labtasker is a CLI tool that distributes tasks across computing resources. It uses two primary commands:
- `labtasker task submit` - Queues task arguments for later execution
- `labtasker loop` - Continuously fetches and executes queued tasks
This guide will help you decompose a serial task script into Labtasker submit and run scripts.
## Task Decomposition Process
For any serial task script, you need to create:
1. A **submit script** that queues all task variations
2. A **run script** that executes those tasks when resources are available
### Pattern Recognition
- Identify loops and parameter variations in the original script
- Move parameter generation to the submit script
- Create a template for task execution in the run script
## Example 1: Simple Parameter Grid
### Original Script
```bash
#!/bin/bash
# Simple parameter grid search
# tags: experimental
for arg1 in {0..2}; do
for arg2 in {3..5}; do
python main.py --arg1 $arg1 --arg2 $arg2
done
done
```
### Submit Script
```bash
#!/bin/bash
submit_date=$(date +%s)
for arg1 in {0..2}; do
for arg2 in {3..5}; do
labtasker task submit --name simple_grid_search --metadata "{'tags': ['experimental', '$submit_date']}" -- --arg1=$arg1 --arg2=$arg2
done
done
```
### Run Script
```bash
#!/bin/bash
labtasker loop -- python main.py --arg1 '%(arg1)' --arg2 '%(arg2)'
```
## Example 2: Complex Environment and Parameters
### Original Script
```bash
#!/bin/bash
export CUDA_HOME=/usr/local/cuda-12.1
BASE_LOG_DIR=/path/to/logs
DATASETS=("imagenet" "cifar10" "mnist" "custom_dataset")
DATASET_DESCRIPTIONS=("A person's \"image\" net" "A person's cifar 10" "A person's mnist" "A person's custom dataset")
MODELS=("resnet50" "vit" "transformer" "alexnet")
for idx in "${!DATASETS[@]}"; do
for model_idx in "${!MODELS[@]}"; do
DATASET_DESCRIPTION=${DATASET_DESCRIPTIONS[$idx]}
DATASET=${DATASETS[$idx]}
MODEL=${MODELS[$model_idx]}
LOG_DIR="$BASE_LOG_DIR/$(echo "$DATASET" | tr '[:upper:]' '[:lower:]')/$MODEL"
echo "Processing Dataset: $(echo "$DATASET" | tr '[:lower:]' '[:upper:]')"
echo "Dataset Description: ${DATASET_DESCRIPTION}"
echo "Model (short): ${MODEL:0:3}"
echo "Log Directory: ${LOG_DIR}"
# Execute training command
python train.py --dataset "$DATASET" \
--dataset-description "$DATASET_DESCRIPTION" \
--model "$MODEL" \
--cuda-home "$CUDA_HOME" \
--log-dir "$LOG_DIR"
done
done
echo "All tasks completed successfully."
```
### Submit Script
```bash
#!/bin/bash
export CUDA_HOME=/usr/local/cuda-12.1
BASE_LOG_DIR=/path/to/logs
DATASETS=("imagenet" "cifar10" "mnist" "custom_dataset")
DATASET_DESCRIPTIONS=("A person's \"image\" net" "A person's cifar 10" "A person's mnist" "A person's custom dataset")
MODELS=("resnet50" "vit" "transformer" "alexnet")
for idx in "${!DATASETS[@]}"; do
for model_idx in "${!MODELS[@]}"; do
DATASET_DESCRIPTION=${DATASET_DESCRIPTIONS[$idx]}
DATASET=${DATASETS[$idx]}
MODEL=${MODELS[$model_idx]}
LOG_DIR="$BASE_LOG_DIR/$(echo "$DATASET" | tr '[:upper:]' '[:lower:]')/$MODEL"
echo "Submitting task for dataset: $DATASET, model: $MODEL"
labtasker task submit -- \
--dataset="$DATASET" \
--dataset-description="$DATASET_DESCRIPTION" \
--model="$MODEL" \
--cuda-home="$CUDA_HOME" \
--log-dir="$LOG_DIR"
done
done
echo "All tasks submitted successfully."
```
### Run Script
```bash
#!/bin/bash
export CUDA_HOME=/usr/local/cuda-12.1
labtasker loop -- \
python train.py --dataset '%(dataset)' \
--dataset-description '%(dataset_description)' \
--model '%(model)' \
--cuda-home '%(cuda_home)' \
--log-dir '%(log_dir)'
```
## Example 3: Complex Scripts (Using `labtasker loop --script-path` option)
### Original Script
```bash
#!/bin/bash
export CUDA_HOME=/usr/local/cuda-12.1
for dataset in imagenet cifar10 mnist; do
for model in resnet50 vit transformer; do
LOG_DIR=/path/to/logs/$dataset/$model
python train.py --dataset $dataset \
--model $model \
--cuda-home $CUDA_HOME \
--log-dir $LOG_DIR
done
done
echo "done"
```
### Submit Script
```bash
#!/bin/bash
export CUDA_HOME=/usr/local/cuda-12.1
for dataset in imagenet cifar10 mnist; do
for model in resnet50 vit transformer; do
LOG_DIR=/path/to/logs/$dataset/$model
labtasker task submit -- --CUDA_HOME="$CUDA_HOME" --LOG_DIR="$LOG_DIR" --dataset="$dataset" --model="$model"
done
done
```
### Run Script
```bash
#!/bin/bash
export CUDA_HOME=/usr/local/cuda-12.1
LABTASKER_TASK_SCRIPT=$(mktemp)
cat <<'LABTASKER_LOOP_EOF' > "$LABTASKER_TASK_SCRIPT"
CUDA_HOME=%(CUDA_HOME)
LOG_DIR=%(LOG_DIR)
dataset=%(dataset)
model=%(model)
python train.py --dataset $dataset \
--model $model \
--cuda-home $CUDA_HOME \
--log-dir $LOG_DIR
LABTASKER_LOOP_EOF
labtasker loop --executable /bin/bash --script-path $LABTASKER_TASK_SCRIPT
echo "done"
```
## Key Points to Remember
- All variables passed to `labtasker task submit` become available as `%(variable_name)` in the run script. All submitted variables MUST be used in the run script. All variables used in the run script MUST be submitted in advance.
- DO NOT APPLY NORMALIZATION ON YOUR OWN. E.g. this is prohibited: original: `--value_a=1`; run: `--value_a 1`. Your script should respect the original format.
- The submitted arguments WILL BE normalized. E.g. `--value-a=1` will be normalized to `--value-a=%(value_a)`
- Complex values with spaces or special characters should be properly quoted
- Environment variables and preprocessing can be included in the submit script
- Environment variables should also be preserved in the run script in case they're needed
- The run script acts as a template that's filled with task-specific values at runtime
- For special characters in submit script, e.g. negative numbers `--value=-1` or empty strings `--value=""` or `--value=" "`, you should use `--value=<value>` instead of just `--value <value>`
- Argument interpolation using %(variable_name) syntax is only valid under `labtasker loop` command. If bash script needs some %(variable_name) syntax, use the `--script-path` option to specify a path to a script that contains the interpolation syntax.
- Properly handle env variables: wrong: `labtasker loop -- CUDA_VISIBLE_DEVICES=0 python main.py ...`; right: `CUDA_VISIBLE_DEVICES=0 labtasker loop -- python main.py ...`.
- If user provided with task relevant info such as name or tags or description, consider using `--name` or `--metadata` to record more information. Remember that `--metadata` must be a JSON string that can be converted to a Python dict using literal_eval.
Now, I will provide an original script that needs to be decomposed for Labtasker. You need to decompose it as per the above steps.