Documentation for AI-Flux - LLM Batch Processing Pipeline for HPC Systems
A streamlined solution for running Large Language Models (LLMs) in batch mode on HPC systems powered by Slurm. AI-Flux uses the OpenAI-compatible API format with a JSONL-first architecture for all interactions.
JSONL Input Batch Processing Results
(OpenAI Format) (Ollama + Model) (JSON Output)
│ │ │
│ │ │
▼ ▼ ▼
┌──────────┐ ┌──────────────┐ ┌──────────┐
│ Batch │ │ │ │ Output │
│ Requests │─────────────────▶ │ Model on │─────────────────▶ │ Results │
│ (JSONL) │ │ GPU(s) │ │ (JSON) │
└──────────┘ │ │ └──────────┘
└──────────────┘
AI-Flux processes JSONL files in a standardized OpenAI-compatible batch API format, enabling efficient processing of thousands of prompts on HPC systems with minimal overhead.
conda create -n aiflux python=3.11 -y
conda activate aiflux
pip install -e .
cp .env.example .env
# Edit .env with your SLURM account and model details
The primary workflow for AI-Flux is submitting JSONL files for batch processing on SLURM:
from aiflux.slurm import SlurmRunner
from aiflux.core.config import Config
# Setup SLURM configuration
config = Config()
slurm_config = config.get_slurm_config()
slurm_config.account = "myaccount"
# Initialize runner
runner = SlurmRunner(config=slurm_config)
# Submit JSONL file directly for processing
job_id = runner.run(
input_path="prompts.jsonl",
output_path="results.json",
model="llama3.2:3b",
batch_size=4
)
print(f"Job submitted with ID: {job_id}")
JSONL input format follows the OpenAI Batch API specification:
{"custom_id":"request1","method":"POST","url":"/v1/chat/completions","body":{"model":"llama3.2:3b","messages":[{"role":"system","content":"You are a helpful assistant"},{"role":"user","content":"Explain quantum computing"}],"temperature":0.7,"max_tokens":500}}
{"custom_id":"request2","method":"POST","url":"/v1/chat/completions","body":{"model":"llama3.2:3b","messages":[{"role":"system","content":"You are a helpful assistant"},{"role":"user","content":"What is machine learning?"}],"temperature":0.7,"max_tokens":500}}
For advanced options like custom batch sizes, processing settings, or SLURM configuration, see the Configuration Guide.
For advanced model configuration, see the Models Guide.
AI-Flux includes a command-line interface for submitting batch processing jobs:
# Process JSONL file directly (core functionality)
aiflux run --model llama3.2:3b --input data/prompts.jsonl --output results/output.json
For detailed command options:
aiflux --help
Results are saved in the user’s workspace:
[
{
"input": {
"custom_id": "request1",
"method": "POST",
"url": "/v1/chat/completions",
"body": {
"model": "llama3.2:3b",
"messages": [
{"role": "system", "content": "You are a helpful assistant"},
{"role": "user", "content": "Original prompt text"}
],
"temperature": 0.7,
"max_tokens": 1024
},
"metadata": {
"source_file": "example.txt"
}
},
"output": {
"id": "chat-cmpl-123",
"object": "chat.completion",
"created": 1699123456,
"model": "llama3.2:3b",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Generated response text"
},
"finish_reason": "stop"
}
]
},
"metadata": {
"model": "llama3.2:3b",
"timestamp": "2023-11-04T12:34:56.789Z",
"processing_time": 1.23
}
}
]
AI-Flux provides utility converters to help prepare JSONL files from various input formats:
# Convert CSV to JSONL
aiflux convert csv --input data/papers.csv --output data/papers.jsonl --template "Summarize: {text}"
# Convert directory to JSONL
aiflux convert dir --input data/documents/ --output data/docs.jsonl --recursive
For code examples of converters, see the examples directory.
We welcome contributions! Please see CONTRIBUTING.md for guidelines.