Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/sgl-project/sglang/llms.txt

Use this file to discover all available pages before exploring further.

Overview

The gen() function generates text from the language model at the current position in your prompt program.

Syntax

sgl.gen(name, max_tokens=128, temperature=1.0, ...)

Parameters

name
str
Variable name to store the generated text. Access with state[name].
max_tokens
int
default:"128"
Maximum number of tokens to generate.
min_tokens
int
Minimum number of tokens to generate.
temperature
float
default:"1.0"
Sampling temperature. Higher values (e.g., 1.5) make output more random, lower values (e.g., 0.2) make it more deterministic.
top_p
float
default:"1.0"
Nucleus sampling threshold. Only tokens with cumulative probability up to top_p are considered.
top_k
int
default:"-1"
Top-k sampling. Only the top k most likely tokens are considered. -1 means disabled.
min_p
float
default:"0.0"
Minimum probability threshold for token sampling.
stop
str | List[str]
Stop sequences. Generation stops when any of these strings are generated.
stop_token_ids
List[int]
Token IDs that trigger generation to stop.
stop_regex
str | List[str]
Regular expressions that trigger generation to stop when matched.
frequency_penalty
float
default:"0.0"
Penalty for token frequency. Positive values reduce repetition.
presence_penalty
float
default:"0.0"
Penalty for token presence. Positive values encourage topic diversity.
ignore_eos
bool
default:"false"
Whether to ignore end-of-sequence tokens.
regex
str
Regular expression constraint. Generated text must match this pattern.
json_schema
str
JSON schema constraint. Generated text must be valid JSON matching this schema.
choices
List[str]
If provided, gen() behaves like select() and chooses from these options.
return_logprob
bool
Whether to return log probabilities for generated tokens.
logprob_start_len
int
Start position for computing log probabilities.
top_logprobs_num
int
Number of top log probabilities to return per token.

Usage

Basic Generation

import sglang as sgl

@sgl.function
def simple_gen(s):
    s += "The capital of France is"
    s += sgl.gen("answer", max_tokens=10)

state = simple_gen.run()
print(state["answer"])  # " Paris"

With Stop Sequences

@sgl.function
def generate_list(s):
    s += "List three colors:\n"
    s += sgl.gen("colors", max_tokens=50, stop="\n\n")

state = generate_list.run()
print(state["colors"])

Temperature Control

@sgl.function
def creative_writing(s, prompt):
    s += prompt
    s += sgl.gen("story", max_tokens=200, temperature=1.5)  # More creative

@sgl.function
def factual_qa(s, question):
    s += question
    s += sgl.gen("answer", max_tokens=50, temperature=0.0)  # Deterministic

Constrained Generation with Regex

@sgl.function
def generate_email(s):
    s += "Generate an email address:\n"
    s += sgl.gen(
        "email",
        max_tokens=30,
        regex=r"[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}"
    )

JSON Schema Constraint

import json

@sgl.function
def generate_person(s):
    schema = json.dumps({
        "type": "object",
        "properties": {
            "name": {"type": "string"},
            "age": {"type": "integer"},
            "email": {"type": "string"}
        },
        "required": ["name", "age"]
    })
    
    s += "Generate a person:\n"
    s += sgl.gen("person", max_tokens=100, json_schema=schema)

state = generate_person.run()
person = json.loads(state["person"])
print(person["name"], person["age"])

Specialized Variants

gen_int()

Generates an integer value.
sgl.gen_int(name, max_tokens=10, ...)
Automatically constrains generation to match integer format (digits with optional +/- prefix). Example:
@sgl.function
def math_problem(s):
    s += "What is 25 + 17? Answer: "
    s += sgl.gen_int("result", max_tokens=5)

state = math_problem.run()
print(int(state["result"]))  # 42

gen_string()

Generates a string value.
sgl.gen_string(name, max_tokens=50, ...)
Automatically constrains generation to match quoted string format. Example:
@sgl.function
def extract_name(s, text):
    s += f"Extract the name from: {text}\nName: "
    s += sgl.gen_string("name", max_tokens=20)

state = extract_name.run(text="Hello, I'm Alice.")
print(state["name"])  # "Alice"

Accessing Generated Content

The generated text is stored in the state object and can be accessed by name:
state = my_function.run()
generated_text = state["variable_name"]

See Also