pyflink.dataframe.ai.llm.LLMAccessor.predict#

LLMAccessor.predict(*input_cols: str, provider: str = None, model: str = None, output_type: Mapping[str, str | DataType] | None = None, config: Dict[str, str] = None) → DataFrame[source]#

Perform prediction using a model.

This is the general-purpose prediction method.

When a provider is configured, a temporary model is created with the given input/output schema. When using a catalog model (no provider), the model’s registered schema is used and output_type is ignored.

Parameters:

*input_cols – Column names to use as input.
provider – Provider name. Uses default if not specified. If no provider is configured, model is treated as a catalog model name.
model – Model name (e.g. “qwen-plus”) or catalog model name.
output_type – Output column schema as a dict {name: type}. Type can be a SQL type string (e.g. "STRING") or a DataType object. Defaults to {"output": "STRING"}. Only used when a provider is configured. Ignored for catalog models.
config – Optional dict of runtime config options.

Returns:

A new DataFrame with the model output columns appended.

Example:

>>> # Single output column (default)
>>> df.llm.predict("question", model="qwen-plus")
>>> # JSON structured output
>>> df.llm.predict("question", model="qwen-plus",
...     output_type={"output": "VARIANT"})
>>> # Multiple output columns
>>> df.llm.predict("question", model="qwen-plus",
...     output_type={"answer": DataType.string(),
...                  "score": DataType.float64()})

pyflink.dataframe.ai.llm.LLMAccessor.predict#

This Page