AI / LLM#
The AI module provides integration with large language models (LLMs) for tasks such as
classification, sentiment analysis, text extraction, translation, summarization, PII masking,
and embedding generation. Access these functions through the df.llm accessor on a
DataFrame.
Example:
>>> import pyflink.dataframe as pf
>>> pf.set_provider("openai", api_key="...", model="gpt-4o")
>>> df = pf.from_dict({"text": ["Hello world", "Bonjour le monde"]})
>>> df.llm.ai_translate("text", source_lang="auto", target_lang="en").collect()
Providers#
Register and manage LLM providers.
|
Register a global provider configuration. |
|
Set the default provider for AI functions. |
List all registered provider names. |
Provider#
Base class and built-in provider implementations.
|
Base class for model providers. |
|
Provider for all OpenAI-compatible endpoints (openai-compat). |
|
Provider for Alibaba Cloud DashScope (dashscope). |
|
Provider for NVIDIA Triton Inference Server (triton). |
|
Generic provider for unknown or custom model providers. |
LLMAccessor#
Accessed via the df.llm property on a DataFrame.
Provides methods for common AI tasks.
|
Perform prediction using a model. |
|
Classify text into one of the provided labels. |
|
Analyze the sentiment of input text. |
|
Extract structured information from text. |
|
Translate text from one language to another. |
|
Summarize text to a maximum length. |
|
Mask sensitive information in text. |
|
Generate embedding vectors for text. |