Skip to main content
Ctrl+K
PyFlink 1.20+vvr.11.7.dev0 documentation - Home PyFlink 1.20+vvr.11.7.dev0 documentation - Home
  • API Reference
  • Examples
  • API Reference
  • Examples

Section Navigation

  • PyFlink Table
  • PyFlink DataStream
  • PyFlink DataFrame
    • DataFrame
    • DataFrame Creation
    • Input/Output
    • SQL
    • DataType
    • User Defined Functions
    • Configuration
    • GPU Support
    • AI / LLM
  • PyFlink Common
  • API Reference
  • PyFlink DataFrame
  • SQL
  • pyflink.dataframe.sql.sql

pyflink.dataframe.sql.sql#

sql(query: str, *, auto_bind: bool = True, **bindings: DataFrame) → DataFrame[source]#

Run a SQL SELECT query against DataFrames in scope, returning a DataFrame.

There are two independent ways DataFrames get bound to SQL names:

  • Explicit bindings via **bindings kwargs. These are user contracts: any problem (bad view name, non-DataFrame value, collision with an existing temporary table/view) is reported as ValueError.

  • Auto-bind (when auto_bind=True, the default): scans the caller’s globals and locals for DataFrame instances and registers them under their Python variable name. Best-effort: collisions warn and skip, invalid view names warn and skip, different-TableEnvironment DataFrames log and skip.

Parameters:
  • query – A SQL SELECT statement. Other statement kinds (INSERT, DDL) are not supported.

  • auto_bind – If True (default), auto-detect DataFrame instances in the caller’s globals and locals. See above.

  • **bindings – Explicit alias -> DataFrame mappings. Always registered, even when auto_bind is False. Take precedence over auto-bind on name collision.

Returns:

A new DataFrame backed by the query result.

Note

This function is not safe under concurrent calls against the same TableEnvironment: register / sql_query / drop is not an atomic sequence. Don’t call it from multiple threads sharing one t_env.

Example:

>>> import pyflink.dataframe as pf
>>> df1 = pf.from_dict({"a": [1, 2, 3], "b": ["x", "y", "z"]})
>>> df2 = pf.from_dict({"a": [1, 2, 3], "c": ["p", "q", "r"]})
>>> pf.sql("SELECT * FROM df1 JOIN df2 ON df1.a = df2.a").show()

previous

SQL

next

DataType

On this page
  • sql()

This Page

  • Show Source

Created using Sphinx 7.4.7.

Built with the PyData Sphinx Theme 0.16.1.