Catalog#

Catalog is responsible for reading and writing metadata such as database/table/views/UDFs from a registered catalog. It connects a registered catalog and Flink’s Table API.

`Catalog.get_default_database`()	Get the name of the default database for this catalog.
`Catalog.list_databases`()	Get the names of all databases in this catalog.
`Catalog.get_database`(database_name)	Get a database from this catalog.
`Catalog.database_exists`(database_name)	Check if a database exists in this catalog.
`Catalog.create_database`(name, database, ...)	Create a database.
`Catalog.drop_database`(name, ignore_if_exists)	Drop a database.
`Catalog.alter_database`(name, new_database, ...)	Modify an existing database.
`Catalog.list_tables`(database_name)	Get names of all tables and views under this database.
`Catalog.list_views`(database_name)	Get names of all views under this database.
`Catalog.get_table`(table_path)	Get a CatalogTable or CatalogView identified by tablePath.
`Catalog.table_exists`(table_path)	Check if a table or view exists in this catalog.
`Catalog.drop_table`(table_path, ...)	Drop a table or view.
`Catalog.rename_table`(table_path, ...)	Rename an existing table or view.
`Catalog.create_table`(table_path, table, ...)	Create a new table or view.
`Catalog.alter_table`(table_path, new_table, ...)	Modify an existing table or view.
`Catalog.list_partitions`(table_path[, ...])	Get CatalogPartitionSpec of all partitions of the table.
`Catalog.get_partition`(table_path, partition_spec)	Get a partition of the given table.
`Catalog.partition_exists`(table_path, ...)	Check whether a partition exists or not.
`Catalog.create_partition`(table_path, ...)	Create a partition.
`Catalog.drop_partition`(table_path, ...)	Drop a partition.
`Catalog.alter_partition`(table_path, ...)	Alter a partition.
`Catalog.list_functions`(database_name)	List the names of all functions in the given database.
`Catalog.get_function`(function_path)	Get the function.
`Catalog.function_exists`(function_path)	Check whether a function exists or not.
`Catalog.create_function`(function_path, ...)	Create a function.
`Catalog.alter_function`(function_path, ...)	Modify an existing function.
`Catalog.drop_function`(function_path, ...)	Drop a function.
`Catalog.get_table_statistics`(table_path)	Get the statistics of a table.
`Catalog.get_table_column_statistics`(table_path)	Get the column statistics of a table.
`Catalog.get_partition_statistics`(table_path, ...)	Get the statistics of a partition.
`Catalog.bulk_get_partition_statistics`(...)	Get a list of statistics of given partitions.
`Catalog.get_partition_column_statistics`(...)	Get the column statistics of a partition.
`Catalog.bulk_get_partition_column_statistics`(...)	Get a list of the column statistics for the given partitions.
`Catalog.alter_table_statistics`(table_path, ...)	Update the statistics of a table.
`Catalog.alter_table_column_statistics`(...)	Update the column statistics of a table.
`Catalog.alter_partition_statistics`(...)	Update the statistics of a table partition.
`Catalog.alter_partition_column_statistics`(...)	Update the column statistics of a table partition.

CatalogDatabase#

Represents a database object in a catalog.

`CatalogDatabase.create_instance`(properties)	Creates an instance of CatalogDatabase.
`CatalogDatabase.get_properties`()	Get a map of properties associated with the database.
`CatalogDatabase.get_comment`()	Get comment of the database.
`CatalogDatabase.copy`()	Get a deep copy of the CatalogDatabase instance.
`CatalogDatabase.get_description`()	Get a brief description of the database.
`CatalogDatabase.get_detailed_description`()	Get a detailed description of the database.

CatalogBaseTable#

CatalogBaseTable is the common parent of table and view. It has a map of key-value pairs defining the properties of the table.

`CatalogBaseTable.create_table`(schema[, ...])	Create an instance of CatalogBaseTable for the catalog table.
`CatalogBaseTable.create_view`(original_query, ...)	Create an instance of CatalogBaseTable for the catalog view.
`CatalogBaseTable.get_options`()	Returns a map of string-based options.
`CatalogBaseTable.get_schema`()	Get the schema of the table.
`CatalogBaseTable.get_unresolved_schema`()	Returns the schema of the table or view.
`CatalogBaseTable.get_comment`()	Get comment of the table or view.
`CatalogBaseTable.copy`()	Get a deep copy of the CatalogBaseTable instance.
`CatalogBaseTable.get_description`()	Get a brief description of the table or view.
`CatalogBaseTable.get_detailed_description`()	Get a detailed description of the table or view.

CatalogPartition#

Represents a partition object in catalog.

`CatalogPartition.create_instance`(properties)	Creates an instance of CatalogPartition.
`CatalogPartition.get_properties`()	Get a map of properties associated with the partition.
`CatalogPartition.copy`()	Get a deep copy of the CatalogPartition instance.
`CatalogPartition.get_description`()	Get a brief description of the partition object.
`CatalogPartition.get_detailed_description`()	Get a detailed description of the partition object.
`CatalogPartition.get_comment`()	Get comment of the partition.

CatalogFunction#

Represents a partition object in catalog.

`CatalogFunction.create_instance`(class_name)	Creates an instance of CatalogDatabase.
`CatalogFunction.get_class_name`()	Get the full name of the class backing the function.
`CatalogFunction.copy`()	Create a deep copy of the function.
`CatalogFunction.get_description`()	Get a brief description of the function.
`CatalogFunction.get_detailed_description`()	Get a detailed description of the function.
`CatalogFunction.is_generic`()	Whether or not is the function a flink UDF.
`CatalogFunction.get_function_language`()	Get the language used for the function definition.

ObjectPath#

A database name and object (table/view/function) name combo in a catalog.

`ObjectPath.from_string`(full_name)
`ObjectPath.get_database_name`()
`ObjectPath.get_object_name`()
`ObjectPath.get_full_name`()

CatalogPartitionSpec#

Represents a partition spec object in catalog. Partition columns and values are NOT of strict order, and they need to be re-arranged to the correct order by comparing with a list of strictly ordered partition keys.

CatalogPartitionSpec.get_partition_spec()

Get the partition spec as key-value map.

CatalogTableStatistics#

Statistics for a non-partitioned table or a partition of a partitioned table.

`CatalogTableStatistics.get_row_count`()	The number of rows in the table or partition.
`CatalogTableStatistics.get_field_count`()	The number of files on disk.
`CatalogTableStatistics.get_total_size`()	The total size in bytes.
`CatalogTableStatistics.get_raw_data_size`()	The raw data size (size when loaded in memory) in bytes.
`CatalogTableStatistics.get_properties`()
`CatalogTableStatistics.copy`()	Create a deep copy of "this" instance.

CatalogColumnStatistics#

Column statistics of a table or partition.

`CatalogColumnStatistics.get_column_statistics_data`()
`CatalogColumnStatistics.get_properties`()
`CatalogColumnStatistics.copy`()

HiveCatalog#

A catalog implementation for Hive.

HiveCatalog(catalog_name[, ...])

A catalog implementation for Hive.

JdbcCatalog#

A catalog implementation for Jdbc.

JdbcCatalog(catalog_name, default_database, ...)

A catalog implementation for Jdbc.

Column#

Representation of a column in a ResolvedSchema.

A table column describes either a pyflink.table.catalog.PhysicalColumn, pyflink.table.catalog.ComputedColumn, or pyflink.table.catalog.MetadataColumn.

`Column.physical`(name, data_type)	Creates a regular table column that represents physical data.
`Column.computed`(name, resolved_expression)	Creates a computed column that is computed from the given `ResolvedExpression`.
`Column.metadata`(name, data_type, ...)	Creates a metadata column from metadata of the given column name or from metadata of the given key (if not null).
`Column.with_comment`(comment)	Add the comment to the column and return the new object.
`Column.is_physical`()	Returns whether the given column is a physical column of a table; neither computed nor metadata.
`Column.is_persisted`()	Returns whether the given column is persisted in a sink operation.
`Column.get_data_type`()	Returns the data type of this column.
`Column.get_name`()	Returns the name of this column.
`Column.get_comment`()	Returns the comment of this column.
`Column.as_summary_string`()	Returns a string that summarizes this column for printing to a console.
`Column.explain_extras`()	Returns an explanation of specific column extras next to name and type.
`Column.copy`(new_type)	Returns a copy of the column with a replaced `DataType`.
`Column.rename`(new_name)	Returns a copy of the column with a replaced name.

WatermarkSpec#

Representation of a watermark specification in ResolvedSchema.

It defines the rowtime attribute and a ResolvedExpression for watermark generation.

`WatermarkSpec.of`(rowtime_attribute, ...)	Creates a `WatermarkSpec` from a given rowtime attribute and a watermark expression.
`WatermarkSpec.get_rowtime_attribute`()	Returns the name of a rowtime attribute.
`WatermarkSpec.get_watermark_expression`()	Returns the `ResolvedExpression` for watermark generation.
`WatermarkSpec.as_summary_string`()	Prints the watermark spec in a readable way.

Constraint#

Integrity constraints, generally referred to simply as constraints, define the valid states of SQL-data by constraining the values in the base tables.

`Constraint.get_name`()	Returns the name of the constraint.
`Constraint.is_enforced`()	Constraints can either be enforced or non-enforced.
`Constraint.get_type`()	Returns the type of the constraint, which could be PRIMARY_KEY or UNIQUE_KEY.
`Constraint.as_summary_string`()	Prints the constraint in a readable way.

UniqueConstraint#

A unique key constraint. It can be declared also as a PRIMARY KEY.

`UniqueConstraint.get_columns`()	List of column names for which the primary key was defined.
`UniqueConstraint.get_type_string`()	Returns a string representation of the underlying constraint type.

ResolvedSchema#

Schema of a table or view consisting of columns, constraints, and watermark specifications.

This class is the result of resolving a Schema into a final validated representation.

Data types and functions have been expanded to fully qualified identifiers.
Time attributes are represented in the column’s data type.
pyflink.table.Expression have been translated to pyflink.table.catalog.ResolvedExpression

This class should not be passed into a connector. It is therefore also not serializable. Instead, the to_physical_row_data_type() can be passed around where necessary.

`ResolvedSchema.of`(columns)	Shortcut for a resolved schema of only columns.
`ResolvedSchema.physical`(column_names, ...)	Shortcut for a resolved schema of only physical columns.
`ResolvedSchema.get_column_count`()	Returns the number of `Column` of this schema.
`ResolvedSchema.get_columns`()	Returns all `Column` of this schema.
`ResolvedSchema.get_column_names`()	Returns all column names.
`ResolvedSchema.get_column_data_types`()	Returns all column data types.
`ResolvedSchema.get_column`(column_index_or_name)	Returns the `Column` instance for the given column index or name.
`ResolvedSchema.get_watermark_specs`()	Returns a list of watermark specifications each consisting of a rowtime attribute and watermark strategy expression.
`ResolvedSchema.get_primary_key`()	Returns the primary key if it has been defined.
`ResolvedSchema.get_primary_key_indexes`()	Returns the primary key indexes, if any, otherwise returns an empty list.
`ResolvedSchema.to_source_row_data_type`()	Converts all columns of this schema into a (possibly nested) row data type.
`ResolvedSchema.to_physical_row_data_type`()	Converts all physical columns of this schema into a (possibly nested) row data type.
`ResolvedSchema.to_sink_row_data_type`()	Converts all persisted columns of this schema into a (possibly nested) row data type.

Catalog#

Catalog#

CatalogDatabase#

CatalogBaseTable#

CatalogPartition#

CatalogFunction#

ObjectPath#

CatalogPartitionSpec#

CatalogTableStatistics#

CatalogColumnStatistics#

HiveCatalog#

JdbcCatalog#

Column#

WatermarkSpec#

Constraint#

UniqueConstraint#

ResolvedSchema#

This Page