Input/Output#

Functions to read data from external sources into a DataFrame. For writing data, see the write_parquet, write_kafka, write_odps, write_paimon, write_sls, write_hologres, and write_custom methods on DataFrame.

Example:

>>> import pyflink.dataframe as pf
>>> df = pf.read_parquet("/path/to/data", schema={"id": pf.DataType.int64(), "name": pf.DataType.string()})
>>> df = pf.read_kafka("localhost:9092", schema={"key": pf.DataType.string()}, topic="my_topic")

read_parquet(path, *[, schema, columns, ...])

Read a Parquet file or directory into a DataFrame.

read_kafka(bootstrap_servers, *[, schema, ...])

Read data from one or more Kafka topics into a DataFrame.

read_odps(endpoint, *[, schema, columns, ...])

Read a MaxCompute (ODPS) table into a DataFrame.

read_paimon([path, schema, primary_key, ...])

Read a Paimon table into a DataFrame.

read_sls(endpoint, *, project, logstore[, ...])

Read an SLS (Aliyun Log Service) logstore into a DataFrame.

read_hologres(endpoint, *, db_name, ...[, ...])

Read a Hologres table into a DataFrame.

read_custom(connector, *[, schema, ...])

Read data from a custom connector into a DataFrame.