Skip to main content
Ctrl+K
PyFlink 1.20+vvr.11.7.dev0 documentation - Home PyFlink 1.20+vvr.11.7.dev0 documentation - Home
  • API Reference
  • Examples
  • API Reference
  • Examples

Section Navigation

  • PyFlink Table
  • PyFlink DataStream
    • StreamExecutionEnvironment
    • DataStream
    • Functions
    • State
    • Timer
    • Window
    • Checkpoint
    • Side Outputs
    • Asynchronous I/O
    • Connectors
    • Formats
  • PyFlink DataFrame
  • PyFlink Common
  • API Reference
  • PyFlink DataStream
  • Formats
  • pyflink.datastream.formats.orc.OrcBulkWriters

pyflink.datastream.formats.orc.OrcBulkWriters#

class OrcBulkWriters[source]#

Convenient builder to create a BulkWriterFactory that writes records with a predefined schema into Orc files in a batch fashion.

Example:

>>> row_type = DataTypes.ROW([
...     DataTypes.FIELD('string', DataTypes.STRING()),
...     DataTypes.FIELD('int_array', DataTypes.ARRAY(DataTypes.INT()))
... ])
>>> sink = FileSink.for_bulk_format(
...     OUTPUT_DIR, OrcBulkWriters.for_row_type(
...         row_type=row_type,
...         writer_properties=Configuration(),
...         hadoop_config=Configuration(),
...     )
... ).build()
>>> ds.sink_to(sink)

Added in version 1.16.0.

Methods

for_row_type(row_type[, writer_properties, ...])

Create a BulkWriterFactory that writes records with a predefined schema into Orc files in a batch fashion.

previous

pyflink.datastream.formats.json.JsonRowSerializationSchema

next

pyflink.datastream.formats.parquet.AvroParquetReaders

On this page
  • OrcBulkWriters

This Page

  • Show Source

Created using Sphinx 7.4.7.

Built with the PyData Sphinx Theme 0.16.1.