Skip to main content
Ctrl+K
PyFlink 1.20+vvr.11.7.dev0 documentation - Home PyFlink 1.20+vvr.11.7.dev0 documentation - Home
  • API Reference
  • Examples
  • API Reference
  • Examples

Section Navigation

  • PyFlink Table
  • PyFlink DataStream
    • StreamExecutionEnvironment
    • DataStream
    • Functions
    • State
    • Timer
    • Window
    • Checkpoint
    • Side Outputs
    • Asynchronous I/O
    • Connectors
    • Formats
  • PyFlink DataFrame
  • PyFlink Common
  • API Reference
  • PyFlink DataStream
  • DataStream
  • pyflink.datastream.data_stream.CachedDataStream.rescale

pyflink.datastream.data_stream.CachedDataStream.rescale#

CachedDataStream.rescale() → DataStream#

Sets the partitioning of the DataStream so that the output elements are distributed evenly to a subset of instances of the next operation in a round-robin fashion.

The subset of downstream operations to which the upstream operation sends elements depends on the degree of parallelism of both the upstream and downstream operation. For example, if the upstream operation has parallelism 2 and the downstream operation has parallelism 4, then one upstream operation would distribute elements to two downstream operations. If, on the other hand, the downstream operation has parallelism 4 then two upstream operations will distribute to one downstream operation while the other two upstream operations will distribute to the other downstream operations.

In cases where the different parallelisms are not multiples of each one or several downstream operations will have a differing number of inputs from upstream operations.

Returns:

The DataStream with rescale partitioning set.

previous

pyflink.datastream.data_stream.CachedDataStream.project

next

pyflink.datastream.data_stream.CachedDataStream.rebalance

On this page
  • CachedDataStream.rescale()

This Page

  • Show Source

Created using Sphinx 7.4.7.

Built with the PyData Sphinx Theme 0.16.1.