pyflink.table.GroupedTable.aggregate#
- GroupedTable.aggregate(func: Expression | UserDefinedAggregateFunctionWrapper) AggregatedTable[source]#
Performs a aggregate operation with an aggregate function. You have to close the aggregate with a select statement.
Example:
>>> agg = udaf(lambda a: (a.mean(), a.max()), ... result_type=DataTypes.ROW( ... [DataTypes.FIELD("a", DataTypes.FLOAT()), ... DataTypes.FIELD("b", DataTypes.INT())]), ... func_type="pandas") >>> tab.group_by(col('a')).aggregate(agg(col('b')).alias("c", "d")).select( ... col('a'), col('c'), col('d')) >>> # take all the columns as inputs >>> # pd is a Pandas.DataFrame >>> agg_row = udaf(lambda pd: (pd.a.mean(), pd.b.max()), ... result_type=DataTypes.ROW( ... [DataTypes.FIELD("a", DataTypes.FLOAT()), ... DataTypes.FIELD("b", DataTypes.INT())]), ... func_type="pandas") >>> tab.group_by(col('a')).aggregate(agg.alias("a", "b")).select(col('a'), col('b'))
- Parameters:
func – user-defined aggregate function.
- Returns:
The result table.
Added in version 1.13.0.