Skip to content

Conversation

sreeja97
Copy link

@sreeja97 sreeja97 commented Oct 18, 2025

@sreeja97 sreeja97 requested a review from rhshadrach as a code owner October 18, 2025 02:17
Copy link
Member

@rhshadrach rhshadrach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR!



@set_module("pandas")
class NamedAgg(NamedTuple):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems too breaking. Previously, users could access NamedAgg.column after creation, but not if we inherit from tuple. Can we use a dataclass here instead:

@dataclasses.dataclass
class NamedAgg:
    column: Hashable
    aggfunc: AggScalar
    args: tuple = ()
    kwargs: dict = dataclasses.field(default_factory=dict)

    def __init__(self, column: Hashable, aggfunc: AggScalar, *args, **kwargs) -> None:
        self.column = column
        self.aggfunc = aggfunc
        self.args = args
        self.kwargs = kwargs

    def __getitem__(self, key: int):
        if key == 0:
            return self.column
        elif key == 1:
            return self.aggfunc
        elif key == 2:
            return self.args
        elif key == 3:
            return self.kwargs
        raise IndexError("index out of range")

We could then possibly deprecate __getitem__ access.

>>> agg_1 = pd.NamedAgg(column=1, aggfunc=lambda x: np.mean(x))
>>> df.groupby("key").agg(result_a=agg_a, result_1=agg_1)
result_a result_1
>>> agg_b = pd.NamedAgg(column="b", aggfunc=lambda x: x.mean())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the point here is to demonstrate that you can used a named tuple on columns that are not strings.

return original_func(series, *final_args, **final_kwargs)

wrapped._is_wrapped = True # type: ignore[attr-defined]
aggfunc = wrapped
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In line with the above, this changes the aggfunc which is a public attribute. Instead, I think we should utilize args/kwargs in places within pandas that accept a NamedAgg

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ENH: pd.NamedAgg should accept args and kwargs

2 participants