Understanding the Pandas DeprecationWarning: DataFrameGroupBy.apply operated on the grouping columns

What Does This Mean?
Example
How to Fix the Warning
- Option 1: Exclude the Grouping Column
- Option 2: Explicitly Select Desired Columns
Conclusion

When using the .apply() method on a grouped DataFrame in pandas, you may encounter the following warning:

DeprecationWarning: DataFrameGroupBy.apply operated on the grouping columns. This behavior is deprecated, and in a future version of pandas the grouping columns will be excluded from the operation. Either pass `include_groups=False` to exclude the groupings or explicitly select the grouping columns after groupby to silence this warning.

What Does This Mean?

This warning indicates that the grouping column(s) are still present in the resulting DataFrame after applying a function. Since these columns already serve as the index, their inclusion is often redundant.

Example

Consider the following DataFrame:

df = pd.DataFrame([
    ('Jack', 'Male', 10000),
    ('Anne', 'Female', 12000),
    ('Kate', 'Female', 15000),
    ('Dan', 'Male', 8000),
    ('Alice', 'Female', 9000)
], columns=['Name', 'Gender', 'Salary'])

grouped = df.groupby('Gender')

Let’s apply a function to find the maximum salary values within each gender group:

grouped.apply(lambda g: g.max())

This triggers the warning because the Gender column, used for grouping, appears in the resulting DataFrame despite already being an index.

DeprecationWarning: DataFrameGroupBy.apply operated on the grouping columns

How to Fix the Warning

There are two ways to resolve this:

Option 1: Exclude the Grouping Column

Pass include_groups=False to explicitly exclude the grouping column from the output:

grouped.apply(lambda g: g.max(), include_groups=False)

Option 2: Explicitly Select Desired Columns

If you only want specific columns in the output, apply the function to a subset:

df.groupby('Gender')[['Name', 'Salary']].apply(lambda g: g.max())

This ensures that only the selected columns are included in the final DataFrame.

Conclusion

This change in pandas helps avoid redundancy when working with grouped DataFrames. To future-proof your code and silence the warning, either set include_groups=False or explicitly select the columns you need.

Understanding the Pandas DeprecationWarning: DataFrameGroupBy.apply operated on the grouping columns

What Does This Mean?

Example

How to Fix the Warning

Option 1: Exclude the Grouping Column

Option 2: Explicitly Select Desired Columns

Conclusion

You May Also Like

Disrupt or Conform: A Tension Between Design and Business Operational Practices

Rethinking Design: Theory as a Practical Tool

Passkeys: Demystified

Representing Negative Binary Numbers