cudf.core.groupby.DataFrameGroupBy.head#
- DataFrameGroupBy.head(n: int = 5, *, preserve_order: bool = True)[source]#
Return first n rows of each group
- Parameters:
- n
If positive: number of entries to include from start of group If negative: number of entries to exclude from end of group
- preserve_order
If True (default), return the n rows from each group in original dataframe order (this mimics pandas behavior though is more expensive). If you don’t need rows in original dataframe order you will see a performance improvement by setting
preserve_order=False
. In both cases, the original index is preserved, so.loc
-based indexing will work identically.
- Returns:
- Series or DataFrame
Subset of the original grouped object as determined by n
See also
Examples
>>> import cudf >>> df = cudf.DataFrame( ... { ... "a": [1, 0, 1, 2, 2, 1, 3, 2, 3, 3, 3], ... "b": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10], ... } ... ) >>> df.groupby("a").head(1) a b 0 1 0 1 0 1 3 2 3 6 3 6 >>> df.groupby("a").head(-2) a b 0 1 0 3 2 3 6 3 6 8 3 8