Skip to content

Conversation

fujiaxiang
Copy link
Member

@fujiaxiang fujiaxiang commented Dec 26, 2019

When columns are integers, df.groupby(label).quantile(<arraylike>) fails.

@fujiaxiang
Copy link
Member Author

@jreback I know this isn't exactly what you thought of per our discussion here (#30462). This is by far the cleanest implementation I can think of. Reason being:

  1. I'm hoping to keep the function call to _get_cythonized_result the same whether q is scalar or not. Hence we would end of having a list of dataframes. They may have Index or MultiIndex with them so I feel it is not so clean to build index for them one-by-one.
  2. If we concatenate them first, using reorder_levels method seems the most natural way of doing things. There's no clean way (that I know of) to place the quantile index level inside before concatenating them.

On a side note, I refactored the original for-loop which is slow and not as readable and used numpy to construct the same array (indices variable).
Let me know what you think, and I will adjust accordingly!

Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fujiaxiang looks pretty good, nice that didn't have to do a major refactor

@jreback jreback added Bug Groupby MultiIndex Reshaping Concat, Merge/Join, Stack/Unstack, Explode labels Dec 26, 2019
@fujiaxiang fujiaxiang requested a review from jreback December 26, 2019 13:46
@fujiaxiang fujiaxiang requested a review from jreback December 26, 2019 16:16
@jreback jreback added this to the 1.0 milestone Dec 26, 2019
@fujiaxiang fujiaxiang requested a review from jreback December 27, 2019 00:53
@jreback jreback merged commit 8e9b3ee into pandas-dev:master Dec 27, 2019
@jreback
Copy link
Contributor

jreback commented Dec 27, 2019

thanks @fujiaxiang nicely done! keep em coming!

@fujiaxiang fujiaxiang deleted the bug_groupby_quantile_listlike_q_and_int_columns branch December 27, 2019 16:42
AlexKirko pushed a commit to AlexKirko/pandas that referenced this pull request Dec 29, 2019
keechongtan added a commit to keechongtan/pandas that referenced this pull request Dec 29, 2019
…ndexing-1row-df

* upstream/master: (333 commits)
  CI: troubleshoot Web_and_Docs failing (pandas-dev#30534)
  WARN: Ignore NumbaPerformanceWarning in test suite (pandas-dev#30525)
  DEPR: camelCase in offsets, get_offset (pandas-dev#30340)
  PERF: implement scalar ops blockwise (pandas-dev#29853)
  DEPR: Remove Series.compress (pandas-dev#30514)
  ENH: Add numba engine for rolling apply (pandas-dev#30151)
  [ENH] Add to_markdown method (pandas-dev#30350)
  DEPR: Deprecate pandas.np module (pandas-dev#30386)
  ENH: Add ignore_index for df.drop_duplicates (pandas-dev#30405)
  BUG: The setting xrot=0 in DataFrame.hist() doesn't work with by and subplots pandas-dev#30288 (pandas-dev#30491)
  CI: Fix GBQ Tests (pandas-dev#30478)
  Bug groupby quantile listlike q and int columns (pandas-dev#30485)
  ENH: Add ignore_index for df.sort_values and series.sort_values (pandas-dev#30402)
  TYP: Typing hints in pandas/io/formats/{css,csvs}.py (pandas-dev#30398)
  BUG: raise on non-hashable Index name, closes pandas-dev#29069 (pandas-dev#30335)
  Replace "foo!r" to "repr(foo)" syntax pandas-dev#29886 (pandas-dev#30502)
  BUG: preserve EA dtype in transpose (pandas-dev#30091)
  BLD: add check to prevent tempita name error, clsoes pandas-dev#28836 (pandas-dev#30498)
  REF/TST: method-specific files for test_append (pandas-dev#30503)
  marked unused parameters (pandas-dev#30504)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Groupby MultiIndex Reshaping Concat, Merge/Join, Stack/Unstack, Explode
Projects
None yet
Development

Successfully merging this pull request may close these issues.

groupby.quantile(<arraylike>) fails with AssertionError
2 participants