Pandas, Groupby Where Column Value Is Greater Than X
I have a table like this timestamp avg_hr hr_quality avg_rr rr_quality activity sleep_summary_id 1422404668 66 229 0 0 13
Solution 1:
the simplest thing to do here is to filter the df first and then perform the groupby:
df2[df2['rr_quality'] > 0].groupby([df2.index.hour,'sleep_summary_id'])
EDIT
If you're intending to assign this back to your original df:
df2.loc[df2['rr_quality'] > 0, 'AVG_HR'] = df2[df2['rr_quality'] >= 150].groupby([df2.index.hour,'emfit_sleep_summary_id'])['avg_hr'].transform('mean')
The loc
call will mask the lhs so that the result of the transform aligns correctly
To filter using multiple conditions you need to use the array comparision operators &
, |
and ~
for and
, or
and not
respectively, additionally you need to wrap the conditions in parentheses due to operator precedence:
df2[(df2['rr_quality'] >= 150) & (df2['hr_quality'] > 200)]
Solution 2:
I know this is old but I wanted to add that there is an official function to do exactly this. Transforming the example from pandas to your case:
grouped_df2= df2.groupby([df2.index.hour,'sleep_summary_id','rr_quality'])
grouped_df2.filter(lambda x: x['rr_quality'] > 0.)
Post a Comment for "Pandas, Groupby Where Column Value Is Greater Than X"