1

I created a dataframe using groupby and pd.cut to calculate the mean, std and number of elements inside a bin. I used the agg()and this is the command I used:

df_bin=df.groupby(pd.cut(df.In_X, ranges,include_lowest=True)).agg(['mean', 'std','size'])

df_bin looks like this:

                 X                  Y
                 mean   std size   mean         std  size
In_X                    
(10.424, 10.43] 10.425  NaN  1      0.003786    NaN   1
(10.43, 10.435] 10.4    NaN  0      NaN         NaN   0

I want to drop the rows of only when I encounter a NaN case. I want to create a new df_bin, without the NaN occurrences. I've tried:

  df_bin=df_bin['X', 'mean'].dropna()

But this drops all other columns of df_bin and keep only one column.

ziulfer
  • 1,339
  • 5
  • 18
  • 30

1 Answers1

1

Let us try pull all the mean out, find the null

df_bin=df_bin_temp[df_bin_temp.loc[:,pd.IndexSlice[:,'mean']].notnull().all(1)]
        X                Y       
        m   s  i         m   s  i
0  10.425 NaN  1  0.003786 NaN  1

Or we do

df_bin=df_bin_temp.dropna(subset=df_bin_temp.loc[:,pd.IndexSlice[:,'m']].columns)
        X                Y       
        m   s  i         m   s  i
0  10.425 NaN  1  0.003786 NaN  1
BENY
  • 317,841
  • 20
  • 164
  • 234
  • 1
    @sammywemmy do small change from this https://stackoverflow.com/questions/45740537/copying-multiindex-dataframes-with-pd-read-clipboard – BENY May 23 '20 at 01:57
  • I've asked a question in this sense today, please have a look here: https://stackoverflow.com/questions/61965631/access-pandas-dataframe-column-with-two-header-pandas/61965685?noredirect=1#comment109597610_61965685 – ziulfer May 23 '20 at 01:58
  • @YOBEN_S I was using `dropna()` with a regular dataframe before, now for the multiindex dataframe I used your first suggestion and it is quite faster. Thanks! – ziulfer May 23 '20 at 15:38