Skip to content Skip to sidebar Skip to footer

Python Pandas Series If Else Box Plot

I have alot of data in a dictionary format and I am attempting to use pandas print a string based on an IF ELSE statement. For my example ill make up some data in dict and covert t

Solution 1:

Sure you can call pandas.Series.plot.box() like df['a'].plot.box() to get the boxplot of your column a.

To fit with your question I would have done this:

def _print(x):
    if (x < 4).any():
        print('Zone %s does not make setpoint' % x.name)
        df[x.name].plot.box() #call x.name to retrieve the column name
        plt.show()
        print(df[x.name].describe())
    else:
        print('Zone %s is Normal' % x.name)
        print('The average is %s' % x.mean())
    print('---')

df.apply(lambda x: _print(x))

Illustrated below extract of the output for zone B and zone C.

enter image description here

Note that you can add .describe() to get the boxplot and other stats description (see documentation).

Nevertheless I would have approach the problem differently, according to the solution proposed here.


Another solution

You can filter your dataframe to split into make setpoint or not:

s = df.apply(lambda x: not (x < 4).any())

Then plot the boxes on the one that doesn't make the set point.
Plot all in a figure if the variation is not too large, and if there is not so many zones:

df[s[~s].index].boxplot()
plt.show()

enter image description here

Or separate them:

for col in s[~s].index:
    df[col].plot.box()
    plt.show()

In both case get the statistics in a dataframe:

statdf = df[s[~s].index].describe()
print(statdf)

              a         b         d
count  3.000000  3.000000  3.000000
mean   4.533333  5.133333  1.966667
std    4.178915  1.960442  0.901850
min    1.500000  3.300000  1.100000
25%    2.150000  4.100000  1.500000
50%    2.800000  4.900000  1.900000
75%    6.050000  6.050000  2.400000
max    9.300000  7.200000  2.900000

This way you can get the stat (say 'mean' for instance) with statdf.loc['mean'].

If you want to print the mean of the one that does make the set point:

print(df[s[s].index].mean())

c    11.3
Name: mean, dtype: float64

Solution 2:

I don't really know if that is what you are looking for, but... you are asking :

I want to add in an extra to create a box plot

You are trying this using... df.Series.plot.box(), which outputs the error AttributeError: 'DataFrame' object has no attribute 'Series'.

Try using instead df.boxplot(), which will then show at each plt.show() call...

Imgur


Post a Comment for "Python Pandas Series If Else Box Plot"