Python Dataframes: Describing A Single Column
Is there a way I can apply df.describe() to just an isolated column in a DataFrame. For example if I have several columns and I use df.describe() - it returns and describes all the
Solution 1:
Just add column name in square braquets:
df['column_name'].describe()
Example:
To get a single column:
df['1']
To get several columns:
df[['1','2']]
To get a single row by name:
df.loc['B']
or by index:
df.iloc[o]
To get a specific field:
df['1']['C']
Solution 2:
import pandas as pd
data=pd.read_csv('data.csv')
data[['column1', 'column2', 'column3']].describe()
Solution 3:
import pandas as pd
data = pd.read_csv("ad.data", header=None)
data[111].describe()
or for example
lastindice = data[data .columns[-1]]
lastindice.describe()
Solution 4:
In Pyspark DataFrame you can describe for only one column like this:
df.describe("col1").toPandas()
or several columns like this:
df.describe(["col1", "col2"]).toPandas()
Solution 5:
to describe it as table
df[['column_name']].describe()
to describe it as data
df['column_name'].describe()
Post a Comment for "Python Dataframes: Describing A Single Column"