Understanding Bracket Filter Syntax In Pandas
How does the following filter out the results in pandas ? For example, with this statement: df[['name', 'id', 'group']][df.id.notnull()] I get 426 rows (it filters out everything
Solution 1:
Not a full answer, just a breakdown of df.id[df.id==458514]
df.id
returns a series with the contents of columnid
df.id[...]
slices that series with either 1) a boolean mask, 2) a single index label or a list of them, 3) a slice of labels in the formstart:end:step
. If it receives a boolean mask then it must be of the same shape as the series being sliced. If it receives index label(s) then it will return those specific rows. Sliciing works just as with python lists, butstart
andend
be integer locations or index labels (e.g.['a':'e']
will return all rows in between, including'e'
).df.id[df.id==458514]
returns a filtered series with your boolean mask, i.e. only the items wheredf.id
equals458514
. It also works with other boolean masks as indf.id[df.name == 'Carl']
ordf.id[df.name.isin(['Tom', 'Jerry'])]
.
Read more in panda's intro to data structures
Post a Comment for "Understanding Bracket Filter Syntax In Pandas"