Skip to content Skip to sidebar Skip to footer

Understanding Bracket Filter Syntax In Pandas

How does the following filter out the results in pandas ? For example, with this statement: df[['name', 'id', 'group']][df.id.notnull()] I get 426 rows (it filters out everything

Solution 1:

Not a full answer, just a breakdown of df.id[df.id==458514]

  • df.id returns a series with the contents of column id
  • df.id[...] slices that series with either 1) a boolean mask, 2) a single index label or a list of them, 3) a slice of labels in the form start:end:step. If it receives a boolean mask then it must be of the same shape as the series being sliced. If it receives index label(s) then it will return those specific rows. Sliciing works just as with python lists, but start and end be integer locations or index labels (e.g. ['a':'e'] will return all rows in between, including 'e').
  • df.id[df.id==458514] returns a filtered series with your boolean mask, i.e. only the items where df.id equals 458514. It also works with other boolean masks as in df.id[df.name == 'Carl'] or df.id[df.name.isin(['Tom', 'Jerry'])].

Read more in panda's intro to data structures

Post a Comment for "Understanding Bracket Filter Syntax In Pandas"