Skip to content Skip to sidebar Skip to footer

Python - Unnest Cells In Pandas Dataframe

Suppose I have DataFrame df: a b c v f 3|4|5 v 2 6 v f 4|5 I'd like to produce this df: a b c v f 3 v f 4 v f 5 v 2 6 v f 4 v f 5 I know how to make this transformation in R, usi

Solution 1:

You could:

import numpy as np

df = df.set_index(['a', 'b'])
df = df.astype(str) + '| '# There's a space ' ' to match the replace later
df = df.c.str.split('|', expand=True).stack().reset_index(-1, drop=True).replace(' ', np.nan).dropna().reset_index() # and replace also has a space ' '

to get:

ab00  v  f  31  v  f  42  v  f  53  v  264  v  f  45  v  f  5

Solution 2:

Option 1

In [3404]: (df.set_index(['a', 'b'])['c']
              .str.split('|', expand=True).stack()
              .reset_index(name='c').drop('level_2', 1))
Out[3404]:
   a  b  c
0  v  f  31  v  f  42  v  f  53  v  264  v  f  45  v  f  5

Option 2 Using repeat and loc

In [3503]: s = df.c.str.split('|')

In [3504]: df.loc[df.index.repeat(s.str.len())].assign(c=np.concatenate(s))
Out[3504]:
   a  b  c
0  v  f  30  v  f  40  v  f  51  v  262  v  f  42  v  f  5

Details

In[3505]: sOut[3505]:
0[3, 4, 5]1[6]2[4, 5]Name: c, dtype: object

Post a Comment for "Python - Unnest Cells In Pandas Dataframe"