Skip to content Skip to sidebar Skip to footer

Alter Number String In Pandas Column

Background I have a sample df with a Text column containing 0,1, or >1 ABC's import pandas as pd df = pd.DataFrame({'Text' : ['Jon J Mmith ABC: 1111111 is this here',

Solution 1:

If all your numerics are to be replaced, you can do:

df['Text_ABC'] = df['Text'].replace(r'\d+', '***BLOCK***', regex=True)

But if you want to be more specific and only replace the numerics after ABC:, then you can use this:

df['Text_ABC'] = df['Text'].replace(r'ABC: \d+', 'ABC: ***BLOCK***', regex=True)

Giving you:

df
                                                Text  P_ID N_ID                                           Text_ABC
0             Jon J Smith  ABC: 1111111isthis here     1   A1           Jon J Smith  ABC: ***BLOCK*** isthis here
1            ABC: 1234567 Mary Lisa Rider found here     2   A2          ABC: ***BLOCK*** Mary Lisa Rider found here
2                            Jane A Doe is also here     3   A3                            Jane A Doe is also here
3  ABC: 2222222 Tom T Tucker is here ABC: 2222222...     4   A4  ABC: ***BLOCK*** Tom T Tucker is here ABC: ***BLOCK...

As a regex, \d+ means "match one or more consecutive digits", so using that within replace says to "replace one or more consecutive digits with ***BLOCK***"

Post a Comment for "Alter Number String In Pandas Column"