Skip to content Skip to sidebar Skip to footer

Convert Columns Of Time In Minutes Format To Time In Hh:mm:ss Format In Pandas

I am using a script to interpolate stop times from the format HH:MM:SS into minute int values. The script is as follows. # read in new csv file reindexed = pd.read_csv('output/sto

Solution 1:

Per the previous question, perhaps the best thing to do would be to keep the original HH:MM:SS strings:

So instead of

for col in ('arrival_time', 'departure_time'):
    df = reindexed[col].str.extract(
        r'(?P<hour>\d+):(?P<minute>\d+):(?P<second>\d+)').astype('float')
    reindexed[col] = df['hour'] * 60 + df['minute']

use

for col in ('arrival_time', 'departure_time'):
    newcol = '{}_minutes'.format(col)
    df = reindexed[col].str.extract(
        r'(?P<hour>\d+):(?P<minute>\d+):(?P<second>\d+)').astype('float')
    reindexed[newcol] = df['hour'] * 60 + df['minute']

Then you don't have to do any new calculations to recover the HH:MM:SS strings. reindexed['arrival_time'] will still be the original HH:MM:SS strings, and reindexed['arrival_time_minutes'] would be the time duration in minutes.


Building on Jianxun Li's solution, to chop off the microseconds, you could multiply the minutes by 60 and then call astype(int):

import numpy as np
import pandas as pd

np.random.seed(0)
df = pd.DataFrame(np.random.rand(3) * 1000, columns=['minutes'])
df['HH:MM:SS'] = pd.to_timedelta((60*df['minutes']).astype('int'), unit='s')

which yields

      minutes  HH:MM:SS
0  548.813504  09:08:48
1  715.189366  11:55:11
2  602.763376  10:02:45

Note that the df['HH:MM:SS'] column contains pd.Timedeltas:

In[240]: df['HH:MM:SS'].iloc[0]Out[240]: Timedelta('0 days 09:08:48')

However, if you try to store this data in a csv

In [223]: df.to_csv('/tmp/out', date_format='%H:%M:%S')

you get:

,minutes,HH:MM:SS
0,548.813503927,0 days 09:08:48.000000000
1,715.189366372,0 days 11:55:11.000000000
2,602.763376072,0 days 10:02:45.000000000

If the minute values are too big, you would also see days as part of the timedelta string representation:

np.random.seed(0)
df = pd.DataFrame(np.random.rand(3) * 10000, columns=['minutes'])
df['HH:MM:SS'] = pd.to_timedelta((60*df['minutes']).astype('int'), unit='s')

yields

minutesHH:MM:SS05488.135039 3days19:28:0817151.893664 4days23:11:5326027.633761 4days04:27:38

That might not be what you want. In that case, instead of

df['HH:MM:SS'] = pd.to_timedelta((60*df['minutes']).astype('int'), unit='s')

per Phillip Cloud's solution you could use

import operator
fmt = operator.methodcaller('strftime', '%H:%M:%S')
df['HH:MM:SS'] = pd.to_datetime(df['minutes'], unit='m').map(fmt)

The result looks the same, but now the df['HH:MM:SS'] column contains strings

In [244]: df['HH:MM:SS'].iloc[0]
Out[244]: '09:08:48'

Note that this chops off (omits) both the whole days and the microseconds. Writing the DataFrame to a CSV

In [229]: df.to_csv('/tmp/out', date_format='%H:%M:%S')

now yields

,minutes,HH:MM:SS0,548.813503927,09:08:481,715.189366372,11:55:112,602.763376072,10:02:45

Solution 2:

You may want to consider using pd.to_timedelta.

import pandas as pd
import numpy as np

np.random.seed(0)
df = pd.DataFrame(np.random.rand(10) * 1000, columns=['time_in_minutes'])

Out[94]: 
   time_in_minutes
0548.81351715.18942602.76343544.88324423.65485645.89416437.58727891.77308963.66289383.4415

# As Jeff suggests, pd.to_timedelta is a very handy tool to dothis
df['time_delta'] = pd.to_timedelta(df.time_in_minutes, unit='m')


Out[96]: 
   time_in_minutes      time_delta
0548.8135 09:08:48.8102351715.189411:55:11.3619822602.763410:02:45.8025643544.8832 09:04:52.9909794423.654807:03:39.2879605645.894110:45:53.6467846437.587207:17:35.2326757891.773014:51:46.3800468963.662816:03:39.7656309383.441506:23:26.491129

Post a Comment for "Convert Columns Of Time In Minutes Format To Time In Hh:mm:ss Format In Pandas"