Check For Any Missing Dates In The Index
Solution 1:
You can use DatetimeIndex.difference(other)
pd.date_range(start='2013-01-19', end='2018-01-29' ).difference(df.index)
It returns the elements not present in the other
Solution 2:
Example:
As a minimal example, take this:
>>>dfGWA_BTCGWA_ETHGWA_LTCGWA_XLMGWA_XRPDate2013-01-19 11,826.361,068.45195.000.511.822013-01-20 13,062.681,158.71207.580.521.752013-01-28 12,326.231,108.90197.360.481.552013-01-29 11,397.521,038.21184.920.471.43
And we can find the missing dates between 2013-01-19
and 2013-01-29
Method 1:
See @Vaishali's answer
Use .difference
to find the difference between your datetime index and the set of all dates within your range:
pd.date_range('2013-01-19', '2013-01-29').difference(df.index)
Which returns:
DatetimeIndex(['2013-01-21', '2013-01-22', '2013-01-23', '2013-01-24',
'2013-01-25', '2013-01-26', '2013-01-27'],
dtype='datetime64[ns]', freq=None)
Method 2:
You can re-index your dataframe using all dates within your desired daterange, and find where reindex
has inserted NaN
s.
And to find missing dates between 2013-01-19
and 2013-01-29
:
>>>df.reindex(pd.date_range('2013-01-19','2013-01-29')).isnull().all(1)2013-01-19 False2013-01-20 False2013-01-21 True2013-01-22 True2013-01-23 True2013-01-24 True2013-01-25 True2013-01-26 True2013-01-27 True2013-01-28 False2013-01-29 FalseFreq:D,dtype:bool
Those values with True
are the missing dates in your original dataframe
Solution 3:
assuming data is daily non business dates:
df.index.to_series().diff().dt.days > 1
Solution 4:
I can't post a comment but you can probably traverse each value and add 24 hours to the previous value to see if the date matches?
import pandas as pd
a = [1,2,3,4,5]
b = [1,0.4,0.3,0.5,0.2]
df = pd.DataFrame({'a':a , 'b': b})
for i inrange(len(df)):
prev = df.loc[i,'a']
if i is0:
continueelse:
# Add 1 day to the current value and check with prev value
Post a Comment for "Check For Any Missing Dates In The Index"