Pandas如何將Timestamp轉為datetime類型

將Timestamp轉為datetime類型

在Pandas中我們在處理時間序列的時候常用的方法有:

  • pd.to_datetime()
  • pd.date_range()

pandas生成時間索引

# pd.date_range()
index = pd.date_range("20210101",periods=20)
index
Out[29]: 
DatetimeIndex(['2021-01-01', '2021-01-02', '2021-01-03', '2021-01-04',
               '2021-01-05', '2021-01-06', '2021-01-07', '2021-01-08',
               '2021-01-09', '2021-01-10', '2021-01-11', '2021-01-12',
               '2021-01-13', '2021-01-14', '2021-01-15', '2021-01-16',
               '2021-01-17', '2021-01-18', '2021-01-19', '2021-01-20'],
              dtype='datetime64[ns]', freq='D')


# pd.to_datetime()
df = pd.DataFrame(data=range(20210101,20210128),columns=["period"])
df["aa"] = pd.to_datetime(df["period"],format="%Y%m%d")
df
Out[24]: 
      period         aa
0   20210101 2021-01-01
1   20210102 2021-01-02
2   20210103 2021-01-03
3   20210104 2021-01-04
4   20210105 2021-01-05
5   20210106 2021-01-06
6   20210107 2021-01-07
7   20210108 2021-01-08
8   20210109 2021-01-09
9   20210110 2021-01-10
10  20210111 2021-01-11
11  20210112 2021-01-12
12  20210113 2021-01-13
13  20210114 2021-01-14
14  20210115 2021-01-15
15  20210116 2021-01-16
16  20210117 2021-01-17
17  20210118 2021-01-18
18  20210119 2021-01-19
19  20210120 2021-01-20
20  20210121 2021-01-21
21  20210122 2021-01-22
22  20210123 2021-01-23
23  20210124 2021-01-24
24  20210125 2021-01-25
25  20210126 2021-01-26
26  20210127 2021-01-27

index[1]
Out[30]: Timestamp('2021-01-02 00:00:00', freq='D')
df["aa"][1]
Out[31]: Timestamp('2021-01-02 00:00:00')
df["aa"][1] == index[1]
Out[32]: True

type(df["aa"][1])
Out[33]: pandas._libs.tslibs.timestamps.Timestamp
type(index[1])
Out[34]: pandas._libs.tslibs.timestamps.Timestamp

Timestamp與datetime

從上面代碼可以看出,pandas中的時間格式是pandas._libs.tslibs.timestamps.Timestamp

但是python中常用的時間格式是datetime.datetime

  • to_pydatetime()
t = datetime(2021,1,2)
type(t)
Out[54]: datetime.datetime
t
Out[55]: datetime.datetime(2021, 1, 2, 0, 0)
r = (index[1].to_pydatetime())
type(r)
Out[57]: datetime.datetime
t == r
Out[58]: True

將pandas Timestamp 轉為 datetime 類型

In [11]: ts = pd.Timestamp('2014-01-23 00:00:00', tz=None)
In [12]: ts.to_pydatetime()
Out[12]: datetime.datetime(2014, 1, 23, 0, 0)

 

It's also available on a DatetimeIndex
rng = pd.date_range('1/10/2011', periods=3, freq='D')
rng.to_pydatetime()
Out[60]: 
array([datetime.datetime(2011, 1, 10, 0, 0),
       datetime.datetime(2011, 1, 11, 0, 0),
       datetime.datetime(2011, 1, 12, 0, 0)], dtype=object)

pandas從Timestamp中提取小時分鐘等

官方文檔: https://pandas.pydata.org/pandas-docs/stable/user_guide/timeseries.html#from-timestamps-to-epoch

最近需要提取某一天的時刻距離0:00的分鐘數,找瞭文檔之後想到這樣一個辦法:

假設數據為

In [64]: stamps = pd.date_range('2012-10-08 18:15:05', periods=4, freq='h')
In [65]: stamps
Out[65]: 
DatetimeIndex(['2012-10-08 18:15:05', '2012-10-08 19:15:05',
               '2012-10-08 20:15:05', '2012-10-08 21:15:05'],
              dtype='datetime64[ns]', freq='D')

先得到距離1970-01-01的秒數

In [66]: (stamps - pd.Timestamp("1970-01-01")) // pd.Timedelta('1s')
Out[66]: Int64Index([1349720105, 1349723705, 1349727305, 1349730905], dtype='int64')

對天取餘,得到距離0:00的秒數

In [67]: (stamps - pd.Timestamp("1970-01-01")) // pd.Timedelta('1s') % 86400
Out[67]: Int64Index([65705, 69305, 72905, 76505], dtype='int64')

取距離0:00的分鐘數

In [68]: (stamps - pd.Timestamp("1970-01-01")) // pd.Timedelta('1s') % 86400 /60
Out[68]: Int64Index([1095.0833333333333, 1155.0833333333333, 1215.0833333333333,
              1275.0833333333333], dtype='float64')

同樣的,也可以取小時數

In [69]: (stamps - pd.Timestamp("1970-01-01")) // pd.Timedelta('1s') % 86400 /3600
Out[68]: Int64Index([18.25138888888889, 19.25138888888889, 20.25138888888889,
              21.25138888888889], dtype='float64')

取小時整數–當然取小時整數也有別的方法。

In [70]: (stamps - pd.Timestamp("1970-01-01")) // pd.Timedelta('1s') % 86400 //3600
Out[70]: Int64Index([18, 19, 20, 21], dtype='int64')

以上為個人經驗,希望能給大傢一個參考,也希望大傢多多支持WalkonNet。

推薦閱讀: