Pandas如何将Timestamp转为datetime类型

2022-07-16 13:51:15
目录
将Timestamp转为datetime类型pandas生成时间索引Timestamp与datetimepandas从Timestamp中提取小时分钟等假设数据为

将Timestamp转为datetime类型

在Pandas中我们在处理时间序列的时候常用的方法有:

    pd.to_datetime()pd.date_range()

    pandas生成时间索引

    # pd.date_range()
    index = pd.date_range("20210101",periods=20)
    index
    Out[29]: 
    DatetimeIndex(['2021-01-01', '2021-01-02', '2021-01-03', '2021-01-04',
                   '2021-01-05', '2021-01-06', '2021-01-07', '2021-01-08',
                   '2021-01-09', '2021-01-10', '2021-01-11', '2021-01-12',
                   '2021-01-13', '2021-01-14', '2021-01-15', '2021-01-16',
                   '2021-01-17', '2021-01-18', '2021-01-19', '2021-01-20'],
                  dtype='datetime64[ns]', freq='D')
    
    
    # pd.to_datetime()
    df = pd.DataFrame(data=range(20210101,20210128),columns=["period"])
    df["aa"] = pd.to_datetime(df["period"],format="%Y%m%d")
    df
    Out[24]: 
          period         aa
    0   20210101 2021-01-01
    1   20210102 2021-01-02
    2   20210103 2021-01-03
    3   20210104 2021-01-04
    4   20210105 2021-01-05
    5   20210106 2021-01-06
    6   20210107 2021-01-07
    7   20210108 2021-01-08
    8   20210109 2021-01-09
    9   20210110 2021-01-10
    10  20210111 2021-01-11
    11  20210112 2021-01-12
    12  20210113 2021-01-13
    13  20210114 2021-01-14
    14  20210115 2021-01-15
    15  20210116 2021-01-16
    16  20210117 2021-01-17
    17  20210118 2021-01-18
    18  20210119 2021-01-19
    19  20210120 2021-01-20
    20  20210121 2021-01-21
    21  20210122 2021-01-22
    22  20210123 2021-01-23
    23  20210124 2021-01-24
    24  20210125 2021-01-25
    25  20210126 2021-01-26
    26  20210127 2021-01-27
    
    index[1]
    Out[30]: Timestamp('2021-01-02 00:00:00', freq='D')
    df["aa"][1]
    Out[31]: Timestamp('2021-01-02 00:00:00')
    df["aa"][1] == index[1]
    Out[32]: True
    
    type(df["aa"][1])
    Out[33]: pandas._libs.tslibs.timestamps.Timestamp
    type(index[1])
    Out[34]: pandas._libs.tslibs.timestamps.Timestamp
    

    Timestamp与datetime

    从上面代码可以看出,pandas中的时间格式是pandas._libs.tslibs.timestamps.Timestamp

    但是python中常用的时间格式是datetime.datetime

      to_pydatetime()
      t = datetime(2021,1,2)
      type(t)
      Out[54]: datetime.datetime
      t
      Out[55]: datetime.datetime(2021, 1, 2, 0, 0)
      r = (index[1].to_pydatetime())
      type(r)
      Out[57]: datetime.datetime
      t == r
      Out[58]: True

      将pandas Timestamp 转为 datetime 类型

      In [11]: ts = pd.Timestamp('2014-01-23 00:00:00', tz=None)
      In [12]: ts.to_pydatetime()
      Out[12]: datetime.datetime(2014, 1, 23, 0, 0)

       

      It's also available on a DatetimeIndex
      rng = pd.date_range('1/10/2011', periods=3, freq='D')
      rng.to_pydatetime()
      Out[60]: 
      array([datetime.datetime(2011, 1, 10, 0, 0),
             datetime.datetime(2011, 1, 11, 0, 0),
             datetime.datetime(2011, 1, 12, 0, 0)], dtype=object)

      pandas从Timestamp中提取小时分钟等

      官方文档:>

      最近需要提取某一天的时刻距离0:00的分钟数,找了文档之后想到这样一个办法:

      假设数据为

      In [64]: stamps = pd.date_range('2012-10-08 18:15:05', periods=4, freq='h')
      In [65]: stamps
      Out[65]: 
      DatetimeIndex(['2012-10-08 18:15:05', '2012-10-08 19:15:05',
                     '2012-10-08 20:15:05', '2012-10-08 21:15:05'],
                    dtype='datetime64[ns]', freq='D')

      先得到距离1970-01-01的秒数

      In [66]: (stamps - pd.Timestamp("1970-01-01")) // pd.Timedelta('1s')
      Out[66]: Int64Index([1349720105, 1349723705, 1349727305, 1349730905], dtype='int64')

      对天取余,得到距离0:00的秒数

      In [67]: (stamps - pd.Timestamp("1970-01-01")) // pd.Timedelta('1s') % 86400
      Out[67]: Int64Index([65705, 69305, 72905, 76505], dtype='int64')

      取距离0:00的分钟数

      In [68]: (stamps - pd.Timestamp("1970-01-01")) // pd.Timedelta('1s') % 86400 /60
      Out[68]: Int64Index([1095.0833333333333, 1155.0833333333333, 1215.0833333333333,
                    1275.0833333333333], dtype='float64')

      同样的,也可以取小时数

      In [69]: (stamps - pd.Timestamp("1970-01-01")) // pd.Timedelta('1s') % 86400 /3600
      Out[68]: Int64Index([18.25138888888889, 19.25138888888889, 20.25138888888889,
                    21.25138888888889], dtype='float64')

      取小时整数–当然取小时整数也有别的方法。

      In [70]: (stamps - pd.Timestamp("1970-01-01")) // pd.Timedelta('1s') % 86400 //3600
      Out[70]: Int64Index([18, 19, 20, 21], dtype='int64')

      以上为个人经验,希望能给大家一个参考,也希望大家多多支持易采站长站。