Channable

Tech

heliclockter: "Time flies when you are having zones!"

November 22, 2022

Summary: We created our own library for reliably handling timestamps in Python that adhere to different formats and conventions from numerous marketplaces around the world

Intro

One of our big challenges when processing orders or returns for customers and marketplaces all around the world is working with dates and times. Various challenges exist when comparing, exchanging and storing timestamps, most notably differences between timezones, daylight savings times and different timestamp formats. As a consequence of integrating with marketplaces all over the world we deal with a large number of different datetime conventions. Additionally we had some tables in Postgres that had columns using the timestamp type instead of timestamptz. This is problematic because we then lose the timezone information of aware timestamps when they are inserted in the database, and therefore requires even more due diligence in the application layer to ensure that all timestamps are correctly converted to the same timezone before inserting.

Due to the above we were unable to consistently persist an ordered list of events. To illustrate: an order created in UTC+8 could be shown as if it was created later than an order that was actually created 6 hours later, but in a UTC+1 timezone. Or an order that was created at 10:00 UTC+2 could be shown as if it was created later than an order that was created at 11:00 UTC+2 from a different marketplace because the marketplaces differed in the way they relayed the timestamps. This was confusing for our customers because with a high enough volume of orders or returns, newer ones could get lost in the overview because they were inserted deep in the list.

We needed a way to enforce a consistent and robust way of converting between all the input timestamp conventions and our internal modeling, so that we can chronologically sort orders and returns based on their absolute creation time in UTC+0 irrespective of where in the world they were created and to what datetime standards. We needed the solution to be statically type checkable using mypy, to help prevent accidentally releasing code not adhering to being timezone-aware.

Existing solutions

Python has a built-in library, datetime, which is widely used to handle timestamps, but it does not enforce timezone-awareness. A TypeError is raised at runtime if a naive and aware datetime object are compared, but there is no support for static type checking with mypy for differentiating the two. Arrow and pendulum are two well known third party libraries for dealing with datetimes in python. While they offer significant advantages over the built-in datetime library these were not suitable to us for two main reasons:

  1. They assume UTC+0 by default when no timezone is given. We want to be more explicit about the assumptions and raise an error if a naive timezone is being parsed without explicitly stating the assumed timezone.
  2. When statically type checking you cannot differentiate between timestamps in different timezones.

We found one other module which tries to enforce timezone-aware datetimes, datetimeutc. As the name suggests, it enforces UTC+0 on all datetime objects, but is too limited in its scope to be useful in our case and it is not type checkable due to the absence of type hints. Additionally, it has not been updated since 2016 and is likely no longer maintained or updated.

Our original solution: NewType

The main goal was to enforce a robust interface between the input datetimes and our internal modeling with support for static type checking. Therefore, we created our own heliclockter module. We used NewType to allow for static type checking, as well as utilities to ensure the datetimes were in fact timezone-aware. For instance:

# A `datetime_tz` is just a guaranteed timezone-aware `datetime.datetime`.
datetime_tz = NewType('datetime_tz', datetime.datetime)

def naive_datetime_to_datetime_tz(
    dt: datetime.datetime, timezone: datetime.tzinfo
) -> datetime_tz:
    """
    Converts a non timezone-aware `datetime.datetime` object into a `datetime_tz` 
    object. Use this function whenever you have a naive `datetime.datetime`.
    """
    aware_dt = dt.replace(tzinfo=timezone)
    assert_aware_datetime(aware_dt)
    return datetime_tz(aware_dt)

datetime_local and datetime_utc were also declared and implemented to always enforce the local and UTC+0 timezone respectively. datetime_local was used to try to ensure that all datetimes that were inserted in the database were in the same timezone as the ones inserted previously. This was later less important as we migrated to use timestamp_tz on all timestamp columns after we invented dbcritic and started using it on all of our databases.

Together with an import linter, which ensured that the built-in datetime module as well as arrow and pendulum could not be imported anywhere in the project, this solution worked quite well. The import linter mitigated human error by forcing all developers to add any missing functionality to the heliclockter module, and the NewType objects allowed static type checking with mypy.

Introducing pydantic

The NewType solution worked well for our existing setup, but we wanted to start using pydantic to model external data to make use of the validation logic it offers. Therefore, we needed to rethink the solution to work with pydantic’s validators.

Current solution: Subclassing

By subclassing datetime.datetime, we can implement __get_validators__() directly in the timezone-aware classes. To ensure the three classes, datetime_tz, datetime_utc and datetime_local, enforce their respective timezone constraints, __init__() has been added to do an additional assertion, such as:

class datetime_tz(_datetime.datetime):
    """
    A `datetime_tz` is just a guaranteed timezone-aware `datetime.datetime`.
    """

    assumed_timezone_for_timezone_naive_input: ClassVar[Optional[ZoneInfo]] = None

    def __init__(  # pylint: disable=unused-argument
        self,
        year: int,
        month: int,
        day: int,
        hour: int = 0,
        minute: int = 0,
        second: int = 0,
        microsecond: int = 0,
        tzinfo: _datetime.tzinfo = None,
    ) -> None:
        msg = f'{self.__class__} must have a timezone'
        assert tzinfo is not None and self.tzinfo is not None, msg
        tz_expected = self.assumed_timezone_for_timezone_naive_input or tzinfo

        msg = f'{self.__class__} got invalid timezone {self.tzinfo!r}'
        assert self.tzinfo == tz_expected, msg

        self.assert_aware_datetime(self)

This construction has two primary benefits:

  1. Static type checking with mypy is still possible, because of class inheritance. We can therefore know if a timestamp is in the local, UTC+0 or any timezone based on the exact class instance.
  2. It is runtime enforceable. A new instance of e.g. datetime_utc cannot be created if the timezone is wrong.

Enforcing timezone-awareness in Pydantic

The runtime enforceability also extends to pydantic models because we declare __get_validators__() in the datetime_tz class as:

@classmethod
def __get_validators__(cls) -> Iterator[Callable[[Any], Optional[datetime_tz]]]:
    yield cls._validate

@classmethod
def _validate(cls: Type[DateTimeTzT], v: Any) -> Optional[DateTimeTzT]:
    return cls.from_datetime(parse_datetime(v)) if v else None

Where from_datetime is an extra utility that instantiates a timezone-aware subclass of datetime.datetime and asserts the timezone-awareness at runtime. Thus, when writing pydantic models like the following, we know that bar is always guaranteed to be a timezone-aware datetime object in the UTC+0 timezone at runtime:

from pydantic import BaseModel
from heliclockter import datetime_utc

class Foo(BaseModel):
    bar: datetime_utc

The pydantic validation logic is only added if pydantic is also available at runtime. Heliclockter can therefore be used, even if you are not using pydantic.

Manually creating datetime objects

Additionally, manually creating objects in the correct timezone is much simpler than before. To create a new datetime_utc instance with the current time, we used to do the following:

datetime_utc(datetime_to_datetime_tz(datetime.datetime_now(datetime.timezone.utc)))

With the subclasses, we can now simply write:

datetime_utc.now()

This still ensures that the returned instance is in fact an instance of datetime_utcbecause now() from the datetime.datetime class has also been overridden to add some extra conversions and assertions:

@classmethod
def from_datetime(cls: Type[DateTimeTzT], dt: _datetime.datetime) -> DateTimeTzT:
    # Case datetime is naive and there is no assumed timezone.
    if dt.tzinfo is None and cls.assumed_timezone_for_timezone_naive_input is None:
        raise DatetimeTzError(
            'Cannot create aware datetime from naive if no tz is assumed'
        )

    # Case: datetime is naive, but the timezone is assumed.
    if dt.tzinfo is None:
        dt = dt.replace(tzinfo=cls.assumed_timezone_for_timezone_naive_input)

    # Case: datetime is aware and the timezone is assumed, enforce that timezone.
    elif (assumed_tz := cls.assumed_timezone_for_timezone_naive_input) is not None:
        dt = dt.astimezone(assumed_tz)

    cls.assert_aware_datetime(dt)
    return cls(
        year=dt.year,
        month=dt.month,
        day=dt.day,
        hour=dt.hour,
        minute=dt.minute,
        second=dt.second,
        microsecond=dt.microsecond,
        tzinfo=dt.tzinfo,  # type: ignore[arg-type]
    )

@classmethod
def now(cls: Type[DateTimeTzT], tz: Optional[_datetime.tzinfo] = None) -> DateTimeTzT:
    tz = cls.assumed_timezone_for_timezone_naive_input or tz
    if tz is None:
        raise DatetimeTzError(
            'Must override assumed_timezone_for_timezone_naive_input '
            'or give a timezone when calling now'
        )
    return cls.from_datetime(_datetime.datetime.now(tz))

Expanding the module

Adding new classes that enforce a certain timezone is simple. All you have to do is declare the assumed_timezone_for_timezone_naive_input class variable and give the class a suitable name. E.g. adding a class that enforces the ‘CET’ timezone would be done in the following way:

from zoneinfo import ZoneInfo
from heliclockter import datetime_tz

class datetime_cet(datetime_tz):
    """
    A `datetime_cet` is a `datetime_tz` guaranteed to be in the 'CET' timezone.
    """
    assumed_timezone_for_timezone_naive_input = ZoneInfo('CET')

Enforcing only using heliclockter

At Channable we use this module in combination with an import linter to enforce that the built-in datetime and the third party libraries arrow and pendulum are not imported anywhere else. An example lint configuration looks like this:

[importlinter]
root_package = my_package
include_external_packages = True

[importlinter:contract:1]
name=The `datetime_tz` module should be used to handle dates and times.
type=forbidden
source_modules =
    my_project
forbidden_modules =
    datetime
    pendulum
    arrow

Conclusion

Working with datetimes in Python is hard, as static type checking of timezone information is lacking and input formats can vary widely depending on the origin of the timestamp. Our original implementation with NewType was not adequate as our use case changed when pydantic was introduced to the codebase. Switching to inheriting datetime.datetime combined with having the pydantic validator built into the class itself, as well as prohibiting importing the regular datetime, arrow or pendulum modules anywhere in the codebase, has greatly improved the consistency and reliability of all datetime behavior across the entire system.

About the name

heliclockter is a word play of "clock" and "helicopter". The module aims to guide the user and help them make little to no mistakes when handling datetimes, just like a helicopter parent strictly supervises their children.

We are also happy to announce, that today we are releasing heliclockter as open source

avatar
Peter NilssonLead Development

We are hiring

Are you interested in working at Channable? Check out our vacancy page to see if we have an open position that suits you!

Apply now