Ticket #202 (new enhancement)
cannot roundtrip offset-aware datetime instances
| Reported by: | aaugustin | Owned by: | xi |
|---|---|---|---|
| Priority: | normal | Component: | pyyaml |
| Severity: | normal | Keywords: | |
| Cc: | matt@… |
Description
I'd expect that yaml.load(yaml.dump(foo) == foo for reasonable values of foo.
However, this isn't true for timezone-aware datetimes:
>>> import datetime >>> from pytz import utc >>> import yaml >>> dt = datetime.datetime(2011, 9, 1, 10, 20, 30, 405060, tzinfo=utc) >>> yaml.load(yaml.dump(dt)) == dt Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: can't compare offset-naive and offset-aware datetimes >>> yaml.load(yaml.dump(dt)) datetime.datetime(2011, 9, 1, 10, 20, 30, 405060)
PyYAML dumps the offset correctly, but when it loads the value, it returns a naive datetime in UTC, with the offset susbtracted.
Instead, I suggest using a simple tzinfo class, such as the following (from http://docs.python.org/library/datetime.html) to represent offsets:
from datetime import timedelta, tzinfo
class FixedOffset(tzinfo):
"""Fixed offset in minutes east from UTC."""
def __init__(self, offset, name):
self.__offset = timedelta(minutes=offset)
self.__name = name
def utcoffset(self, dt):
return self.__offset
def tzname(self, dt):
return self.__name
def dst(self, dt):
return timedelta(0)
Note that it often makes sense to handle naive datetimes (such as user input) in local times. In such cases, round-tripping a timezone-aware datetime through PyYAML — for example, dumping/loading fixtures in Django — will result in data corruption.
See also #25 and this solution to the same problem.
Attachments
Change History
comment:2 Changed 5 months ago by Matt Behrens <matt@…>
I started looking at what it would take to create a patch for this and I've come up on a few hard problems.
The first thing I did was go ahead and patch construct_yaml_timestamp so that if a + or -HH:MM timezone was specified, a tzinfo instance was created with that offset, much like was suggested in this ticket's description.
When I went to add UTC support for Z timezones, I started looking more critically at the implementation of the spec itself http://yaml.org/type/timestamp.html. Specifically, according to my reading, any timestamp that does not have a timezone—even those with no time specified at all—should be UTC. Thus, because you can't localize a date instance, and because date instances don't appear to make any assertions as to time-of-day in contrast to the spec which says missing time should be read as 00:00:00Z, construct_yaml_timestamp should never return a date instance for a date-only timestamp value, but instead a datetime with hour 0, minute 0, and tzinfo UTC.
I do fear that such changes will break a lot of code that are used to receiving either date instances or naïve datetime instances, though.
My current work on this problem as a starting point for discussion: http://nopaste.info/1b9398393d.html Keep in mind it does break tests right now, largely because date and datetime instances are incomparable as well as naïve and offset-aware datetime instances.
comment:3 Changed 5 months ago by Matt Behrens <matt@…>
- Cc matt@… added
- Summary changed from PyYAML to cannot roundtrip offset-aware datetime instances
comment:4 Changed 5 months ago by Matt Behrens <matt@…>
I have a working implementation here: https://bitbucket.org/zigg/pyyaml
The new behavior can be switched on or off with a keyword argument to load et al.
I do not have tests for the new behavior yet. I will probably write a new test case for them.

This bug actually affects users of Django: https://code.djangoproject.com/ticket/18867