Version 3 (modified by xi, 11 years ago) (diff)


PyYAML Documentation

work in progress

Some RPG-ish descriptions are stolen from  the Angband rogue-like game.

Basic usage

Start with importing the yaml package.

>>> import yaml


Warning: It is not safe to call yaml.load with any data received from an untrusted source! yaml.load is as powerful as pickle.load and so may call any Python function. Check the yaml.safe_load function though.

The function yaml.load converts a YAML document to a Python object.

>>> yaml.load("""
... - Hesperiidae
... - Papilionidae
... - Apatelodidae
... - Epiplemidae
... """)

['Hesperiidae', 'Papilionidae', 'Apatelodidae', 'Epiplemidae']

yaml.load accepts a string, a Unicode string, an open file object, or an open Unicode file object. A string or a file must be encoded with utf-8, utf-16-be or utf-16-le encoding. yaml.load detects the encoding by checking the BOM (byte order mark) sequence at the beginning of the string/file. If no BOM is present, the utf-8 encoding is assumed.

>>> yaml.load(u"""
... hello: Привет!
... """)

{'hello': u'\u041f\u0440\u0438\u0432\u0435\u0442!'}

>>> stream = file('document.yaml', 'r')    # 'document.yaml' contains a single YAML document.
>>> yaml.load(stream)
[...]    # A Python object corresponding to the document.

if a string or a file contains several documents, you may load them all with the yaml.load_all function.

>>> documents = """
... ---
... name: The Set of Gauntlets 'Pauraegen'
... description: >
...     A set of handgear with sparks that crackle
...     across its knuckleguards.
... ---
... name: The Set of Gauntlets 'Paurnen'
... description: >
...   A set of gauntlets that gives off a foul,
...   acrid odour yet remains untarnished.
... ---
... name: The Set of Gauntlets 'Paurnimmen'
... description: >
...   A set of handgear, freezing with unnatural cold.
... """

>>> for data in yaml.load_all(documents):
...     print data

{'description': 'A set of handgear with sparks that crackle across its knuckleguards.\n',
'name': "The Set of Gauntlets 'Pauraegen'"}
{'description': 'A set of gauntlets that gives off a foul, acrid odour yet remains untarnished.\n',
'name': "The Set of Gauntlets 'Paurnen'"}
{'description': 'A set of handgear, freezing with unnatural cold.\n',
'name': "The Set of Gauntlets 'Paurnimmen'"}


YAML syntax

YAML tags and Python types







Scanner interface

Parser interface

Composer interface

Constructor interface

Resolver interface


Emitter interface

Serializer interface

Representer interface

Resolver interface


The yaml package

Deviations from the specification

Download and installing

Check it out from the SVN repository

Install it by running

$ python install

High-level API

Warning: API is not stable and may change in the future

Basic examples

Start with importing the package:

>>> import yaml

Define the input data:

>>> data = """
... - YAML
... - is
... - fun!
... """

The parser accepts string objects, unicode objects, open file objects, and unicode file objects.

Now convert it to a native Python object:

>>> yaml.load(data)
['YAML', 'is', 'fun!']

Conversely, you may convert a Python object into a YAML document:

>>> print yaml.dump(['YAML', 'is', 'fun!'])
- is
- fun!

PyYAML 3000 supports many of the types defined in the YAML tags repository:

>>> data = """
... - ~
... - true
... - 3_141_592.653e-6
... - 3000
... - PyYAML3000 birthday: 2006-02-11
... - primes (sort of): !!set { 2, 3, 5, 7, 11, 13 }
... - pairs: !!pairs [1: 2, 3: 4, 5: 6]
... """
>>> for x in yaml.load(data): print x
{'PyYAML3000 birthday': datetime.datetime(2006, 2, 11, 0, 0)}
{'primes (sort of)': set([2, 3, 5, 7, 11, 13])}
{'pairs': [(1, 2), (3, 4), (5, 6)]}
>>> print yaml.dump([None, True, False, 123, 123.456, 'a string',
... {'a': 'dictionary'}, ['a', 'list']])
- null
- true
- false
- 123
- 123.456
- a string
- a: dictionary
- - a
  - list

The following tags are supported: !!map, !!omap, !!pairs, !!set, !!seq, !!binary, !!bool, !!float, !!int, !!merge, !!null, !!str, !!timestamp, !!value.

Defining custom tags

You may define constructors for your own application-specific tags. You may use either the function yaml.add_constructor or subclass from yaml.YAMLObject.

Instances of yaml.YAMLObject are automatically serialized to YAML and vice versa. You only need to define the YAML tag with the yaml_tag variable.

class Person(yaml.YAMLObject):
    yaml_tag = '!Person'
    def __init__(self, first_name=None, last_name=None, email=None, birthday=None):
        self.first_name = first_name
        self.last_name = last_name = email
        self.birthday = birthday
    def __repr__(self):
        return "%s(first_name=%r, last_name=%r, email=%r, birthday=%r)"  \
                % (self.__class__.__name__, self.first_name, self.last_name,
              , self.birthday)
>>> p = yaml.load("""
... !Person
... first_name: Kirill
... last_name: Simonov
... email: xi(at)
... birthday: null
... """)
>>> print p
Person(first_name='Kirill', last_name='Simonov', email='xi(at)', birthday=None)
>>> print yaml.dump(p)
last_name: Simonov
first_name: Kirill
email: xi(at)
birthday: null

If you don't want to use metaclass magic, you may define the constructor and representer as functions and register them:

def construct_person(constructor, node):
    # ...
def represent_person(representer, person):
    # ...
yaml.add_constructor('!Person', construct_person)
yaml.add_representer(Person, represent_person)

Parsing and emitting multiple documents in a stream

If an input stream contains several documents, you may load all of them using the yaml.load_all function.

>>> data = """
... This is the first document
... --- # This is an empty document
... ---
... - this
... - is: the
...   last: document
... """
>>> for document in yaml.load_all(data): print document
This is the first document
['this', {'is': 'the', 'last': 'document'}]

You may also dump several documents into the same stream using the yaml.dump_all function.

>>> print yaml.dump_all(["The first document", None, ["The", "last", "document"]])
The first document
--- null
- The
- last
- document

There are more features, check the source to find out.

Low-level API

PyYAML 3000 provides low-level event-based and easy-to-use parser and emitter API.


>>> data = """
... --- !tag
... scalar
... ---
... - &anchor item
... - another item
... - *anchor
... ---
... key: value
... ? - complex
...   - key
... : - complex
...   - value
... """
>>> for event in yaml.parse(data): print event
ScalarEvent(anchor=None, tag=u'!tag', implicit=(False, False), value=u'scalar')
SequenceStartEvent(anchor=None, tag=None, implicit=True)
ScalarEvent(anchor=u'anchor', tag=None, implicit=(True, False), value=u'item')
ScalarEvent(anchor=None, tag=None, implicit=(True, False), value=u'another item')
MappingStartEvent(anchor=None, tag=None, implicit=True)
ScalarEvent(anchor=None, tag=None, implicit=(True, False), value=u'key')
ScalarEvent(anchor=None, tag=None, implicit=(True, False), value=u'value')
SequenceStartEvent(anchor=None, tag=None, implicit=True)
ScalarEvent(anchor=None, tag=None, implicit=(True, False), value=u'complex')
ScalarEvent(anchor=None, tag=None, implicit=(True, False), value=u'key')
SequenceStartEvent(anchor=None, tag=None, implicit=True)
ScalarEvent(anchor=None, tag=None, implicit=(True, False), value=u'complex')
ScalarEvent(anchor=None, tag=None, implicit=(True, False), value=u'value')
>>> events = [
... yaml.StreamStartEvent(encoding='utf-8'),
... yaml.DocumentStartEvent(explicit=True),
... yaml.MappingStartEvent(anchor=None, tag=None, implicit=True),
... yaml.ScalarEvent(anchor=None, tag=None, value=u'flow sequence', implicit=(True, True)),
... yaml.SequenceStartEvent(anchor=None, tag=None, flow_style=True, implicit=True),
... yaml.ScalarEvent(anchor=None, tag=None, value=u'123', implicit=(True, False)),
... yaml.ScalarEvent(anchor=None, tag=None, value=u'456', implicit=(True, False)),
... yaml.SequenceEndEvent(),
... yaml.ScalarEvent(anchor=None, tag=None, value=u'block scalar', implicit=(True, True)),
... yaml.ScalarEvent(anchor=None, tag=None, value=u'YAML\nis\nfun!\n', style='|', implicit=(True, True)),
... yaml.MappingEndEvent(),
... yaml.DocumentEndEvent(explicit=True),
... yaml.StreamEndEvent(),
... ]

>>> print yaml.emit(events)
flow sequence: [123, 456]
block scalar: |

To Do

Long-term goals:

  • fix tabs, indentation for flow collections, indentation for scalars (min=1?), 'y' is !!bool,
  • libyaml3000

Deviations from the specification

  • rules for tabs in YAML are confusing. We are close, but not there yet. Perhaps both the spec and the parser should be fixed. Anyway, the best rule for tabs in YAML is to not use them at all.
  • Byte order mark. The initial BOM is stripped, but BOMs inside the stream are considered as parts of the content. It can be fixed, but it's not really important now.
  • Empty plain scalars are not allowed if alias or tag is specified. This is done to prevent anomalities like [ !tag, value], which can be interpreted both as [ !<!tag,> value ] and [ !<!tag> "", "value" ]. The spec should be fixed.
  • Indentation of flow collections. The spec requires them to be indented more then their block parent node. Unfortunately this rule many intuitively correct constructs invalid, for instance,
    block: {
    } # this is indentation violation according to the spec.
  • ':' is not allowed for plain scalars in the flow mode. {1:2} is interpreted as { 1 : 2 }.