Version 12 (modified by xi, 11 years ago) (diff)


PyYAML Documentation

work in progress

RPG-ish descriptions are stolen from  the Angband rogue-like game. Names of the heroes are generated with  MudNames.

This documentation is very brief and incomplete. Feel free to fix or improve it.

Basic usage

Start with importing the yaml package.

>>> import yaml


Warning: It is not safe to call yaml.load with any data received from an untrusted source! yaml.load is as powerful as pickle.load and so may call any Python function. Check the yaml.safe_load function though.

The function yaml.load converts a YAML document to a Python object.

>>> yaml.load("""
... - Hesperiidae
... - Papilionidae
... - Apatelodidae
... - Epiplemidae
... """)

['Hesperiidae', 'Papilionidae', 'Apatelodidae', 'Epiplemidae']

yaml.load accepts a string, a Unicode string, an open file object, or an open Unicode file object. A string or a file must be encoded with utf-8, utf-16-be or utf-16-le encoding. yaml.load detects the encoding by checking the BOM (byte order mark) sequence at the beginning of the string/file. If no BOM is present, the utf-8 encoding is assumed.

yaml.load returns a Python object.

>>> yaml.load(u"""
... hello: Привет!
... """)

{'hello': u'\u041f\u0440\u0438\u0432\u0435\u0442!'}

>>> stream = file('document.yaml', 'r')    # 'document.yaml' contains a single YAML document.
>>> yaml.load(stream)
[...]    # A Python object corresponding to the document.

if a string or a file contains several documents, you may load them all with the yaml.load_all function.

>>> documents = """
... ---
... name: The Set of Gauntlets 'Pauraegen'
... description: >
...     A set of handgear with sparks that crackle
...     across its knuckleguards.
... ---
... name: The Set of Gauntlets 'Paurnen'
... description: >
...   A set of gauntlets that gives off a foul,
...   acrid odour yet remains untarnished.
... ---
... name: The Set of Gauntlets 'Paurnimmen'
... description: >
...   A set of handgear, freezing with unnatural cold.
... """

>>> for data in yaml.load_all(documents):
...     print data

{'description': 'A set of handgear with sparks that crackle across its knuckleguards.\n',
'name': "The Set of Gauntlets 'Pauraegen'"}
{'description': 'A set of gauntlets that gives off a foul, acrid odour yet remains untarnished.\n',
'name': "The Set of Gauntlets 'Paurnen'"}
{'description': 'A set of handgear, freezing with unnatural cold.\n',
'name': "The Set of Gauntlets 'Paurnimmen'"}

PyYAML allows you to construct a Python object of any type.

>>> yaml.load("""
... none: [~, null]
... bool: [true, false, on, off]
... int: 42
... float: 3.14159
... list: [LITE, RES_ACID, SUS_DEXT]
... dict: {hp: 13, sp: 5}
... """)

{'none': [None, None], 'int': 42, 'float': 3.1415899999999999,
'list': ['LITE', 'RES_ACID', 'SUS_DEXT'], 'dict': {'hp': 13, 'sp': 5},
'bool': [True, False, True, False]}

Even instances of Python classes can be constructed using the !!python/object tag.

>>> class Hero:
...     def __init__(self, name, hp, sp):
... = name
...         self.hp = hp
...         self.sp = sp
...     def __repr__(self):
...         return "%s(name=%r, hp=%r, sp=%r)" % (
...             self.__class__.__name__,, self.hp, self.sp)

>>> yaml.load("""
... !!python/object:__main__.Hero
... name: Welthyr Syxgon
... hp: 1200
... sp: 0
... """)

Hero(name='Welthyr Syxgon', hp=1200, sp=0)

Note that the ability to construct an arbitrary Python object may be dangerous if you receive a YAML document from an untrusted source such as Internet. The function yaml.safe_load limits this ability to simple Python objects like integers or lists.


The yaml.dump function accepts a Python object and produces a YAML document.

>>> print yaml.dump({'name': 'Silenthand Olleander', 'race': 'Human',
... 'traits': ['ONE_HAND', 'ONE_EYE']})

name: Silenthand Olleander
race: Human
traits: [ONE_HAND, ONE_EYE]

yaml.dump accepts the second optional argument, which must be an open file. In this case, yaml.dump will write the produced YAML document into the file. Otherwise, yaml.dump returns the produced document.

>>> stream = file('document.yaml', 'w')
>>> yaml.dump(data, stream)    # Write a YAML representation of data to 'document.yaml'.
>>> print yaml.dump(data)      # Output the document to the screen.

If you need to dump several YAML documents to a single stream, use the function yaml.dump_all. yaml.dump_all accepts a list or a generator producing Python objects to be serialized into a YAML document. The second optional argument is an open file.

>>> print yaml.dump([1,2,3], explicit_start=True)
--- [1, 2, 3]

>>> print yaml.dump_all([1,2,3], explicit_start=True)
--- 1
--- 2
--- 3

You may even dump instances of Python classes.

>>> class Hero:
...     def __init__(self, name, hp, sp):
... = name
...         self.hp = hp
...         self.sp = sp
...     def __repr__(self):
...         return "%s(name=%r, hp=%r, sp=%r)" % (
...             self.__class__.__name__,, self.hp, self.sp)

>>> print yaml.dump(("Galain Ysseleg", hp=-3, sp=2))

!!python/object:__main__.Hero {hp: -3, name: Galain Ysseleg, sp: 2}

yaml.dump supports a number of keyword arguments that specify formatting details for the emitter. For instance, you may set the preferred intendation and width, use the canonical YAML format or force preferred style for scalars and collections.

>>> print yaml.dump(range(50))
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22,
  23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42,
  43, 44, 45, 46, 47, 48, 49]

>>> print yaml.dump(range(50), width=50, indent=4)
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15,
    16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27,
    28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39,
    40, 41, 42, 43, 44, 45, 46, 47, 48, 49]

>>> print yaml.dump(range(5), canonical=True)
!!seq [
  !!int "0",
  !!int "1",
  !!int "2",
  !!int "3",
  !!int "4",

>>> print yaml.dump(range(5), default_flow_style=False)
- 0
- 1
- 2
- 3
- 4

>>> print yaml.dump(range(5), default_flow_style=True, default_style='"')
[!!int "0", !!int "1", !!int "2", !!int "3", !!int "4"]

YAML syntax

A good introduction to the YAML syntax is  Chapter 2 of the YAML specification.

You may also check  the YAML cookbook. Note that it is focused on a Ruby implementation and uses the old YAML 1.0 syntax.

Here we present most common YAML constructs together with the corresponding Python objects.


YAML stream is a collection of zero or more documents. An empty stream contains no documents. Documents are separated with ---. Documents may optionally end with .... A single document may or may not be marked with ---.

Example of an implicit document:

- Multimedia
- Internet
- Education

Example of an explicit document:

- Afterstep
- Oroborus

Example of several documents in the same stream:

- Ada
- Assembly
- Awk
- Basic
- C
- C#    # Note that comments are denoted with ' #' (space and #).
- C++
- Cold Fusion

Block sequences

In the block context, sequence entries are denoted by - (dash and space):

- The Dagger 'Narthanc'
- The Dagger 'Nimthanc'
- The Dagger 'Dethanc'
# Python
["The Dagger 'Narthanc'", "The Dagger 'Nimthanc'", "The Dagger 'Dethanc'"]

Block sequences can be nested:

  - HTML
  - LaTeX
  - SGML
  - VRML
  - XML
  - YAML
  - BSD
  - GNU Hurd
  - Linux
# Python
[['HTML', 'LaTeX', 'SGML', 'VRML', 'XML', 'YAML'], ['BSD', 'GNU Hurd', 'Linux']]

It's not necessary to start a nested sequence with a new line:

- 1.1
- - 2.1
  - 2.2
- - - 3.1
    - 3.2
    - 3.3
# Python
[1.1, [2.1, 2.2], [[3.1, 3.2, 3.3]]]

A block sequence may be nested to a block mapping. Note that in this case it is not necessary to indent the sequence.

left hand:
- Ring of Teleportation
- Ring of Speed
right hand:
- Ring of Resist Fire
- Ring of Resist Cold
- Ring of Resist Poison
# Python
{'right hand': ['Ring of Resist Fire', 'Ring of Resist Cold', 'Ring of Resist Poison'],
'left hand': ['Ring of Teleportation', 'Ring of Speed']}

Block mappings

In the block context, keys and values of mappings are separated by : (colon and space):

base armor class: 0
base damage: [4,4]
plus to-hit: 12
plus to-dam: 16
plus to-ac: 0
# Python
{'plus to-hit': 12, 'base damage': [4, 4], 'base armor class': 0, 'plus to-ac': 0, 'plus to-dam': 16}

Complex keys are denoted with ? (question mark and space):

? !!python/tuple [0,0]
: The Hero
? !!python/tuple [0,1]
: Treasure
? !!python/tuple [1,0]
: Treasure
? !!python/tuple [1,1]
: The Dragon
# Python
{(0, 1): 'Treasure', (1, 0): 'Treasure', (0, 0): 'The Hero', (1, 1): 'The Dragon'}

Block mapping can be nested:

  hp: 34
  sp: 8
  level: 4
  hp: 12
  sp: 0
  level: 2
# Python
{'hero': {'hp': 34, 'sp': 8, 'level': 4}, 'orc': {'hp': 12, 'sp': 0, 'level': 2}}

A block mapping may be nested in a block sequence:

- name: PyYAML
  status: 4
  license: MIT
  language: Python
- name: PySyck
  status: 5
  license: BSD
  language: Python
# Python
[{'status': 4, 'language': 'Python', 'name': 'PyYAML', 'license': 'MIT'},
{'status': 5, 'license': 'BSD', 'name': 'PySyck', 'language': 'Python'}]

Flow collections

The syntax of flow collections in YAML is very close to the syntax of list and dictionary constructors in Python:

{ str: [15, 17], con: [16, 16], dex: [17, 18], wis: [16, 16], int: [10, 13], chr: [5, 8] }
# Python
{'dex': [17, 18], 'int': [10, 13], 'chr': [5, 8], 'wis': [16, 16], 'str': [15, 17], 'con': [16, 16]}


There are 5 styles of scalars in YAML: plain, single-quoted, double-quoted, literal, and folded:

plain: Scroll of Remove Curse
single-quoted: 'EASY_KNOW'
double-quoted: "?"
literal: |    # Borrowed from
  by hjw              ___
     __              /.-.\
    /  )_____________\\  Y
   /_ /=== == === === =\ _\_
  ( /)=== == === === == Y   \
   `-------------------(  o  )
folded: >
  It removes all ordinary curses from all equipped items.
  Heavy or permanent curses are unaffected.
# Python
{'folded': 'It removes all ordinary curses from all equipped items. \
Heavy or permanent curses are unaffected.\n',
'literal': 'by hjw              ___\n   __              /.-.    /  )_____________\\  Y\n\
 /_ /=== == === === =\\ _\\_\n( /)=== == === === == Y      `-------------------(  o  )\n\
'single-quoted': 'EASY_KNOW', 'double-quoted': '?', 'plain': 'Scroll of Remove Curse'}

Each style has its own quirks. A plain scalar does not use indicators to denote its start and end, therefore it's the most restricted style. Its natural applications are names of attributes and parameters.

Using single-quoted scalars, you may express any value that does not contain special characters. No escaping occurs for single quoted scalars except that duplicate quotes '' are replaced with a single quote '.

Double-quoted is the most powerful style and the only style that can express any scalar value. Double-quoted scalars allow escaping. Using escaping sequences \x** and \u****, you may express any ASCII or Unicode character.

There are two kind of block scalar styles: literal and folded. The literal style is the most suitable style for large block of text such as source code. The folded style is similar to the literal style, but two consequent non-empty lines are joined to a single line separated by a space character.


Note that PyYAML does not yet support recursive objects.

Using YAML you may represent objects of arbitrary graph-like structures. If you want to refer to the same object from different parts of a document, you need to use anchors and aliases.

Anchors are denoted by the & indicator while aliases are denoted by *. For instance, the document

left hand: &A The Bastard Sword of Eowyn
right hand: *A

expresses the idea of a hero holding a heavy sword in both hands.


Tags are used to denote the type of a YAML node. Standard YAML tags are defined at

Tags may be implicit:

boolean: true
integer: 3
float: 3.14
{'boolean': True, 'integer': 3, 'float': 3.14}

or explicit:

boolean: !!bool "true"
integer: !!int "3"
float: !!float "3.14"
{'boolean': True, 'integer': 3, 'float': 3.14}

Plain scalars without explicitly defined tag are subject to implicit tag resolution. The scalar value is checked against a set of regular expressions and if one of them matches, the corresponding tag is assigned to the scalar. PyYAML allows an application to add custom implicit tag resolvers.

YAML tags and Python types

The following table describes how nodes with different tags are converted to Python objects.

YAML tag Python type
Standard YAML tags
!!null None
!!bool bool
!!int int or long
!!float float
!!binary str
!!timestamp datetime.datetime
!!omap, !!pairs list of pairs
!!set set
!!str str or unicode
!!seq list
!!map dict
Python-specific tags
!!python/none None
!!python/bool bool
!!python/str str
!!python/unicode unicode
!!python/int int
!!python/long long
!!python/float float
!!python/complex complex
!!python/list list
!!python/tuple tuple
!!python/dict dict
Complex Python tags
!!python/module:package.module package.module
!!python/object:module.cls module.cls instance
!!python/object/new:module.cls module.cls instance
!!python/object/apply:module.f value of f(...)

String conversion

There are four tags that are converted to str and unicode values: !!str, !!binary, !!python/str, and !!python/unicode.

!!str-tagged scalars are converted to str objects if its value is ASCII. Otherwise it is converted to unicode. !!binary-tagged scalars are converted to str objects with its value decoded using the base64 encoding. !!python/str scalars are converted to str objects encoded with utf-8 encoding. !!python/unicode scalars are converted to unicode objects.

Conversely, a str object is converted to

  1. a !!str scalar if its value is ASCII.
  2. a !!python/str scalar if its value is a correct utf-8 sequence.
  3. a !!binary scalar otherwise.

A unicode object is converted to

  1. a !!python/unicode scalar if its value is ASCII.
  2. a !!str scalar otherwise.

Names and modules

In order to represent static Python objects like functions or classes, you need to use a complex !!python/name tag. For instance, the function yaml.dump can be represented as


Similarly, modules are represented using the tag !python/module:



Any pickleable object can be serialized using the !!python/object tag:

!!python/object:module.Class { attribute: value, ... }

In order to support the pickle protocol, two additional forms of the !!python/object tag are provided:

args: [argument, ...]
kwds: {key: value, ...}
state: ...
listitems: [item, ...]
dictitems: [key: value, ...]
args: [argument, ...]
kwds: {key: value, ...}
state: ...
listitems: [item, ...]
dictitems: [key: value, ...]

If only the args field is non-empty, the above records can be shortened:

!!python/object/new:module.Class [argument, ...]
!!python/object/apply:module.function [argument, ...]


Warning: API stability is not guaranteed'''



If YAML parser encounters an error condition, it raises an exception which is an instance of YAMLError or of its subclass. An application may catch this exception and warn a user.

    config = yaml.load(file('config.yaml', 'r'))
except yaml.YAMLError, exc:
    print "Error in configuration file:", exc


Mark(name, index, line, column, buffer, pointer)

An instance of Mark points to a certain position in the input stream. name is the name of the stream, for instance it may be the filename if the input stream is a file. line and column is the line and column of the position (starting from 0). buffer, when it is not None, is a part of the input stream that contain the position and pointer refers to the position in the buffer.


Tokens are produced by a YAML scanner. They are not really useful except for low-level YAML applications such as syntax highlighting.

The PyYAML scanner produces the following types of tokens:

StreamStartToken(encoding, start_mark, end_mark) # Start of the stream.
StreamEndToken(start_mark, end_mark) # End of the stream.
DirectiveToken(name, value, start_mark, end_mark) # YAML directive, either %YAML or %TAG.
DocumentStartToken(start_mark, end_mark) # '---'.
DocumentEndToken(start_mark, end_mark) # '...'.
BlockSequenceStartToken(start_mark, end_mark) # Start of a new block sequence.
BlockMappingStartToken(start_mark, end_mark) # Start of a new block mapping.
BlockEndToken(start_mark, end_mark) # End of a block collection.
FlowSequenceStartToken(start_mark, end_mark) # '['.
FlowMappingStartToken(start_mark, end_mark) # '{'.
FlowSequenceEndToken(start_mark, end_mark) # ']'.
FlowMappingEndToken(start_mark, end_mark) # '}'.
KeyToken(start_mark, end_mark) # Either '?' or start of a simple key.
ValueToken(start_mark, end_mark) # ':'.
BlockEntryToken(start_mark, end_mark) # '-'.
FlowEntryToken(start_mark, end_mark) # ','.
AliasToken(value, start_mark, end_mark) # '*value'.
AnchorToken(value, start_mark, end_mark) # '&value'.
TagToken(value, start_mark, end_mark) # '!value'.
ScalarToken(value, plain, style, start_mark, end_mark) # 'value'.

start_mark and end_mark denote the beginning and the and of a token.


>>> document = """
... ---
... block sequence:
... - BlockEntryToken
... block mapping:
...   ? KeyToken
...   : ValueToken
... flow sequence: [FlowEntryToken, FlowEntryToken]
... flow mapping: {KeyToken: ValueToken}
... anchors and tags:
... - &A !!int '5'
... - *A
... ...
... """

>>> for token in yaml.scan(document):
...     print token




ScalarToken(plain=True, style=None, value=u'block sequence')

ScalarToken(plain=True, style=None, value=u'BlockEntryToken')

ScalarToken(plain=True, style=None, value=u'block mapping')


ScalarToken(plain=True, style=None, value=u'KeyToken')
ScalarToken(plain=True, style=None, value=u'ValueToken')

ScalarToken(plain=True, style=None, value=u'flow sequence')

ScalarToken(plain=True, style=None, value=u'FlowEntryToken')
ScalarToken(plain=True, style=None, value=u'FlowEntryToken')

ScalarToken(plain=True, style=None, value=u'flow mapping')

ScalarToken(plain=True, style=None, value=u'KeyToken')
ScalarToken(plain=True, style=None, value=u'ValueToken')

ScalarToken(plain=True, style=None, value=u'anchors and tags')

TagToken(value=(u'!!', u'int'))
ScalarToken(plain=False, style="'", value=u'5')





Events are used by the low-level Parser and Emitter interfaces, which are similar to the SAX API. While the Parser parses a YAML stream and produces a sequence of events, the Emitter accepts a sequence of events and emits a YAML stream.

The following events are defined:

StreamStartEvent(encoding, start_mark, end_mark)
StreamEndEvent(start_mark, end_mark)
DocumentStartEvent(explicit, version, tags, start_mark, end_mark)
DocumentEndEvent(start_mark, end_mark)
SequenceStartEvent(anchor, tag, implicit, flow_style, start_mark, end_mark)
SequenceEndEvent(start_mark, end_mark)
MappingStartEvent(anchor, tag, implicit, flow_style, start_mark, end_mark)
MappingEndEvent(start_mark, end_mark)
AliasEvent(anchor, start_mark, end_mark)
ScalarEvent(anchor, tag, implicit, value, style, start_mark, end_mark)

The flow_style flag indicates if a collection is block or flow. The possible values are None, True, False. The style flag of a scalar event indicates the style of the scalar. Possible values are None, '', '\'', '"', '|', '>'. The implicit flag of a collection start event indicates if the tag may be omitted when the collection is emitted. The implicit flag of a scalar event is a pair of boolean values that indicate if the tag may be omitted when the scalar is emitted in a plain and non-plain style correspondingly.


>>> document = """
... scalar: &A !!int '5'
... alias: *A
... sequence: [1, 2, 3]
... mapping: [1: one, 2: two, 3: three]
... """

>>> for event in yaml.parse(document):
...     print event



MappingStartEvent(anchor=None, tag=None, implicit=True)

ScalarEvent(anchor=None, tag=None, implicit=(True, False), value=u'scalar')
ScalarEvent(anchor=u'A', tag=u',2002:int', implicit=(False, False), value=u'5')

ScalarEvent(anchor=None, tag=None, implicit=(True, False), value=u'alias')

ScalarEvent(anchor=None, tag=None, implicit=(True, False), value=u'sequence')
SequenceStartEvent(anchor=None, tag=None, implicit=True)
ScalarEvent(anchor=None, tag=None, implicit=(True, False), value=u'1')
ScalarEvent(anchor=None, tag=None, implicit=(True, False), value=u'2')
ScalarEvent(anchor=None, tag=None, implicit=(True, False), value=u'3')

ScalarEvent(anchor=None, tag=None, implicit=(True, False), value=u'mapping')
MappingStartEvent(anchor=None, tag=None, implicit=True)
ScalarEvent(anchor=None, tag=None, implicit=(True, False), value=u'1')
ScalarEvent(anchor=None, tag=None, implicit=(True, False), value=u'one')
ScalarEvent(anchor=None, tag=None, implicit=(True, False), value=u'2')
ScalarEvent(anchor=None, tag=None, implicit=(True, False), value=u'two')
ScalarEvent(anchor=None, tag=None, implicit=(True, False), value=u'3')
ScalarEvent(anchor=None, tag=None, implicit=(True, False), value=u'three')




>>> print yaml.emit([
...     yaml.StreamStartEvent(encoding='utf-8'),
...     yaml.DocumentStartEvent(explicit=True),
...     yaml.MappingStartEvent(anchor=None, tag=u',2002:map', implicit=True, flow_style=False),
...     yaml.ScalarEvent(anchor=None, tag=u',2002:str', implicit=(True, True), value=u'agile languages'),
...     yaml.SequenceStartEvent(anchor=None, tag=u',2002:seq', implicit=True, flow_style=True),
...     yaml.ScalarEvent(anchor=None, tag=u',2002:str', implicit=(True, True), value=u'Python'),
...     yaml.ScalarEvent(anchor=None, tag=u',2002:str', implicit=(True, True), value=u'Perl'),
...     yaml.ScalarEvent(anchor=None, tag=u',2002:str', implicit=(True, True), value=u'Ruby'),
...     yaml.SequenceEndEvent(),
...     yaml.MappingEndEvent(),
...     yaml.DocumentEndEvent(explicit=True),
...     yaml.StreamEndEvent(),
... ])

agile languages: [Python, Perl, Ruby]


Nodes are entities in the YAML informational model. There are three kinds of nodes: scalar, sequence, and mapping. In PyYAML, nodes are produced by Composer and can be serialized to a YAML stream by Serializer.

ScalarNode(tag, value, style, start_mark, end_mark)
SequenceNode(tag, value, flow_style, start_mark, end_mark)
MappingNode(tag, value, flow_style, start_mark, end_mark)

The style and flow_style flags have the same meaning as for events. The value of a scalar node must be a unicode string. The value of a sequence node is a list of nodes. The value of a mapping node is a dictionary which keys and values are nodes.


>>> print yaml.compose("""
... kinds:
... - scalar
... - sequence
... - mapping
... """)

MappingNode(tag=u',2002:map', value={
    ScalarNode(tag=u',2002:str', value=u'kinds'): SequenceNode(tag=u',2002:seq', value=[
        ScalarNode(tag=u',2002:str', value=u'scalar'),
        ScalarNode(tag=u',2002:str', value=u'sequence'),
        ScalarNode(tag=u',2002:str', value=u'mapping')])})

>>> print yaml.serialize(yaml.SequenceNode(tag=u',2002:seq', value=[
...     yaml.ScalarNode(tag=u',2002:str', value=u'scalar'),
...     yaml.ScalarNode(tag=u',2002:str', value=u'sequence'),
...     yaml.ScalarNode(tag=u',2002:str', value=u'mapping')]))

- scalar
- sequence
- mapping



BaseLoader, SafeLoader, and Loader provide the PyYAML Loader interfaces. Loader is the most common of them and should be used in most cases. BaseLoader does not resolve or support any tags and construct only basic Python objects: lists, dictionaries and Unicode strings. SafeLoader supports only standard YAML tags and thus it does not construct class instances and probably safe to use with documents received from an untrusted source.

stream is an input YAML stream. It can be a string, a Unicode string, an open file, an open Unicode file.

Scanner interface

scan(stream, Loader=Loader)


Parser interface

Composer interface

Constructor interface

Resolver interface


Emitter interface

Serializer interface

Representer interface

Resolver interface


The yaml package

Deviations from the specification

Download and installing

Check it out from the SVN repository

Install it by running

$ python install

High-level API

Warning: API is not stable and may change in the future

Basic examples

Start with importing the package:

>>> import yaml

Define the input data:

>>> data = """
... - YAML
... - is
... - fun!
... """

The parser accepts string objects, unicode objects, open file objects, and unicode file objects.

Now convert it to a native Python object:

>>> yaml.load(data)
['YAML', 'is', 'fun!']

Conversely, you may convert a Python object into a YAML document:

>>> print yaml.dump(['YAML', 'is', 'fun!'])
- is
- fun!

PyYAML 3000 supports many of the types defined in the YAML tags repository:

>>> data = """
... - ~
... - true
... - 3_141_592.653e-6
... - 3000
... - PyYAML3000 birthday: 2006-02-11
... - primes (sort of): !!set { 2, 3, 5, 7, 11, 13 }
... - pairs: !!pairs [1: 2, 3: 4, 5: 6]
... """
>>> for x in yaml.load(data): print x
{'PyYAML3000 birthday': datetime.datetime(2006, 2, 11, 0, 0)}
{'primes (sort of)': set([2, 3, 5, 7, 11, 13])}
{'pairs': [(1, 2), (3, 4), (5, 6)]}
>>> print yaml.dump([None, True, False, 123, 123.456, 'a string',
... {'a': 'dictionary'}, ['a', 'list']])
- null
- true
- false
- 123
- 123.456
- a string
- a: dictionary
- - a
  - list

The following tags are supported: !!map, !!omap, !!pairs, !!set, !!seq, !!binary, !!bool, !!float, !!int, !!merge, !!null, !!str, !!timestamp, !!value.

Defining custom tags

You may define constructors for your own application-specific tags. You may use either the function yaml.add_constructor or subclass from yaml.YAMLObject.

Instances of yaml.YAMLObject are automatically serialized to YAML and vice versa. You only need to define the YAML tag with the yaml_tag variable.

class Person(yaml.YAMLObject):
    yaml_tag = '!Person'
    def __init__(self, first_name=None, last_name=None, email=None, birthday=None):
        self.first_name = first_name
        self.last_name = last_name = email
        self.birthday = birthday
    def __repr__(self):
        return "%s(first_name=%r, last_name=%r, email=%r, birthday=%r)"  \
                % (self.__class__.__name__, self.first_name, self.last_name,
              , self.birthday)
>>> p = yaml.load("""
... !Person
... first_name: Kirill
... last_name: Simonov
... email: xi(at)
... birthday: null
... """)
>>> print p
Person(first_name='Kirill', last_name='Simonov', email='xi(at)', birthday=None)
>>> print yaml.dump(p)
last_name: Simonov
first_name: Kirill
email: xi(at)
birthday: null

If you don't want to use metaclass magic, you may define the constructor and representer as functions and register them:

def construct_person(constructor, node):
    # ...
def represent_person(representer, person):
    # ...
yaml.add_constructor('!Person', construct_person)
yaml.add_representer(Person, represent_person)

Parsing and emitting multiple documents in a stream

If an input stream contains several documents, you may load all of them using the yaml.load_all function.

>>> data = """
... This is the first document
... --- # This is an empty document
... ---
... - this
... - is: the
...   last: document
... """
>>> for document in yaml.load_all(data): print document
This is the first document
['this', {'is': 'the', 'last': 'document'}]

You may also dump several documents into the same stream using the yaml.dump_all function.

>>> print yaml.dump_all(["The first document", None, ["The", "last", "document"]])
The first document
--- null
- The
- last
- document

There are more features, check the source to find out.

Low-level API

PyYAML 3000 provides low-level event-based and easy-to-use parser and emitter API.


>>> data = """
... --- !tag
... scalar
... ---
... - &anchor item
... - another item
... - *anchor
... ---
... key: value
... ? - complex
...   - key
... : - complex
...   - value
... """
>>> for event in yaml.parse(data): print event
ScalarEvent(anchor=None, tag=u'!tag', implicit=(False, False), value=u'scalar')
SequenceStartEvent(anchor=None, tag=None, implicit=True)
ScalarEvent(anchor=u'anchor', tag=None, implicit=(True, False), value=u'item')
ScalarEvent(anchor=None, tag=None, implicit=(True, False), value=u'another item')
MappingStartEvent(anchor=None, tag=None, implicit=True)
ScalarEvent(anchor=None, tag=None, implicit=(True, False), value=u'key')
ScalarEvent(anchor=None, tag=None, implicit=(True, False), value=u'value')
SequenceStartEvent(anchor=None, tag=None, implicit=True)
ScalarEvent(anchor=None, tag=None, implicit=(True, False), value=u'complex')
ScalarEvent(anchor=None, tag=None, implicit=(True, False), value=u'key')
SequenceStartEvent(anchor=None, tag=None, implicit=True)
ScalarEvent(anchor=None, tag=None, implicit=(True, False), value=u'complex')
ScalarEvent(anchor=None, tag=None, implicit=(True, False), value=u'value')
>>> events = [
... yaml.StreamStartEvent(encoding='utf-8'),
... yaml.DocumentStartEvent(explicit=True),
... yaml.MappingStartEvent(anchor=None, tag=None, implicit=True),
... yaml.ScalarEvent(anchor=None, tag=None, value=u'flow sequence', implicit=(True, True)),
... yaml.SequenceStartEvent(anchor=None, tag=None, flow_style=True, implicit=True),
... yaml.ScalarEvent(anchor=None, tag=None, value=u'123', implicit=(True, False)),
... yaml.ScalarEvent(anchor=None, tag=None, value=u'456', implicit=(True, False)),
... yaml.SequenceEndEvent(),
... yaml.ScalarEvent(anchor=None, tag=None, value=u'block scalar', implicit=(True, True)),
... yaml.ScalarEvent(anchor=None, tag=None, value=u'YAML\nis\nfun!\n', style='|', implicit=(True, True)),
... yaml.MappingEndEvent(),
... yaml.DocumentEndEvent(explicit=True),
... yaml.StreamEndEvent(),
... ]

>>> print yaml.emit(events)
flow sequence: [123, 456]
block scalar: |

To Do

Long-term goals:

  • fix tabs, indentation for flow collections, indentation for scalars (min=1?), 'y' is !!bool,
  • libyaml3000

Deviations from the specification

  • rules for tabs in YAML are confusing. We are close, but not there yet. Perhaps both the spec and the parser should be fixed. Anyway, the best rule for tabs in YAML is to not use them at all.
  • Byte order mark. The initial BOM is stripped, but BOMs inside the stream are considered as parts of the content. It can be fixed, but it's not really important now.
  • Empty plain scalars are not allowed if alias or tag is specified. This is done to prevent anomalities like [ !tag, value], which can be interpreted both as [ !<!tag,> value ] and [ !<!tag> "", "value" ]. The spec should be fixed.
  • Indentation of flow collections. The spec requires them to be indented more then their block parent node. Unfortunately this rule many intuitively correct constructs invalid, for instance,
    block: {
    } # this is indentation violation according to the spec.
  • ':' is not allowed for plain scalars in the flow mode. {1:2} is interpreted as { 1 : 2 }.