Ticket #43 (closed defect: fixed)
path resolver has bug
| Reported by: | jstroud@… | Owned by: | xi |
|---|---|---|---|
| Priority: | high | Component: | pyyaml |
| Severity: | major | Keywords: | Path Resolver |
| Cc: |
Description
Path resolving does not work for Nodes within Sequence Nodes
[This is current for pyyaml 3.04.]
For example:
yaml.add_path_resolver(u'!MyObject', (u'myobjs', [list, False]))
should match all of the items in the myobjs sequence in this yaml:
"""
--- !MyConfig
myobjs :
- name: x
root: z
- name: p
root: q
...
"""
The problem is here in resolver.py (line 210):
elif isinstance(index_check, int):
if index_check != current_index:
return
If I'm infering the intent of the code correctly
(u'myobjs', (list, False))
should match all elements of the sequence keyed by "myobjs". However,
isinstance(False, int)
always returns True because bool is a subclass of int, so index_check will only equal current_index for the first element when False is specified as the index_check.
Additionally, I believe another bug exists in resolver.py. According to the code in add_path_resolver(), index_check defaults to True when not specified (resolver.py, line 39):
if isinstance(element, (list, tuple)):
if len(element) == 2:
node_check, index_check = element
elif len(element) == 1:
node_check = element[0]
index_check = True
The intendend meaning of True here is not clear because True causes every element in the sequence to fail the match. The relevant test is in check_resolver_prefix() (resolver.py, line 116):
if index_check is True and current_index is not None:
return
Thus, any path not specifying an index_check will never match. Most insidious to the user is that this case is never reported nor an exception thrown. If index_check is True, this code will only continue to test if current_index is None, which should never happen for any node of any sequence, because any node of a sequence must, by virtue of its existence, have an index.
To fix these problems, I propose the following patch to resolver.py:
44c44 < index_check = True --- > index_check = False 122c122 < elif isinstance(index_check, int): --- > elif isinstance(index_check, int) and index_check is not False:
This will set the default value of index_check to match all elements of a sequence if no index_check is given and will not attempt to compare False to the value of index_check.
At the bare minimum, the following patch should be strongly considered if a good reason exists to leave the defalut of index_check to True.
122c122 < elif isinstance(index_check, int): --- > elif isinstance(index_check, int) and index_check is not False:
Moreover, the code should provide some indication for what an index_check value of True actually means.
In fact, I propose re-writing the code to use None to match all elements, because
isinstance(None, int)
will always evaluate to False. This will also remove the question as to what the "other" bool value means because bools will not be used. I can provide a patch if this latter proposal sounds good to the developers.
Attachments
Change History
comment:3 Changed 6 years ago by xi
- Status changed from new to closed
- Resolution set to fixed
Thanks for the report. The bug is, indeed, real. It is fixed in [246]. Still note that the add_path_resolver is an experimental API which has never been completed and is subject to change.
The "index_check is True" is supposed to match against mapping keys. It makes sense because keys in YAML could be sequence or mapping nodes. For example:
>>> yaml.add_path_resolver('tag:yaml.org,2002:python/tuple', ([dict, True],), list) >>> yaml.load('[1, 2, 3]: 4') {(1, 2, 3): 4}

It looks you don't use the add_path_resolver function correctly. If you want to match all of the items in the myobjs sequence in
--- !MyConfig myobjs : - name: x root: z - name: p root: qthen call
Then the above document will be loaded as
--- !MyConfig myobjs : - !MyObject name: x root: z - !MyObject name: p root: qI guess this is what you needed?