Ticket #128 (new defect)
libyaml should detect duplicate keys and report an error.
| Reported by: | anonymous | Owned by: | xi |
|---|---|---|---|
| Priority: | high | Component: | libyaml |
| Severity: | major | Keywords: | |
| Cc: |
Description
In the 1.1 spec I believe this is considered an error; currently libyaml silently accepts duplicate keys.
Attachments
Change History
comment:2 Changed 4 years ago by rblazecka@…
I was just bit by this bug; I strongly believe that it should throw an exception when this scenario is encountered.
The YAML spec as written allows the parser to continue parsing the rest of the file, as highlighted in the comment above. However, this appears to be as a concession, not a rule (it "may" continue, not "must"). In the previous sentence, it unequivocally states that two equal keys is an error. In the case where it does pass over the error, it _must_ issue a warning. There is no case of silently ignoring the duplicate keys.
In all of the use cases that I have in my application, any invalid YAML should be an error. Thus why I'd like a straightforward exception that the data couldn't be parsed.
In this simple test case:
import yaml
print yaml.load('''
one: 1
one: 4
''')
pyyaml loads the data without any warning at all. The output of this program is:
{'one': 4}
No warnings or errors that there were any problems with the data stream at all. Ideally perhaps the library should be configurable between the two modes, but if it clutters the interface too much, I'd like to have it just raise an exception.
(Note also that this output is incorrect when compared to what the spec says it should be doing, in that it should be ignoring the second value, not keeping it)
comment:3 Changed 4 years ago by anonymous
a) this really sucks, loading something that's considered "an error" and not exposing that to the user is broken. b) libyaml is returning the last instance of the key, not the first.
comment:4 Changed 3 years ago by sven@…
I do not know why this issue has not been addressed some months, but I have found a solution which fits to my needs. See the uDiff below. Was hard to find, that it was the constructor.yml file ... but...
--- constructur-r303.py Do Feb 18 18:10:45 2010
+++ constructor-mine.py Do Feb 18 18:12:29 2010
@@ -1,4 +1,3 @@
-
__all__ = ['BaseConstructor', 'SafeConstructor', 'Constructor',
'ConstructorError']
@@ -137,6 +136,12 @@
raise ConstructorError("while constructing a mapping", node.start_mark,
"found unacceptable key (%s)" % exc, key_node.start_mark)
value = self.construct_object(value_node, deep=deep)
+ #print key
+ #print mapping
+ #print "-----------"
+ if key in mapping:
+ raise ConstructorError("while constructing a mapping", node.start_mark,
+ "found already in-use key (%s)" % key, key_node.start_mark)
mapping[key] = value
return mapping
comment:5 Changed 3 years ago by sven.witterstein@…
- Priority changed from normal to high
- Severity changed from normal to major
A month later and no comment on my suggested patch. Sad. I think the importance and priority of this bug should be raised, which I do now (if I may). All that needs be done is to apply my patch and maybe add some config param such as --strict-key-check or the like...
comment:6 Changed 3 years ago by strombrg@…
I need this too. It could conceivably prevent me from being able to use PyYAML in my current project.
comment:7 Changed 3 years ago by rdesgroppes
+1. JSON's RFC4627 requires that mappings keys merely “SHOULD” be unique, while YAML insists they “MUST” be. Technically, YAML therefore complies with the JSON spec, choosing to treat duplicates as an error. In practice, since JSON is silent on the semantics of such duplicates, the only portable JSON files are those with unique keys, which are therefore valid YAML files. (from http://www.yaml.org/spec/1.2/spec.html#id2759572)
comment:8 Changed 2 years ago by amannijhawan@…
I think raising an error would be harsh and might break existing code for users, however raising a warning would be useful.
comment:10 Changed 14 months ago by e.a.b.piel@…
Any advances on this matter? In my project I'd like to detect such erroneous construct but without pyyaml telling me anything I've no idea how to do that :-( I also understand from the spec that it should generate an error, but just a warning would be already very nice :-D

http://yaml.org/spec/1.1/#id932806
It is an error for two equal keys to appear in the same mapping node. In such a case the YAML processor may continue, ignoring the second key: value pair and issuing an appropriate warning. This strategy preserves a consistent information model for one-pass and random access applications.
I do not think the library should raise an exception in this case.