Modify

Ticket #129 (closed defect: fixed)

Opened 5 years ago

Last modified 5 years ago

Incorrect Unicode BOM generation

Reported by: Valentin Nechayev <netchv@…> Owned by: xi
Priority: normal Component: pyyaml
Severity: normal Keywords:
Cc:

Description

py-YAML 3.07, with Python 2.5 and FreeBSD (package name py25-yaml-3.07_2)

When yaml.dump() generates stream in utf-16be or utf-16le, it generates byte-order mark (BOM), but makes it incorrectly. Example:

>>> yaml.dump("xyz", encoding = 'utf-16be')
'\x00\xff\x00\xfe\x00x\x00y\x00z\x00\n\x00.\x00.\x00.\x00\n'

Instead, it should generate:

'\xfe\xff\x00x\x00y\x00z\x00\n\x00.\x00.\x00.\x00\n'

Fix:

--- 01/PyYAML-3.07/lib/yaml/emitter.py  2008-12-29 01:36:32.000000000 +0200
+++ work/PyYAML-3.07/lib/yaml/emitter.py        2009-06-06 16:48:39.000000000 +0300
@@ -787,7 +787,7 @@
     def write_stream_start(self):
         # Write BOM if needed.
         if self.encoding and self.encoding.startswith('utf-16'):
-            self.stream.write(u'\xFF\xFE'.encode(self.encoding))
+            self.stream.write(u'\uFEFF'.encode(self.encoding))
 
     def write_stream_end(self):
         self.flush_stream()

P.S. I guess it also should generate BOMs for utf-32*

Attachments

Change History

comment:1 Changed 5 years ago by xi

  • Status changed from new to closed
  • Resolution set to fixed

Thank you for the report and the fix. Fixed in [351].

UTF-32 is not supported by the YAML specification.

View

Add a comment

Modify Ticket

Change Properties
<Author field>
Action
as closed
The resolution will be deleted. Next status will be 'reopened'
Author


E-mail address and user name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.