Character encoding aliases for legacy web content
In order to be compatible with legacy web content when interpreting something like Content-Type: text/html; charset=latin1
, tools need to use a particular set of aliases for encoding labels as well as some overriding rules. For example, US-ASCII
and iso-8859-1
on the web are actually aliases for windows-1252
, and a UTF-8
or UTF-16
BOM takes precedence over any other encoding declaration. The WHATWG Encoding standard defines all such details so that implementations do not have to reverse-engineer each other.
This module implements the Encoding standard and has encoding labels and BOM detection, but the actual implementation for encoders and decoders is Python’s.
Linter | Message | Location |
---|---|---|
input-labels Identify input labels that do not match package names | label 'python-pytest' does not match package name 'python2-pytest' | |
optional-tests Make sure tests are only run when requested | the 'check' phase should respect #:tests? |