lynx/tests/lynx-dump/data/c1.html.exp
Kamil Dudka 5bdda90d01 Resolves: CVE-2021-38165 - implement a gating test
... based on `fmf` and `tmt`
2021-10-15 10:12:52 +02:00

58 lines
2.7 KiB
Plaintext
Raw Permalink Blame History

This file contains invisible Unicode characters

This file contains invisible Unicode characters that are indistinguishable to humans but may be processed differently by a computer. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

Test of invalid NCRs 128-159
Authoring tools on MS Windows, in particular MS FrontPage ("WYSIWYG"
HTML editor), generate invalid Numerical Character References for
characters commonly found in positions 128...159 (0x80...0x9f) in
Windows fonts. Although these are valid codepoints for windows-1252
(and other windows-xxxx) charsets, valid NCRs always refer to the
document character set in the SGML sense, not to the character encoding
scheme (or charset). For HTML, the SGML document character set is
fixed, it is always a subset of Unicode (or ISO 10646). In Unicode and
its iso-8859-1 subset, values 128...159 are C1 control characters, they
must not appear in HTML. Valid NCRs for the intended characters use
Unicode values greater than 256.
Lynx tries to interpret some of the invalid codes, by assuming that
they are windows-1252 codepoints.
You may want to press '\' to view the source of this test.
Code invalid NCR valid NCR, description
normal in ALT
0x80 € € € #EURO SIGN
0x81  <20> #NOT USED
0x82 ‚ #SINGLE LOW-9 QUOTATION MARK
0x83 ƒ ƒ ƒ #LATIN SMALL LETTER F WITH HOOK
0x84 „ „ „ #DOUBLE LOW-9 QUOTATION MARK
0x85 … … … #HORIZONTAL ELLIPSIS
0x86 † † † #DAGGER
0x87 ‡ ‡ ‡ #DOUBLE DAGGER
0x88 ˆ ˆ ˆ #MODIFIER LETTER CIRCUMFLEX ACCENT
0x89 ‰ ‰ ‰ #PER MILLE SIGN
0x8a Š Š Š #LATIN CAPITAL LETTER S WITH CARON
0x8b ‹ #SINGLE LEFT-POINTING ANGLE QUOTATION MARK
0x8c Œ Œ Œ #LATIN CAPITAL LIGATURE OE
0x8d  <20> #NOT USED
0x8e Ž Ž #NOT USED
0x8f  <20> #NOT USED
0x90  <20> #NOT USED
0x91 ‘ #LEFT SINGLE QUOTATION MARK
0x92 ’ #RIGHT SINGLE QUOTATION MARK
0x93 “ “ “ #LEFT DOUBLE QUOTATION MARK
0x94 ” ” ” #RIGHT DOUBLE QUOTATION MARK
0x95 • • • #BULLET
0x96 – #EN DASH
0x97 — — — #EM DASH
0x98 ˜ ˜ ˜ #SMALL TILDE
0x99 ™ ™ ™ #TRADE MARK SIGN
0x9a š š š #LATIN SMALL LETTER S WITH CARON
0x9b › #SINGLE RIGHT-POINTING ANGLE QUOTATION MARK
0x9c œ œ œ #LATIN SMALL LIGATURE OE
0x9d  <20> #NOT USED
0x9e ž ž #NOT USED
0x9f Ÿ Ÿ Ÿ #LATIN CAPITAL LETTER Y WITH DIAERESIS