man ascii2uni (Commandes) - convert 7-bit ASCII representations to UTF-8 Unicode

NAME

uni2ascii - convert 7-bit ASCII representations to UTF-8 Unicode

SYNOPSIS

uni2ascii [options]

DESCRIPTION

uni2ascii converts various 7-bit ASCII representations to UTF-8. It reads from the standard input and writes to the standard output. The representations understood are listed below under the command line options.

COMMAND LINE OPTIONS

-h
Help. Print the usage message and exit.
-v
Print program version information and exit.
-p
Pure. Assume that the input consists entirely of escapes except for arbitrary (but non-null) amounts of separating whitespace.
-q
Be quiet. Do not chat unnecessarily. Supply one of the following conversion specifications:
-A
Convert hexadecimal numbers with prefix U in angle-brackets (e.g. <U00E9>).
-B
Convert \x-escaped hex (e.g. \x00E9)
-C
Convert \x escaped hexadecimal numbers in braces (e.g. \x{00E9}).
-D
Convert decimal HTML numeric character entities (e.g. &#x0233)
-E
Convert hexadecimal with prefix U (e.g. U00E9).
-F
Convert hexadecimal with prefix u (e.g. u00E9).
-G
Convert hexadecimal in single quotes with prefix X (e.g. X'00E9').
-H
Convert hexadecimal HTML numeric character entities (e.g. &#x00E9)
-I
Convert hexadecimal UTF-8 with each byte's hex preceded by an =-sign (e.g. =C3=A9)
-J
Convert hexadecimal UTF-8 with each byte's hex preceded by a %-sign (e.g. %C3%A9). This is the URIescape format defined by RFC 2396.
-K
Convert octal UTF-8 with each byte escaped by a backslash (e.g. \303\251)
-L
Convert \U-escaped hex outside the BMP, \u-escaped hex within the BMP.
-P
Convert hexadecimal numbers with prefix U+ (e.g. U+00E9).
-Q
Convert HTML character entities (e.g. &eacute;). If an unknown character entity is encountered, a warning is issued and the Unicode Replacement Character (0xFFFD) is emitted.
-R
Convert raw hexadecimal numbers (e.g. 00E9)
-U
Convert \u-escaped hexadecimal numbers (e.g. \u00E9).
-X
Convert standard hexadecimal numbers (e.g. 0x00E9).
-Y
Convert all three types of HTML escape: hexadecimal character references, decimal character references, and character entities.
-Z <format>
Convert input using the supplied format. The format specified will be used as the format string in a call to sscanf(3) with a single argument consisting of a pointer to an unsigned long integer. For example, to obtain the same results as with the -U flag, the format would be: \u%04X.

EXIT STATUS

The following values are returned on exit:

0 SUCCESS
The input was successfully converted.
3 INFO
The user requested information such as the version number or usage synopsis and this has been provided.
5 BAD OPTION
An incorrect option flag was given on the command line.
7 OUT OF MEMORY
Additional memory was unsuccessfully requested.
8 BAD RECORD
An ill-formed record was detected in the input.

SEE ALSO

AUTHOR

Bill Poser (billposer@alum.mit.edu)

LICENSE

GNU General Public License