man cstocs () - cstocs -- charset encoding convertor for the Czech and Slovak languages.
NAME
cstocs -- charset encoding convertor for the Czech and Slovak languages.
FORMAT
cstocs [options] src_encoding dst_encoding [files ...]
SYNOPSIS
cstocs il2 ascii < file | less cstocs -i utf8 il2 file1 file2 file3 cstocs --help
DESCRIPTION
Cstocs is a simple conversion utility to change charset encoding of a text. It reads either specified files or (if none specified) the standard input, assumes that the input is encoded in CWsrc_encoding and ties to reencode it into CWdst_encoding. The result is written to the standard output.
Run CWcstocs without parameters to get short help and list of available encodings.
Characters that are not defined in CWsrc_encoding are passed to the output unchanged.
If source text contains character, that is defined in CWsrc_encoding but not in CWdst_encoding, it can be handled several ways. For example, character e with caron (symbol ecaron), and d with caron (symbol dcaron) are included in the iso-8859-2 encoding, but not in the iso-8859-1. If you will do reencoding of 8859-2 text to 8859-1, you may want to do one of the following actions:
- 1.
- Keep it the same, option CW--nofillstring.
- 2.
- Do not produce any output instead of ecaron symbol, option CW--null.
- 3.
- Substitute some string (possibly a space) instead of both ecaron and dcaron, options CW--fillstring.
- 4.
- Substitute a letter d instead of dcaron, and e instead of ecaron. It is even possible to substitute string instead of symbol, so you can replace the AE Latin character with string AE (letter A, and letter E). Or you can replace a plusminus sign with a string +/-. These substitutions are described in the accent file.
OPTIONS
- -i, -i.ext, --inplace.ext
- Files specified will be converted in-place, using Perl CW-i facility. Optionaly, an extension for backup copies may be specified after dot. This parameter has to be the first one, if specified.
- --dir directory
- Encoding files are taken from directory instead of the default, which is Cz/Cstocs/enc in the Perl lib tree. The location of encoding files can also be changed using the CSTOCSDIR environment variable, but the --dir option has the highest priority.
- --fillstring string
- If source text contains character, that is defined in the CWsrc_encoding but not in the CWdst_encoding nor in the accent file (or accent file is not used), it is replaced by CWstring. The default is single space.
- --nofillstring
- Disable changes of characters that would otherwise have fillstring applied. This is different from CW--null because that cancels that character out.
- --null
- Completely equivalent to --fillstring "".
- --nochange or --noaccent
- Do not use the accent file at all.
- --onebyone
- Use only those rules from the accent file, which rewrite one character to one character. If this option is specified, character ecaron will be rewritten to e, but AE character will not be rewritten to AE string.
- --onebymore
- Use all rules from accent file. This is the default option.
SEE ALSO
Cz::Cstocs(3).
AUTHOR
Jan Yenya Kasprzak has done the original Un*x implementation.
Jan Pazdziora, adelton@fi.muni.cz, created the Perl module version.