man thescoder (Commandes) - Compiler for OpenOffice.org 1.x thesaurus files

NAME

thescoder - Compiler for OpenOffice.org 1.x thesaurus files

SYNOPSIS

thescoder [options] inputfile prefix

DESCRIPTION

Thescoder compiles thesaurus data into the two .idx and .dat files used by the OpenOffice.org 1.x thesarus.

Compared to the equivalent tool found in the OpenOffice.org development tools, thescoder has additional features:

•
it is possible to insert spaces between different synonims;
•
it is possible to use any character as a separator between synonims;
•
if a synonim is not present as a word, the word gets added automatically without synonims (and a warning is shown when it happens);
•
if there is a word without synonims, it is preserved (and a warning is shown when it happens);
•
the word list need not be sorted.

OPTIONS

-sep separator
Specify a different separator character for the input (default is the semicolon: ';')

INPUT FORMAT

The input file is a text file.

The first line is the sequence of symbols that are considered correct, such as:

   qwertyuioplkjhgfdsazxcvbnmQWERTYUIOPLKJHGFDSAZXCVBNM-'àèéìòù

The next lines contain the words followed by their synonims. Everything is separated by semicolon (';') or by the character specified using the -sep commandline option. For example:

   poeta;araldo; verseggiatore
   araldo; verseggiatore; poeta

The resulting file for this tiny sample thesaurus will then be:

   qwertyuioplkjhgfdsazxcvbnmQWERTYUIOPLKJHGFDSAZXCVBNM-'àèéìòù
   poeta;araldo; verseggiatore
   araldo; verseggiatore; poeta

EXAMPLE

Here is a small tutorial on how to install a new thesaurus in OpenOffice.org. You must an OpenOffice version between 1.0.2 and 2.0.

1
create a text file as explained previously in the section "INPUT FORMAT"
2
run the command:
   thescoder thesaurus.txt th_it_IT
it will create the two files th_it_IT.idx and th_it_IT.dat. If you used a separator that is different than the semicolon, use the -sep option. For example:
   thescoder -sep "," thesaurus.txt th_it_IT
3
close all running OpenOffice.org applications, including the quick-started applet if you use it. This will make it so the next time you run OpenOffice it will reload all the data, including the new thesaurus.
4
Copy the two files .idx and .dat into the directory where OpenOffice keeps its thesarusu data (in Debian it is /usr/share/myspell/dicts/).
5
edit the file dictionary.lst and add the line:
   THES it IT th_it_IT
6
open an OpenOffice application (Writer, for example)
7
Open the menu Tools/Options/Language settings/Linguistitcs and you will be able to select your thesaurus in the section "Avaliable linguistic modules" for the language of your thesaurus.
8
The thesaurus is now ready to use: just put the cursor on a word and press Ctrl+F7 to see its synonims.

AUTHOR

thescoder has been written by Giuseppe Modugno (gppe.modugno@libero.it).

This manpage has been written by Enrico Zini (enrico@debian.org), translating parts of the original Italian readme file.