man Locale::Maketext::Fuzzy () - Maketext from already interpolated strings

NAME

Locale::Maketext::Fuzzy - Maketext from already interpolated strings

VERSION

This document describes version 0.02 of Locale::Maketext::Fuzzy.

SYNOPSIS

    package MyApp::L10N;
    use base 'Locale::Maketext::Fuzzy'; # instead of Locale::Maketext

    package MyApp::L10N::de;
    use base 'MyApp::L10N';
    our %Lexicon = (
        # Exact match should always be preferred if possible
        "0 camels were released."
            => "Exact match",

        # Fuzzy match candidate
        "[quant,_1,camel was,camels were] released."
            => "[quant,_1,Kamel wurde,Kamele wurden] freigegeben.",

        # This could also match fuzzily, but is less preferred
        "[_2] released[_1]"
            => "[_1][_2] ist frei[_1]",
    );

    package main;
    my $lh = MyApp::L10N->get_handle('de');

    # All ->maketext calls below will become ->maketext_fuzzy instead
    $lh->override_maketext(1);

    # This prints "Exact match"
    print $lh->maketext('0 camels were released.');

    # "1 Kamel wurde freigegeben." -- quant() gets 1
    print $lh->maketext('1 camel was released.');

    # "2 Kamele wurden freigegeben." -- quant() gets 2
    print $lh->maketext('2 camels were released.');

    # "3 Kamele wurden freigegeben." -- parameters are ignored
    print $lh->maketext('3 released.');

    # "4 Kamele wurden freigegeben." -- normal usage
    print $lh->maketext('[*,_1,camel was,camels were] released.', 4);

    # "!Perl ist frei!" -- matches the broader one
    # Note that the sequence ([_2] before [_1]) is preserved
    print $lh->maketext('Perl released!');

DESCRIPTION

This module is a subclass of CWLocale::Maketext, with additional support for localizing messages that already contains interpolated variables. This is most useful when the messages are returned by external modules for example, to match CWdir: command not found against CW[_1]: command not found.

Of course, this module is also useful if you're simply too lazy to use the

    $lh->maketext("[quant,_1,file,files] deleted.", $count);

syntax, but wish to write

    $lh->maketext_fuzzy("$count files deleted");

instead, and have the correct plural form figured out automatically.

If CWmaketext_fuzzy seems too long to type for you, this module also provides a CWoverride_maketext method to turn all CWmaketext calls into CWmaketext_fuzzy calls.

METHODS

$lh->maketext_fuzzy(key[, parameters...]);

That method takes exactly the same arguments as the CWmaketext method of CWLocale::Maketext.

If key is found in lexicons, it is applied in the same way as CWmaketext. Otherwise, it looks at all lexicon entries that could possibly yield key, by turning CW[...] sequences into CW(.*?) and match the resulting regular expression against key.

Once it finds all candidate entries, the longest one replaces the key for the real CWmaketext call. Variables matched by its bracket sequences (CW$1, CW$2...) are placed before parameters; the order of variables in the matched entry are correctly preserved.

For example, if the matched entry in CW%Lexicon is CWTest [_1], this call:

    $fh->maketext_fuzzy("Test string", "param");

is equivalent to this:

    $fh->maketext("Test [_1]", "string", "param");

However, most of the time you won't need to supply parameters to a CWmaketext_fuzzy call, since all parameters are already interpolated into the string.

$lh->override_maketext([flag]);

If flag is true, this accessor method turns CW$lh->maketext into an alias for CW$lh->maketext_fuzzy, so all consecutive CWmaketext calls in the CW$lh's packages are automatically fuzzy. A false flag restores the original behaviour. If the flag is not specified, returns the current status of override; the default is 0 (no overriding).

Note that this call only modifies the symbol table of the language class that CW$lh belongs to, so other languages are not affected. If you want to override all language handles in a certain application, try this:

    MyApp::L10N->override_maketext(1);

CAVEATS

•
The longer is better heuristic to determine the best match is reasonably good, but could certainly be improved.
•
Currently, CW"[quant,_1,file] deleted" won't match CW"3 files deleted"; you'll have to write CW"[quant,_1,file,files] deleted" instead, or simply use CW"[_1] file deleted" as the lexicon key and put the correct plural form handling into the corresponding value.
•
When used in combination with CWLocale::Maketext::Lexicon's CWTie backend, all keys would be iterated over each time a fuzzy match is performed, and may cause serious speed penalty. Patches welcome.

SEE ALSO

Locale::Maketext, Locale::Maketext::Lexicon

BACKGROUND

This particular module was written to facilitate an auto-extraction layer for Slashcode's Template Toolkit provider, based on CWHTML::Parser and CWTemplate::Parser. It would work like this:

    Input | <B>from the [% story.dept %] dept.</B>
    Output| <B>[%|loc( story.dept )%]from the [_1] dept.[%END%]</B>

Now, this layer suffers from the same linguistic problems as an ordinary CWMsgcat or CWGettext framework does what if we want to make ordinates from CW[% story.dept %] (i.e. CWfrom the 3rd dept.), or expand the CWdept. to CWdepartment / CWdepartments?

The same problem occurred in RT's web interface, where it had to localize messages returned by external modules, which may already contain interpolated variables, e.g. CW"Successfully deleted 7 ticket(s) in 'c:\temp'.".

Since I didn't have the time to refactor CWDBI and CWDBI::SearchBuilder, I devised a CWloc_match method to pre-process their messages into one of the candidate strings, then applied the matched string to CWmaketext.

Afterwards, I realized that instead of preparing a set of candidate strings, I could actually use the original lexicon file (i.e. PO files via CWLocale::Maketext::Lexicon) to match against. This is how CWLocale::Maketext::Fuzzy was born.

AUTHORS

Autrijus Tang <autrijus@autrijus.org>

COPYRIGHT

Copyright 2002 by Autrijus Tang <autrijus@autrijus.org>.

This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

See <http://www.perl.com/perl/misc/Artistic.html>