man zlibc (Fonctions bibliothèques) - transparently access compressed files.

Name

zlibc - transparently access compressed files.

Introduction

The zlibc package allows transparent on the fly uncompression of gzipped files. Your programs will be able to access any compressed file, just as if they were uncompressed. Zlibc will transparently uncompresses the data from these files as soon as they are read, just as a compressed filesystem would do. No kernel patch, no recompilation of these executables and no recompilation of the libraries is needed.

It is not (yet) possible execute compressed files with zlibc. However, there is another package, called CWtcx, which is able to uncompress executables on the fly. On the other hand CWtcx isn't able to uncompress data files on the fly. Fortunately, the both zlibc and tcx may coexist on the same machine without problems.

Warning

This manpage has been automatically generated from zlibc's texinfo documentation. However, this process is only approximative, and some items, such as crossreferences, footnotes and indices are lost in this translation process. Indeed, this items have no appropriate representation in the manpage format. Thus I strongly advise you to use the original texinfo doc.

*
To generate a printable copy from the texinfo doc, run the following commands:
    ./configure; make dvi; dvips zlibc.dvi
*
To generate a html copy, run:
    ./configure; make html
A premade html can be found at: CWhttp://www.tux.org/pub/knaff/zlibc/zlibc.html
*
To generate an info copy (browsable using emacs' info mode), run:
    ./configure; make info

The texinfo doc looks most pretty when printed or as html. Indeed, in the info version certain examples are difficult to read due to the quoting conventions used in info.

Where to get zlibc

Zlibc can be found at the following places (and their mirrors):

ftp://zlibc.linux.lu/zlibc-0.9j.tar.gz
ftp://www.tux.org/pub/knaff/zlibc/zlibc-0.9j.tar.gz
ftp://ibiblio.unc.edu/pub/Linux/libc/zlibc-0.9j.tar.gz

Before reporting a bug, make sure that it has not yet been fixed in the Alpha patches which can be found at:

http://zlibc.linux.lu/
http://www.tux.org/pub/knaff/zlibc

These patches are named CWzlibc-versionCW-ddmmCW.taz, where version stands for the base version, dd for the day and mm for the month. Due to a lack of space, I usually leave only the most recent patch.

There is an zlibc mailing list at zlibc @ www.tux.org . Please send all bug reports to this list. You may subscribe to the list by sending a message with 'subscribe zlibc @ www.tux.org' in its body to majordomo @ www.tux.org . (N.B. Please remove the spaces around the "@" both times. I left them there in order to fool spambots.) Announcements of new zlibc versions will also be sent to the list, in addition to the linux announce newsgroups. The mailing list is archived at http://www.tux.org/hypermail/zlibc/latest

Installing zlibc

1.
If you install zlibc on Linux, make sure that your shared loader (ld-linux.so.1/ld.so) understands CWLD_PRELOAD. (Best if ld.so-1.8.5 or more recent)
2.
Type CW./configure. This runs the GNU autoconfigure script which configures the CWMakefile and the CWconfig.h file. You may compile time configuration options to CW./configure, see for details.
3.
Type CWmake to compile zlibc.
4.
Type CWmake install to install zlibc and associated programs to its final target.
5.
To use this module, set the environment variable CWLD_PRELOAD to point to the object. Example (sh syntax):
      LD_PRELOAD=/usr/local/lib/uncompress.o
      export LD_PRELOAD
or (csh syntax):
      setenv LD_PRELOAD /usr/local/lib/uncompress.o
On linux, use /lib/uncompress.o instead of /usr/local/lib/uncompress.o .
You might want to put these lines in your CW.profile or CW.cshrc in order to have the uncompressing functions available all the time.
6.
Compress your files using gzip and enjoy

For security reasons, the dynamic loader disregards environmental variables such as CWLD_PRELOAD when executing set uid programs.

However, on Linux, you can use zlibc with set uid programs too, by using one of the two methods described below:

1.
You may ing the path to CWuncompress.o into CW/etc/ld.so.preload instead of using CWLD_PRELOAD.
WARNING: If you use CW/etc/ld.so.preload, be sure to install CWuncompress.o on your root filesystem, for instance in CW/lib, as is done by the default configuration. Using a directory which is not available at boot time, such as /usr/local/lib will cause trouble at the next reboot!
It is also careful to remove zlibc from CW/etc/ld.so.preload when installing a new version. First test it out using CWLD_PRELOAD, and only if everything is ok, put it back into CW/etc/ld.so.preload. The zlibc package also supplies four statically linked programs CWsrm, CWsmv, CWsln and CWssln, which are equivalen to CWrm, CWmv, CWln and CWln -s. These can be used in case anything goes wrong with the installation.
2.
If you have a version of CWld.so which is more recent than CW1.9.0, you can set CWLD_PRELOAD to just contain the basename of CWuncompress.o without the directory. In that case, the file is found as long as it is in the shared library path (which usually contains CW/lib and CW/usr/lib)). Because the search is restricted to the library search path, this also works for set-uid programs.
Example (sh syntax):
      LD_PRELOAD=uncompress.o
      export LD_PRELOAD
or (csh syntax):
      setenv LD_PRELOAD uncompress.o
The advantage of this approach over CWld.so.preload is that zlibc can more easily be switched off in case something goes wrong.

Using zlibc

Once zlibc is installed, simply compress your biggest datafiles using gzip. Your programs are now able to uncompress these files on the fly whenever they need them.

Zlibc and links

Symbolic links

After compressing your datafiles, you also need to change any potential symbolic links pointing to them. Let's suppose that CWx is a symlink to CWtstfil:

> echo 'this is a test' >tstfil
> ln -s tstfil x
> ls -l
total 1
-rw-r--r--   1 alknaff  sirac          15 Feb 25 19:40 tstfil
lrwxrwxrwx   1 alknaff  sirac           8 Feb 25 19:40 x -> tstfil

After compressing it, you'll see the following listing:

> gzip tstfil
> ls -l
total 1
pr--r--r--   1 alknaff  sirac          15 Feb 25 19:40 tstfil
lrwxrwxrwx   1 alknaff  sirac           8 Feb 25 19:40 x -> tstfil

CWTstfil is now shown as a pipe by zlibc in order to warn programs that they cannot seek in it. Zlibc still shows it with its old name, and you can directly look at its contents:

> cat tstfil
this is a test

However, CWtstfil is not yet accessible using the symbolic link:

> cat x
cat: x: No such file or directory

In order to make CWtstfil accessible using the link, you have to destroy the link, and remake it:

> rm x
/bin/rm: remove `x'? y
> ln -s tstfil x
> ls -l
total 1
pr--r--r--   1 alknaff  sirac          15 Feb 25 19:40 tstfil
lrwxrwxrwx   1 alknaff  sirac           8 Feb 25 19:44 x -> tstfil
> cat x
this is a test

Hard links

If you compress datafiles with hard links pointing to them, gzip refuses to compress them.

> echo 'this is a test' >tstfil
> ln tstfil x
> ls -li
total 2
    166 -rw-r--r--   2 alknaff  sirac          15 Feb 25 19:46 tstfil
    166 -rw-r--r--   2 alknaff  sirac          15 Feb 25 19:46 x
> gzip tstfil
gzip: tstfil has 1 other link  -- unchanged

Thus you need to remove these hard links first, and remake them after compressing the file.

> rm x
/bin/rm: remove `x'? y
> gzip tstfil
> ln tstfil x
> ls -li
total 2
    167 pr--r--r--   2 alknaff  sirac          15 Feb 25 19:46 tstfil
    167 pr--r--r--   2 alknaff  sirac          15 Feb 25 19:46 x
> cat x
this is a test

How it works

Usually, programs don't make system calls directly, but instead call a library function which performs the actual system calls. For instance, to open a file, the program first calls the CWopen library function, and then this function makes the actual syscall. Zlibc overrides the CWopen function and other related functions in order to do the uncompression on the fly.

If the CWopen system call fails because the file doesn't exist, zlibc constructs the filename of a compressed file by appending CW.gz to the filename supplied by the user program. If this compressed file exists, it is opened and piped trough CWgunzip, and the descriptor of the read end of this pipe is returned to the caller.

In some cases, the compressed file is first uncompressed into a temporary file, and a read descriptor for this file is passed to the caller. This is necessary if the caller wants to call CWlseek on the file or CWmmap it. A description of data files for which using temporary is necessary can be given in the configuration files CW/usr/local/etc/zlibc.conf (CW/etc/zlibc.conf on Linux) and CW~/.zlibrc. See section Configuration files, for a detailed description of their syntax.

Many user programs try to check the existence of a given file by other system calls before actually opening it. That's why zlibc also overrides these system calls. If for example the user program tries to stat a file, this call is also intercepted.

The compressed file, which exists physically on the disk, is also called 'the real file', and the uncompressed file, whose existence is only simulated by zlibc is called 'the virtual file'.

Customization

The behavior of zlibc can be tailored using configuration files or environment variables. This customization should normally not be needed, as the compiled-in defaults are already pretty complete.

Environmental variables

Environmental variables come in two kinds: switch variables have a boolean value and can only be turned on or off, whereas string variables can have arbitrary strings as values.

Switch variables

These variables represent a flag which can be turned on or off. If their value is CWon or CW1 they are turned on, if their value is CWoff or CW0 they are turned off. All other values are ignored. If the same flag can be turned on or off using config files, the environmental variable always has the priority.

CWLD_ZLIB_VERBOSE
If this variable is turned on, informational messages are printed on many operations of zlibc. Moreover, error messages are printed in order to point out errors in the configuration files, if any. If this variable is turned off, errors are silently ignored.
CWLD_ZLIB_UNLINK
If this variable is turned on, and if the user program tries to unlink a virtual (uncompressed) file, zlibc translates this call into unlinking the real file. If this variable is turned off, unlink calls on virtual files are ignored.
CWLD_ZLIB_DISABLE
If this variable is turned on, zlibc is switched off.
CWLD_ZLIB_READDIR_COMPR
If this variable is turned on, the readdir function shows the real (compressed) files instead of the virtual (uncompressed) files.

String variables

These variables have a string value, which represent a file, a directory or a command.

CWLD_ZLIB_TMP
This is the name of the directory where the temporary uncompressed files are put. The default is /tmp.
CWLD_ZLIB_EXT
This is the extension which is appended to a virtual file name in order to obtain the real (compressed) file name. The default is CW.gz.
CWLD_ZLIB_UNCOMPRESSOR
This is the name of the program to be invoked to uncompress the data. Default is CWgzip -dc.
CWLD_ZLIB_CONFFILE
This is the name of an additional configuration file. If this variable is defined and if the corresponding file exists, the configuration described in this file overrides the configurations in CW~/.zlibrc and in CW/usr/local/etc/zlibc.conf (CW/etc/zlibc.conf on Linux).

Compiled-in defaults

It is possible to operate zlibc entirely without configuration files. In this case, it uses the compiled-in defaults. These are generated at compile-time from the CWzlibrc.sample file. This file has the same syntax as the configuration files described above (see section Configuration files). If you want to change the compiled-in defaults of zlibc, edit that file, and remake.

Compile-time configuration via GNU autoconf

Before it can be compiled, zlibc must be configured using the GNU autoconf script CW./configure. In most circumstances, running CW./configure without any parameters is enough. However, you may customize zlibc using various options to CW./configure. The following options are supported:

CW--prefix directoryCW
Prefix used for any directories used by zlibc. By default, this is CW/usr/local. Zlibc is installed in CW$prefix/lib, looks for its system wide configuration file in CW$prefix/etc. Man pages are installed in CW$prefix/man, info pages in CW$prefix/info etc. On Linux, if you use zlibc via CW/etc/ld.so.preload, you should use CW/ as the prefix instead of the default CW$prefix/lib.
CW--sysconfdir directoryCW
Directory containing the system-wide configuration file CWzlibc.conf. By default, this is derived from CWprefix (see above).
CW--disable-runtime-conf
Disables run time configuration via environmental variables and via the configuration files. This may be needed in hyper secure environments.
CW--disable-env-conf
Disables run time configuration via environmental variables
CW--disable-have-proc
Tells zlibc not to use the /proc filesystem to find out the commandline of the programs for which it runs, even if a working /proc is detected.
CW--disable-have-proc
Tells zlibc to use the /proc filesystem to find out the commandline of the programs for which it runs, even if no working /proc is detected.
CW--with-compr-ext=extensionCW
Uses extension as the filename extension of compressed files. By default, is CW.gz
CW--with-extlen=lengthCW
Allows to configure compressed filename extensions with at most length character via runtime configuration. By default is 5.
CW--with-tmpdir=directoryCW
Uses directory to store the uncompressed files. By default is CW/tmp.
CW--with-uncompressor=uncompressor-command-lineCW
Defines how the program for uncompressing files should be invoked. This command should read the compressed file from stdin, and output the uncompressed data to stdout By default is CWgzip -dc.

In addition to the above-listed options, the standard GNU autoconf options apply. Type CW./configure --help to get a complete list of these.

See also