man prcs-synch (Commandes) - synchronize PRCS projects between repositories

NAME

prcs-synch - synchronize PRCS projects between repositories

SYNOPSIS

Synchronize between old and new repositories, both on local disk:

    prcs-synch --proj=myproject --remote-repo=/somewhere/old \
        --local-repo=/somewhere/new

Synchronize over ssh:

    prcs-synch --proj=myproj --ssh=alterego@nowhere.net

OPTIONS

Standard GNU-style options are used.

--proj=name (mandatory)
PRCS project name.
--tmpdir=directory (optional)
Scratch area, should have ample space relative to project. Defaults to /tmp.
--debug=level (optional)
Debug level, higher means more. Default 1.
--local-repo=directory (optional)
Local repository: this is where updates will be placed. If not specified, the envvar PRCS_REPOSITORY is tried, then the PRCS internal configuration.
--remote-repo=directory (sometimes optional)
Remote repository, i.e. where the original versions are to be synched from. This is mandatory unless --ssh is specified, in which case the same heuristics are tried as for the local repository. (If you are accustomed to $PRCS_REPOSITORY being set for your personal environment, try ~/.ssh/environment on the remote host.)
--dir={ up | down | both } (optional)
Controls direction of synchronization. By default, both is assumed: you want to synchronize changes in both directions. up means send changes to remote from local only; down means get changes from remote only.
--ssh= [ user@ ] host (optional)
Access the remote repository using ssh (Secure SHell). This must be installed as ssh (and of course the remote host must be running the server). Give it a hostname or user@host syntax. You may do synchronization in any direction, but if going upstream, you need to have this script installed and in your default path on the remote machine as well. ssh may prompt you for passwords or passphrases multiple times; to avoid this annoyance, you should set up ssh properly so you have an authenticated identity, and use ssh-agent to make the authentication transparent. If in doubt as to whether this is working, you should be able to do this and have it not prompt at all:
    ssh -l user host echo looks good
For a quick test, this will probably work:
    ssh-agent sh -c 'ssh-add; prcs-synch ....'
--ssh-opts=option list (optional)
Pass extra options to ssh. Normally none should be needed.
--remote-tmpdir=directory (optional)
Like --tmpdir, but on the remote host if using ssh.
--help
Display this documentation.

DESCRIPTION

prcs-synch tries to synchronize two PRCS repositories (actually, just a single project at a time). It requires a remote repository, which is assumed to have recent changes, vs. a local repository which is out of date (it has not yet seen these new versions). All versions present in both repositories must match up in all relevant details. prcs-synch will try to maintain this synchronization. You could subsequently reverse positions and synchronize local versions into the remote repository, too. If you are doing this kind of bidirectional stuff, it is your responsibility to ensure that you never check in different versions with the same name into the two repositories; prcs-synch currently may not notice the discrepancy and may fail unpredictably. So, e.g., use one repository for most stuff, but reserve a specially-named branch or two on which all local checkins will be made; this is the simplest way to ensure that you do not accidentally overlap.

Please use this only on projects already existing in both repositories. If you need to create one or another from scratch, this is easy enough to do with CWprcs package and CWprcs unpackage (not to mention faster, safer, and more preserving of metainformation).

PREREQUISITES

•
A reasonably recent PRCS.
•
Perl 5.004.
•
The SysV utility tsort in your path (most systems have this somewhere, let me know if this is a problem; check /usr/ccs/bin/ in Solaris?).
•
Ssh if you are doing remote synchs, with an ssh server on the remote host.
•
A copy of this script on the remote host to do upstream synchs.

IMPLEMENTATION

CAVEAT USER: Implementation relies on undocumented and possibly unstable aspects of PRCS. The heart is the update function, which actually does the synchronization of remote versions into the local repository.

The basic algorithm is as follows (this may be incomprehensible):

For each new version to be added, go through each of its parents in turn. Check out the project files for both the local and remote varieties of that parent. In each of these pairs, go through each file by name, collecting its file family; ignore revision number correspondences for now on the assumption that they will be the same (they should be). The list of files should match up one-to-one, else error. For each filename, create a correspondence between the local and remote internal file families. (This correspondence should be retained across versions, in fact, as an extra sanity check.)

Now check out the version''s remote project file. Edit it as follows. First, swap New-Foo with Foo (just edit the symbols!); this will have the effect of creating the correct New-Foo, while blanking out the Foo (which has no effect anyway). Second, substitute all file families with the corresponding local versions. If there is no corresponding local version, ipso facto this is a new file (it did not appear in any of its parents), so we blank out the internal descriptor to indicate this. (If it was deleted, then we do not care.) Third, change the project version to one of the parents, say the first one.

Fourth (the nasty part): for each file descriptor: check to see if the v.n. in the new remote is the same as in one of the old remotes. If so, then copy in the v.n. from the corresponding old local. If not, perform RCS subtraction and look for that in the old remotes. If found, use the corresponding old local. If not, signal an error. (RCS subtraction: if ending in ....1.n+1, go to ....1.n; if ....n.1; go to ...; if just 1.1, then there should have been no prior version, so check that.)

Fifth: make a note of the original user & time of checkin.

Now you can check in. Phew.

INTERNAL FUNCTIONS

Too many weird little functions in here. Returns the text of the project file for a given version. Cached. Given a project name and the text of some version of the project file, returns a mapping for the file descriptors. The mapping is a hash ref, from file names (external) to hashrefs of: the total file descriptor string as it appears (total); the file family (ff); and the RCS version number (vn). Cached. Take a mapping as from get_mapping and rekey it by file family. Same as before but now ff is replaced by name. Looks for the correspondence between file families between the two repositories. The indicated version only is checked for this call. CW%ff_mapping will be a map from project name, to remote family, to local family. If there is ever a mismatch (between versions) an error will be raised. Find the RCS predecessor to this RCS number. If it is of the form xxx.n+1, then we get xxx.n. If of the form xxx.n.1, then we get xxx. If it is 1.1, then it has no predecessor so CWundef is returned. Find the local RCS version number which presumably corresponds to the ancestor of the given remote one. Pass in the observed remote version number; and for both the local and remote repositories, the observed file family (may be CWundef for local), and lists consisting of total/vn/name hashes (as from remap_by_ff, but for the correct file family only; CWundef if not present), one for each parent version, in corresponding order. The presumed local version number will be searched for and returned; if the file is observed to be fresh, CWundef will be returned.

This function is where all the nasty logic really lives. Update the specified PRCS project according to the contents of a master repository. Only the specified list of versions will be updated. Each version specifier should be of the form CW[$version, @parents], i.e. a list reference giving first the version to update from the package file, then its parents (there may be multiple in the case of a merge). The function will determine which order to do the updates in.

Expects /tmp to be available and have sufficient space for scratch space; also expects PRCS to be installed as prcs, and a topological sort as tsort. Given an input filehandle, returns a reflist to all versions present in that CWprcs info listing. Project name should be specified. Same as parse_info_short, but input assumed to be from prcs info -l, and result versions are update-ready lists w/ parent information.

Currently not that flexible w.r.t. PRCS output format: assumes that the parent versions immediately follow main line, & there is at least one extra line after that (e.g. Version-Log). Update from one repository to another based on the versions currently present in each. CW@from_repo must be long format; CW@to_repo may be either long or short.

BUGS

It is impossible to make the newly created versions have the same login and checkin time as the original. But this information is recorded at the beginning of the version log.

Something about package upload seems to be capable of killing Linux PPP links completely! Please let me know if you have problems with a transmission stopping partway through (no harm should come of it, as the unpackaging should fail noisily). This may be a network-code bug, or an ssh bug.

Deletions

Version deletions will screw things up somewhat; if this causes problems, the synch will be stopped and you will be asked to rerun it. NOTE that this only works if you have deleted a version (due to a mistake) before checking in any versions with that as its parent; trying to synch from a repository which contains versions derived from now-deleted versions will cause a failure!

If you really need to do synching on projects containing internal deletions (those with nondeleted children), you have two options (both untested). If you know you want to do the deletion ahead of time, synch up the projects first, then do matching deletions in each repository. If you have already done the internal deletions and want to continue with (or begin) synchronization, you will have to manually synch up those versions with immediate deleted ancestors, in which case you are on your own (though it would probably not be that difficult if you read the implementation notes to this script).

TODO

Should chdir to some directory with no interesting subdirectories. Else some PRCS commands will be interpreted as referring to a subdirectory, rather than a project name, which produces mysterious errors.

Should check when updating a new version to see if the v.l. already contains an [Originally checked in by...], and just keep that one instead of adding a new one.

Upon dying, it should list all of the scratch directories it was using, for debugging purposes.

Efficiency

In the case of remote synchs, the entire project package is transmitted, which is surely wasteful when only a small portion of that is actually used; but this keeps the code much simpler and hopefully stabler than it would otherwise be. Conceivably it would be possible to remotely check out the required versions into a directory tree and .tar.gz the whole mess, relying on gzip to notice the redundancy; this would still not reduce waste in the case of huge projects only a few files of which are changing each time.

Ssh is run a number of times within a remote session, at the cost of some connection overhead.

Ssh compression and project-file compression is always turned on where it would reduce bandwidth requirements. This probably causes a little unnecessary overhead when running over a fast local network.

--revision option

Should permit you to only synch up certain revisions or branches. This could be somewhat tricky, though, as it would need to ensure that the R.T.C. of versions requested fell within those versions plus the versions already existing in the destination repository: i.e. that there are no gaps.

--all-proj option

Would synch on all projects present in both repositories, and skip some overhead; substitute for --proj option.

Robustness

Examine common versions quickly to ensure critical parts match up. Currently, it is expected that the user is being careful to keep branches separate. However, mistakes on this point might cause other sorts of errors, just not likely to be as apparent what is wrong.

The script cannot and does not lock the repository between checkins. In principle this would not break anything, providing you don't do anything dumb like delete project versions while it is running!

Testing

Needs to be tested in many more & more obscure circumstances than it has. (That said, I have been using it since late April without problems.)

AUTHOR

Jesse Glick jglick@sig.bsh.com. Please send comments & bug reports.

REVISION

This is alpha-level software, use at your own risk.

CW$ProjectHeader: prcs 1.3-release.1 Sun, 28 Oct 2001 18:18:09 -0800 jmacd $

Copyright (C) 1998 Strategic Interactive Group, all rights reserved. This software may be used under the terms of the GNU General Public License. There is no warranty of any kind whatsoever.