Re: CVS migration help
Hi Michael,
I've checkout the trunk version (the version on ubuntu hardy heron is quite old : 2.0.1)
succeed in cvs2svn conversion,
unfortunately it crashes in the same way that bzr cvsps-import does :
thomas <at> home:~/temp/bzr$ cat ../cvs2svn-tmp/git-blob.dat ../cvs2svn-tmp/git-dump.dat | bzr fast-import -
bzr: ERROR: exceptions.UnicodeDecodeError: 'utf8' codec can't decode bytes in position 43-45: invalid data
Traceback (most recent call last):
File "/usr/lib/python2.5/site-packages/bzrlib/commands.py", line 834, in run_bzr_catch_errors
return run_bzr(argv)
File "/usr/lib/python2.5/site-packages/bzrlib/commands.py", line 790, in run_bzr
ret = run(*run_argv)
File "/usr/lib/python2.5/site-packages/bzrlib/commands.py", line 492, in run_argv_aliases
return self.run(**all_cmd_args)
File "/home/thomas/.bazaar/plugins/fastimport/__init__.py", line 199, in run
params, verbose)
File "/home/thomas/.bazaar/plugins/fastimport/__init__.py", line 77, in _run
return proc.process(p.iter_commands)
File "/home/thomas/.bazaar/plugins/fastimport/processor.py", line 83, in process
self._process(command_iter)
File "/home/thomas/.bazaar/plugins/fastimport/processors/generic_processor.py", line 317, in _process
processor.ImportProcessor._process(self, command_iter)
File "/home/thomas/.bazaar/plugins/fastimport/processor.py", line 105, in _process
handler(self, cmd)
File "/home/thomas/.bazaar/plugins/fastimport/processors/generic_processor.py", line 486, in commit_handler
handler.process()
File "/home/thomas/.bazaar/plugins/fastimport/processor.py", line 164, in process
for fc in self.command.file_iter():
File "/home/thomas/.bazaar/plugins/fastimport/parser.py", line 312, in iter_file_commands
yield self._parse_file_modify(line[2:])
File "/home/thomas/.bazaar/plugins/fastimport/parser.py", line 365, in _parse_file_modify
path = self._path(params[2])
File "/home/thomas/.bazaar/plugins/fastimport/parser.py", line 493, in _path
return s.decode('utf_8')
File "/usr/lib/python2.5/encodings/utf_8.py", line 16, in decode
return codecs.utf_8_decode(input, errors, True)
UnicodeDecodeError: 'utf8' codec can't decode bytes in position 43-45: invalid data
bzr 1.3.1 on python 2.5.2.final.0 (linux2)
arguments: ['/usr/bin/bzr', 'fast-import', '-']
encoding: 'UTF-8', fsenc: 'UTF-8', lang: 'en_US.UTF-8'
plugins:
bzrtools /usr/lib/python2.5/site-packages/bzrlib/plugins/bzrtools [1.3.0]
cvsps_import /home/thomas/.bazaar/plugins/cvsps_import [unknown]
fastimport /home/thomas/.bazaar/plugins/fastimport [unknown]
launchpad /usr/lib/python2.5/site-packages/bzrlib/plugins/launchpad [unknown]
*** Bazaar has encountered an internal error.
Please report a bug at
https://bugs.launchpad.net/bzr/+filebug including this traceback, and a description of what you
were doing when the error occurred.
I don't think it's related to cvs2svn or cvsps as it fails in both cases.
It should be a bzr bug.
I've successfully converted my project to git repository format with these set of command :
export CVSROOT=/home/thomas/temp/cvs2git/cvs/files
git cvsimport -C /home/thomas/temp/cvs2gitOutput/crf-irp crf-irp
git cvsimport -C /home/thomas/temp/cvs2gitOutput/crf-irp-model crf-irp-model
git cvsimport -C /home/thomas/temp/cvs2gitOutput/crf-irp-monitor crf-irp-monitor
git cvsimport -C /home/thomas/temp/cvs2gitOutput/crf-irp-portail crf-irp-portail
git cvsimport -C /home/thomas/temp/cvs2gitOutput/crf-irp-utilities crf-irp-utilities
Is it possible to convert the git version of my sources to bzr ? maybe it would be successfull.
Thomas.
On Tue, Oct 7, 2008 at 12:41, Michael Haggerty
<mhagger <at> alum.mit.edu> wrote:
Jelmer Vernooij wrote:
> Am Dienstag, den 07.10.2008, 00:01 +0200 schrieb Thomas Manson:
>> I've look to it... but didn't tryed yet...
>>
>> It really misses straightforward howto (for all tools except bzr
>> cvsimport)
> cvsps-import should be the best solution here, we should just fixing
> that imho. What's blocking you from using it?
No conversion tool that is based on cvsps will be able to do a truly
reliable job of migrating from CVS. cvsps, which was written for
another purpose, simply is not robust enough and does not emit enough
information for a complete conversion. I gave many concrete examples of
its shortcomings on the Mercurial mailing list [1].
Deducing a project's history from CVS's incomplete records is a very
tricky thing; cvs2svn's feature list [2] will give you an idea of the
kinds of things an industrial-strength converter needs to handle.
cvs2svn deduces the CVS changesets itself, using a much more robust
algorithm than that used by cvsps. (The main disadvantage of cvs2svn is
that it can only be used for one-time conversions, not for tracking a
live CVS repository incrementally.)
cvs2svn/cvs2git can create output in git-fast-import format [3], which
should also be readable by the bzr fast-import tool. It hasn't gotten
much testing in "cvs2bzr" mode, but given that 90% of the job is
inferring CVS's history, it should not be too much work to fix any
problems in the "2bzr" part. Therefore, any feedback would be much
appreciated.
(By the way, if you want to use cvs2svn to convert to bzr, I suggest
that you use the trunk version of cvs2svn, which has several
improvements compared to release 2.1.1.)
Michael
[1] http://selenic.com/pipermail/mercurial-devel/2008-February/004975.html
[2] http://cvs2svn.tigris.org/features.html
[3] http://cvs2svn.tigris.org/cvs2git.html
# (Be in -*- python -*- mode.)
#
# ====================================================================
# Copyright (c) 2006-2007 CollabNet. All rights reserved.
#
# This software is licensed as described in the file COPYING, which
# you should have received as part of this distribution. The terms
# are also available at http://subversion.tigris.org/license-1.html.
# If newer versions of this license are posted there, you may use a
# newer version instead, at your option.
#
# This software consists of voluntary contributions made by many
# individuals. For exact contribution history, see the revision
# history and logs, available at http://cvs2svn.tigris.org/.
# ====================================================================
# #####################
# ## PLEASE READ ME! ##
# #####################
#
# This is a template for an options file that can be used to configure
# cvs2svn. Many options do not have defaults, so it is easier to copy
# this file and modify what you need rather than creating a new
# options file from scratch.
#
# This file is in Python syntax, but you don't need to know Python to
# modify it. But if you *do* know Python, then you will be happy to
# know that you can use arbitary Python constructs to do fancy
# configuration tricks.
#
# But please be aware of the following:
#
# * In many places, leading whitespace is significant in Python (it is
# used instead of curly braces to group statements together).
# Therefore, if you don't know what you are doing, it is best to
# leave the whitespace as it is.
#
# * In normal strings, Python uses backslashes ("\") are used as an
# escape character. Therefore you need to be careful, especially
# when specifying regular expressions or Windows filenames. It is
# recommended that you use "raw strings" for these cases.
# Backslashes in raw strings are treated literally. A raw string is
# written by prefixing an "r" character to a string. Example:
#
# ctx.sort_executable = r'c:\windows\system32\sort.exe'
#
# Two identifiers will have been defined before this file is executed,
# and can be used freely within this file:
#
# ctx -- a Ctx object (see cvs2svn_lib/context.py), which holds
# many configuration options
#
# run_options -- an instance of the OptionsFileRunOptions class
# (see cvs2svn_lib/run_options.py), which holds some variables
# governing how cvs2svn is run
# Import some modules that are used in setting the options:
import re
from cvs2svn_lib import config
from cvs2svn_lib import changeset_database
from cvs2svn_lib.common import CVSTextDecoder
from cvs2svn_lib.log import Log
from cvs2svn_lib.project import Project
from cvs2svn_lib.svn_output_option import DumpfileOutputOption
from cvs2svn_lib.svn_output_option import ExistingRepositoryOutputOption
from cvs2svn_lib.svn_output_option import NewRepositoryOutputOption
from cvs2svn_lib.revision_manager import NullRevisionRecorder
from cvs2svn_lib.revision_manager import NullRevisionExcluder
from cvs2svn_lib.rcs_revision_manager import RCSRevisionReader
from cvs2svn_lib.cvs_revision_manager import CVSRevisionReader
from cvs2svn_lib.checkout_internal import InternalRevisionRecorder
from cvs2svn_lib.checkout_internal import InternalRevisionExcluder
from cvs2svn_lib.checkout_internal import InternalRevisionReader
from cvs2svn_lib.symbol_strategy import AllBranchRule
from cvs2svn_lib.symbol_strategy import AllTagRule
from cvs2svn_lib.symbol_strategy import BranchIfCommitsRule
from cvs2svn_lib.symbol_strategy import ExcludeRegexpStrategyRule
from cvs2svn_lib.symbol_strategy import ForceBranchRegexpStrategyRule
from cvs2svn_lib.symbol_strategy import ForceTagRegexpStrategyRule
from cvs2svn_lib.symbol_strategy import ExcludeTrivialImportBranchRule
from cvs2svn_lib.symbol_strategy import ExcludeVendorBranchRule
from cvs2svn_lib.symbol_strategy import HeuristicStrategyRule
from cvs2svn_lib.symbol_strategy import UnambiguousUsageRule
from cvs2svn_lib.symbol_strategy import HeuristicPreferredParentRule
from cvs2svn_lib.symbol_strategy import SymbolHintsFileRule
from cvs2svn_lib.symbol_transform import ReplaceSubstringsSymbolTransform
from cvs2svn_lib.symbol_transform import RegexpSymbolTransform
from cvs2svn_lib.symbol_transform import NormalizePathsSymbolTransform
from cvs2svn_lib.property_setters import AutoPropsPropertySetter
from cvs2svn_lib.property_setters import CVSBinaryFileDefaultMimeTypeSetter
from cvs2svn_lib.property_setters import CVSBinaryFileEOLStyleSetter
from cvs2svn_lib.property_setters import CVSRevisionNumberSetter
from cvs2svn_lib.property_setters import DefaultEOLStyleSetter
from cvs2svn_lib.property_setters import EOLStyleFromMimeTypeSetter
from cvs2svn_lib.property_setters import ExecutablePropertySetter
from cvs2svn_lib.property_setters import KeywordsPropertySetter
from cvs2svn_lib.property_setters import MimeMapper
from cvs2svn_lib.property_setters import SVNBinaryFileKeywordsPropertySetter
# To choose the level of logging output, uncomment one of the
# following lines:
#Log().log_level = Log.WARN
#Log().log_level = Log.QUIET
Log().log_level = Log.NORMAL
#Log().log_level = Log.VERBOSE
#Log().log_level = Log.DEBUG
# There are several possible options for where to put the output of a
# cvs2svn conversion. Please choose one of the following and adjust
# the parameters as necessary:
# Use this output option if you would like cvs2svn to create a new SVN
# repository and store the converted repository there. The first
# argument is the path to which the repository should be written (this
# repository must not already exist). The second (optional) argument
# allows a --fs-type option to be passed to "svnadmin create". The
# third (optional) argument can be specified to set the
# --bdb-txn-nosync option on a bdb repository. The fourth (optional)
# argument can be specified to set a list of verbatim options to be passed
# to "svnadmin create":
ctx.output_option = NewRepositoryOutputOption(
r'/path/to/svnrepo', # Path to repository
#fs_type='fsfs', # Type of repository to create
#bdb_txn_nosync=False, # For bsd repositories, this option can be added
#create_options=['--pre-1.5-compatible'], # Options for "svnadmin create"
)
# Use this output option if you would like cvs2svn to store the
# converted CVS repository into an SVN repository that already exists.
# The argument is the filesystem path of an existing local SVN
# repository (this repository must already exist):
#ctx.output_option = ExistingRepositoryOutputOption(
# r'/path/to/svnrepo', # Path to repository
# )
# Use this type of output option if you want the output of the
# conversion to be written to a SVN dumpfile instead of committing
# them into an actual repository:
#ctx.output_option = DumpfileOutputOption(
# dumpfile_path=r'/path/to/cvs2svn-dump', # Name of dumpfile to create
# )
# Independent of the ctx.output_option selected, the following option
# can be set to True to suppress cvs2svn output altogether:
ctx.dry_run = False
# The following set of options specifies how the revision contents of
# the RCS files should be read.
#
# The default selection is InternalRevisionReader, which uses built-in
# code that reads the RCS deltas while parsing the files in
# CollectRevsPass. This method is very fast but requires lots of
# temporary disk space. The disk space is required for (1) storing
# all of the RCS deltas, and (2) during OutputPass, keeping a copy of
# the full text of every revision that still has a descendant that
# hasn't yet been committed. Since this can includes multiple
# revisions of each file (i.e., on multiple branches), the required
# amount of temporary space can potentially be many times the size of
# a checked out copy of the whole repository. Setting compress=True
# cuts the disk space requirements by about 50% at the price of
# increased CPU usage. Using compression usually speeds up the
# conversion due to the reduced I/O pressure, unless --tmpdir is on a
# RAM disk. This method does not expand CVS's "Log" keywords.
#
# The second possibility is RCSRevisionReader, which uses RCS's "co"
# program to extract the revision contents of the RCS files during
# OutputPass. This option doesn't require any temporary space, but it
# is relatively slow because (1) "co" has to be executed very many
# times; and (2) "co" itself has to assemble many file deltas to
# compute the contents of a particular revision. The constructor
# argument specifies how to invoke the "co" executable.
#
# The third possibility is CVSRevisionReader, which uses the "cvs"
# program to extract the revision contents out of the RCS files during
# OutputPass. This option doesn't require any temporary space, but it
# is the slowest of all, because "cvs" is considerably slower than
# "co". However, it works in some situations where RCSRevisionReader
# fails; see the HTML documentation of the "--use-cvs" option for
# details. The constructor argument specifies how to invoke the "co"
# executable.
#
# Choose one of the following three groups of lines:
ctx.revision_recorder = InternalRevisionRecorder(compress=True)
ctx.revision_excluder = InternalRevisionExcluder()
ctx.revision_reader = InternalRevisionReader(compress=True)
#ctx.revision_recorder = NullRevisionRecorder()
#ctx.revision_excluder = NullRevisionExcluder()
#ctx.revision_reader = RCSRevisionReader(co_executable=r'co')
#ctx.revision_recorder = NullRevisionRecorder()
#ctx.revision_excluder = NullRevisionExcluder()
#ctx.revision_reader = CVSRevisionReader(cvs_executable=r'cvs')
# Set the name (and optionally the path) of some other executables
# required by cvs2svn:
ctx.svnadmin_executable = r'svnadmin'
ctx.sort_executable = r'sort'
# Change the following line to True if the conversion should only
# include the trunk of the repository (i.e., all branches and tags
# should be ignored):
ctx.trunk_only = False
# Change the following line to True if cvs2svn should delete a
# directory once the last file has been deleted from it:
ctx.prune = False
# How to convert author names, log messages, and filenames to unicode.
# The first argument to CVSTextDecoder is a list of encoders that are
# tried in order in 'strict' mode until one of them succeeds. If none
# of those succeeds, then fallback_encoder is used in lossy 'replace'
# mode (if it is specified). Setting a fallback encoder ensures that
# the encoder always succeeds, but it can cause information loss.
ctx.cvs_author_decoder = CVSTextDecoder(
[
'latin1',
#'utf8',
#'ascii',
],
fallback_encoding='ascii'
)
ctx.cvs_log_decoder = CVSTextDecoder(
[
'latin1',
#'utf8',
#'ascii',
],
fallback_encoding='ascii'
)
# You might want to be especially strict when converting filenames to
# unicode (e.g., maybe not specify a fallback_encoding).
ctx.cvs_filename_decoder = CVSTextDecoder(
[
'latin1',
#'utf8',
#'ascii',
],
fallback_encoding='ascii'
)
# Template for the commit message to be used for initial project
# commits.
ctx.initial_project_commit_message = (
'Standard project directories initialized by cvs2svn.'
)
# Template for the commit message to be used for post commits, in
# which modifications to a vendor branch are copied back to trunk.
# This message can use '%(revnum)d' to include the revision number of
# the revision that included the change to the vendor branch.
ctx.post_commit_message = (
'This commit was generated by cvs2svn to compensate for '
'changes in r%(revnum)d, which included commits to RCS files '
'with non-trunk default branches.'
)
# Template for the commit message to be used for commits in which
# symbols are created. This message can use '%(symbol_type)d' to
# include the type of the symbol ('branch' or 'tag') or
# '%(symbol_name)' to include the name of the symbol.
ctx.symbol_commit_message = (
"This commit was manufactured by cvs2svn to create %(symbol_type)s "
"'%(symbol_name)s'."
)
# Some CVS clients for MacOS store resource fork data into CVS along
# with the file contents itself by wrapping it all up in a container
# format called "AppleSingle". Subversion currently does not support
# MacOS resource forks. Nevertheless, sometimes the resource fork
# information is not necessary and can be discarded. Set the
# following option to True if you would like cvs2svn to identify files
# whose contents are encoded in AppleSingle format, and discard all
# but the data fork for such files before committing them to
# Subversion. (Please note that AppleSingle contents are identified
# by the AppleSingle magic number as the first four bytes of the file.
# This check is not failproof, so only set this option if you think
# you need it.)
ctx.decode_apple_single = False
# This option can be set to the name of a filename to which are stored
# statistics and conversion decisions about the CVS symbols.
ctx.symbol_info_filename = None
#ctx.symbol_info_filename = 'symbol-info.txt'
# cvs2svn uses "symbol strategy rules" to help decide how to handle
# CVS symbols. The rules in a project's symbol_strategy_rules are
# applied in order, and each rule is allowed to modify the symbol.
# The result (after each of the rules has been applied) is used for
# the conversion.
#
# 1. A CVS symbol might be used as a tag in one file and as a branch
# in another file. cvs2svn has to decide whether to convert such a
# symbol as a tag or as a branch. cvs2svn uses a series of
# heuristic rules to decide how to convert a symbol. The user can
# override the default rules for specific symbols or symbols
# matching regular expressions.
#
# 2. cvs2svn is also capable of excluding symbols from the conversion
# (provided no other symbols depend on them.
#
# 3. CVS does not record unambiguously the line of development from
# which a symbol sprouted. cvs2svn uses a heuristic to choose a
# symbol's "preferred parents".
#
# The standard branch/tag/exclude StrategyRules do not change a symbol
# that has already been processed by an earlier rule, so in effect the
# first matching rule is the one that is used.
global_symbol_strategy_rules = [
# It is possible to specify manually exactly how symbols should be
# converted and what line of development should be used as the
# preferred parent. To do so, create a file containing the symbol
# hints and enable the following option.
#
# The format of the hints file is described in the documentation
# for the SymbolHintsFileRule class in
# cvs2svn_lib/symbol_strategy.py. The file output by the
# --write-symbol-info (i.e., ctx.symbol_info_filename) option is
# in the same format. The simplest way to use this option is to
# run the conversion through CollateSymbolsPass with
# --write-symbol-info option, copy the symbol info and edit it to
# create a hints file, then re-start the conversion at
# CollateSymbolsPass with this option enabled.
#SymbolHintsFileRule('symbol-hints.txt'),
# To force all symbols matching a regular expression to be
# converted as branches, add rules like the following:
#ForceBranchRegexpStrategyRule(r'branch.*'),
# To force all symbols matching a regular expression to be
# converted as tags, add rules like the following:
#ForceTagRegexpStrategyRule(r'tag.*'),
# To force all symbols matching a regular expression to be
# excluded from the conversion, add rules like the following:
#ExcludeRegexpStrategyRule(r'unknown-.*'),
# Sometimes people use "cvs import" to get their own source code
# into CVS. This practice creates a vendor branch 1.1.1 and
# imports the code onto the vendor branch as 1.1.1.1, then copies
# the same content to the trunk as version 1.1. Normally, such
# vendor branches are useless and they complicate the SVN history
# unnecessarily. The following rule excludes any branches that
# only existed as a vendor branch with a single import (leaving
# only the 1.1 revision). If you want to retain such branches,
# comment out the following line. (Please note that this rule
# does not exclude vendor *tags*, as they are not so easy to
# identify.)
ExcludeTrivialImportBranchRule(),
# To exclude all vendor branches (branches that had "cvs import"s
# on them bug no other kinds of commits), uncomment the following
# line:
#ExcludeVendorBranchRule(),
# Usually you want this rule, to convert unambiguous symbols
# (symbols that were only ever used as tags or only ever used as
# branches in CVS) the same way they were used in CVS:
UnambiguousUsageRule(),
# If there was ever a commit on a symbol, then it cannot be
# converted as a tag. This rule causes all such symbols to be
# converted as branches. If you would like to resolve such
# ambiguities manually, comment out the following line:
BranchIfCommitsRule(),
# Last in the list can be a catch-all rule that is used for
# symbols that were not matched by any of the more specific rules
# above. (Assuming that BranchIfCommitsRule() was included above,
# then the symbols that are still indeterminate at this point can
# sensibly be converted as branches or tags.) Include at most one
# of these lines. If none of these catch-all rules are included,
# then the presence of any ambiguous symbols (that haven't been
# disambiguated above) is an error:
# Convert ambiguous symbols based on whether they were used more
# often as branches or as tags:
HeuristicStrategyRule(),
# Convert all ambiguous symbols as branches:
#AllBranchRule(),
# Convert all ambiguous symbols as tags:
#AllTagRule(),
# The last rule is here to choose the preferred parent of branches
# and tags, that is, the line of development from which the symbol
# sprouts.
HeuristicPreferredParentRule(),
]
# Specify a username to be used for commits generated by cvs2svn. If
# this options is set to None then no username will be used for such
# commits:
ctx.username = None
#ctx.username = 'cvs2svn'
# ctx.svn_property_setters contains a list of rules used to set the
# svn properties on files in the converted archive. For each file,
# the rules are tried one by one. Any rule can add or suppress one or
# more svn properties. Typically the rules will not overwrite
# properties set by a previous rule (though they are free to do so).
ctx.svn_property_setters.extend([
# To read auto-props rules from a file, uncomment the following line
# and specify a filename. The boolean argument specifies whether
# case should be ignored when matching filenames to the filename
# patterns found in the auto-props file:
#AutoPropsPropertySetter(
# r'/home/username/.subversion/config',
# ignore_case=True,
# ),
# To read mime types from a file, uncomment the following line and
# specify a filename:
#MimeMapper(r'/etc/mime.types'),
# Omit the svn:eol-style property from any files that are listed
# as binary (i.e., mode '-kb') in CVS:
CVSBinaryFileEOLStyleSetter(),
# If the file is binary and its svn:mime-type property is not yet
# set, set svn:mime-type to 'application/octet-stream'.
CVSBinaryFileDefaultMimeTypeSetter(),
# To try to determine the eol-style from the mime type, uncomment
# the following line:
#EOLStyleFromMimeTypeSetter(),
# Choose one of the following lines to set the default
# svn:eol-style if none of the above rules applied. The argument
# is the svn:eol-style that should be applied, or None if no
# svn:eol-style should be set (i.e., the file should be treated as
# binary).
#
# The default is to treat all files as binary unless one of the
# previous rules has determined otherwise, because this is the
# safest approach. However, if you have been diligent about
# marking binary files with -kb in CVS and/or you have used the
# above rules to definitely mark binary files as binary, then you
# might prefer to use 'native' as the default, as it is usually
# the most convenient setting for text files. Other possible
# options: 'CRLF', 'CR', 'LF'.
DefaultEOLStyleSetter(None),
#DefaultEOLStyleSetter('native'),
# Prevent svn:keywords from being set on files that have
# svn:eol-style unset.
SVNBinaryFileKeywordsPropertySetter(),
# If svn:keywords has not been set yet, set it based on the file's
# CVS mode:
KeywordsPropertySetter(config.SVN_KEYWORDS_VALUE),
# Set the svn:executable flag on any files that are marked in CVS as
# being executable:
ExecutablePropertySetter(),
# Uncomment the following line to include the original CVS revision
# numbers as file properties in the SVN archive:
#CVSRevisionNumberSetter(),
])
# The directory to use for temporary files:
ctx.tmpdir = r'cvs2svn-tmp'
# To skip the cleanup of temporary files, uncomment the following
# option:
#ctx.skip_cleanup = True
# In CVS, it is perfectly possible to make a single commit that
# affects more than one project or more than one branch of a single
# project. Subversion also allows such commits. Therefore, by
# default, when cvs2svn sees what looks like a cross-project or
# cross-branch CVS commit, it converts it into a
# cross-project/cross-branch Subversion commit.
#
# However, other tools and SCMs have trouble representing
# cross-project or cross-branch commits. (For example, Trac's Revtree
# plugin, http://www.trac-hacks.org/wiki/RevtreePlugin is confused by
# such commits.) Therefore, we provide the following two options to
# allow cross-project/cross-branch commits to be suppressed.
# To prevent CVS commits from different projects from being merged
# into single SVN commits, change this option to False:
ctx.cross_project_commits = True
# To prevent CVS commits on different branches from being merged into
# single SVN commits, change this option to False:
ctx.cross_branch_commits = True
# By default, .cvsignore files are rendered in the output by setting
# corresponding svn:ignore properties on the parent directory, but the
# .cvsignore files themselves are not included in the conversion
# output. If you would like to include the .cvsignore files in the
# output, change this option to True:
ctx.keep_cvsignore = False
# By default, it is a fatal error for a CVS ",v" file to appear both
# inside and outside of an "Attic" subdirectory (this should never
# happen, but frequently occurs due to botched repository
# administration). If you would like to retain both versions of such
# files, change the following option to True, and the attic version of
# the file will be left in an SVN subdirectory called "Attic":
ctx.retain_conflicting_attic_files = False
# Now use stanzas like the following to define CVS projects that
# should be converted. The arguments are:
#
# - The filesystem path of the project within the CVS repository.
#
# - The path that should be used for the "trunk" directory of this
# project within the SVN repository. This is an SVN path, so it
# should always use forward slashes ("/").
#
# - The path that should be used for the "branches" directory of this
# project within the SVN repository. This is an SVN path, so it
# should always use forward slashes ("/").
#
# - The path that should be used for the "tags" directory of this
# project within the SVN repository. This is an SVN path, so it
# should always use forward slashes ("/").
#
# - A list of symbol transformations that can be used to rename
# symbols in this project. Each entry is a tuple (pattern,
# replacement), where pattern is a Python regular expression pattern
# and replacement is the text that should replace the pattern. Each
# pattern is matched against each symbol name. If the pattern
# matches, then it is replaced with the corresponding replacement
# text. The replacement can include substitution patterns (e.g.,
# r'\1' or r'\g<name>'). Typically you will want to use raw strings
# (strings with a preceding 'r', like shown in the examples) for the
# regexp and its replacement to avoid backslash substitution within
# those strings.
# Create the default project (using ctx.trunk, ctx.branches, and ctx.tags):
run_options.add_project(
r'cvsrepo',
trunk_path='trunk',
branches_path='branches',
tags_path='tags',
initial_directories=[
# The project's trunk_path, branches_path, and tags_path
# directories are added to the SVN repository in the project's
# first commit. If you would like additional SVN directories
# to be created in the project's first commit, list them here.
#'releases',
],
symbol_transforms=[
#RegexpSymbolTransform(r'release-(\d+)_(\d+)',
# r'release-\1.\2'),
#RegexpSymbolTransform(r'release-(\d+)_(\d+)_(\d+)',
# r'release-\1.\2.\3'),
# Convert backslashes into forward slashes:
ReplaceSubstringsSymbolTransform('\\','/'),
# Eliminate leading, trailing, and repeated slashes:
NormalizePathsSymbolTransform(),
],
symbol_strategy_rules=[
# Additional, project-specific symbol strategy rules can
# be added here.
] + global_symbol_strategy_rules,
)
# Add a second project, to be stored to projA/trunk, projA/branches,
# and projA/tags:
#run_options.add_project(
# r'my/cvsrepo/projA',
# trunk_path='projA/trunk',
# branches_path='projA/branches',
# tags_path='projA/tags',
# initial_directories=[
# # The project's trunk_path, branches_path, and tags_path
# # directories are added to the SVN repository in the project's
# # first commit. If you would like additional SVN directories
# # to be created in the project's first commit, list them here.
# #'releases',
# ],
# symbol_transforms=[
# #RegexpSymbolTransform(r'release-(\d+)_(\d+)',
# # r'release-\1.\2'),
# #RegexpSymbolTransform(r'release-(\d+)_(\d+)_(\d+)',
# # r'release-\1.\2.\3'),
# # Convert backslashes into forward slashes:
# ReplaceSubstringsSymbolTransform('\\','/'),
# # Eliminate leading, trailing, and repeated slashes:
# NormalizePathsSymbolTransform(),
# ],
# symbol_strategy_rules=[
# # Additional, project-specific symbol strategy rules can
# # be added here.
# ] + global_symbol_strategy_rules,
# )
# Change this option to True to turn on profiling of cvs2svn (for
# debugging purposes):
run_options.profiling = False
# Should CVSItem -> Changeset database files be memory mapped? In
# some tests, using memory mapping speeded up the overall conversion
# by about 5%. But this option can cause the conversion to fail with
# an out of memory error if the conversion computer runs out of
# virtual address space (e.g., when running a very large conversion on
# a 32-bit operating system). Therefore it is disabled by default.
# Uncomment the following line to allow these database files to be
# memory mapped.
#changeset_database.use_mmap_for_cvs_item_to_changeset_table = True
# (Be in -*- mode: python; coding: utf-8 -*- mode.)
# This is an example of an options file that can be used to make
# cvs2svn convert to git rather than to Subversion. See
# www/cvs2svn.html for general information, and see the comments in
# this file and in cvs2svn-example.options for information about what
# options are available and how they can be set.
# "cvs2git" is shorthand for "cvs2svn in the mode where it is
# outputting to git instead of Subversion". But the program that
# needs to be run is still called "cvs2svn". Run it with the
# --options option, passing it this file as argument:
#
# cvs2svn --options=cvs2svn-git.options
# Many options are shared with Subversion, so we simply import the
# sample Subversion options file (from the current directory, which
# should be the main cvs2svn directory)...
execfile('cvs2svn-example.options')
# ...and then we overwrite the things that need to be different.
from cvs2svn_lib.fulltext_revision_recorder \
import SimpleFulltextRevisionRecorderAdapter
from cvs2svn_lib.git_revision_recorder import GitRevisionRecorder
from cvs2svn_lib.git_output_option import GitRevisionMarkWriter
from cvs2svn_lib.git_output_option import GitOutputOption
# Set this option to True to convert only the main branch of the CVS
# repository:
ctx.trunk_only = False
# cvs2git only supports single-project conversions (multiple-project
# conversions wouldn't really make sense for git anyway). So these
# options must both be set to False:
ctx.cross_project_commits = False
ctx.cross_branch_commits = False
# Specify a username to be used for commits for which CVS doesn't
# record the original author (for example, the creation of a branch).
# This should be a simple (unix-style) username, but it can be
# translated into a git-style name by the author_transforms map.
ctx.username = 'cvs2svn'
# CVS uses unix login names as author names whereas git requires
# author names to be of the form "foo <bar>". The default is to set
# the git author to "cvsauthor <cvsauthor>". author_transforms can be
# used to map cvsauthor names (e.g., "jrandom") to a true name and
# email address (e.g., "J. Random <jrandom <at> example.com>" for the
# example shown). All values should be either 16-bit strings (i.e.,
# with "u" as a prefix) or 8-bit strings in the utf-8 encoding.
# Please substitute your own project's usernames here to use with the
# author_transforms option of GitOutputOption below.
author_transforms={
'jrandom' : ('J. Random', 'jrandom <at> example.com'),
'mhagger' : ('Michael Haggerty', 'mhagger <at> alum.mit.edu'),
'brane' : (u'Branko Äibej', 'brane <at> xbc.nu'),
'ringstrom' : ('Tobias Ringström', 'tobias <at> ringstrom.mine.nu'),
'dionisos' : (u'Erik Hülsmann', 'e.huelsmann <at> gmx.net'),
# This one will be used for commits for which CVS doesn't record
# the original author, as explained above.
'cvs2svn' : ('cvs2svn', 'admin <at> example.com'),
}
# This is the main option that causes cvs2svn to output to git rather
# than Subversion:
ctx.output_option = GitOutputOption(
# The file in which to write the git-fast-import stream that
# contains the changesets and branch/tag information:
'cvs2svn-tmp/git-dump.dat',
# The blobs will be written via the revision recorder, so in
# OutputPass we only have to emit references to the blob marks:
GitRevisionMarkWriter(),
# This option can be set to an integer to limit the number of
# revisions that are merged with the main parent in any commit.
# For git output, this can be set to None (unlimited), though due
# to the limitations of other tools you might want to set it to a
# smaller number (e.g., 16). For Mercurial output, this should be
# set to 1.
max_merges=None,
#max_merges=1,
# Optional map from CVS author names to git author names:
author_transforms=author_transforms,
)
# During CollectRevsPass, cvs2git records the contents of file
# revisions into a file in git-fast-import format. This option
# configures that process:
ctx.revision_recorder = SimpleFulltextRevisionRecorderAdapter(
# cvs2git uses either RCS's "co" command or CVS's "cvs co -p" to
# extract the content of file revisions. Here you can choose
# whether to use RCS (faster, but fails in some rare
# circumstances) or CVS (much slower, but more reliable).
#RCSRevisionReader(co_executable=r'co'),
CVSRevisionReader(cvs_executable=r'cvs'),
# The file in which to write the git-fast-import stream that
# contains the file revision contents:
GitRevisionRecorder('cvs2svn-tmp/git-blob.dat'),
)
# cvs2git does not need to know what revisions will be excluded, so
# leave this option unchanged:
ctx.revision_excluder = NullRevisionExcluder()
# cvs2git does not need a revision reader, so leave this option
# unchanged.
ctx.revision_reader = None
# Clear the default project that was set in cvs2svn-example.options:
run_options.clear_projects()
# Now add a project to be converted to git:
run_options.add_project(
# The path to the part of the CVS repository (*not* a CVS working
# copy) that should be converted. This may be a subdirectory
# (i.e., a module) within a larger CVS repository.
r'cvsrepo',
# See cvs2svn-example.options for more documention about symbol
# transforms that can be set using this option.
symbol_transforms=[
ReplaceSubstringsSymbolTransform('\\','/'),
NormalizePathsSymbolTransform(),
],
# See www/cvs2svn.html and cvs2svn-example.options for more
# documention about symbol strategy rules. The settings below
# will give pretty much a 1:1 conversion, but you can change the
# settings for more customization (for example, excluding a symbol
# from the conversion, forcing it to be output as a branch
# vs. tag, changing its name, etc).
symbol_strategy_rules=[
# Read from a file how symbols should be converted:
#SymbolHintsFileRule('symbol-hints.txt'),
# Specific rules for symbols matching particular regexps:
#ForceBranchRegexpStrategyRule(r'branch.*'),
#ForceTagRegexpStrategyRule(r'tag.*'),
#ExcludeRegexpStrategyRule(r'unknown-.*'),
# If a symbol is used consistently in CVS, do the same in git:
UnambiguousUsageRule(),
# If there are ever commits on a symbol, force it to be a
# branch:
BranchIfCommitsRule(),
# Uncomment at most one of the following group of
# "catch-all" rules:
HeuristicStrategyRule(),
#AllBranchRule(),
#AllTagRule(),
# This rule should always be included. It determines from
# which parent branch each daughter branch/tag sprouts.
HeuristicPreferredParentRule(),
],
)