Thomas Manson | 6 Oct 22:11

CVS migration help

Hi,
 
  does someone has full instruction to migrate from cvs to bazaar other than bzr cvsps-import ?
 
I've tryed with
 
bzr cvsps-import cvs/files/ . bazaar --use-cvs
or
bzr cvsps-import cvs/files/ . bazaar
 
but I've an encoding error.
 
I really want to be able to start coding with bazaar... and i'm stuck with this encoding bug which is really really frustrating !
 
Thomas.
Thomas Manson | 6 Oct 22:57

Re: CVS migration help

Has someone ever convert from cvs to bazaar ?
 
 
I've now tryed fast import...
 
Documentation says : 
 
bzr init-repo .
front-end | bzr fast-import -
 
as front-end, there is cvs2svn...
 
But cvs2svn doesn't have an option to put the conversion on the output.
so i've tryed the following :
 
cvs2svn --encoding ISO-8859-1 --dumpfile=test  cvs/files/
mkdir bzr-tst
cd bzr-tst
bzr init-repo .
cat ../test  | bzr fast-import -
 
which fails...
 
bzr: ERROR: line 1: Invalid command 'SVN-fs-dump-format-version: 2'
 
I even tried this :
 
 cvs2svn --encoding ISO-8859-1 --dumpfile=../test  ../cvs/files/ | bzr fast-import -
 
bzr: ERROR: line 1: Invalid command '----- pass 1 (CollectRevsPass) -----'
Traceback (most recent call last):
  File "/usr/bin/cvs2svn", line 31, in <module>
    main(sys.argv[0], sys.argv[1:])
...

 
 Maybe I'll stay on cvs... maybe bad, but works... (yes, I'm bitter)
 
On Mon, Oct 6, 2008 at 22:15, Thomas Manson <dev.mansonthomas <at> gmail.com> wrote:
Hi,
 
  does someone has full instruction to migrate from cvs to bazaar other than bzr cvsps-import ?
 
I've tryed with
 
bzr cvsps-import cvs/files/ . bazaar --use-cvs
or
bzr cvsps-import cvs/files/ . bazaar
 
but I've an encoding error.
 
I really want to be able to start coding with bazaar... and i'm stuck with this encoding bug which is really really frustrating !
 
Thomas.

Michael Haggerty | 7 Oct 09:36
Favicon

Re: CVS migration help

Thomas Manson wrote:
> Has someone ever convert from cvs to bazaar ?
>  
> I've now tryed fast import...
> Documentation says : 
> 
>     bzr init-repo .
>     front-end | bzr fast-import -
> 
> as front-end, there is cvs2svn...
>  
> But cvs2svn doesn't have an option to put the conversion on the output.
> so i've tryed the following :

There is no need for cvs2svn to write the conversion to its output.  You
can have cvs2svn write the output to files, then import the files into
bzr using something like

    bzr fast-import <my-dump-filename

But to use cvs2svn to convert to bzr, you need to follow the
instructions for converting to git [1].  The dumpfile created with the
--dumpfile option is an SVN dumpfile that bzr won't understand.

As for the problems that you have been having with filename encodings,
it should be possible to have cvs2svn handle them by setting one or more
filename decoders by setting them within the ctx.cvs_filename_decoder
option in your options file.  See cvs2svn-example.options and
test-data/main-cvsrepos/cvs2svn-git-inline.options for more information.
 The decoder that you need here is the one that your filesystem uses.

If you need more help with cvs2svn, please CC your emails to
users <at> cvs2svn.tigris.org (as I have with this email).

Michael

[1] http://cvs2svn.tigris.org/cvs2git.html
Nicholas Allen | 6 Oct 23:31

Re: CVS migration help


Have you tried using tailor?

http://progetti.arstecnica.it/tailor

It can convert to and from many kinds of version control systems
(including Bazaar and CVS).

Cheers,

Nick

Thomas Manson wrote:
> Hi,
>  
>   does someone has full instruction to migrate from cvs to bazaar other
> than bzr cvsps-import ?
>  
> I've tryed with
>  
> bzr cvsps-import cvs/files/ . bazaar --use-cvs
> or
> bzr cvsps-import cvs/files/ . bazaar
>  
> but I've an encoding error.
>  
> I really want to be able to start coding with bazaar... and i'm stuck
> with this encoding bug which is really really frustrating !
>  
> Thomas.

Thomas Manson | 7 Oct 00:01

Re: CVS migration help

I've look to it... but didn't tryed yet...
 
It really misses straightforward howto (for all tools except bzr cvsimport)
 
 
I guess I've to  change this config file of tailor :
 
 
[pxlib]
root-directory = /wip/sf.net/pxlib
target = darcs:pxlib
source = cvs:pxlib
subdir = pxlib
 
[darcs:pxlib]
[cvs:pxlib]
repository = :pserver:anonymous <at> cvs.sf.net:/cvsroot/pxlib
module = pxlib
encoding = iso-8859-1
 
to something with target= bzr:something...
 
 
To ease the migration of cvs people like me that barely know only cvs, nothing about bzr, full howto that covers standart use case is a must have...
 
Really, i'm now on holliday so i've got some time to search, but in the real world (ie : on work time), I would i've tryed one or two hours and would says : 
 
Well, let's not loose time... use cvs.
 
 
Thomas

 
On Mon, Oct 6, 2008 at 23:31, Nicholas Allen <nick.allen <at> onlinehome.de> wrote:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Have you tried using tailor?

http://progetti.arstecnica.it/tailor

It can convert to and from many kinds of version control systems
(including Bazaar and CVS).

Cheers,

Nick

Thomas Manson wrote:
> Hi,
>
>   does someone has full instruction to migrate from cvs to bazaar other
> than bzr cvsps-import ?
>
> I've tryed with
>
> bzr cvsps-import cvs/files/ . bazaar --use-cvs
> or
> bzr cvsps-import cvs/files/ . bazaar
>
> but I've an encoding error.
>
> I really want to be able to start coding with bazaar... and i'm stuck
> with this encoding bug which is really really frustrating !
>
> Thomas.

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFI6oPE1+i51gqqEGkRAn+WAKCCnbajClG6vXEtNmAfAt6WJKbsYgCfRgOM
xAMMegD6b9yjnlFCJ6ocmUw=
=/BPx
-----END PGP SIGNATURE-----

Jelmer Vernooij | 7 Oct 00:34

Re: CVS migration help

Am Dienstag, den 07.10.2008, 00:01 +0200 schrieb Thomas Manson:
> I've look to it... but didn't tryed yet...
>  
> It really misses straightforward howto (for all tools except bzr
> cvsimport)
cvsps-import should be the best solution here, we should just fixing
that imho. What's blocking you from using it?

Cheers,

Jelmer

>  
> I guess I've to  change this config file of tailor : 
>  
>  
> [pxlib]
> root-directory = /wip/sf.net/pxlib
> target = darcs:pxlib
> source = cvs:pxlib
> subdir = pxlib
>  
> [darcs:pxlib]
> [cvs:pxlib]
> repository = :pserver:anonymous <at> cvs.sf.net:/cvsroot/pxlib
> module = pxlib
> encoding = iso-8859-1
>  
> to something with target= bzr:something...
>  
>  
> To ease the migration of cvs people like me that barely know only cvs,
> nothing about bzr, full howto that covers standart use case is a must
> have...
>  
> Really, i'm now on holliday so i've got some time to search, but in
> the real world (ie : on work time), I would i've tryed one or
> two hours and would says : 
>  
> Well, let's not loose time... use cvs.
>  
>  
> Thomas
> 
>  
> On Mon, Oct 6, 2008 at 23:31, Nicholas Allen
> <nick.allen <at> onlinehome.de> wrote:
>         -----BEGIN PGP SIGNED MESSAGE-----
>         Hash: SHA1
>         
>         Have you tried using tailor?
>         
>         http://progetti.arstecnica.it/tailor
>         
>         It can convert to and from many kinds of version control
>         systems
>         (including Bazaar and CVS).
>         
>         Cheers,
>         
>         Nick
>         
>         
>         Thomas Manson wrote:
>         > Hi,
>         >
>         >   does someone has full instruction to migrate from cvs to
>         bazaar other
>         > than bzr cvsps-import ?
>         >
>         > I've tryed with
>         >
>         > bzr cvsps-import cvs/files/ . bazaar --use-cvs
>         > or
>         > bzr cvsps-import cvs/files/ . bazaar
>         >
>         > but I've an encoding error.
>         >
>         > I really want to be able to start coding with bazaar... and
>         i'm stuck
>         > with this encoding bug which is really really frustrating !
>         >
>         > Thomas.
>         
>         
>         -----BEGIN PGP SIGNATURE-----
>         Version: GnuPG v1.4.6 (GNU/Linux)
>         Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
>         
>         iD8DBQFI6oPE1+i51gqqEGkRAn
>         +WAKCCnbajClG6vXEtNmAfAt6WJKbsYgCfRgOM
>         xAMMegD6b9yjnlFCJ6ocmUw=
>         =/BPx
>         -----END PGP SIGNATURE-----
> 
> 

> 

Thomas Manson | 7 Oct 00:46

Re: CVS migration help

Hi Jelmer,
 
  Apparently there's an encoding issue on file name (or commit comments).
 
 
if you want to have a try, you can download the cvs repo archive here : 
 
the cvsroot is in cvs/files in the archive.
 
I've searched the archive with all possible accent and only find 2 files: 
 
paquerette <at> home:/tmp/cvs/files$ find . | grep é
./crf-irp/Ressources/documentation/Spécifications.doc,v
./crf-irp-monitor/Ressources/documentation/Spécifications.doc,v
paquerette <at> home:/tmp/cvs/files$ find . | grep ç
paquerette <at> home:/tmp/cvs/files$ find . | grep è
paquerette <at> home:/tmp/cvs/files$ find . | grep à
paquerette <at> home:/tmp/cvs/files$ find . | grep ù
paquerette <at> home:/tmp/cvs/files$

i've tryed to remove these two files (with rm). it was enough for cvs2svn to works (+ --encoding ISO-8859-1 to get rid of warning about commit comments encoding), but not for bzr cvsps-import.
 
 
I'm a french team lead, and accent are part of my life, so I've to deal with.
I'm testing 'new' thing on my own personnal projects before deploying it in my business env and teach my engineers to use it.
 
Thanks for your help,
Thomas.

 
On Tue, Oct 7, 2008 at 00:34, Jelmer Vernooij <jelmer <at> vernstok.nl> wrote:
Am Dienstag, den 07.10.2008, 00:01 +0200 schrieb Thomas Manson:
> I've look to it... but didn't tryed yet...
>
> It really misses straightforward howto (for all tools except bzr
> cvsimport)
cvsps-import should be the best solution here, we should just fixing
that imho. What's blocking you from using it?

Cheers,

Jelmer



>
> I guess I've to  change this config file of tailor :
>
>
> [pxlib]
> root-directory = /wip/sf.net/pxlib
> target = darcs:pxlib
> source = cvs:pxlib
> subdir = pxlib
>
> [darcs:pxlib]
> [cvs:pxlib]
> repository = :pserver:anonymous <at> cvs.sf.net:/cvsroot/pxlib
> module = pxlib
> encoding = iso-8859-1
>
> to something with target= bzr:something...
>
>
> To ease the migration of cvs people like me that barely know only cvs,
> nothing about bzr, full howto that covers standart use case is a must
> have...
>
> Really, i'm now on holliday so i've got some time to search, but in
> the real world (ie : on work time), I would i've tryed one or
> two hours and would says :
>
> Well, let's not loose time... use cvs.
>
>
> Thomas
>
>
> On Mon, Oct 6, 2008 at 23:31, Nicholas Allen
> <nick.allen <at> onlinehome.de> wrote:
>         -----BEGIN PGP SIGNED MESSAGE-----
>         Hash: SHA1
>
>         Have you tried using tailor?
>
>         http://progetti.arstecnica.it/tailor
>
>         It can convert to and from many kinds of version control
>         systems
>         (including Bazaar and CVS).
>
>         Cheers,
>
>         Nick
>
>
>         Thomas Manson wrote:
>         > Hi,
>         >
>         >   does someone has full instruction to migrate from cvs to
>         bazaar other
>         > than bzr cvsps-import ?
>         >
>         > I've tryed with
>         >
>         > bzr cvsps-import cvs/files/ . bazaar --use-cvs
>         > or
>         > bzr cvsps-import cvs/files/ . bazaar
>         >
>         > but I've an encoding error.
>         >
>         > I really want to be able to start coding with bazaar... and
>         i'm stuck
>         > with this encoding bug which is really really frustrating !
>         >
>         > Thomas.
>
>
>         -----BEGIN PGP SIGNATURE-----
>         Version: GnuPG v1.4.6 (GNU/Linux)
>         Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
>
>         iD8DBQFI6oPE1+i51gqqEGkRAn
>         +WAKCCnbajClG6vXEtNmAfAt6WJKbsYgCfRgOM
>         xAMMegD6b9yjnlFCJ6ocmUw=
>         =/BPx
>         -----END PGP SIGNATURE-----
>
>

>


Michael Haggerty | 7 Oct 12:41
Favicon

Re: CVS migration help

Jelmer Vernooij wrote:
> Am Dienstag, den 07.10.2008, 00:01 +0200 schrieb Thomas Manson:
>> I've look to it... but didn't tryed yet...
>>  
>> It really misses straightforward howto (for all tools except bzr
>> cvsimport)
> cvsps-import should be the best solution here, we should just fixing
> that imho. What's blocking you from using it?

No conversion tool that is based on cvsps will be able to do a truly
reliable job of migrating from CVS.  cvsps, which was written for
another purpose, simply is not robust enough and does not emit enough
information for a complete conversion.  I gave many concrete examples of
its shortcomings on the Mercurial mailing list [1].

Deducing a project's history from CVS's incomplete records is a very
tricky thing; cvs2svn's feature list [2] will give you an idea of the
kinds of things an industrial-strength converter needs to handle.
cvs2svn deduces the CVS changesets itself, using a much more robust
algorithm than that used by cvsps.  (The main disadvantage of cvs2svn is
that it can only be used for one-time conversions, not for tracking a
live CVS repository incrementally.)

cvs2svn/cvs2git can create output in git-fast-import format [3], which
should also be readable by the bzr fast-import tool.  It hasn't gotten
much testing in "cvs2bzr" mode, but given that 90% of the job is
inferring CVS's history, it should not be too much work to fix any
problems in the "2bzr" part.  Therefore, any feedback would be much
appreciated.

(By the way, if you want to use cvs2svn to convert to bzr, I suggest
that you use the trunk version of cvs2svn, which has several
improvements compared to release 2.1.1.)

Michael

[1] http://selenic.com/pipermail/mercurial-devel/2008-February/004975.html

[2] http://cvs2svn.tigris.org/features.html

[3] http://cvs2svn.tigris.org/cvs2git.html

Thomas Manson | 7 Oct 17:03

Re: CVS migration help

Hi Michael,
 
  I've checkout the trunk version (the version on ubuntu hardy heron is quite old : 2.0.1)
  succeed in cvs2svn conversion,
 
 unfortunately it crashes in the same way that bzr cvsps-import does :
 
 
thomas <at> home:~/temp/bzr$ cat ../cvs2svn-tmp/git-blob.dat  ../cvs2svn-tmp/git-dump.dat |  bzr fast-import -
bzr: ERROR: exceptions.UnicodeDecodeError: 'utf8' codec can't decode bytes in position 43-45: invalid data
Traceback (most recent call last):
  File "/usr/lib/python2.5/site-packages/bzrlib/commands.py", line 834, in run_bzr_catch_errors
    return run_bzr(argv)
  File "/usr/lib/python2.5/site-packages/bzrlib/commands.py", line 790, in run_bzr
    ret = run(*run_argv)
  File "/usr/lib/python2.5/site-packages/bzrlib/commands.py", line 492, in run_argv_aliases
    return self.run(**all_cmd_args)
  File "/home/thomas/.bazaar/plugins/fastimport/__init__.py", line 199, in run
    params, verbose)
  File "/home/thomas/.bazaar/plugins/fastimport/__init__.py", line 77, in _run
    return proc.process(p.iter_commands)
  File "/home/thomas/.bazaar/plugins/fastimport/processor.py", line 83, in process
    self._process(command_iter)
  File "/home/thomas/.bazaar/plugins/fastimport/processors/generic_processor.py", line 317, in _process
    processor.ImportProcessor._process(self, command_iter)
  File "/home/thomas/.bazaar/plugins/fastimport/processor.py", line 105, in _process
    handler(self, cmd)
  File "/home/thomas/.bazaar/plugins/fastimport/processors/generic_processor.py", line 486, in commit_handler
    handler.process()
  File "/home/thomas/.bazaar/plugins/fastimport/processor.py", line 164, in process
    for fc in self.command.file_iter():
  File "/home/thomas/.bazaar/plugins/fastimport/parser.py", line 312, in iter_file_commands
    yield self._parse_file_modify(line[2:])
  File "/home/thomas/.bazaar/plugins/fastimport/parser.py", line 365, in _parse_file_modify
    path = self._path(params[2])
  File "/home/thomas/.bazaar/plugins/fastimport/parser.py", line 493, in _path
    return s.decode('utf_8')
  File "/usr/lib/python2.5/encodings/utf_8.py", line 16, in decode
    return codecs.utf_8_decode(input, errors, True)
UnicodeDecodeError: 'utf8' codec can't decode bytes in position 43-45: invalid data
bzr 1.3.1 on python 2.5.2.final.0 (linux2)
arguments: ['/usr/bin/bzr', 'fast-import', '-']
encoding: 'UTF-8', fsenc: 'UTF-8', lang: 'en_US.UTF-8'
plugins:
  bzrtools             /usr/lib/python2.5/site-packages/bzrlib/plugins/bzrtools [1.3.0]
  cvsps_import         /home/thomas/.bazaar/plugins/cvsps_import [unknown]
  fastimport           /home/thomas/.bazaar/plugins/fastimport [unknown]
  launchpad            /usr/lib/python2.5/site-packages/bzrlib/plugins/launchpad [unknown]
*** Bazaar has encountered an internal error.
    Please report a bug at https://bugs.launchpad.net/bzr/+filebug
    including this traceback, and a description of what you
    were doing when the error occurred.
 
I don't think it's related to cvs2svn or cvsps as it fails in both cases.
It should be a bzr bug.
 
I've successfully converted my project to git repository format with these set of command :
 
export CVSROOT=/home/thomas/temp/cvs2git/cvs/files

git cvsimport -C /home/thomas/temp/cvs2gitOutput/crf-irp           crf-irp
git cvsimport -C /home/thomas/temp/cvs2gitOutput/crf-irp-model     crf-irp-model
git cvsimport -C /home/thomas/temp/cvs2gitOutput/crf-irp-monitor   crf-irp-monitor
git cvsimport -C /home/thomas/temp/cvs2gitOutput/crf-irp-portail   crf-irp-portail
git cvsimport -C /home/thomas/temp/cvs2gitOutput/crf-irp-utilities crf-irp-utilities
 
 
Is it possible to convert the git version of my sources to bzr ? maybe it would be successfull.
 
 
Thomas.
 
 

On Tue, Oct 7, 2008 at 12:41, Michael Haggerty <mhagger <at> alum.mit.edu> wrote:
Jelmer Vernooij wrote:
> Am Dienstag, den 07.10.2008, 00:01 +0200 schrieb Thomas Manson:
>> I've look to it... but didn't tryed yet...
>>
>> It really misses straightforward howto (for all tools except bzr
>> cvsimport)
> cvsps-import should be the best solution here, we should just fixing
> that imho. What's blocking you from using it?

No conversion tool that is based on cvsps will be able to do a truly
reliable job of migrating from CVS.  cvsps, which was written for
another purpose, simply is not robust enough and does not emit enough
information for a complete conversion.  I gave many concrete examples of
its shortcomings on the Mercurial mailing list [1].

Deducing a project's history from CVS's incomplete records is a very
tricky thing; cvs2svn's feature list [2] will give you an idea of the
kinds of things an industrial-strength converter needs to handle.
cvs2svn deduces the CVS changesets itself, using a much more robust
algorithm than that used by cvsps.  (The main disadvantage of cvs2svn is
that it can only be used for one-time conversions, not for tracking a
live CVS repository incrementally.)

cvs2svn/cvs2git can create output in git-fast-import format [3], which
should also be readable by the bzr fast-import tool.  It hasn't gotten
much testing in "cvs2bzr" mode, but given that 90% of the job is
inferring CVS's history, it should not be too much work to fix any
problems in the "2bzr" part.  Therefore, any feedback would be much
appreciated.

(By the way, if you want to use cvs2svn to convert to bzr, I suggest
that you use the trunk version of cvs2svn, which has several
improvements compared to release 2.1.1.)

Michael

[1] http://selenic.com/pipermail/mercurial-devel/2008-February/004975.html

[2] http://cvs2svn.tigris.org/features.html

[3] http://cvs2svn.tigris.org/cvs2git.html

# (Be in -*- python -*- mode.)
#
# ====================================================================
# Copyright (c) 2006-2007 CollabNet.  All rights reserved.
#
# This software is licensed as described in the file COPYING, which
# you should have received as part of this distribution.  The terms
# are also available at http://subversion.tigris.org/license-1.html.
# If newer versions of this license are posted there, you may use a
# newer version instead, at your option.
#
# This software consists of voluntary contributions made by many
# individuals.  For exact contribution history, see the revision
# history and logs, available at http://cvs2svn.tigris.org/.
# ====================================================================

#                  #####################
#                  ## PLEASE READ ME! ##
#                  #####################
#
# This is a template for an options file that can be used to configure
# cvs2svn.  Many options do not have defaults, so it is easier to copy
# this file and modify what you need rather than creating a new
# options file from scratch.
#
# This file is in Python syntax, but you don't need to know Python to
# modify it.  But if you *do* know Python, then you will be happy to
# know that you can use arbitary Python constructs to do fancy
# configuration tricks.
#
# But please be aware of the following:
#
# * In many places, leading whitespace is significant in Python (it is
#   used instead of curly braces to group statements together).
#   Therefore, if you don't know what you are doing, it is best to
#   leave the whitespace as it is.
#
# * In normal strings, Python uses backslashes ("\") are used as an
#   escape character.  Therefore you need to be careful, especially
#   when specifying regular expressions or Windows filenames.  It is
#   recommended that you use "raw strings" for these cases.
#   Backslashes in raw strings are treated literally.  A raw string is
#   written by prefixing an "r" character to a string.  Example:
#
#       ctx.sort_executable = r'c:\windows\system32\sort.exe'
#
# Two identifiers will have been defined before this file is executed,
# and can be used freely within this file:
#
#     ctx -- a Ctx object (see cvs2svn_lib/context.py), which holds
#         many configuration options
#
#     run_options -- an instance of the OptionsFileRunOptions class
#         (see cvs2svn_lib/run_options.py), which holds some variables
#         governing how cvs2svn is run

# Import some modules that are used in setting the options:
import re

from cvs2svn_lib import config
from cvs2svn_lib import changeset_database
from cvs2svn_lib.common import CVSTextDecoder
from cvs2svn_lib.log import Log
from cvs2svn_lib.project import Project
from cvs2svn_lib.svn_output_option import DumpfileOutputOption
from cvs2svn_lib.svn_output_option import ExistingRepositoryOutputOption
from cvs2svn_lib.svn_output_option import NewRepositoryOutputOption
from cvs2svn_lib.revision_manager import NullRevisionRecorder
from cvs2svn_lib.revision_manager import NullRevisionExcluder
from cvs2svn_lib.rcs_revision_manager import RCSRevisionReader
from cvs2svn_lib.cvs_revision_manager import CVSRevisionReader
from cvs2svn_lib.checkout_internal import InternalRevisionRecorder
from cvs2svn_lib.checkout_internal import InternalRevisionExcluder
from cvs2svn_lib.checkout_internal import InternalRevisionReader
from cvs2svn_lib.symbol_strategy import AllBranchRule
from cvs2svn_lib.symbol_strategy import AllTagRule
from cvs2svn_lib.symbol_strategy import BranchIfCommitsRule
from cvs2svn_lib.symbol_strategy import ExcludeRegexpStrategyRule
from cvs2svn_lib.symbol_strategy import ForceBranchRegexpStrategyRule
from cvs2svn_lib.symbol_strategy import ForceTagRegexpStrategyRule
from cvs2svn_lib.symbol_strategy import ExcludeTrivialImportBranchRule
from cvs2svn_lib.symbol_strategy import ExcludeVendorBranchRule
from cvs2svn_lib.symbol_strategy import HeuristicStrategyRule
from cvs2svn_lib.symbol_strategy import UnambiguousUsageRule
from cvs2svn_lib.symbol_strategy import HeuristicPreferredParentRule
from cvs2svn_lib.symbol_strategy import SymbolHintsFileRule
from cvs2svn_lib.symbol_transform import ReplaceSubstringsSymbolTransform
from cvs2svn_lib.symbol_transform import RegexpSymbolTransform
from cvs2svn_lib.symbol_transform import NormalizePathsSymbolTransform
from cvs2svn_lib.property_setters import AutoPropsPropertySetter
from cvs2svn_lib.property_setters import CVSBinaryFileDefaultMimeTypeSetter
from cvs2svn_lib.property_setters import CVSBinaryFileEOLStyleSetter
from cvs2svn_lib.property_setters import CVSRevisionNumberSetter
from cvs2svn_lib.property_setters import DefaultEOLStyleSetter
from cvs2svn_lib.property_setters import EOLStyleFromMimeTypeSetter
from cvs2svn_lib.property_setters import ExecutablePropertySetter
from cvs2svn_lib.property_setters import KeywordsPropertySetter
from cvs2svn_lib.property_setters import MimeMapper
from cvs2svn_lib.property_setters import SVNBinaryFileKeywordsPropertySetter

# To choose the level of logging output, uncomment one of the
# following lines:
#Log().log_level = Log.WARN
#Log().log_level = Log.QUIET
Log().log_level = Log.NORMAL
#Log().log_level = Log.VERBOSE
#Log().log_level = Log.DEBUG

# There are several possible options for where to put the output of a
# cvs2svn conversion.  Please choose one of the following and adjust
# the parameters as necessary:

# Use this output option if you would like cvs2svn to create a new SVN
# repository and store the converted repository there.  The first
# argument is the path to which the repository should be written (this
# repository must not already exist).  The second (optional) argument
# allows a --fs-type option to be passed to "svnadmin create".  The
# third (optional) argument can be specified to set the
# --bdb-txn-nosync option on a bdb repository.  The fourth (optional)
# argument can be specified to set a list of verbatim options to be passed
# to "svnadmin create":
ctx.output_option = NewRepositoryOutputOption(
    r'/path/to/svnrepo', # Path to repository
    #fs_type='fsfs', # Type of repository to create
    #bdb_txn_nosync=False, # For bsd repositories, this option can be added
    #create_options=['--pre-1.5-compatible'], # Options for "svnadmin create"
    )

# Use this output option if you would like cvs2svn to store the
# converted CVS repository into an SVN repository that already exists.
# The argument is the filesystem path of an existing local SVN
# repository (this repository must already exist):
#ctx.output_option = ExistingRepositoryOutputOption(
#    r'/path/to/svnrepo', # Path to repository
#    )

# Use this type of output option if you want the output of the
# conversion to be written to a SVN dumpfile instead of committing
# them into an actual repository:
#ctx.output_option = DumpfileOutputOption(
#    dumpfile_path=r'/path/to/cvs2svn-dump', # Name of dumpfile to create
#    )

# Independent of the ctx.output_option selected, the following option
# can be set to True to suppress cvs2svn output altogether:
ctx.dry_run = False

# The following set of options specifies how the revision contents of
# the RCS files should be read.
#
# The default selection is InternalRevisionReader, which uses built-in
# code that reads the RCS deltas while parsing the files in
# CollectRevsPass.  This method is very fast but requires lots of
# temporary disk space.  The disk space is required for (1) storing
# all of the RCS deltas, and (2) during OutputPass, keeping a copy of
# the full text of every revision that still has a descendant that
# hasn't yet been committed.  Since this can includes multiple
# revisions of each file (i.e., on multiple branches), the required
# amount of temporary space can potentially be many times the size of
# a checked out copy of the whole repository.  Setting compress=True
# cuts the disk space requirements by about 50% at the price of
# increased CPU usage.  Using compression usually speeds up the
# conversion due to the reduced I/O pressure, unless --tmpdir is on a
# RAM disk.  This method does not expand CVS's "Log" keywords.
#
# The second possibility is RCSRevisionReader, which uses RCS's "co"
# program to extract the revision contents of the RCS files during
# OutputPass.  This option doesn't require any temporary space, but it
# is relatively slow because (1) "co" has to be executed very many
# times; and (2) "co" itself has to assemble many file deltas to
# compute the contents of a particular revision.  The constructor
# argument specifies how to invoke the "co" executable.
#
# The third possibility is CVSRevisionReader, which uses the "cvs"
# program to extract the revision contents out of the RCS files during
# OutputPass.  This option doesn't require any temporary space, but it
# is the slowest of all, because "cvs" is considerably slower than
# "co".  However, it works in some situations where RCSRevisionReader
# fails; see the HTML documentation of the "--use-cvs" option for
# details.  The constructor argument specifies how to invoke the "co"
# executable.
#
# Choose one of the following three groups of lines:
ctx.revision_recorder = InternalRevisionRecorder(compress=True)
ctx.revision_excluder = InternalRevisionExcluder()
ctx.revision_reader = InternalRevisionReader(compress=True)

#ctx.revision_recorder = NullRevisionRecorder()
#ctx.revision_excluder = NullRevisionExcluder()
#ctx.revision_reader = RCSRevisionReader(co_executable=r'co')

#ctx.revision_recorder = NullRevisionRecorder()
#ctx.revision_excluder = NullRevisionExcluder()
#ctx.revision_reader = CVSRevisionReader(cvs_executable=r'cvs')

# Set the name (and optionally the path) of some other executables
# required by cvs2svn:
ctx.svnadmin_executable = r'svnadmin'
ctx.sort_executable = r'sort'

# Change the following line to True if the conversion should only
# include the trunk of the repository (i.e., all branches and tags
# should be ignored):
ctx.trunk_only = False

# Change the following line to True if cvs2svn should delete a
# directory once the last file has been deleted from it:
ctx.prune = False

# How to convert author names, log messages, and filenames to unicode.
# The first argument to CVSTextDecoder is a list of encoders that are
# tried in order in 'strict' mode until one of them succeeds.  If none
# of those succeeds, then fallback_encoder is used in lossy 'replace'
# mode (if it is specified).  Setting a fallback encoder ensures that
# the encoder always succeeds, but it can cause information loss.
ctx.cvs_author_decoder = CVSTextDecoder(
    [
        'latin1',
        #'utf8',
        #'ascii',
        ],
    fallback_encoding='ascii'
    )
ctx.cvs_log_decoder = CVSTextDecoder(
    [
        'latin1',
        #'utf8',
        #'ascii',
        ],
    fallback_encoding='ascii'
    )
# You might want to be especially strict when converting filenames to
# unicode (e.g., maybe not specify a fallback_encoding).
ctx.cvs_filename_decoder = CVSTextDecoder(
    [
        'latin1',
        #'utf8',
        #'ascii',
        ],
    fallback_encoding='ascii'
    )

# Template for the commit message to be used for initial project
# commits.
ctx.initial_project_commit_message = (
    'Standard project directories initialized by cvs2svn.'
    )

# Template for the commit message to be used for post commits, in
# which modifications to a vendor branch are copied back to trunk.
# This message can use '%(revnum)d' to include the revision number of
# the revision that included the change to the vendor branch.
ctx.post_commit_message = (
    'This commit was generated by cvs2svn to compensate for '
    'changes in r%(revnum)d, which included commits to RCS files '
    'with non-trunk default branches.'
    )

# Template for the commit message to be used for commits in which
# symbols are created.  This message can use '%(symbol_type)d' to
# include the type of the symbol ('branch' or 'tag') or
# '%(symbol_name)' to include the name of the symbol.
ctx.symbol_commit_message = (
    "This commit was manufactured by cvs2svn to create %(symbol_type)s "
    "'%(symbol_name)s'."
    )

# Some CVS clients for MacOS store resource fork data into CVS along
# with the file contents itself by wrapping it all up in a container
# format called "AppleSingle".  Subversion currently does not support
# MacOS resource forks.  Nevertheless, sometimes the resource fork
# information is not necessary and can be discarded.  Set the
# following option to True if you would like cvs2svn to identify files
# whose contents are encoded in AppleSingle format, and discard all
# but the data fork for such files before committing them to
# Subversion.  (Please note that AppleSingle contents are identified
# by the AppleSingle magic number as the first four bytes of the file.
# This check is not failproof, so only set this option if you think
# you need it.)
ctx.decode_apple_single = False

# This option can be set to the name of a filename to which are stored
# statistics and conversion decisions about the CVS symbols.
ctx.symbol_info_filename = None
#ctx.symbol_info_filename = 'symbol-info.txt'

# cvs2svn uses "symbol strategy rules" to help decide how to handle
# CVS symbols.  The rules in a project's symbol_strategy_rules are
# applied in order, and each rule is allowed to modify the symbol.
# The result (after each of the rules has been applied) is used for
# the conversion.
#
# 1. A CVS symbol might be used as a tag in one file and as a branch
#    in another file.  cvs2svn has to decide whether to convert such a
#    symbol as a tag or as a branch.  cvs2svn uses a series of
#    heuristic rules to decide how to convert a symbol.  The user can
#    override the default rules for specific symbols or symbols
#    matching regular expressions.
#
# 2. cvs2svn is also capable of excluding symbols from the conversion
#    (provided no other symbols depend on them.
#
# 3. CVS does not record unambiguously the line of development from
#    which a symbol sprouted.  cvs2svn uses a heuristic to choose a
#    symbol's "preferred parents".
#
# The standard branch/tag/exclude StrategyRules do not change a symbol
# that has already been processed by an earlier rule, so in effect the
# first matching rule is the one that is used.

global_symbol_strategy_rules = [
    # It is possible to specify manually exactly how symbols should be
    # converted and what line of development should be used as the
    # preferred parent.  To do so, create a file containing the symbol
    # hints and enable the following option.
    #
    # The format of the hints file is described in the documentation
    # for the SymbolHintsFileRule class in
    # cvs2svn_lib/symbol_strategy.py.  The file output by the
    # --write-symbol-info (i.e., ctx.symbol_info_filename) option is
    # in the same format.  The simplest way to use this option is to
    # run the conversion through CollateSymbolsPass with
    # --write-symbol-info option, copy the symbol info and edit it to
    # create a hints file, then re-start the conversion at
    # CollateSymbolsPass with this option enabled.
    #SymbolHintsFileRule('symbol-hints.txt'),

    # To force all symbols matching a regular expression to be
    # converted as branches, add rules like the following:
    #ForceBranchRegexpStrategyRule(r'branch.*'),

    # To force all symbols matching a regular expression to be
    # converted as tags, add rules like the following:
    #ForceTagRegexpStrategyRule(r'tag.*'),

    # To force all symbols matching a regular expression to be
    # excluded from the conversion, add rules like the following:
    #ExcludeRegexpStrategyRule(r'unknown-.*'),

    # Sometimes people use "cvs import" to get their own source code
    # into CVS.  This practice creates a vendor branch 1.1.1 and
    # imports the code onto the vendor branch as 1.1.1.1, then copies
    # the same content to the trunk as version 1.1.  Normally, such
    # vendor branches are useless and they complicate the SVN history
    # unnecessarily.  The following rule excludes any branches that
    # only existed as a vendor branch with a single import (leaving
    # only the 1.1 revision).  If you want to retain such branches,
    # comment out the following line.  (Please note that this rule
    # does not exclude vendor *tags*, as they are not so easy to
    # identify.)
    ExcludeTrivialImportBranchRule(),

    # To exclude all vendor branches (branches that had "cvs import"s
    # on them bug no other kinds of commits), uncomment the following
    # line:
    #ExcludeVendorBranchRule(),

    # Usually you want this rule, to convert unambiguous symbols
    # (symbols that were only ever used as tags or only ever used as
    # branches in CVS) the same way they were used in CVS:
    UnambiguousUsageRule(),

    # If there was ever a commit on a symbol, then it cannot be
    # converted as a tag.  This rule causes all such symbols to be
    # converted as branches.  If you would like to resolve such
    # ambiguities manually, comment out the following line:
    BranchIfCommitsRule(),

    # Last in the list can be a catch-all rule that is used for
    # symbols that were not matched by any of the more specific rules
    # above.  (Assuming that BranchIfCommitsRule() was included above,
    # then the symbols that are still indeterminate at this point can
    # sensibly be converted as branches or tags.)  Include at most one
    # of these lines.  If none of these catch-all rules are included,
    # then the presence of any ambiguous symbols (that haven't been
    # disambiguated above) is an error:

    # Convert ambiguous symbols based on whether they were used more
    # often as branches or as tags:
    HeuristicStrategyRule(),
    # Convert all ambiguous symbols as branches:
    #AllBranchRule(),
    # Convert all ambiguous symbols as tags:
    #AllTagRule(),

    # The last rule is here to choose the preferred parent of branches
    # and tags, that is, the line of development from which the symbol
    # sprouts.
    HeuristicPreferredParentRule(),
    ]

# Specify a username to be used for commits generated by cvs2svn.  If
# this options is set to None then no username will be used for such
# commits:
ctx.username = None
#ctx.username = 'cvs2svn'

# ctx.svn_property_setters contains a list of rules used to set the
# svn properties on files in the converted archive.  For each file,
# the rules are tried one by one.  Any rule can add or suppress one or
# more svn properties.  Typically the rules will not overwrite
# properties set by a previous rule (though they are free to do so).
ctx.svn_property_setters.extend([
    # To read auto-props rules from a file, uncomment the following line
    # and specify a filename.  The boolean argument specifies whether
    # case should be ignored when matching filenames to the filename
    # patterns found in the auto-props file:
    #AutoPropsPropertySetter(
    #    r'/home/username/.subversion/config',
    #    ignore_case=True,
    #    ),

    # To read mime types from a file, uncomment the following line and
    # specify a filename:
    #MimeMapper(r'/etc/mime.types'),

    # Omit the svn:eol-style property from any files that are listed
    # as binary (i.e., mode '-kb') in CVS:
    CVSBinaryFileEOLStyleSetter(),

    # If the file is binary and its svn:mime-type property is not yet
    # set, set svn:mime-type to 'application/octet-stream'.
    CVSBinaryFileDefaultMimeTypeSetter(),

    # To try to determine the eol-style from the mime type, uncomment
    # the following line:
    #EOLStyleFromMimeTypeSetter(),

    # Choose one of the following lines to set the default
    # svn:eol-style if none of the above rules applied.  The argument
    # is the svn:eol-style that should be applied, or None if no
    # svn:eol-style should be set (i.e., the file should be treated as
    # binary).
    #
    # The default is to treat all files as binary unless one of the
    # previous rules has determined otherwise, because this is the
    # safest approach.  However, if you have been diligent about
    # marking binary files with -kb in CVS and/or you have used the
    # above rules to definitely mark binary files as binary, then you
    # might prefer to use 'native' as the default, as it is usually
    # the most convenient setting for text files.  Other possible
    # options: 'CRLF', 'CR', 'LF'.
    DefaultEOLStyleSetter(None),
    #DefaultEOLStyleSetter('native'),

    # Prevent svn:keywords from being set on files that have
    # svn:eol-style unset.
    SVNBinaryFileKeywordsPropertySetter(),

    # If svn:keywords has not been set yet, set it based on the file's
    # CVS mode:
    KeywordsPropertySetter(config.SVN_KEYWORDS_VALUE),

    # Set the svn:executable flag on any files that are marked in CVS as
    # being executable:
    ExecutablePropertySetter(),

    # Uncomment the following line to include the original CVS revision
    # numbers as file properties in the SVN archive:
    #CVSRevisionNumberSetter(),

    ])

# The directory to use for temporary files:
ctx.tmpdir = r'cvs2svn-tmp'

# To skip the cleanup of temporary files, uncomment the following
# option:
#ctx.skip_cleanup = True

# In CVS, it is perfectly possible to make a single commit that
# affects more than one project or more than one branch of a single
# project.  Subversion also allows such commits.  Therefore, by
# default, when cvs2svn sees what looks like a cross-project or
# cross-branch CVS commit, it converts it into a
# cross-project/cross-branch Subversion commit.
#
# However, other tools and SCMs have trouble representing
# cross-project or cross-branch commits.  (For example, Trac's Revtree
# plugin, http://www.trac-hacks.org/wiki/RevtreePlugin is confused by
# such commits.)  Therefore, we provide the following two options to
# allow cross-project/cross-branch commits to be suppressed.

# To prevent CVS commits from different projects from being merged
# into single SVN commits, change this option to False:
ctx.cross_project_commits = True

# To prevent CVS commits on different branches from being merged into
# single SVN commits, change this option to False:
ctx.cross_branch_commits = True

# By default, .cvsignore files are rendered in the output by setting
# corresponding svn:ignore properties on the parent directory, but the
# .cvsignore files themselves are not included in the conversion
# output.  If you would like to include the .cvsignore files in the
# output, change this option to True:
ctx.keep_cvsignore = False

# By default, it is a fatal error for a CVS ",v" file to appear both
# inside and outside of an "Attic" subdirectory (this should never
# happen, but frequently occurs due to botched repository
# administration).  If you would like to retain both versions of such
# files, change the following option to True, and the attic version of
# the file will be left in an SVN subdirectory called "Attic":
ctx.retain_conflicting_attic_files = False

# Now use stanzas like the following to define CVS projects that
# should be converted.  The arguments are:
#
# - The filesystem path of the project within the CVS repository.
#
# - The path that should be used for the "trunk" directory of this
#   project within the SVN repository.  This is an SVN path, so it
#   should always use forward slashes ("/").
#
# - The path that should be used for the "branches" directory of this
#   project within the SVN repository.  This is an SVN path, so it
#   should always use forward slashes ("/").
#
# - The path that should be used for the "tags" directory of this
#   project within the SVN repository.  This is an SVN path, so it
#   should always use forward slashes ("/").
#
# - A list of symbol transformations that can be used to rename
#   symbols in this project.  Each entry is a tuple (pattern,
#   replacement), where pattern is a Python regular expression pattern
#   and replacement is the text that should replace the pattern.  Each
#   pattern is matched against each symbol name.  If the pattern
#   matches, then it is replaced with the corresponding replacement
#   text.  The replacement can include substitution patterns (e.g.,
#   r'\1' or r'\g<name>').  Typically you will want to use raw strings
#   (strings with a preceding 'r', like shown in the examples) for the
#   regexp and its replacement to avoid backslash substitution within
#   those strings.

# Create the default project (using ctx.trunk, ctx.branches, and ctx.tags):
run_options.add_project(
    r'cvsrepo',
    trunk_path='trunk',
    branches_path='branches',
    tags_path='tags',
    initial_directories=[
        # The project's trunk_path, branches_path, and tags_path
        # directories are added to the SVN repository in the project's
        # first commit.  If you would like additional SVN directories
        # to be created in the project's first commit, list them here.
        #'releases',
        ],
    symbol_transforms=[
        #RegexpSymbolTransform(r'release-(\d+)_(\d+)',
        #                      r'release-\1.\2'),
        #RegexpSymbolTransform(r'release-(\d+)_(\d+)_(\d+)',
        #                      r'release-\1.\2.\3'),
        # Convert backslashes into forward slashes:
        ReplaceSubstringsSymbolTransform('\\','/'),
        # Eliminate leading, trailing, and repeated slashes:
        NormalizePathsSymbolTransform(),
        ],
    symbol_strategy_rules=[
        # Additional, project-specific symbol strategy rules can
        # be added here.
        ] + global_symbol_strategy_rules,
    )

# Add a second project, to be stored to projA/trunk, projA/branches,
# and projA/tags:
#run_options.add_project(
#    r'my/cvsrepo/projA',
#    trunk_path='projA/trunk',
#    branches_path='projA/branches',
#    tags_path='projA/tags',
#    initial_directories=[
#        # The project's trunk_path, branches_path, and tags_path
#        # directories are added to the SVN repository in the project's
#        # first commit.  If you would like additional SVN directories
#        # to be created in the project's first commit, list them here.
#        #'releases',
#        ],
#    symbol_transforms=[
#        #RegexpSymbolTransform(r'release-(\d+)_(\d+)',
#        #                      r'release-\1.\2'),
#        #RegexpSymbolTransform(r'release-(\d+)_(\d+)_(\d+)',
#        #                      r'release-\1.\2.\3'),
#        # Convert backslashes into forward slashes:
#        ReplaceSubstringsSymbolTransform('\\','/'),
#        # Eliminate leading, trailing, and repeated slashes:
#        NormalizePathsSymbolTransform(),
#        ],
#    symbol_strategy_rules=[
#        # Additional, project-specific symbol strategy rules can
#        # be added here.
#        ] + global_symbol_strategy_rules,
#    )

# Change this option to True to turn on profiling of cvs2svn (for
# debugging purposes):
run_options.profiling = False

# Should CVSItem -> Changeset database files be memory mapped?  In
# some tests, using memory mapping speeded up the overall conversion
# by about 5%.  But this option can cause the conversion to fail with
# an out of memory error if the conversion computer runs out of
# virtual address space (e.g., when running a very large conversion on
# a 32-bit operating system).  Therefore it is disabled by default.
# Uncomment the following line to allow these database files to be
# memory mapped.
#changeset_database.use_mmap_for_cvs_item_to_changeset_table = True

# (Be in -*- mode: python; coding: utf-8 -*- mode.)

# This is an example of an options file that can be used to make
# cvs2svn convert to git rather than to Subversion.  See
# www/cvs2svn.html for general information, and see the comments in
# this file and in cvs2svn-example.options for information about what
# options are available and how they can be set.

# "cvs2git" is shorthand for "cvs2svn in the mode where it is
# outputting to git instead of Subversion".  But the program that
# needs to be run is still called "cvs2svn".  Run it with the
# --options option, passing it this file as argument:
#
#     cvs2svn --options=cvs2svn-git.options

# Many options are shared with Subversion, so we simply import the
# sample Subversion options file (from the current directory, which
# should be the main cvs2svn directory)...
execfile('cvs2svn-example.options')

# ...and then we overwrite the things that need to be different.

from cvs2svn_lib.fulltext_revision_recorder \
     import SimpleFulltextRevisionRecorderAdapter
from cvs2svn_lib.git_revision_recorder import GitRevisionRecorder
from cvs2svn_lib.git_output_option import GitRevisionMarkWriter
from cvs2svn_lib.git_output_option import GitOutputOption

# Set this option to True to convert only the main branch of the CVS
# repository:
ctx.trunk_only = False

# cvs2git only supports single-project conversions (multiple-project
# conversions wouldn't really make sense for git anyway).  So these
# options must both be set to False:
ctx.cross_project_commits = False
ctx.cross_branch_commits = False

# Specify a username to be used for commits for which CVS doesn't
# record the original author (for example, the creation of a branch).
# This should be a simple (unix-style) username, but it can be
# translated into a git-style name by the author_transforms map.
ctx.username = 'cvs2svn'

# CVS uses unix login names as author names whereas git requires
# author names to be of the form "foo <bar>".  The default is to set
# the git author to "cvsauthor <cvsauthor>".  author_transforms can be
# used to map cvsauthor names (e.g., "jrandom") to a true name and
# email address (e.g., "J. Random <jrandom <at> example.com>" for the
# example shown).  All values should be either 16-bit strings (i.e.,
# with "u" as a prefix) or 8-bit strings in the utf-8 encoding.
# Please substitute your own project's usernames here to use with the
# author_transforms option of GitOutputOption below.
author_transforms={
    'jrandom' : ('J. Random', 'jrandom <at> example.com'),
    'mhagger' : ('Michael Haggerty', 'mhagger <at> alum.mit.edu'),
    'brane' : (u'Branko Čibej', 'brane <at> xbc.nu'),
    'ringstrom' : ('Tobias Ringström', 'tobias <at> ringstrom.mine.nu'),
    'dionisos' : (u'Erik Hülsmann', 'e.huelsmann <at> gmx.net'),

    # This one will be used for commits for which CVS doesn't record
    # the original author, as explained above.
    'cvs2svn' : ('cvs2svn', 'admin <at> example.com'),
    }

# This is the main option that causes cvs2svn to output to git rather
# than Subversion:
ctx.output_option = GitOutputOption(
    # The file in which to write the git-fast-import stream that
    # contains the changesets and branch/tag information:
    'cvs2svn-tmp/git-dump.dat',

    # The blobs will be written via the revision recorder, so in
    # OutputPass we only have to emit references to the blob marks:
    GitRevisionMarkWriter(),

    # This option can be set to an integer to limit the number of
    # revisions that are merged with the main parent in any commit.
    # For git output, this can be set to None (unlimited), though due
    # to the limitations of other tools you might want to set it to a
    # smaller number (e.g., 16).  For Mercurial output, this should be
    # set to 1.
    max_merges=None,
    #max_merges=1,

    # Optional map from CVS author names to git author names:
    author_transforms=author_transforms,
    )

# During CollectRevsPass, cvs2git records the contents of file
# revisions into a file in git-fast-import format.  This option
# configures that process:
ctx.revision_recorder = SimpleFulltextRevisionRecorderAdapter(
    # cvs2git uses either RCS's "co" command or CVS's "cvs co -p" to
    # extract the content of file revisions.  Here you can choose
    # whether to use RCS (faster, but fails in some rare
    # circumstances) or CVS (much slower, but more reliable).
    #RCSRevisionReader(co_executable=r'co'),
    CVSRevisionReader(cvs_executable=r'cvs'),

    # The file in which to write the git-fast-import stream that
    # contains the file revision contents:
    GitRevisionRecorder('cvs2svn-tmp/git-blob.dat'),
    )
# cvs2git does not need to know what revisions will be excluded, so
# leave this option unchanged:
ctx.revision_excluder = NullRevisionExcluder()

# cvs2git does not need a revision reader, so leave this option
# unchanged.
ctx.revision_reader = None

# Clear the default project that was set in cvs2svn-example.options:
run_options.clear_projects()

# Now add a project to be converted to git:
run_options.add_project(
    # The path to the part of the CVS repository (*not* a CVS working
    # copy) that should be converted.  This may be a subdirectory
    # (i.e., a module) within a larger CVS repository.
    r'cvsrepo',

    # See cvs2svn-example.options for more documention about symbol
    # transforms that can be set using this option.
    symbol_transforms=[
        ReplaceSubstringsSymbolTransform('\\','/'),
        NormalizePathsSymbolTransform(),
        ],

    # See www/cvs2svn.html and cvs2svn-example.options for more
    # documention about symbol strategy rules.  The settings below
    # will give pretty much a 1:1 conversion, but you can change the
    # settings for more customization (for example, excluding a symbol
    # from the conversion, forcing it to be output as a branch
    # vs. tag, changing its name, etc).
    symbol_strategy_rules=[
        # Read from a file how symbols should be converted:
        #SymbolHintsFileRule('symbol-hints.txt'),

        # Specific rules for symbols matching particular regexps:
        #ForceBranchRegexpStrategyRule(r'branch.*'),
        #ForceTagRegexpStrategyRule(r'tag.*'),
        #ExcludeRegexpStrategyRule(r'unknown-.*'),

        # If a symbol is used consistently in CVS, do the same in git:
        UnambiguousUsageRule(),

        # If there are ever commits on a symbol, force it to be a
        # branch:
        BranchIfCommitsRule(),

        # Uncomment at most one of the following group of
        # "catch-all" rules:
        HeuristicStrategyRule(),
        #AllBranchRule(),
        #AllTagRule(),

        # This rule should always be included.  It determines from
        # which parent branch each daughter branch/tag sprouts.
        HeuristicPreferredParentRule(),
        ],
    )

Thomas Manson | 7 Oct 17:04

Re: CVS migration help

btw...
 
bzr 1.3.1

Maybe that's the issue... the version on ubuntu is very old considering 1.8 rc1 has been announced...
 
 
On Tue, Oct 7, 2008 at 17:03, Thomas Manson <dev.mansonthomas <at> gmail.com> wrote:
Hi Michael,
 
  I've checkout the trunk version (the version on ubuntu hardy heron is quite old : 2.0.1)
  succeed in cvs2svn conversion,
 
 unfortunately it crashes in the same way that bzr cvsps-import does :
 
 
thomas <at> home:~/temp/bzr$ cat ../cvs2svn-tmp/git-blob.dat  ../cvs2svn-tmp/git-dump.dat |  bzr fast-import -
bzr: ERROR: exceptions.UnicodeDecodeError: 'utf8' codec can't decode bytes in position 43-45: invalid data
Traceback (most recent call last):
  File "/usr/lib/python2.5/site-packages/bzrlib/commands.py", line 834, in run_bzr_catch_errors
    return run_bzr(argv)
  File "/usr/lib/python2.5/site-packages/bzrlib/commands.py", line 790, in run_bzr
    ret = run(*run_argv)
  File "/usr/lib/python2.5/site-packages/bzrlib/commands.py", line 492, in run_argv_aliases
    return self.run(**all_cmd_args)
  File "/home/thomas/.bazaar/plugins/fastimport/__init__.py", line 199, in run
    params, verbose)
  File "/home/thomas/.bazaar/plugins/fastimport/__init__.py", line 77, in _run
    return proc.process(p.iter_commands)
  File "/home/thomas/.bazaar/plugins/fastimport/processor.py", line 83, in process
    self._process(command_iter)
  File "/home/thomas/.bazaar/plugins/fastimport/processors/generic_processor.py", line 317, in _process
    processor.ImportProcessor._process(self, command_iter)
  File "/home/thomas/.bazaar/plugins/fastimport/processor.py", line 105, in _process
    handler(self, cmd)
  File "/home/thomas/.bazaar/plugins/fastimport/processors/generic_processor.py", line 486, in commit_handler
    handler.process()
  File "/home/thomas/.bazaar/plugins/fastimport/processor.py", line 164, in process
    for fc in self.command.file_iter():
  File "/home/thomas/.bazaar/plugins/fastimport/parser.py", line 312, in iter_file_commands
    yield self._parse_file_modify(line[2:])
  File "/home/thomas/.bazaar/plugins/fastimport/parser.py", line 365, in _parse_file_modify
    path = self._path(params[2])
  File "/home/thomas/.bazaar/plugins/fastimport/parser.py", line 493, in _path
    return s.decode('utf_8')
  File "/usr/lib/python2.5/encodings/utf_8.py", line 16, in decode
    return codecs.utf_8_decode(input, errors, True)
UnicodeDecodeError: 'utf8' codec can't decode bytes in position 43-45: invalid data
bzr 1.3.1 on python 2.5.2.final.0 (linux2)
arguments: ['/usr/bin/bzr', 'fast-import', '-']
encoding: 'UTF-8', fsenc: 'UTF-8', lang: 'en_US.UTF-8'
plugins:
  bzrtools             /usr/lib/python2.5/site-packages/bzrlib/plugins/bzrtools [1.3.0]
  cvsps_import         /home/thomas/.bazaar/plugins/cvsps_import [unknown]
  fastimport           /home/thomas/.bazaar/plugins/fastimport [unknown]
  launchpad            /usr/lib/python2.5/site-packages/bzrlib/plugins/launchpad [unknown]
*** Bazaar has encountered an internal error.
    Please report a bug at https://bugs.launchpad.net/bzr/+filebug
    including this traceback, and a description of what you
    were doing when the error occurred.
 
I don't think it's related to cvs2svn or cvsps as it fails in both cases.
It should be a bzr bug.
 
I've successfully converted my project to git repository format with these set of command :
 
export CVSROOT=/home/thomas/temp/cvs2git/cvs/files

git cvsimport -C /home/thomas/temp/cvs2gitOutput/crf-irp           crf-irp
git cvsimport -C /home/thomas/temp/cvs2gitOutput/crf-irp-model     crf-irp-model
git cvsimport -C /home/thomas/temp/cvs2gitOutput/crf-irp-monitor   crf-irp-monitor
git cvsimport -C /home/thomas/temp/cvs2gitOutput/crf-irp-portail   crf-irp-portail
git cvsimport -C /home/thomas/temp/cvs2gitOutput/crf-irp-utilities crf-irp-utilities
 
 
Is it possible to convert the git version of my sources to bzr ? maybe it would be successfull.
 
 
Thomas.
 
 

On Tue, Oct 7, 2008 at 12:41, Michael Haggerty <mhagger <at> alum.mit.edu> wrote:
Jelmer Vernooij wrote:
> Am Dienstag, den 07.10.2008, 00:01 +0200 schrieb Thomas Manson:
>> I've look to it... but didn't tryed yet...
>>
>> It really misses straightforward howto (for all tools except bzr
>> cvsimport)
> cvsps-import should be the best solution here, we should just fixing
> that imho. What's blocking you from using it?

No conversion tool that is based on cvsps will be able to do a truly
reliable job of migrating from CVS.  cvsps, which was written for
another purpose, simply is not robust enough and does not emit enough
information for a complete conversion.  I gave many concrete examples of
its shortcomings on the Mercurial mailing list [1].

Deducing a project's history from CVS's incomplete records is a very
tricky thing; cvs2svn's feature list [2] will give you an idea of the
kinds of things an industrial-strength converter needs to handle.
cvs2svn deduces the CVS changesets itself, using a much more robust
algorithm than that used by cvsps.  (The main disadvantage of cvs2svn is
that it can only be used for one-time conversions, not for tracking a
live CVS repository incrementally.)

cvs2svn/cvs2git can create output in git-fast-import format [3], which
should also be readable by the bzr fast-import tool.  It hasn't gotten
much testing in "cvs2bzr" mode, but given that 90% of the job is
inferring CVS's history, it should not be too much work to fix any
problems in the "2bzr" part.  Therefore, any feedback would be much
appreciated.

(By the way, if you want to use cvs2svn to convert to bzr, I suggest
that you use the trunk version of cvs2svn, which has several
improvements compared to release 2.1.1.)

Michael

[1] http://selenic.com/pipermail/mercurial-devel/2008-February/004975.html

[2] http://cvs2svn.tigris.org/features.html

[3] http://cvs2svn.tigris.org/cvs2git.html


Jelmer Vernooij | 7 Oct 17:22
Favicon

Re: CVS migration help

Am Dienstag, den 07.10.2008, 17:03 +0200 schrieb Thomas Manson:
> Hi Michael,
>  
>   I've checkout the trunk version (the version on ubuntu hardy heron
> is quite old : 2.0.1)
>   succeed in cvs2svn conversion,
>  
>  unfortunately it crashes in the same way that bzr cvsps-import
> does : 
>  
>  
> thomas <at> home:~/temp/bzr$
> cat ../cvs2svn-tmp/git-blob.dat  ../cvs2svn-tmp/git-dump.dat |  bzr
> fast-import -
> bzr: ERROR: exceptions.UnicodeDecodeError: 'utf8' codec can't decode
> bytes in position 43-45: invalid data
The problem seems to be that one of the characters in your CVS
repository is not valid as UTF8 character. Did you specify the locale in
which the filenames are encoded explicitly somehow?

Git does not have this problem, since it does not interpret any of the
filenames you store in it. This has advantages (conversion can't fail
since you're not doing conversion at at all), but it also has
disadvantages - checking out the repository on hosts with a different
encoding breaks the filenames.

Cheers,

Jelmer

> Traceback (most recent call last):
>   File "/usr/lib/python2.5/site-packages/bzrlib/commands.py", line
> 834, in run_bzr_catch_errors
>     return run_bzr(argv)
>   File "/usr/lib/python2.5/site-packages/bzrlib/commands.py", line
> 790, in run_bzr
>     ret = run(*run_argv)
>   File "/usr/lib/python2.5/site-packages/bzrlib/commands.py", line
> 492, in run_argv_aliases
>     return self.run(**all_cmd_args)
>   File "/home/thomas/.bazaar/plugins/fastimport/__init__.py", line
> 199, in run
>     params, verbose)
>   File "/home/thomas/.bazaar/plugins/fastimport/__init__.py", line 77,
> in _run
>     return proc.process(p.iter_commands)
>   File "/home/thomas/.bazaar/plugins/fastimport/processor.py", line
> 83, in process
>     self._process(command_iter)
>   File
> "/home/thomas/.bazaar/plugins/fastimport/processors/generic_processor.py", line 317, in _process
>     processor.ImportProcessor._process(self, command_iter)
>   File "/home/thomas/.bazaar/plugins/fastimport/processor.py", line
> 105, in _process
>     handler(self, cmd)
>   File
> "/home/thomas/.bazaar/plugins/fastimport/processors/generic_processor.py", line 486, in commit_handler
>     handler.process()
>   File "/home/thomas/.bazaar/plugins/fastimport/processor.py", line
> 164, in process
>     for fc in self.command.file_iter():
>   File "/home/thomas/.bazaar/plugins/fastimport/parser.py", line 312,
> in iter_file_commands
>     yield self._parse_file_modify(line[2:])
>   File "/home/thomas/.bazaar/plugins/fastimport/parser.py", line 365,
> in _parse_file_modify
>     path = self._path(params[2])
>   File "/home/thomas/.bazaar/plugins/fastimport/parser.py", line 493,
> in _path
>     return s.decode('utf_8')
>   File "/usr/lib/python2.5/encodings/utf_8.py", line 16, in decode
>     return codecs.utf_8_decode(input, errors, True)
> UnicodeDecodeError: 'utf8' codec can't decode bytes in position 43-45:
> invalid data
> bzr 1.3.1 on python 2.5.2.final.0 (linux2)
> arguments: ['/usr/bin/bzr', 'fast-import', '-']
> encoding: 'UTF-8', fsenc: 'UTF-8', lang: 'en_US.UTF-8'
> plugins:
> 
> bzrtools             /usr/lib/python2.5/site-packages/bzrlib/plugins/bzrtools [1.3.0]
>   cvsps_import         /home/thomas/.bazaar/plugins/cvsps_import
> [unknown]
>   fastimport           /home/thomas/.bazaar/plugins/fastimport
> [unknown]
> 
> launchpad            /usr/lib/python2.5/site-packages/bzrlib/plugins/launchpad [unknown]
> *** Bazaar has encountered an internal error.
>     Please report a bug at https://bugs.launchpad.net/bzr/+filebug
>     including this traceback, and a description of what you
>     were doing when the error occurred.
> 
>  
> I don't think it's related to cvs2svn or cvsps as it fails in both
> cases.
> It should be a bzr bug.
>  
> I've successfully converted my project to git repository format with
> these set of command : 
>  
> export CVSROOT=/home/thomas/temp/cvs2git/cvs/files
> 
> git cvsimport -C /home/thomas/temp/cvs2gitOutput/crf-irp
> crf-irp
> git cvsimport -C /home/thomas/temp/cvs2gitOutput/crf-irp-model
> crf-irp-model
> git cvsimport -C /home/thomas/temp/cvs2gitOutput/crf-irp-monitor
> crf-irp-monitor
> git cvsimport -C /home/thomas/temp/cvs2gitOutput/crf-irp-portail
> crf-irp-portail
> git cvsimport -C /home/thomas/temp/cvs2gitOutput/crf-irp-utilities
> crf-irp-utilities
>  
>  
> Is it possible to convert the git version of my sources to bzr ? maybe
> it would be successfull.
>  
>  
> Thomas.
>  
>   
> 
> 
> On Tue, Oct 7, 2008 at 12:41, Michael Haggerty <mhagger <at> alum.mit.edu>
> wrote:
>         Jelmer Vernooij wrote:
>         > Am Dienstag, den 07.10.2008, 00:01 +0200 schrieb Thomas
>         Manson:
>         >> I've look to it... but didn't tryed yet...
>         >>
>         >> It really misses straightforward howto (for all tools
>         except bzr
>         >> cvsimport)
>         > cvsps-import should be the best solution here, we should
>         just fixing
>         > that imho. What's blocking you from using it?
>         
>         
>         No conversion tool that is based on cvsps will be able to do a
>         truly
>         reliable job of migrating from CVS.  cvsps, which was written
>         for
>         another purpose, simply is not robust enough and does not emit
>         enough
>         information for a complete conversion.  I gave many concrete
>         examples of
>         its shortcomings on the Mercurial mailing list [1].
>         
>         Deducing a project's history from CVS's incomplete records is
>         a very
>         tricky thing; cvs2svn's feature list [2] will give you an idea
>         of the
>         kinds of things an industrial-strength converter needs to
>         handle.
>         cvs2svn deduces the CVS changesets itself, using a much more
>         robust
>         algorithm than that used by cvsps.  (The main disadvantage of
>         cvs2svn is
>         that it can only be used for one-time conversions, not for
>         tracking a
>         live CVS repository incrementally.)
>         
>         cvs2svn/cvs2git can create output in git-fast-import format
>         [3], which
>         should also be readable by the bzr fast-import tool.  It
>         hasn't gotten
>         much testing in "cvs2bzr" mode, but given that 90% of the job
>         is
>         inferring CVS's history, it should not be too much work to fix
>         any
>         problems in the "2bzr" part.  Therefore, any feedback would be
>         much
>         appreciated.
>         
>         (By the way, if you want to use cvs2svn to convert to bzr, I
>         suggest
>         that you use the trunk version of cvs2svn, which has several
>         improvements compared to release 2.1.1.)
>         
>         Michael
>         
>         [1]
>         http://selenic.com/pipermail/mercurial-devel/2008-February/004975.html
>         
>         [2] http://cvs2svn.tigris.org/features.html
>         
>         [3] http://cvs2svn.tigris.org/cvs2git.html
> 

Thomas Manson | 7 Oct 17:47

Re: CVS migration help

in cvs2svn option file :
 
# You might want to be especially strict when converting filenames to
# unicode (e.g., maybe not specify a fallback_encoding).
ctx.cvs_filename_decoder = CVSTextDecoder(
    [
        'latin1',
        #'utf8',
        #'ascii',
        ],
    fallback_encoding='ascii'
    )

I've added the ppa repositiory for ubuntu apt-get and update to 1.7.0.1 and will try again.
Thomas
 
On Tue, Oct 7, 2008 at 17:22, Jelmer Vernooij <jelmer <at> samba.org> wrote:
Am Dienstag, den 07.10.2008, 17:03 +0200 schrieb Thomas Manson:
> Hi Michael,
>
>   I've checkout the trunk version (the version on ubuntu hardy heron
> is quite old : 2.0.1)
>   succeed in cvs2svn conversion,
>
>  unfortunately it crashes in the same way that bzr cvsps-import
> does :
>
>
> thomas <at> home:~/temp/bzr$
> cat ../cvs2svn-tmp/git-blob.dat  ../cvs2svn-tmp/git-dump.dat |  bzr
> fast-import -
> bzr: ERROR: exceptions.UnicodeDecodeError: 'utf8' codec can't decode
> bytes in position 43-45: invalid data
The problem seems to be that one of the characters in your CVS
repository is not valid as UTF8 character. Did you specify the locale in
which the filenames are encoded explicitly somehow?

Git does not have this problem, since it does not interpret any of the
filenames you store in it. This has advantages (conversion can't fail
since you're not doing conversion at at all), but it also has
disadvantages - checking out the repository on hosts with a different
encoding breaks the filenames.

Cheers,

Jelmer

> Traceback (most recent call last):
>   File "/usr/lib/python2.5/site-packages/bzrlib/commands.py", line
> 834, in run_bzr_catch_errors
>     return run_bzr(argv)
>   File "/usr/lib/python2.5/site-packages/bzrlib/commands.py", line
> 790, in run_bzr
>     ret = run(*run_argv)
>   File "/usr/lib/python2.5/site-packages/bzrlib/commands.py", line
> 492, in run_argv_aliases
>     return self.run(**all_cmd_args)
>   File "/home/thomas/.bazaar/plugins/fastimport/__init__.py", line
> 199, in run
>     params, verbose)
>   File "/home/thomas/.bazaar/plugins/fastimport/__init__.py", line 77,
> in _run
>     return proc.process(p.iter_commands)
>   File "/home/thomas/.bazaar/plugins/fastimport/processor.py", line
> 83, in process
>     self._process(command_iter)
>   File
> "/home/thomas/.bazaar/plugins/fastimport/processors/generic_processor.py", line 317, in _process
>     processor.ImportProcessor._process(self, command_iter)
>   File "/home/thomas/.bazaar/plugins/fastimport/processor.py", line
> 105, in _process
>     handler(self, cmd)
>   File
> "/home/thomas/.bazaar/plugins/fastimport/processors/generic_processor.py", line 486, in commit_handler
>     handler.process()
>   File "/home/thomas/.bazaar/plugins/fastimport/processor.py", line
> 164, in process
>     for fc in self.command.file_iter():
>   File "/home/thomas/.bazaar/plugins/fastimport/parser.py", line 312,
> in iter_file_commands
>     yield self._parse_file_modify(line[2:])
>   File "/home/thomas/.bazaar/plugins/fastimport/parser.py", line 365,
> in _parse_file_modify
>     path = self._path(params[2])
>   File "/home/thomas/.bazaar/plugins/fastimport/parser.py", line 493,
> in _path
>     return s.decode('utf_8')
>   File "/usr/lib/python2.5/encodings/utf_8.py", line 16, in decode
>     return codecs.utf_8_decode(input, errors, True)
> UnicodeDecodeError: 'utf8' codec can't decode bytes in position 43-45:
> invalid data
> bzr 1.3.1 on python 2.5.2.final.0 (linux2)
> arguments: ['/usr/bin/bzr', 'fast-import', '-']
> encoding: 'UTF-8', fsenc: 'UTF-8', lang: 'en_US.UTF-8'
> plugins:
>
> bzrtools             /usr/lib/python2.5/site-packages/bzrlib/plugins/bzrtools [1.3.0]
>   cvsps_import         /home/thomas/.bazaar/plugins/cvsps_import
> [unknown]
>   fastimport           /home/thomas/.bazaar/plugins/fastimport
> [unknown]
>
> launchpad            /usr/lib/python2.5/site-packages/bzrlib/plugins/launchpad [unknown]
> *** Bazaar has encountered an internal error.
>     Please report a bug at https://bugs.launchpad.net/bzr/+filebug
>     including this traceback, and a description of what you
>     were doing when the error occurred.
>
>
> I don't think it's related to cvs2svn or cvsps as it fails in both
> cases.
> It should be a bzr bug.
>
> I've successfully converted my project to git repository format with
> these set of command :
>
> export CVSROOT=/home/thomas/temp/cvs2git/cvs/files
>
> git cvsimport -C /home/thomas/temp/cvs2gitOutput/crf-irp
> crf-irp
> git cvsimport -C /home/thomas/temp/cvs2gitOutput/crf-irp-model
> crf-irp-model
> git cvsimport -C /home/thomas/temp/cvs2gitOutput/crf-irp-monitor
> crf-irp-monitor
> git cvsimport -C /home/thomas/temp/cvs2gitOutput/crf-irp-portail
> crf-irp-portail
> git cvsimport -C /home/thomas/temp/cvs2gitOutput/crf-irp-utilities
> crf-irp-utilities
>
>
> Is it possible to convert the git version of my sources to bzr ? maybe
> it would be successfull.
>
>
> Thomas.
>
>
>
>
> On Tue, Oct 7, 2008 at 12:41, Michael Haggerty <mhagger <at> alum.mit.edu>
> wrote:
>         Jelmer Vernooij wrote:
>         > Am Dienstag, den 07.10.2008, 00:01 +0200 schrieb Thomas
>         Manson:
>         >> I've look to it... but didn't tryed yet...
>         >>
>         >> It really misses straightforward howto (for all tools
>         except bzr
>         >> cvsimport)
>         > cvsps-import should be the best solution here, we should
>         just fixing
>         > that imho. What's blocking you from using it?
>
>
>         No conversion tool that is based on cvsps will be able to do a
>         truly
>         reliable job of migrating from CVS.  cvsps, which was written
>         for
>         another purpose, simply is not robust enough and does not emit
>         enough
>         information for a complete conversion.  I gave many concrete
>         examples of
>         its shortcomings on the Mercurial mailing list [1].
>
>         Deducing a project's history from CVS's incomplete records is
>         a very
>         tricky thing; cvs2svn's feature list [2] will give you an idea
>         of the
>         kinds of things an industrial-strength converter needs to
>         handle.
>         cvs2svn deduces the CVS changesets itself, using a much more
>         robust
>         algorithm than that used by cvsps.  (The main disadvantage of
>         cvs2svn is
>         that it can only be used for one-time conversions, not for
>         tracking a
>         live CVS repository incrementally.)
>
>         cvs2svn/cvs2git can create output in git-fast-import format
>         [3], which
>         should also be readable by the bzr fast-import tool.  It
>         hasn't gotten
>         much testing in "cvs2bzr" mode, but given that 90% of the job
>         is
>         inferring CVS's history, it should not be too much work to fix
>         any
>         problems in the "2bzr" part.  Therefore, any feedback would be
>         much
>         appreciated.
>
>         (By the way, if you want to use cvs2svn to convert to bzr, I
>         suggest
>         that you use the trunk version of cvs2svn, which has several
>         improvements compared to release 2.1.1.)
>
>         Michael
>
>         [1]
>         http://selenic.com/pipermail/mercurial-devel/2008-February/004975.html
>
>         [2] http://cvs2svn.tigris.org/features.html
>
>         [3] http://cvs2svn.tigris.org/cvs2git.html
>

--
Jelmer Vernooij <jelmer <at> samba.org> - http://samba.org/~jelmer/
Jabber: jelmer <at> jabber.fsfe.org


Thomas Manson | 7 Oct 17:52

Re: CVS migration help

Same result with 1.7.0.1.
 
I've tried with
ctx.cvs_filename_decoder = CVSTextDecoder(
    [
        #'latin1',
        'utf8',
        #'ascii',
        ],
    fallback_encoding='latin1'
    )

Same result.
 
 
 
Where do I change the encoding of bzr ?
 
bzr 1.7.1 on python 2.5.2 (linux2)

arguments: ['/usr/bin/bzr', 'fast-import', '-']
encoding: 'UTF-8', fsenc: 'UTF-8', lang: 'en_US.UTF-8'
plugins:
On Tue, Oct 7, 2008 at 17:47, Thomas Manson <dev.mansonthomas <at> gmail.com> wrote:
in cvs2svn option file :
 
# You might want to be especially strict when converting filenames to
# unicode (e.g., maybe not specify a fallback_encoding).
ctx.cvs_filename_decoder = CVSTextDecoder(
    [
        'latin1',
        #'utf8',
        #'ascii',
        ],
    fallback_encoding='ascii'
    )

I've added the ppa repositiory for ubuntu apt-get and update to 1.7.0.1 and will try again.
Thomas
 
On Tue, Oct 7, 2008 at 17:22, Jelmer Vernooij <jelmer <at> samba.org> wrote:
Am Dienstag, den 07.10.2008, 17:03 +0200 schrieb Thomas Manson:
> Hi Michael,
>
>   I've checkout the trunk version (the version on ubuntu hardy heron
> is quite old : 2.0.1)
>   succeed in cvs2svn conversion,
>
>  unfortunately it crashes in the same way that bzr cvsps-import
> does :
>
>
> thomas <at> home:~/temp/bzr$
> cat ../cvs2svn-tmp/git-blob.dat  ../cvs2svn-tmp/git-dump.dat |  bzr
> fast-import -
> bzr: ERROR: exceptions.UnicodeDecodeError: 'utf8' codec can't decode
> bytes in position 43-45: invalid data
The problem seems to be that one of the characters in your CVS
repository is not valid as UTF8 character. Did you specify the locale in
which the filenames are encoded explicitly somehow?

Git does not have this problem, since it does not interpret any of the
filenames you store in it. This has advantages (conversion can't fail
since you're not doing conversion at at all), but it also has
disadvantages - checking out the repository on hosts with a different
encoding breaks the filenames.

Cheers,

Jelmer

> Traceback (most recent call last):
>   File "/usr/lib/python2.5/site-packages/bzrlib/commands.py", line
> 834, in run_bzr_catch_errors
>     return run_bzr(argv)
>   File "/usr/lib/python2.5/site-packages/bzrlib/commands.py", line
> 790, in run_bzr
>     ret = run(*run_argv)
>   File "/usr/lib/python2.5/site-packages/bzrlib/commands.py", line
> 492, in run_argv_aliases
>     return self.run(**all_cmd_args)
>   File "/home/thomas/.bazaar/plugins/fastimport/__init__.py", line
> 199, in run
>     params, verbose)
>   File "/home/thomas/.bazaar/plugins/fastimport/__init__.py", line 77,
> in _run
>     return proc.process(p.iter_commands)
>   File "/home/thomas/.bazaar/plugins/fastimport/processor.py", line
> 83, in process
>     self._process(command_iter)
>   File
> "/home/thomas/.bazaar/plugins/fastimport/processors/generic_processor.py", line 317, in _process
>     processor.ImportProcessor._process(self, command_iter)
>   File "/home/thomas/.bazaar/plugins/fastimport/processor.py", line
> 105, in _process
>     handler(self, cmd)
>   File
> "/home/thomas/.bazaar/plugins/fastimport/processors/generic_processor.py", line 486, in commit_handler
>     handler.process()
>   File "/home/thomas/.bazaar/plugins/fastimport/processor.py", line
> 164, in process
>     for fc in self.command.file_iter():
>   File "/home/thomas/.bazaar/plugins/fastimport/parser.py", line 312,
> in iter_file_commands
>     yield self._parse_file_modify(line[2:])
>   File "/home/thomas/.bazaar/plugins/fastimport/parser.py", line 365,
> in _parse_file_modify
>     path = self._path(params[2])
>   File "/home/thomas/.bazaar/plugins/fastimport/parser.py", line 493,
> in _path
>     return s.decode('utf_8')
>   File "/usr/lib/python2.5/encodings/utf_8.py", line 16, in decode
>     return codecs.utf_8_decode(input, errors, True)
> UnicodeDecodeError: 'utf8' codec can't decode bytes in position 43-45:
> invalid data
> bzr 1.3.1 on python 2.5.2.final.0 (linux2)
> arguments: ['/usr/bin/bzr', 'fast-import', '-']
> encoding: 'UTF-8', fsenc: 'UTF-8', lang: 'en_US.UTF-8'
> plugins:
>
> bzrtools             /usr/lib/python2.5/site-packages/bzrlib/plugins/bzrtools [1.3.0]
>   cvsps_import         /home/thomas/.bazaar/plugins/cvsps_import
> [unknown]
>   fastimport           /home/thomas/.bazaar/plugins/fastimport
> [unknown]
>
> launchpad            /usr/lib/python2.5/site-packages/bzrlib/plugins/launchpad [unknown]
> *** Bazaar has encountered an internal error.
>     Please report a bug at https://bugs.launchpad.net/bzr/+filebug
>     including this traceback, and a description of what you
>     were doing when the error occurred.
>
>
> I don't think it's related to cvs2svn or cvsps as it fails in both
> cases.
> It should be a bzr bug.
>
> I've successfully converted my project to git repository format with
> these set of command :
>
> export CVSROOT=/home/thomas/temp/cvs2git/cvs/files
>
> git cvsimport -C /home/thomas/temp/cvs2gitOutput/crf-irp
> crf-irp
> git cvsimport -C /home/thomas/temp/cvs2gitOutput/crf-irp-model
> crf-irp-model
> git cvsimport -C /home/thomas/temp/cvs2gitOutput/crf-irp-monitor
> crf-irp-monitor
> git cvsimport -C /home/thomas/temp/cvs2gitOutput/crf-irp-portail
> crf-irp-portail
> git cvsimport -C /home/thomas/temp/cvs2gitOutput/crf-irp-utilities
> crf-irp-utilities
>
>
> Is it possible to convert the git version of my sources to bzr ? maybe
> it would be successfull.
>
>
> Thomas.
>
>
>
>
> On Tue, Oct 7, 2008 at 12:41, Michael Haggerty <mhagger <at> alum.mit.edu>
> wrote:
>         Jelmer Vernooij wrote:
>         > Am Dienstag, den 07.10.2008, 00:01 +0200 schrieb Thomas
>         Manson:
>         >> I've look to it... but didn't tryed yet...
>         >>
>         >> It really misses straightforward howto (for all tools
>         except bzr
>         >> cvsimport)
>         > cvsps-import should be the best solution here, we should
>         just fixing
>         > that imho. What's blocking you from using it?
>
>
>         No conversion tool that is based on cvsps will be able to do a
>         truly
>         reliable job of migrating from CVS.  cvsps, which was written
>         for
>         another purpose, simply is not robust enough and does not emit
>         enough
>         information for a complete conversion.  I gave many concrete
>         examples of
>         its shortcomings on the Mercurial mailing list [1].
>
>         Deducing a project's history from CVS's incomplete records is
>         a very
>         tricky thing; cvs2svn's feature list [2] will give you an idea
>         of the
>         kinds of things an industrial-strength converter needs to
>         handle.
>         cvs2svn deduces the CVS changesets itself, using a much more
>         robust
>         algorithm than that used by cvsps.  (The main disadvantage of
>         cvs2svn is
>         that it can only be used for one-time conversions, not for
>         tracking a
>         live CVS repository incrementally.)
>
>         cvs2svn/cvs2git can create output in git-fast-import format
>         [3], which
>         should also be readable by the bzr fast-import tool.  It
>         hasn't gotten
>         much testing in "cvs2bzr" mode, but given that 90% of the job
>         is
>         inferring CVS's history, it should not be too much work to fix
>         any
>         problems in the "2bzr" part.  Therefore, any feedback would be
>         much
>         appreciated.
>
>         (By the way, if you want to use cvs2svn to convert to bzr, I
>         suggest
>         that you use the trunk version of cvs2svn, which has several
>         improvements compared to release 2.1.1.)
>
>         Michael
>
>         [1]
>         http://selenic.com/pipermail/mercurial-devel/2008-February/004975.html
>
>         [2] http://cvs2svn.tigris.org/features.html
>
>         [3] http://cvs2svn.tigris.org/cvs2git.html
>

--
Jelmer Vernooij <jelmer <at> samba.org> - http://samba.org/~jelmer/
Jabber: jelmer <at> jabber.fsfe.org



John Arbash Meinel | 7 Oct 17:52
Favicon

Re: CVS migration help


Thomas Manson wrote:
> in cvs2svn option file :
>  
> # You might want to be especially strict when converting filenames to
> # unicode (e.g., maybe not specify a fallback_encoding).
> ctx.cvs_filename_decoder = CVSTextDecoder(
>     [
>         'latin1',
>         #'utf8',
>         #'ascii',
>         ],
>     fallback_encoding='ascii'
>     )
> 
> I've added the ppa repositiory for ubuntu apt-get and update to 1.7.0.1
> <http://1.7.0.1> and will try again.
> Thomas
>  

I'll just mention that 'latin1' can never fail to decode a string. All
8-bit bytestrings have a valid Unicode decode from 'latin1', though
sometimes they will involve control characters. (Put another way,
'latin1' looks at one character at a time, and has all 256 mappings =>
unicode defined.)

UTF-8 can sometimes fail, because it uses multiple characters. Thus
every possible string is not valid in UTF-8.

ASCII can fail because it cannot map characters with the 8th bit set
(though I'll mention that anything in ASCII is identical in UTF-8).

John
=:->


Gmane