Scaling API-first – The story of a global engineering organization
[Nvidia] Extracting Depot Paths Into New Instances of Their Own
1.
MERGE 2013 THE PERFORCE CONFERENCE SAN FRANCISCO • APRIL 24−26
Perforce White Paper
To provide a solid foundation for software development
excellence in today s demanding economy, it s critical
to have a software version management solution that
can meet your demands.
Extracting Depot Paths into New
Instances of Their Own
Mark Warren, NVIDIA
2. 2 Extracting Depot Paths into New Instances of Their Own
INTRODUCTION
As Perforce instances are used over time, they naturally grow in file and metadata size. New
files are submitted and metadata increases in size and the instance becomes unwieldy. At
some point normal operations require table locks so long that all users are affected. To help
mitigate this problem, we can increase hardware performance but there is a limit to what
hardware you can replace. A more practical method to release the building metadata pressure
is to move select datasets to their own instance/depot.
Perfsplit1
I s a tool developed by Perforce that extracts a section of a depot from an existing
Perforce database. It will access a Perforce server’s database directly and is platform and
release specific. Perfsplit does not remove data from the source Perforce server but does copy
archive files from it. Perfsplit is a good tool for this operation but does not resolve some of
these problems:
• The need for zero downtime. Most instances that are in need of splitting have a very
large user base. The need to keep instances up and running is compounded by the
number of users unable to access their instance once this process in initiated.
• Perfsplit does not rename the new instance depot. This is undesirable because it can
be confusing to users having the same depot name across multiple instances.
• The need to use “p4 snap” to copy lazy integrated files to their physical lbr location. p4
snap can considerably increase the size of the original depot depending on the size of
the area we are splitting off.
This white paper is intended to give guidelines on a method to resolve all these issues.
Preparation
To make sure we gather a complete dataset for migration from a live instance, it’s necessary to
prevent users from making changes to the path(s) we are splitting. With super access rights,
this can be done by simply restricting read-write access to this path and only allowing read-
only. This restriction ensures that the metadata structure we are splitting off will be up to date.
Once this is done, we need to create a checkpoint of the instance to gather lbr records and we
need to have a running instance of this checkpoint for Perfsplit use.
Despite the inadequacies Perfsplit has, this process makes use of it; Perfsplit is necessary to
build the foundation of the new instance. The key function of Perfsplit is using a map file to
direct it to the select path(s) to extract. Because we are splitting not only the initial path(s) but
also the integration history, we will need to append this dataset to the splitmap. To get this, we
need to grep from the newly created original instances checkpoint the lbrFile record defined in
db.rev2
of all files associated with the depot path we are splitting. The lbrFile filename specifies
where in the archives the file containing the revision may be found.
For example:
1
http://ftp.perforce.com/perforce/tools/perfsplit/perfsplit.html
2
http://www.perforce.com/perforce/doc.current/schema/#db.rev
3. 3 Extracting Depot Paths into New Instances of Their Own
grep @db.rev@ /checkpoint.XXX | grep //targeted/path/to/split/
This will give you the db.rev entries for the path you want to split. From these entries you pull
the lbrFile column and remove all entries referring to the original path. This will give you the
location of all lazy integrated files.
Because we are not making use of the p4 snap feature, we will need to add these paths to the
splitmap (mapping) files already containing the path(s) we are splitting from the original depot.
Transition
Once we have this mapping, we can begin our split using Perfsplit with the minimum options,
source, output, and splitmap file, but we also need an additional (undocumented) option “–a” to
skip the Perfsplit archive file copy step. This will build a duplicate instance of the original
metadata for all depots associated with the original split path in the output path. Because we
don’t want two instances with depots of the same name, we need to take another checkpoint of
this new instance.
Conversion
With this new checkpoint, we can shape the metadata into a new data structure. To do this, we
build another instance from the newly created checkpoint, but during creation (replay) we
make some substitutions to point the current data structure to what we want.
For example, to convert file paths from depot “foo” to depot “bar,” use the following commands:
cat <checkpoint_file> | sed –e s#//foo/path/#//bar/path/# | p4d -r $p4root -f -jr –
Now we have a new instance with the correct metadata.
Connection
The conversion now points the original metadata to a new depot area. We will need to create
this new depot “bar” to access this area. This new depot needs to be pointed to the split files.
There are a number of options for the depot files. Depending on your situation, you can copy
the files from the original location, leave them in the original location and symlink the new
depot to it, or move them to a new location and then symlink from the original depot location. In
every situation, it is important to make sure the original depot does not have write access to
these files.
Once this is done, you will have a new instance with a different name containing a complete
data structure of split files.
Verification
Verification of the new instance should be run to test the success of the transfer. Only two
errors can occur from a verify:
4. 4 Extracting Depot Paths into New Instances of Their Own
• Verification returns a "BAD" error. This is reported when the MD5 digest of a file
revision stored in the Perforce database differs from the one calculated by the “p4
verify” command. This error indicates that the file revision might be corrupted. This is
most likely due to changes to the physical files during transfer. Otherwise, files should
be confirmed by someone familiar with them or by diffing them from the original.3
• Verification returns a “MISSING” error. This indicates that Perforce cannot find the
specified revisions within the versioned file tree. Most likely this means the archive file
is missing from the versioned file tree. Check the lbrFile record of this file and make
sure that this file is in its correct location, that the new instance can access this
location, and that this file’s location was part of the splitmap.4
Cleanup
If you added paths to the splitmap to capture the lazy integrated files, these depots/files will be
accessible in the new instance. These are necessary for the new instance to locate these files
but can make the new instance look cluttered because they are not part of the original
intended split path. Because these paths are only for the instance to locate and not for user
interaction, these extra depots/files can be removed/hidden from user view by restricting their
view in the protection table. This will make the new instance look like it only has the intended
split depot path and still allow the instance access.
Completion
By implementing these steps using Perfsplit, the issues regarding zero downtime, duplicate
naming, and integration history are addressed. Resolution of these issues makes Perfsplit a
more desirable tool in a large installation environment.
3
http://answers.perforce.com/articles/KB_Article/How-to-Handle-p4-verify-BAD-Errors
4
http://answers.perforce.com/articles/KB_Article/MISSING-errors-from-p4-verify