Convenient tools for working with DVC in HPC environments with shared external caches and SSH remotes.
pip install git+ssh://git@github.com/swarbricklab/dvc_tools.git# Create a new DVC project
mkdir my-analysis && cd my-analysis
dt init my-analysis
# Or clone an existing project
dt clone git@github.com:myorg/existing-project.git
# Check configuration
dt doctorThe dt command provides subcommands for managing DVC projects:
dt init # Initialize a new DVC project with cache and remote
dt clone # Clone an existing DVC project with local configuration
dt add # Add files to DVC tracking via compute node
dt fetch # Fetch import files into cache from local sources
dt pull # Pull DVC-tracked files, handling imports automatically
dt push # Push files to all configured remotes
dt import # Import data from other repositories using local caches
dt mv # Move/rename files, preserving import metadata
dt cache # Manage external shared caches
dt remote # Manage remote storage
dt config # View and modify configuration settings
dt doctor # Diagnose common setup issuesSee the Command Reference for full documentation, or use dt <command> --help.
On HPC systems, dt supports the following pattern:
- Workspaces on fast scratch storage (e.g.,
/scratch/${PROJECT}/${USER}/) - Shared caches on scratch for team collaboration (e.g.,
/scratch/${PROJECT}/dvc/cache/) - Remotes on persistent storage (e.g.,
/g/data/${PROJECT}/dvc/) - SSH access to remotes from external systems
- Command Reference - All commands and options
- Configuration - Configuration system and scopes
- DVC Basics - Background on DVC concepts