git & git-lfs

Overview

Git is a version control system, which allows you to keep old versions of files and directories (usually source code), keep a log of who, when, and why changes occurred, etc.,

Versions 2.43.0, 2.24.0, 2.19.0 (via a modulefile) and 1.8.3.1 (system-wide default) are installed on the CSF.

git-lfs Git Large File Storage (LFS) replaces large files such as audio samples, videos, datasets, and graphics with text pointers inside Git, while storing the file contents on a remote server like GitHub.com or GitHub Enterprise.

If you require a different version to the one installed we advise that you install it in your home directory.

Restrictions on use

All users of the system can access the software.

Set up procedure

June 2023: The proxy is no longer available – you no longer need to load the proxy modulefile but doing so will do no harm – it no longer modifies your environment.

To download data from external sites using https:// or ssh:// protocols, please do so from a batch job or use an interactive session on a backend node by running qrsh -l short.

Note that you may need to edit your ~/.gitconfig or ~/.ssh/config files to add some new config and remove old, out-of-date config (see below.)

You DO NOT then need to load the proxy modulefiles. Please see the qrsh notes for more information on interactive use.

We recommend using the newer version of git via the modulefile rather than the system-wide default version (which is now lacking recent git features.) Hence you should load one of the modulefiles:

# Load one of the following for a more recent version of git
module load tools/gcc/git/2.43.0
module load tools/gcc/git/2.24.0

# A slightly older version is available should you need it (via a different modules path)
module load apps/git/2.19.0

# If you don't load a modulefile, the default version is 1.8.3.1

To use git-lfs use:

module load tools/bintools/git-lfs/2.8.0

Note that if you get an error such as:

error: The requested URL returned error: 403 Forbidden while accessing https://github.com/.../refs

Then you are using the older system-wide default version and should load the above modulefile instead.

Running the application

On the command line type git.

If you wish to prepare a repository for use with git-lfs run the following inside each repository:

git lfs install

External Access using https://

Updated June 2023: The proxy is no longer available. Instead, you should run your commands in an interactive sessions on a compute node, or submit a batch job. Then run your commands as normal – there is NO NEED to load the proxy modulefile.

For example:

# On the login node, request an interactive session
qrsh -l short
  #
  # Wait until you are logged in to a compute node

# Now run your commands as normal
module load tools/gcc/git/2.24.0
git clone https://github.com/myproject.git

# Note: If you get an error about the web-proxy, please edit your ~/.gitconfig file:
gedit ~/.gitconfig
  #
  # Remove 'proxy' line from the [http] section
  [http]
  proxy = http://some.proxy.addr:3128/       # REMOVE THIS LINE FROM YOUR ~/.gitconfig FILE

Note: You may also need to remove some old config from your ~/.ssh/config file if using the ssh:// protocol (see below)

Once cloned, you may wish to work with a specific tag:

# List all tags (and those matching a pattern)
git tag
git tag --list v2*

# Work with a specific tag
git checkout tagname

External Access to github.com using https:// (with Personal Access Tokens)

Github are moving away from using https:// with passwords to access your repos. Their blog post says:

If you are using a password to authenticate Git operations with GitHub.com today, you must begin using a personal access token over HTTPS or SSH key by August 13, 2021, to avoid disruption.

See their documentation on how to create a Personal Access Token (this is similar to a random, expiring password.)

External Access to github.com using ssh:// (with key)

An alternative to using Personal Access Tokens with https:// is to set up a public-private key-pair for your github.com account, which will give you password-less access to github.com via ssh://.

The following is a quick summary of how to setup ssh:// access to github on the CSF using a key-pair:

# If not already done so, create a key-pair on the CSF:
ssh-keygen -t ed25519 -C your.email@manchester.ac.uk.com -f ~/.ssh/id_github
  #
  # We recommend protecting your key with a passphrase when asked, but this is optional.
  # If you intend to run git from batch jobs, do not set a passphrase.

# Now edit your ~/.ssh/config file and add the following:
gedit ~/.ssh/config

# ADD the following config to make ssh:// use your key
Host github.com
  IdentityFile ~/.ssh/id_github
  IdentitiesOnly yes

  # REMOVE any old config if present
  # HostName nataas.itservices.manchester.ac.uk
  # IdentityFile /mnt/iusers01/support/mbexegc2/.ssh/id_ed25519
  # StrictHostKeyChecking  no
  # Port 2022

(save the file)

# Now look at the public key-file contents (we'll need the displayed line shortly):
cat .ssh/id_github.pub
ssh-ed25519 AAA................... your.email@manchester.ac.uk

Now login to your github.com account as normal.

  1. Then go to your Account Settings
  2. Then SSH and GPG Keys.
  3. Press the New SSH Key green button and paste in the above line.

Finally, test you connection from the CSF:

# On the CSF login node, start an interactive session (ssh:// only works from compute nodes)
qrsh -l short
module load tools/gcc/git/2.24.0

# Use the github test method to check your keys
ssh -T git@github.com
  # If you saved your key with a passphrase you'll be prompted to enter it
# You should see
Hi githubusername! You've successfully authenticated, but GitHub does not provide shell access.

# Return to the login node
exit

Don’t clone in to scratch

We strongly recommend that you clone repositories in to you home area, or other research data storage, instead of your scratch area. This is because the scratch clean-up policy can automatically delete files not accessed for at least 3 months. A cloned repository often contains many files that you might not access and so the scratch cleanup could remove files without you being aware of it.

VSCode and github.com

Some VSCode extensions will attempt to contact github.com from the CSF login node. This occurs when you have VSCode installed on your local PC/laptop, and you use the “SSH Extension” in VSCode to login to the CSF and edit code in your CSF home directory. But it is not possible to make VSCode run the git commands on a CSF compute node, as is normally required.

To allow VSCode to work with github.com, some access to github from the login node will now work. We have added github.com to the “allow-list” of sites that can be contacted from the login node. Please note that the Research Infrastructure team DO NOT manage the list of permitted sites that can be contacted from the CSF login nodes. We can only make a request that access be granted to certain sites.

If your use of VSCode is trying to contact a github service from the CSF login node and is being blocked, please contact us with the domain name (e.g., something.github.com) that is reported in any error message you receive.

Further info

Updates

None

Last modified on April 2, 2024 at 12:33 pm by George Leaver