Skip to content

Git Troubleshooting & Tips

This troubleshooting guide is mostly for data stewards. It is not a git tutorial, but rather a small start.

ARCitect – first aid checklist

Section titled ARCitect – first aid checklist

This checklist should help to identify common issues with an ARC. It may help data stewards and users to identify the overall status of the ARC and a user’s setup.

Check whether git and git-lfs are installed and executable.

  1. Open command line via ARCitect —> Tools —> Command Window
  2. Type and execute git --version and git-lfs --version

If instead of showing the versions for both tools this returns an error, something may be wrong with the Git installation or storage location of the ARC.

Check commits and compare those of the local ARC vs. the ARC in the DataHUB.

  1. Local: ARCitect —> History Menu
  2. DataHUB: ARC —> right sidebar —> Number of Commits

The commits (incl. commit message, date and committer details) in the DataHUB should be the same as in the local ARC. If not, this might help to identify at what time point (i.e. between which commits) something unexpected happened.

Check the ARC size and compare that of the local ARC vs. the ARC in DataHUB.

  1. Local: Open the ARC folder via ARCitect —> Explorer. Right-click the folder name and inspect the size via Properties (Windows) or Get Info (macOS)
  2. DataHUB: ARC —> right sidebar —> Project storage

Note, that the size of your local ARC is approximately double the size of that in the DataHUB. This is due to git’s version control mechanism. If the size is very different, the ARC synchronization was likely not successful.

Check whether large files are properly uploaded into LFS.

  1. Local: Inside ARCitect, large files should be flagged with LFS in the file tree
  2. DataHUB:
    • ARC —> right sidebar —> Project storage: The size of LFS should be high, that of Repository should be low
    • Just like in ARCitect, large files should be flagged with LFS in the file tree

If large files are unexpectedly not flagged as LFS, please check the details on Git-LFS and some trouble shooting in the section below.

Check the remote connection.

  1. ARCitect —> DataHUB sync —> Remote

Make sure, that the remote URL is correct and aligns with that of the ARC in the DataHUB. This may not be the case, if the local ARC’s folder name was changed or if the ARCs URL in the DataHUB has changed to moving the ARC or adapting the URL.

Check the current branch.

  1. ARCitect —> Commit Menu —> Dropdown “Branch”

For most use cases, the main branch should be selected. If the branch dropdown does not display main, something may be wrong with the status of your ARC. Please contact a data steward for help.

Check the current git status.

  1. Open ARC in a command line via ARCitect —> Tools —> Command Window
  2. Run git status

See Git status for details.

Check the git configuration

  1. Open ARC in a command line via ARCitect —> Tools —> Command Window
  2. Run git config --list

See Git config for details.

Check whether a .gitignore file exists in the ARC. If no .gitignore exists, this can lead to unexpected behavior for temporary, hidden files.

This article explains how to add a .gitignore file.

Identify the storage location.

  1. Is the ARC stored on a mounted external hard drive, network or server?
  2. Is the ARC stored in a cloud folder (e.g. Dropbox, iCloud, Sciebo, OneDrive)?

If so, create a new test ARC in the same and another “local-only” location (i.e. non-cloud, non-server) to check whether the issue persists. If this solves the issue, something may be tricky with your cloud or network connection. Please contact a data steward for help.

Identify whether the ARC is intact with all expected files being present.

  1. Was the ARC only partially moved or copied from one location to another (i.e. without the hidden .git folder)?
  2. Was the ARC downloaded from the DataHUB without LFS objects and tried to upload to another remote?

If so, make sure to move or copy the complete ARC folder and make sure to download the ARC including all LFS objects (not recommended for large ARCs).

  1. (if required) Install Git on user machine

  2. navigate to the ARC in trouble (via one of many options below)

  • On macOS: you can drag&drop the ARC folder from Finder into a terminal
  • On macOS: right-click ARC folder—>“Services”—> “New Terminal at Folder”
  • On windows: open folder via Explorer; type “cmd” or “powershell” into the address field on top of Explorer
  • On linux / macOS terminal: cd path/to/ARC
  • From inside ARCitect: Tools -> Command Window
  1. try some of the git commands and debugging below

This is a list of common error messages, if there is an error with the setup or ARC synchronization. The errors are displayed during synchronization via ARCitect (pop-up windows in the menus Commit or DataHUB sync) or during ARC Commander’s arc sync.

error message*possible reasonpossible solution
remote: HTTP Basic: Access denied fatal: Authentication failed for 'https://git.nfdi4plants.org/UserName/ARCname'Your computer is not “linked” to your DataHUB accountAccess Denied
error: failed to push some refs to 'https://git.nfdi4plants.org/UserName/ARCname' hint: Your push was rejected due to missing or corrupt local objects.You tried to upload LFS-tracked files that are not present on your computerMissing LFS Objects
remote: GitLab: LFS objects are missing. Ensure LFS is properly set up or try a manual "git lfs push --all"You tried to upload LFS-tracked files that are not present on your computerMissing LFS Objects
LFS: PUT "<https://git.nfdi4plants.org/.../...>" read tcp ... i/o timeoutYou ran into a time out, likely due to very large single filesPrevent LFS time out error
error: failed to push some refs to 'https://git.nfdi4plants.org/UserName/ARCname' hint: Updates were rejected because the remote contains work that you do not have locally.Your local ARC is out of sync with the remote.ARC not in sync with the DataHUB
ERROR: Can not sync with remote as no remote repository address was specified.There is no URL specified for your ARC’s remoteGit remote
ERROR: GIT: fatal: repository 'https://git.nfdi4plants.org/UserName/ARCname.git' not foundThe remote URL does not existGit remote
ERROR: GIT: fatal: detected dubious ownershipThis is an error typically seen when working on mounted network drivesDubious ownership
fatal: credential-cache unavailable; no unix socket supportLikely happens on Windows, if a gitconfig contains credential.helper=cacheAdjust the Git Credential helper setting
fatal: Need to specify how to reconcile divergent branches.Your ARC contains multiple branches that progressed independently and need to be mergedContact a data steward.
error: unable to create file <path/to/file> : Filename too longLikely occurs on Windows, if your ARC or files in your ARC are stored in a deeply nested folder, i.e. a folder in a folder in a folder …Allow very long file names
UNC paths are not supported. Defaulting to Windows directory.Might be due to working on a network drive or server.tbd Please contact a data steward for support.

Your two favorite Git commands: status and log

Section titled Your two favorite Git commands: status and log

To get a good summary of the ARC including

  • the branch you are on
  • files that were committed since last commit
  • files that were modified, but not committed (tracked)
  • typically anything buggy
Terminal window
git status

If everything’s clear and committed, this should prompt something like

Your branch is up to date with … nothing to commit, working tree clean

Now, to compare the status of the local clone vs. that of the remote (i.e. the DataHUB) with a bit more confidence and wording, use

Terminal window
git log

This displays the commit history (messages) of the ARC reverse-chronologically, i.e. top-most = latest. So if the top commit message of the local ARC is different from the last commit message displayed in the DataHUB, the ARC is out of sync.

If you like it prettier, remember “a dog”…

Terminal window
git log --all --decorate --oneline --graph

Hit q to close the log.

The gitconfig is basically the settings and preferences for your git installation. There are three types of gitconfigs. Depending on the tool (ARCitect, ARC Commander) and operating system (macOS, Linux, Windows), different git settings may be received from different config files.

flagmeaning
—globalcurrent user on that computer
—systemsystem-wide (all users)
—localcurrent repository (ARC)

The following command lists all configurations and where they originate (—show-origin) from and what there scope is (—show-scope).

Terminal window
git config --list --show-origin --show-scope

In order to only show e.g. the global gitconfig use

Terminal window
git config --global --list
Section titled Recommended git configurations

When executed inside an ARC folder, the git config --list should contain the following configurations

configurationexplanation
user.nameShould display the user’s DataHUB account name
user.emailShould display the user’s DataHUB account email address
credential.helperWhether and how DataHUB credentials are stored. Should be credential.helper=store (Windows, Linux) or credential.helper=osxkeychain (macOS)
core.longpaths=trueAllows to have very long file names or nested folder structures.
init.defaultbranch=mainProvides that newly created ARCs work on a main branch
filter.lfs.process=git-lfs filter-process, filter.lfs.required=true, filter.lfs.clean=git-lfs clean -- %f, filter.lfs.smudge=git-lfs smudge -- %fThese four settings are required for LFS
lfs.activitytimeout=0Circumvents a time our error, when trying to push ARCs to the DataHUB with very large files.

Editing the respective gitconfig is ideally done via command line.

Terminal window
git config --global user.name "Your Name"
git config --global user.email "Your eMail"
Terminal window
git config --global init.defaultBranch main

The gitconfig contains a setting, whether and how to save git credentials on your machine called credential.helper.

On Windows, you might run into the error fatal: credential-cache unavailable; no unix socket support, if it is set to credential.helper=cache.

This can be solved by either of the following:

  1. Remove “credential.helper=cache” via git config --global --unset credential.helper.
  2. Overwrite the setting with “store” instead of “cache” via git config --global credential.helper store.

Users (especially on windows) run into errors with long overall file names (i.e. full path). This setting should fix it:

Terminal window
git config --global core.longpaths true

For ARCs the “remote” is the DataHUB. The remote address (ARC url) is stored in the git of the local ARC. Display the URL, to which the local ARC is connected via

Terminal window
git remote -v

Adding a remote during arc sync

Section titled Adding a remote during arc sync

A default remote is usually added by ARC Commander or ARCitect. If the ARC does not yet exist in the DataHUB, and you created it via ARC Commander and synced it via arc sync, you will see this error:

Terminal window
ERROR: GIT: fatal: repository 'https://git.nfdi4plants.org/UserName/ARCname.git/' not found
GIT: warning: redirecting to https://git.nfdi4plants.org/UserName/ARCname.git/
...
GIT: remote: The private project UserName/ARCname was successfully created.

This is not to worry about, the ARC was created in the DataHUB during this process.

If you only see the error ERROR: GIT: fatal: repository 'https://git.nfdi4plants.org/UserName/ARCname.git/' not found, but not the following lines mentioning that the ARC was created automatically, make sure to use the “force”, i.e. arc sync --force ....

If above command does not display any remote, you can add one via

Terminal window
git remote add origin https://git.nfdi4plants.org/<UserName>/<ARCName>

You can edit a remote via

Terminal window
git remote set-url origin https://git.nfdi4plants.org/<UserName>/<ARCName>

As of now, the DataPLANT tools focus on working on a single branch (main). It can still happen that your ARC has multiple branches e.g. by accident (see git config —> init.defaultbranch) or because some git-affine collaborator knows how to create them. To display the branches of the local ARC, use

Terminal window
git branch

If you also want to display branches that exist on the remote (but not locally), use

Terminal window
git branch --all

Git LFS is basically the system in the back to simplify working with git and (ARCs containing) large data files. ARC commander and ARCitect offer options to download (clone) an ARC without large files; speeding up the process and avoiding waste of data storage, if you are only interested e.g. in the metadata.

In order to properly upload large(r) files to the DataHUB via “pure git” (i.e. on the command line) or via ARC Commander or ARCitect, Git-LFS needs to be initiated on every computer (and user account) before using these tools.

Checking whether LFS (large file storage) works properly for your ARCs

Section titled Checking whether LFS (large file storage) works properly for your ARCs
  • In ARCitect, you can see large files (defined by the threshold in the commit menu) flagged as LFS in the file tree
  • In the DataHUB LFS files are also flagged as LFS. In addition, you can click in the right sidebar of your ARC in the DataHUB on “Project Storage”. Here, the major amount of your data should be stored in “LFS”, while only a minor part is stored in “Repository”.
  • If you have git-lfs installed and know how to use there command line, simply run git lfs install.
  • You can check for the proper configuration via git config --list --show-origin --show-scope. Amongst others, the config should contain the following lines
filter.lfs.process=git-lfs filter-process
filter.lfs.required=true
filter.lfs.clean=git-lfs clean -- %f
filter.lfs.smudge=git-lfs smudge -- %f

In your home folder (Windows: C:/Users/<UserName>, macOS: Users/<UserName>), create or edit the file called .gitconfig to include the following lines.

[filter "lfs"]
process = git-lfs filter-process
required = true
clean = git-lfs clean -- %f
smudge = git-lfs smudge -- %f

When users try to upload very large files, i.e. not the overall push size, but single-very-large-files, they might run into a time out error. This setting should fix it:

Terminal window
git config lfs.activitytimeout 0

The following errors are related to missing LFS object:

Terminal window
hint: Your push was rejected due to missing or corrupt local objects.
error: failed to push some refs to 'https://git.nfdi4plants.org/UserName/ARCName.git'
Terminal window
remote: GitLab: LFS objects are missing. Ensure LFS is properly set up or try a manual "git lfs push --all".

Possible reasons, why this happens:

  • you have downloaded (cloned) an ARC without the large files (i.e. only the pointer files) and try to upload it to another location on the DataHUB (i.e. new remote due to a transfer to other user, group, etc. or renamed ARC)
  • you moved a pointer file (instead of an actual large file) from one ARC on your computer to another ARC and tried to upload

In this case you would have to download all LFS objects from the original remote first -> ask a data steward for help.

Step-by-step track large file(s) via LFS

Section titled Step-by-step track large file(s) via LFS

Done in small steps plus logging. Note this works on shells like macOS terminal, linux terminal, Git Bash (available for Windows). This likely does not work on Windows Powershell and definitely not in Windows command prompt.

  1. Track files via LFS (this adds them to .gitattributes)

    Terminal window
    git lfs track "assays/RNAseq_RawData/dataset/**"
  2. git track the .gitattributes file first

    Terminal window
    git add .gitattributes
  3. Git add the large files

    Terminal window
    git add assays/RNAseq_RawData/dataset/*
  4. Git commit (and write what’s happening to a log file)

    Terminal window
    GIT_CURL_VERBOSE=1 GIT_TRACE=1 GIT_TRACE_PACKET=1 git commit -m "add rnaseq files to LFS" -v >> git-commit-LFS.log 2>&1 &
  5. Git push (and write what’s happening to a log file)

    Terminal window
    GIT_CURL_VERBOSE=1 GIT_TRACE=1 GIT_TRACE_PACKET=1 git push -v >> git-push-LFS.log 2>&1 &

Check the status of LFS-tracked files

Section titled Check the status of LFS-tracked files
Terminal window
git lfs status

To get a list of LFS-tracked files including the size of the original file, run

Terminal window
git lfs ls-files -ls

This will display the object ID (oid), the relative path to the file and the object size. The oid is also stored in the pointer file at the file’s position.

To get a report of all LFS-tracked files including there status, use

Terminal window
git lfs ls-files -d

Amongst others, this report will print for every LFS file, whether it is downloaded (checkout: true; download: true) to the local ARC or not (checkout: false; download: false).

Common issues and error messages

Section titled Common issues and error messages

ARC files opened in multiple programs

Section titled ARC files opened in multiple programs

A common source for issues are multiple programs that work on the ARC in parallel.

  • In particular, working on the ARC with multiple softwares that have Git integration may lead to confusion. For instance, while you sync the ARC using ARCitect or ARC Commander, the changes may still be displayed as un-committed in VSCode, RStudio, PyCharm or other third-party software.

  • Many softwares produce hidden temporary files. By default these files are not shown or synced by the ARCitect or ARC Commander. They might still sometimes lead to confusion, e.g. not being able to commit changes. This is especially the case for office software (Excel, Word, LibreOffice, etc.), where e.g. one of the ISA files (isa.investigation.xlsx, isa.study.xlsx, isa.assay.xlsx) or another office file stored in the ARC may be open. However, also ARCs opened in Windows Explorer or macOS Finder sometimes led to issues.

  • Before syncing an ARC, close all ARC-files and Explorer / Finder windows

  • Avoid to edit, delete, or move files, while the ARC is being synced to the DataHUB

ARC not in sync with the DataHUB

Section titled ARC not in sync with the DataHUB

Your local ARC is likely out of sync with the remote. This happens, if you or an invited colleague work(s) on the same ARC from a different location (e.g. the DataHUB or another computer). Before working on your ARC, make sure to update the local clone via one of these

  • ARCitect —> Versioning —> Pull
  • arc sync
  • git pull (-> this would also prompt a message if changes need to be merged)

Sometimes you run into permission issues such as

Terminal window
remote: HTTP Basic: Access denied. The provided password or token is incorrect or your account has 2FA enabled and you must use a personal access token instead of a password.
Terminal window
fatal: Authentication failed for 'https://git.nfdi4plants.org/UserName/ARCName.git/'

This is due to missing or outdated DataHUB credentials on your computer. It usually helps to just retrieve new ones. If not, you might have to remove existing credentials stored on your computer.

Option 1: via ARC Commander

Option 2: “by hand”

  1. Login to the DataHUB
  2. Create a new Personal Access Token (PAT) with scope api
  3. Run a git command (e.g. arc sync, git pull) to trigger being asked for git credentials
    1. Provide your DataHUB username
    2. Use the token instead of your password

If (new) authentication alone does not help, you might need to delete existing tokens or passwords first.

  1. Run git config --get-regexp "credential" to find out whether and where credentials are stored

  2. This typically displays one of the following

    credential.helper store

    credential.helper osxkeychain (only on macOS)

  3. If credential.helper store is displayed, the credentials are typically stored in ~/.git-credentials, a hidden text file stored in the user’s home folder. Edit this file and delete the row(s) containing “git.nfdi4plants.org” (https://<UserName>:<Token>@git.nfdi4plants.org).

  4. On macOS (if credential.helper osxkeychain is displayed) open the app “Keychain Access”, search and delete passwords for “git.nfdi4plants.org”.

The error ERROR: GIT: fatal: detected dubious ownership typically occurs when working on a mounted network drive (Fileshare, File Server, NAS). Very simplified: the user on the computer and the owner of the network drive differ and git tries to safe you from working in a folder you do not own.

You can add the path to the ARC to the list of safe directories via the command

Terminal window
git config --global --add safe.directory %(prefix)///servername/share/path/to/ARC/

You can circumvent this error by adding all directories to your list of safe directories via the command

Terminal window
git config --global --add safe.directory *

To help troubleshooting add (some or all) variables GIT_CURL_VERBOSE=1 GIT_TRACE=1 GIT_TRACE_PACKET=1 before your git command to get more info, e.g.

Terminal window
GIT_CURL_VERBOSE=1 GIT_TRACE=1 GIT_TRACE_PACKET=1 git push -v >> git-push-LFS.log 2>&1 &