Part 2. Basic Git and GitHub#

Git client is an open-source, version control tool that allows you to track changes in your code and collaborate with others. It is very useful for managing code in research projects, especially when working in teams.

GitHub is a web-based platform that hosts Git repositories and provides additional features for collaboration, such as issue tracking, pull requests, and code reviews. GitHub also provides a user-friendly interface for managing repositories and collaborating with others. There are other platforms similar to GitHub, such as GitLab and Bitbucket.

Advanced usage of Git includes branching and merging, which allows multiple people to work on the same codebase without conflicts. However, for this course and your MSc project, we will only use the basic commands.

It is important as using Git helps developing good practices in coding, such as writing clear commit messages, using branches for new features or bug fixes, and regularly pushing changes to the remote repository.

But also, it is extremely usefull for research as it promotes transparency and reproducibility. It allows others to access the code to replicate results, data analysis pipelines, etc. If you know about the FAIR principles (Findable, Accessible, Interoperable, Reusable), using Git and a remote repository hosted in GitHub, GitLab or Bitbucket, can help make your code more FAIR by providing a platform for sharing, collaborating on code and reproduce results.

2.1 Basic Git commands#

We will first activate our virtual environment (if not already activated):

conda activate pyqm

For a quick reference of the most common Git commands, you can run on the terminal:

git --help

In the case that Git is not installed (you can check by running git --version), here are the instructions:

conda install git 

Quick question: Why we are not using pip to install Git?

For this course, we will use the following basic Git commands:

  • git init: Initialize a new Git repository in the current directory.

  • git clone <repository-url>: Clone an existing repository from a remote source (e.g., GitHub) to your local machine.

  • git status: Check the status of your working directory and see which files have been modified, added, or deleted.

  • git add <file>: Stage changes to a specific file for the next commit. You can also use git add . to stage all changes in the current directory.

  • git commit -m "commit message": Commit the staged changes with a descriptive message.

  • git push origin <branch>: Push your local commits to the remote repository on the specified branch (e.g., main or master).

  • git pull origin <branch>: Fetch and merge changes from the remote repository to your local branch.

  • git log: View the commit history of the repository.

  • git branch: List all branches in the repository. You can also create a new branch with git branch <branch-name>.

  • git checkout <branch-name>: Switch to a different branch in the repository.

Intially, we won’t use branches, but it is good to know that they exist. Branches allow you to work on your project committing changes to a separate branch without affecting the main codebase. This is useful for testing new features or making changes without risking the stability of the main code. In case you create a branch, and push changes to it, you will see the changes applied to that branch in the remote repository. (GitHub main page of repository → Branch dropdown menu). Then, you will need to create a pull request to merge the changes from that branch to the main one.

More information can be found here: https://confluence.atlassian.com/bitbucketserver/basic-git-commands-776639767.html

2.2 Create a GitHub account#

To create a GitHub account, follow these steps:

  1. Go to the GitHub website:

  2. Click on the “Sign up” button located at the top right corner of the page.

  3. Fill in the required information, including your username, email address, and password.

  4. Choose a plan (the free plan is sufficient for most users).

  5. Complete the account creation process by following the on-screen instructions, which may include verifying your email address.

  6. Once your account is created, you can log in and start using GitHub.

2.3 Create a new repository on your local machine and in GitHub (remote repository)#

For this section we will be using the terminal.

  1. Create a local git repository:

To begin, open up a terminal window and navigate to the directory where you want to create your new Git repository. Then, run the following commands:

mkdir my-repo
cd my-repo
git init    
  1. Create a new repository on GitHub:

    • Go to your GitHub account and click on the “New” button to create a new repository.

    • Fill in the repository name matching your local repository name (my-repo), description, and make it public.

    • Click on the “Create repository” button to create the new repository.

We will need the url of the remote repository to link it to our local repository. You can find the URL of the remote repository on the GitHub page of the repository you just created. It should look something like this: https://github.com/your-username/my-repo.git

  1. Configure your Git username and email (if not done already):

We need to configure GitHub credentials in our local machine. This is done using the following commands:

git config --global user.name "Your Name"
git config --global user.email "your.email@example.com"
  1. Link the local repository to the remote repository:

To begin, open up a terminal window and navigate to the directory where you want to create your new Git repository. Then, run the following commands:

git remote add origin https://github.com/your-username/my-repo.git
  1. Add a new file to in the local repository:

Using a text editior of your choice, create a new file called README.md in the my-repo directory. Add some content to the file, such as a brief description of your project.

Example using VIM:

vim README.md

Press i to enter insert mode, type your content (e.g. “This is a READ me test”), then press Esc, type :wq, and press Enter to save and exit.

or,

echo "# My Repository" >> README.md

After creating the file, you can use git status to see the changes:

git status

The last line will tell you that there is a file that has been added or modified, yet untracked by git.

  1. Stage the file for commit:

We then need to add the file to the staging area using the git add command:

git add README.md
git status
  1. Commit the changes:

Comminting the changes to the repository means leaving a descriptive message of what has been changed. The more descriptive, the more usefull they will be for you and your collaborators. This is done using the git commit command:

git commit -m "Initial commit with README file"
  1. Push the changes to GitHub:

Once you have committed your changes, you can push them to a remote repository on GitHub. Pushing the changes to GitHub means uploading the local repository to trhe remote repository on GitHub.

git push origin main

Note that we are assuming that the default branch is called main. In older versions of Git, the default branch was called master. If your default branch is called master, you should use git push origin master instead.

2.3 Branches (optional):#

By default, Git creates a branch called main (or master in older versions). That is the root branch of your repository. Using branches allows to develop your code in a secondary branch, while keeping an estable version of it in the main branch.

2.3.1 Create a new branch (optional):#

You can create a new branch using the git branch command:

git branch my-branch

We can then switch to the new branch using the git checkout command:

git checkout my-branch
git status

This, will push the changes to the my-branch branch instead of the main branch. For instance by running:

echo "Some changes in my branch" >> README.md
git add README.md
git commit -m "Changes in my branch"
git push origin my-branch

It is very useful when you don’t want to change the main branch until you are sure that the changes are correct. This is especially useful when working in teams since it avoids conflicts between different people’s changes.

2.3.2 Merge branches (optional):#

Once you have made changes in a branch and you are satisfied with them, you can merge the changes back into the main branch using the git merge command:

git checkout main
git merge my-branch
git status

2.4 Pull changes from GitHub:#

Whether you are working alone or in a team, it is important to regularly pull changes from the remote repository to ensure that you have the latest version of the code. This means, updating your local repository with the latest changes from the remote repository. It automatically downloads the changes and merges them into your local branch. This is done using the git pull command:

git pull origin main

If you have followed the previous steps so far, now you should have a local repository with a README.md file, and the same repository should be available on GitHub. You can verify this by going to your GitHub account and checking if the repository is there.

2.5 Clone a repository from GitHub:#

Cloning a repository means creating a local copy of a remote repository on your machine. For instance, there is a public program that you want to use (or contribute to), and it is hosted on GitHub. You can clone the repository to your local machine and use it locally.

Using packages and code from public repositories is a common practice in research. Especially in those fields which are in early-stages or very fast developing, such as machine learning, quantum computation, computational chemistry, within many others.

However, few tips and warnings:

  • Always make sure to read the license of the repository before using it, as some repositories may have restrictions on how the code can be used.

  • Be aware that some repositories may not be well-maintained or may contain bugs, so it is important to test the code thoroughly before using it in your own research.

  • Public repositories may not be peer-reviewed, so use them with caution and verify their validity.

  • And finally, if you are making your work public, always remeber to cite the original authors if you use their code in your research! (You will apprciate if one day someone cites your code :D )

To clone an existing repository from GitHub to your local machine, you can use the git clone command:

cd /path/to/clone/directory 
git clone <repository-url>

This means, downloading a copy of a repository hosted at ‘’ to your local machine. You can find the repository URL on the GitHub page of the repository you want to clone.

In addition, it allows for a good version control of your code since it tracks every change made to the codebase. If you make a mistake, you can easily revert to a previous version of the code. Writing meaninful commit messages that would also be very helpful for you and your collaborators to understand the changes made to the code over time.

Another advantage of using Git is that is allows for transoperatibility, as you can work on the same codebase from different machines. You can clone the repository to each machine and work on it locally. Just make sure to push your changes to the remote repository so that they are available on all machines!

2.6 View commit history:#

To view the commit history of a repository, you can use the git log command:

git log

This will display a list of all the commits in the repository, along with their commit messages, author information, and timestamps. You can use this information to track changes to the code over time and to identify specific commits that introduced bugs or other issues.

Additional resources#