100% found this document useful (1 vote)
149 views

Git Tutorial

This document provides a brief introduction to Git and some common operations: - Git is an open source distributed version control system created by Linus Torvalds in 2005 as an alternative to centralized systems like CVS or Subversion. - The document explains how to install Git, configure a user, and connect to a Gitlab instance for source control. - Basic Git commands are demonstrated like init, add, commit, status to initialize a repository and track changes to files.

Uploaded by

Roberto Martinez
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
149 views

Git Tutorial

This document provides a brief introduction to Git and some common operations: - Git is an open source distributed version control system created by Linus Torvalds in 2005 as an alternative to centralized systems like CVS or Subversion. - The document explains how to install Git, configure a user, and connect to a Gitlab instance for source control. - Basic Git commands are demonstrated like init, add, commit, status to initialize a repository and track changes to files.

Uploaded by

Roberto Martinez
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 35

Big Data Management and Analytics

The purpose of this document is to give a brief introduction into Git and to some of its more
common used operations. The intent is not to be a complete guide to the software but to be an
introductory one, being specially centered on what will be used in the lab sessions. Experience with
Source Control Systems is a big plus for a full understanding but not necessary.

Table of contents

Table of contents 1

Git as Source Control System 2


Introduction 2
Configuration and installation 2
Git Client 2
Gitlab Instance 2

First Steps 5

Travelling Across Time and Realities 11

Working With Remotes 20


Creating a remote repository 20
Pull, Fetch and Push 23
Additional Knowledge 31
Merge Requests 31
Rebase 32

Appendix 33
Lab session Git cheat sheet 33
Using Gitlab for deploying to the cluster 34

1
Git as Source Control System

Introduction
Git is a Source Control System created by Linus Torvalds in 2005. It’s a decentralized and
distributed system, as opposed to other centralized popular source control systems (CVS, Subversion).
It was created in order to substitute BitKeeper as the Source Control System for the Linux kernel
project and it was swiftly adopted by lots of projects all across the globe.

One of the key advantages git had over the old source control software is that it can be easily
set up to be used locally in a single machine. And a remote repository can be added afterwards quite
easily.

Configuration and installation

Git Client
Git client is a command line tool that is used to deal with the repositories. Depending on your
system, it may be already installed or not, and the way of installing or updating will differ. Usually it
can be done via a binary package, a package manager or compiling it yourself. Here, we add a link to
the official documentation for reference for the different systems: ​Installing Git

We will explain how to use git on the command line. On Linux or Mac OS, any of the typical
shells will work (bash, zsh, etc). If you use Windows, you can use Powershell or the new Windows
Subsystem For Linux (WSL) shell. Git can also be used from a GUI frontend and most IDEs also
include a git plugin, making the use of the command line unnecessary. If you plan to use one of these
tools, please, refer to their documentation. First step is to check if git is installed. On Linux and Mac
Os it’ll be surely already available on the system. For Windows systems, it can be installed as a
stand-alone command line tool or it’ll come bundled on some GUI frontends.

Once we have checked git is installed, it is time to configure our user. Open the shell and
configure your username and email in the following way:

git config --global user.name "your_name


git config --global user.email "your_name@domain”

Gitlab Instance
We have created our own GitLab instance (more on what’s Gitlab later), which is hosted at
https://fpc-git.upc.es/ . You should have received an invitation to log in and change your password.
Just click on the following link and define a new password:

2
Once you have changed your password, log in, and you will see an empty screen with several
options (Create Project, Create Group…). We will move to your profile settings section in the upper
right section of the screen

Once there, move to the SSH keys option on the left menu:

Here, you can add your public SSH key that will help you push to the repository without
having to enter your password each time. If you are on a Linux or MacOS system, your public key
will be residing on ​~/.ssh/id_rsa.pub ​, and if you’re on Windows it will reside on
/Users/you/.ssh/id_rsa.pub .​ If you don’t have any key generated, you can follow the following ​SSH
Tutorial​ to create it for your system.

3
Open the ​id_rsa.pub f​ ile with your editor of choice, and paste the contents on the field for the
key. Then, you can assign a name for this specific key and set an expiration date (not necessary for
now). Once you’re done you can press the “Add Key” button to have the key included.

Once you have the key included in your profile, you will be able to work with the Gitlab
instance directly without authenticating at each step. Take into account that, depending on your
system configuration, you may need to enter your keychain password in order to use your ssh keys. If
that’s the case, you can fix it editing (or creating if it doesn’t exist) the file ​~/.ssh/config and add the
following content

Host *
UseKeychain yes

4
First Steps
We will start this tutorial with the most simple exercise. Go to the shell, create a directory and
initialize it as a git repository:
mkdir myrepo
cd myrepo
git init

This initializes the directory as a Git repository. We can check the current status of the
repository using the following command:
git status

On branch master
No commits yet
nothing to commit (create/copy files and use "git add" to track)

We can see here that we are on branch master (more on that afterwards), there’s still no
commits (more on that afterwards), and there is nothing to commit. It is time to have something added
to this repository:
touch myfile.txt
git add ./myfile.txt
git status

On branch master
No commits yet
Changes to be committed:
(use "git rm --cached <file>..." to unstage)
new file: myfile.txt

We have created a file called ​myfile.txt and we have added it (this is called staged in the
context of Git). Staging is a way to prepare current changes that we have not yet “committed”. If we
do additional changes afterwards, we will have the option to discard them, or also discard the staged
ones. Once we commit the changes, they cannot be discarded and there will be a specific point in
history where we can go back if needed (and it will remain forever unless we rebase, do not worry
however about rebasing). Let’s do it:
git commit -m “Added myfile.txt”

[master (root-commit) 0d1b717] Added myfile.txt


1 file changed, 0 insertions(+), 0 deletions(-)
create mode 100644 myfile.txt

We can check there is not any pending changes to be committed by using ​git status again, we
should get something like the following:
git status
On branch master
nothing to commit, working tree clean

5
But the main purpose of a source control system is to maintain a track of the changes that you
perform on the source files. Open ​myfile.txt with your favourite text editor, and add some content.
Add a line with the following content:
This is a new line of content for the file

If we try now to check the current status of the repository, we will get the following message
describing the existence of changes on the file:
git status
On branch master
Changes not staged for commit:
(use "git add <file>..." to update what will be committed)
(use "git restore <file>..." to discard changes in working directory)
modified: myfile.txt

no changes added to commit (use "git add" and/or "git commit -a")

Now, we can add the file and then commit it, as we did before using ​git add a​ nd ​git commit
but it can be done in a single command using the ​git commit -a ​ command, that contains the ​-a ​flag :
git commit -a -m "Added first line to myfile.txt"
[master b55b0d5] Added first line to myfile.txt
1 file changed, 1 insertion(+)

We can create another file and add it to the repository, just to have a bit more of fun. And in
order for it to be real fun, we will do it in a subdirectory. So, we start doing the following:
mkdir subdir1
touch subdir1/asubdirfile.txt

Then, we edit it with to add the following content to the file:


This is the content of the subdir file

And we can check the status to see the file is marked as having changes on its contents. Then
we “stage” the file using the ​git add c​ ommand and commit the change using the ​git commit
command that we’ve used previously, so we get that output:
git status
On branch master
Untracked files:
(use "git add <file>..." to include in what will be committed)
subdir1/

nothing added to commit but untracked files present (use "git add" to track)

git add ./subdir1/asubdirfile.txt

git commit -m "Added first line to asubdirfile.txt"


[master 0c2280f] Added first line to asubdirfile.txt
1 file changed, 1 insertion(+)
create mode 100644 subdir1/asubdirfile.txt

6
Ok, so which is the real use for all of that? Now that we have performed several changes, we
can check the history of the repository doing the following:
git log
commit 0c2280fe8f625c986fea6ef8a7066ff4b5cb363c (HEAD -> master)
Author: Jordi Montornés Solé <[email protected]>
Date: Tue Sep 15 15:40:05 2020 +0200

Added first line to asubdirfile.txt

commit b55b0d5e29d30391225a05fefe251a7fa912ccda
Author: Jordi Montornés Solé <[email protected]>
Date: Mon Sep 14 09:55:53 2020 +0200

Added first line to myfile.txt

commit 512cd0ede2a7d421a739b234981d48ac49221a60
Author: Jordi Montornés Solé <[email protected]>
Date: Mon Sep 14 09:28:03 2020 +0200

Added myfile.txt

This is a linear history of the repository, listing all the different commits we have done on the
repository. We can see that on the latest one we have a message telling us it’s the current HEAD for
the master branch (more on that later). So we can represent that history as:

Not only the history of the full repository is available. We can also check the history of a
single file. It’s done using the same ​git log​ command we used previously:
git log myfile.txt
commit b55b0d5e29d30391225a05fefe251a7fa912ccda

7
Author: Jordi Montornés Solé <[email protected]>
Date: Mon Sep 14 09:55:53 2020 +0200

Added first line to myfile.txt

commit 512cd0ede2a7d421a739b234981d48ac49221a60
Author: Jordi Montornés Solé <[email protected]>
Date: Mon Sep 14 09:28:03 2020 +0200

Added myfile.txt

This enables us to get a precise list of the changes that have affected a single file, when they
were applied, and who is the author of the change. We can even see the diff of the different commits
using the following command:
git log -p myfile.txt
commit b55b0d5e29d30391225a05fefe251a7fa912ccda
Author: Jordi Montornés Solé <[email protected]>
Date: Mon Sep 14 09:55:53 2020 +0200

Added first line to myfile.txt

diff --git a/myfile.txt b/myfile.txt


index e69de29..7c55baa 100644
--- a/myfile.txt
+++ b/myfile.txt
@@ -0,0 +1 @@
+This is a new line of content for the file

commit 512cd0ede2a7d421a739b234981d48ac49221a60
Author: Jordi Montornés Solé <[email protected]>
Date: Mon Sep 14 09:28:03 2020 +0200

Added myfile.txt

diff --git a/myfile.txt b/myfile.txt


new file mode 100644
index 0000000..e69de29

We can try now to get rid of the changes we have not still committed quite easily. Open
myfile.txt ​with a text editor, and we’ll add a new line with the following content:
Another change

If we check the status of the repository, we see that the changes to the file have been detected,
and we get some spoilers about how we can get rid of the changes:
git status
On branch master
Changes not staged for commit:
(use "git add <file>..." to update what will be committed)

8
(use "git restore <file>..." to discard changes in working directory)
modified: myfile.txt

no changes added to commit (use "git add" and/or "git commit -a")

So, if we use the ​git restore​ command with the file, we are able to discard the changes:
git restore myfile.txt
On branch master
nothing to commit, working tree clean

If the changes are already staged, then we should use the ​git reset o​ r ​git restore --staged
commands to unstage them. Repeat the same steps and stage the change. We add some content to
myfile.txt​ and we stage the change using ​git add​:
git status
On branch master
Changes not staged for commit:
(use "git add <file>..." to update what will be committed)
(use "git restore <file>..." to discard changes in working directory)
modified: myfile.txt

no changes added to commit (use "git add" and/or "git commit -a")
git add ./myfile.txt
git status
On branch master
Changes to be committed:
(use "git restore --staged <file>..." to unstage)
modified: myfile.txt

And we unstage the changes, and we can get rid of them using the ​git restore​ command:
git reset HEAD ./myfile.txt
git status
On branch master
Changes not staged for commit:
(use "git add <file>..." to update what will be committed)
(use "git restore <file>..." to discard changes in working directory)
modified: myfile.txt
git restore myfile.txt
git status
On branch master
nothing to commit, working tree clean

Other operations that are really useful are moving and deleting files. We will practice both of
them using the ​git mv ​and ​ git rm ​commands. Create a new file and play with it:
touch anotherfile.txt
git add anotherfile.txt
git status
On branch master
Changes to be committed:

9
(use "git restore --staged <file>..." to unstage)
new file: anotherfile.txt
git commit -m "Added anotherfile"
[master 245a152] Added anotherfile
1 file changed, 0 insertions(+), 0 deletions(-)
create mode 100644 anotherfile.txt
git mv anotherfile.txt newnamefile.txt
git status
On branch master
Changes to be committed:
(use "git restore --staged <file>..." to unstage)
renamed: anotherfile.txt -> newnamefile.txt

git commit -m "Renamed anotherfile"


[master cdbe683] Renamed anotherfile
1 file changed, 0 insertions(+), 0 deletions(-)
rename anotherfile.txt => newnamefile.txt (100%)

git rm newnamefile.txt
rm 'newnamefile.txt'

git status
On branch master
Changes to be committed:
(use "git restore --staged <file>..." to unstage)
deleted: newnamefile.txt

git commit -m "deleted newnamefile"


[master 66d9c89] deleted newnamefile
1 file changed, 0 insertions(+), 0 deletions(-)
delete mode 100644 newnamefile.txt

10
Travelling Across Time and Realities
Another useful perk we get by using a source control system is the ability to “go back in time”
and return to a previous state of the repository. If we do a simple listing of the current directory, we’ll
get something similar to this:
ls -lrt
total 8
-rw-r--r--@ 1 jordimontornes staff 43 Sep 14 09:43 myfile.txt
drwxr-xr-x@ 3 jordimontornes staff 96 Sep 14 10:03 subdir1

Now we will try to travel to a previous state, this can be done using the ​git checkout
command with the right commit hash. Let’s use the hash for the first commit (Use your own! We are
using the example one in the tutorial!)
git checkout 512cd0ede2a7d421a739b234981d48ac49221a60
Note: switching to '512cd0ede2a7d421a739b234981d48ac49221a60'.

You are in 'detached HEAD' state. You can look around, make experimental
changes and commit them, and you can discard any commits you make in this
state without impacting any branches by switching back to a branch.

If you want to create a new branch to retain commits you create, you may
do so (now or later) by using -c with the switch command. Example:

git switch -c <new-branch-name>

Or undo this operation with:

git switch -

Turn off this advice by setting config variable advice.detachedHead to false

HEAD is now at 512cd0e Added myfile.txt

We are now in a previous state of the repository with a detached HEAD (more on that later).
So, if we do a simple listing of the repository we get the following:
ls -lrt
total 0
-rw-r--r--@ 1 jordimontornes staff 0 Sep 16 12:13 myfile.txt

We are in the status we were after the first commit, with only the ​myfile.txt present on the
repository. And we can check the status, so we will be able to confirm that we are currently on a
detached HEAD on the repository.
git status
HEAD detached at 512cd0e
nothing to commit, working tree clean

11
This situation can be represented by the following drawing:

We can return to the “normal” situation by switching back to the master branch. You can do
that using ​git checkout ​ with the master branch:
​git checkout master
Switched to branch 'master'

So, what is all this “branch” thing about? Git allows us to “branch” our repository and have
several different versions of the source code. This is especially useful when you have several
developers working on the same code base and several features are being created at the same time. So,
we’re always working on a “branch”, and by default, the initial branch is called master. We can create
new branches, work on them, merge them back to master afterwards, etc. Play a bit in order to
understand the different concepts involved. We’ll start creating a branch with the ​git branch
command:
git branch first_branch
git branch
f​ irst_branch
* master
git checkout first_branch
Switched to branch 'first_branch'
git branch
*​ first_branch
master

In these last commands, we’ve created a new branch called ​first_branch , we’ve listed the
existing branches and we’ve switched to the new branch. Let’s do some changes to the contents of
myfile.txt​ and commit them. Edit the file and add a line with the following content:
This line only exists on the first_branch

12
And now we commit this change:
git status
On branch first_branch
Changes not staged for commit:
(use "git add <file>..." to update what will be committed)
(use "git restore <file>..." to discard changes in working directory)
modified: myfile.txt

no changes added to commit (use "git add" and/or "git commit -a")
git commit -a -m "Added line on myfile.txt on the first_branch"
[first_branch 77b503f] Added line on myfile.txt on the first_branch
1 file changed, 1 insertion(+)
git status
On branch first_branch
nothing to commit, working tree clean

The interesting thing happens when we display the history of the repository as we did in
previous steps. We get this response:
git log
commit 77b503f9155ac74b134b20de9d0befec6c21f0c1 (HEAD -> first_branch)
Author: Jordi Montornés Solé <[email protected]>
Date: Fri Sep 18 11:21:05 2020 +0200

Added line on myfile.txt on the first_branch

commit 66d9c89fc6bfdc660fa4c1614ac64b394ecde3cf (master)


Author: Jordi Montornés Solé <[email protected]>
Date: Thu Sep 17 11:50:59 2020 +0200

...
...
...

commit 512cd0ede2a7d421a739b234981d48ac49221a60
Author: Jordi Montornés Solé <[email protected]>
Date: Mon Sep 14 09:28:03 2020 +0200

Added myfile.txt

The current repository situation can be understood using the following figure:

13
As you can see, the ​master branch head is still on the 6th commit, and we have the current
HEAD on the head of the ​first_branch​, so on the 7th commit. We can do even more interesting things,
so we’ll simulate there is another developer starting to work on the codebase. Switch back to master,
create a new branch and switch to it:

git checkout master


Switched to branch 'master'
git branch second_branch
git checkout second_branch
Switched to branch 'second_branch'
git branch
first_branch
master
* second_branch

Now we have moved from the ​first_branch to the ​second_branch. W ​ e can see, if we open the
myfile.txt that the changes we made on ​first_branch have disappeared. Do some different changes to
the contents of ​myfile.txt​ and commit them. Edit the file and add a line with the following content:
This line only exists on the second_branch

And now we commit the change:


git status
On branch second_branch
Changes not staged for commit:
(use "git add <file>..." to update what will be committed)
(use "git restore <file>..." to discard changes in working directory)
modified: myfile.txt

no changes added to commit (use "git add" and/or "git commit -a")
git commit -a -m "Adding line to on myfile.txt on the second_branch"
[second_branch cfbf946] Adding line to on myfile.txt on the second_branch
1 file changed, 1 insertion(+)
git status
On branch second_branch
nothing to commit, working tree clean

14
Displaying the history of the repository gives us an insight of which is the current situation on
the repository:
​git log
commit cfbf9469ff62c6db48ea95ab387adeb9b6a76cee (HEAD -> second_branch)
Author: Jordi Montornés Solé <[email protected]>
Date: Fri Sep 18 12:16:37 2020 +0200

Adding line to on myfile.txt on the second_branch

commit 66d9c89fc6bfdc660fa4c1614ac64b394ecde3cf (master)


Author: Jordi Montornés Solé <[email protected]>
Date: Thu Sep 17 11:50:59 2020 +0200

deleted newnamefile



commit 512cd0ede2a7d421a739b234981d48ac49221a60
Author: Jordi Montornés Solé <[email protected]>
Date: Mon Sep 14 09:28:03 2020 +0200

Added myfile.txt

You can observe that we cannot see any mention of ​first_branch h​ ere, as it is not a
predecessor for the current one, but it still exists as we have seen when using the ​git branch ​command.
The current situation can be represented as this:

So, imagine that the developer that is working on the ​second_branch decides that his work is
done. We need a way to integrate the changes back to ​master.​ This is done using the ​git merge
command. So we will go back to ​master ​and merge the changes back to master:

15
git checkout master
Switched to branch 'master'
git merge second_branch
Updating 66d9c89..cfbf946
Fast-forward
myfile.txt | 1 +
1 file changed, 1 insertion(+)
git log
commit cfbf9469ff62c6db48ea95ab387adeb9b6a76cee (HEAD -> master, second_branch)
Author: Jordi Montornés Solé <[email protected]>
Date: Fri Sep 18 12:16:37 2020 +0200

Adding line to on myfile.txt on the second_branch



...

commit 512cd0ede2a7d421a739b234981d48ac49221a60
Author: Jordi Montornés Solé <[email protected]>
Date: Mon Sep 14 09:28:03 2020 +0200

Added myfile.txt

Looking at the log output we can see that master is also pointing to the last commit, as it is
still doing ​second_branch .​ This merge has been quite clean (note the “Fast-Forward” message)
because there were no further changes on master since the branch diverted. It won’t be as nice if
there’s additional changes already integrated on ​master​. Let’s take a look at how the situation of the
repository can be represented:

We can clean up the repository by deleting the ​second_branch ​. This can be done using the ​git
branch -d ​command. This is a nice thing to do in order to have a tidy repository, maintaining only the

16
master branch in the long term, and deleting the branches already merged (although other ways of
working have several permanent branches, as git-flow with the ​master b​ ranch and ​development
branch)
git branch
first_branch
* master
second_branch
git branch -d second_branch
Deleted branch second_branch (was cfbf946).
git branch
​first_branch
* master

And now, the topology of the repository can be described as in the next figure:

So now we will move to a more complicated situation. We want now to merge the
first_branch to ​master.​ It’ll be more difficult because we’ve some conflicting changes in the same file
already in ​master.​ Let’s see what happens:
git branch
​ first_branch
* master
git merge first_branch
Auto-merging myfile.txt
CONFLICT (content): Merge conflict in myfile.txt
Automatic merge failed; fix conflicts and then commit the result.
git status
On branch master
You have unmerged paths.
(fix conflicts and run "git commit")
(use "git merge --abort" to abort the merge)

Unmerged paths:
(use "git add <file>..." to mark resolution)
both modified: myfile.txt

17
no changes added to commit (use "git add" and/or "git commit -a")

We have got a merge conflict due to changes that have been applied over ​myfile.txt ​on
different branches overlap, so git is not able to solve it by itself. So now we need to do a manual fix of
the result file. Open the file on a text editor, and we’ll see the content as:
This is a new line of content for the file
<<<<<<< HEAD
This line only exists on the second_branch
=======
This line only exists on the first_branch
>>>>>>> first_branch

What does it mean? We have a mark meaning there is a change coming from ​HEAD and
another one coming from ​first_branch​. We can decide what we want to maintain and then we can
commit the final result. Let’s modify the content to have this:
This is a new line of content for the file
This line comes from the second_branch
This line comes from the first_branch

Now we stage and commit the changes we have done. (we are using the “.” wildcard that
automatically apply the stage to all the files on the current directory). Then the merge is completed.
git add .
git commit -m "Solved conflicts on myfile.txt"
[master b29101d] Solved conflicts on myfile.txt
git status
On branch master
nothing to commit, working tree clean
git log
commit b29101deaaa1b12235a44222b5212d37ab3e1299 (HEAD -> master)
Merge: cfbf946 77b503f
Author: Jordi Montornés Solé <[email protected]>
Date: Thu Sep 24 12:54:38 2020 +0200

Solved conflicts on myfile.txt

commit cfbf9469ff62c6db48ea95ab387adeb9b6a76cee
Author: Jordi Montornés Solé <[email protected]>
Date: Fri Sep 18 12:16:37 2020 +0200

Adding line to on myfile.txt on the second_branch

commit 77b503f9155ac74b134b20de9d0befec6c21f0c1 (first_branch)


Author: Jordi Montornés Solé <[email protected]>
Date: Fri Sep 18 11:21:05 2020 +0200

Added line on myfile.txt on the first_branch

18
.
.
.

commit 512cd0ede2a7d421a739b234981d48ac49221a60
Author: Jordi Montornés Solé <[email protected]>
Date: Mon Sep 14 09:28:03 2020 +0200

Added myfile.txt

And we can describe the current state of the repository using the following graph:

19
Working With Remotes
Creating a remote repository
Until now, we have been working locally on a single machine. But the usual context where
people work with Source Control Systems is a context that involves several machines in different
locations. The repositories can point to several different “remotes”, but in a typical use case we will
have a single remote for each repository.

Usually these remotes are hosted on a server, and in most of the cases they are inside a
repository manager software like GitHub, GitLab or BitBucket. All of them work in quite a similar
way, but we will use Gitlab in this tutorial as it is the platform that will be used in the lab sessions.

You should have received an account for our instance of Gitlab. Once you log in, you can see
a list of the existing repositories you have currently access, or if you do not have any, an empty screen
with several options. We will click on the “Create Project” button.

Now we are in this following screen, when you can define the name of the project, if the
repository url will be under your username or under one of your groups, the description of the project,
the visibility level (private, internal or public) and if you want to initialize the repository with a
README (we do not). Fill the form with the following data:

20
We press the “Create Project” button, and we end on the empty project page, with some
instructions for pushing data to our repository. We’ll now push the repository we used in the previous
section to this project and set this GitLab instance as its remote.

First of all, open the shell and navigate to the repository base directory. Once there, we will
use the g​ it remote add ​ command to set it to our created project:
git status
On branch master
nothing to commit, working tree clean
git remote add origin [email protected]:jordi.montornes/gittutorial.git
git push -u origin --all
The authenticity of host 'fpc-git.upc.es (147.83.53.37)' can't be established.
ECDSA key fingerprint is SHA256:03pZZF6zPFseESxo0ybP2/GoJGI0dGMeaF6ZgOKbjMw.
Are you sure you want to continue connecting (yes/no/[fingerprint])? yes
Warning: Permanently added 'fpc-git.upc.es,147.83.53.37' (ECDSA) to the list of known
hosts.
Enter passphrase for key '/Users/jordimontornes/.ssh/id_rsa':
Enumerating objects: 24, done.
Counting objects: 100% (24/24), done.
Delta compression using up to 8 threads
Compressing objects: 100% (18/18), done.
Writing objects: 100% (24/24), 2.50 KiB | 1.25 MiB/s, done.
Total 24 (delta 3), reused 0 (delta 0)
remote:
remote: To create a merge request for first_branch, visit:

21
remote:
https://fpc-git.upc.es/jordi.montornes/gittutorial/-/merge_requests/new?merge_request%5Bsource_br
anch%5D=first_branch
remote:
To fpc-git.upc.es:jordi.montornes/gittutorial.git
* [new branch] first_branch -> first_branch
* [new branch] master -> master
Branch 'first_branch' set up to track remote branch 'first_branch' from 'origin'.
Branch 'master' set up to track remote branch 'master' from 'origin'.
git push -u origin --tags
Enter passphrase for key '/Users/jordimontornes/.ssh/id_rsa':
Everything up-to-date
git status
On branch master
Your branch is up to date with 'origin/master'.

nothing to commit, working tree clean

Do not worry about all the output we got from the previous commands, as most of them were
only information regarding it was the first time the SSH key has been used to connect to that machine,
and some additional information about how to create merge requests, etc. If you go back to your
project homepage and reload it on the browser, yo will see that the contents of the repository are now
listed there:

22
Pull, Fetch and Push
When working with remote repositories we have to deal with additional operations that help
us move the changes to and from the remotes. These are the Pull, Fetch and Push operations.
● Fetch: This operation downloads the missing metadata from the remote and
incorporates it on the local repository. This is really useful to have references to
remote branches locally. This action does NOT perform any change on the source
code, so any remote change is not applied to the local repository.
● Pull: Similar to fetch. When doing a Pull, you fetch the remote metadata and
afterwards tries to merge any remote changes that have been received. If there is any
conflict, it should be resolved manually.
● Push: This operation moves the local changes and data from our repository to the
remote one. If the local repository has not been synchronized with the remote one, the
operation can fail if there’s additional changes on the remote that we have not
retrieved.

Play a bit with it to understand it better. Open the shell and navigate to our repository. We
will create a new file, stage it and commit it. Looking at the status we will notice that we have moved
“ahead” of the remote one:

git status
On branch master
Your branch is up to date with 'origin/master'.

nothing to commit, working tree clean


echo "my content" > thirdfile.txt
git add .
git commit -m "Added thirdfile"
[master e23cbf2] Added thirdfile
1 file changed, 1 insertion(+)
create mode 100644 thirdfile.txt
git status
On branch master
Your branch is ahead of 'origin/master' by 1 commit.
(use "git push" to publish your local commits)

nothing to commit, working tree clean

As you will notice, we are getting the message that we are ahead of the remote. We can see
the current situation with the following diagram:

23
If we want to sync the status with the remote, we have to push the changes. However, as a
good practice, it is always better just to do a Pull before the Push, just to retrieve any change coming
from the remote. That is really important when you are working with other colleagues, as additional
changes may have been pushed meanwhile. So, we will do:
git status
On branch master
Your branch is ahead of 'origin/master' by 1 commit.
(use "git push" to publish your local commits)

nothing to commit, working tree clean


git pull origin master
From fpc-git.upc.es:jordi.montornes/gittutorial
* branch master -> FETCH_HEAD
Already up to date.
git push origin master
Enumerating objects: 4, done.
Counting objects: 100% (4/4), done.
Delta compression using up to 8 threads
Compressing objects: 100% (2/2), done.
Writing objects: 100% (3/3), 327 bytes | 327.00 KiB/s, done.
Total 3 (delta 0), reused 0 (delta 0)
To fpc-git.upc.es:jordi.montornes/gittutorial.git
b29101d..e23cbf2 master -> master
git status
On branch master
Your branch is up to date with 'origin/master'.
nothing to commit, working tree clean

And now the situation is like the one shown as follows:

24
We are now going to simulate what happens when we have other people doing changes on the
remote. We could do that by having another local repository on our machine, but we will do that in an
easier way, doing an edit and commit from the web interface. Just open the project homepage on your
browser, and click over ​thirdfile.txt .​

Then you can click on the edit button and start editing the content of the file:

We will add a new line to the file with the following content: ​This line has been edited on the
gitlab frontend.,​ and we will commit it after putting a commit message to the target branch ​master​:

25
If we open the shell and we navigate to the repository, we will get the following output when
using the ​git status ​ command:
git status
On branch master
Your branch is up to date with 'origin/master'.

nothing to commit, working tree clean

We are not seeing a reference of the new commit we performed remotely as we have not
retrieved the remote data yet. Now, open an editor, open the ​thirdfile.txt and put the following content
inside: ​This line has been edited on local repository​. Afterwards, stage and commit the file:
git status
On branch master
Your branch is up to date with 'origin/master'.

Changes not staged for commit:


(use "git add <file>..." to update what will be committed)
(use "git restore <file>..." to discard changes in working directory)
modified: thirdfile.txt

no changes added to commit (use "git add" and/or "git commit -a"​)
git add .
git commit -m "Added new line to third file.txt locally"
[master 248c1bf] Added new line to third file.txt locally
1 file changed, 1 insertion(+)
git status
On branch master
Your branch is ahead of 'origin/master' by 1 commit.
(use "git push" to publish your local commits)

nothing to commit, working tree clean

git log
commit 248c1bffb867b507abad8b0d37d5699ed1bc2370 (HEAD -> master)
Author: Jordi Montornes <[email protected]>
Date: Sun Sep 27 11:07:31 2020 +0200

Added new line to third file.txt locally

commit e23cbf2ceb17c05f37e2d01dca53dd5ef5148fd7 (origin/master)


Author: Jordi Montornes <[email protected]>
Date: Sat Sep 26 13:02:49 2020 +0200

Added thirdfile

commit b29101deaaa1b12235a44222b5212d37ab3e1299
Merge: cfbf946 77b503f
Author: Jordi Montornés Solé <[email protected]>

26
Date: Thu Sep 24 12:54:38 2020 +0200

Solved conflicts on myfile.txt

.
.
.

And now we will call the ​git fetch command to retrieve the remote metadata, and then we can
check the status to know which is the current situation:
git fetch origin master
remote: Enumerating objects: 5, done.
remote: Counting objects: 100% (5/5), done.
remote: Compressing objects: 100% (3/3), done.
remote: Total 3 (delta 1), reused 0 (delta 0), pack-reused 0
Unpacking objects: 100% (3/3), done.
From fpc-git.upc.es:jordi.montornes/gittutorial
* branch master -> FETCH_HEAD
e23cbf2..ed49ccf master -> origin/master
git status
On branch master
Your branch and 'origin/master' have diverged,
and have 1 and 1 different commits each, respectively.
(use "git pull" to merge the remote branch into yours)

nothing to commit, working tree clean

We are now getting a message that our branch ​master and the one on the remote ​origin/master
have diverged. We can see the current status with the following drawing:

27
As you can see, the 11th commit is different on the remote branch and the local one. So, what
will happen if we try to perform the pull action using the ​git pull c​ ommand?
git pull origin master
From fpc-git.upc.es:jordi.montornes/gittutorial
* branch master -> FETCH_HEAD
Auto-merging thirdfile.txt
CONFLICT (content): Merge conflict in thirdfile.txt
Automatic merge failed; fix conflicts and then commit the result.
git status
On branch master
Your branch and 'origin/master' have diverged,
and have 1 and 1 different commits each, respectively.
(use "git pull" to merge the remote branch into yours)

You have unmerged paths.


(fix conflicts and run "git commit")
(use "git merge --abort" to abort the merge)

Unmerged paths:
(use "git add <file>..." to mark resolution)
both modified: thirdfile.txt

no changes added to commit (use "git add" and/or "git commit -a")

In a way quite similar to dealing with local branches, having changes on the same file both
locally and remotely means that we are getting a conflict. We will solve it in the same way as before.
Open an editor, and we see that the file content is like that:
my content
<<<<<<< HEAD
This line has been edited on the local repository
=======
This line has been edited on the gitlab frontend.
>>>>>>> ed49ccfddd7a26d4f293c823d2c33e1032aaeeb8

We will solve the conflict opening the file with our favorite editor and we will change the
content to the following:
my content
This line has been added on the local repository
This line has been added on the gitlab frontend.

And now we will stage and commit the solved conflicts:


git commit -a -m "Solved merge conflicts"
[master e0d9464] Solved merge conflicts
git status
On branch master
Your branch is ahead of 'origin/master' by 2 commits.
(use "git push" to publish your local commits)

28
nothing to commit, working tree clean
git log
commit e0d9464b372e5d821701b03bb1abad76f9417082 (HEAD -> master)
Merge: 248c1bf ed49ccf
Author: Jordi Montornes <[email protected]>
Date: Sun Oct 4 11:40:45 2020 +0200

Solved merge conflicts

commit 248c1bffb867b507abad8b0d37d5699ed1bc2370
Author: Jordi Montornes <[email protected]>
Date: Sun Sep 27 11:07:31 2020 +0200

Added new line to third file.txt locally

commit ed49ccfddd7a26d4f293c823d2c33e1032aaeeb8 (origin/master)


Author: Jordi Montornés <[email protected]>
Date: Sun Sep 27 10:54:46 2020 +0200

Update thirdfile.txt with a new line

commit e23cbf2ceb17c05f37e2d01dca53dd5ef5148fd7
Author: Jordi Montornes <[email protected]>
Date: Sat Sep 26 13:02:49 2020 +0200

Added thirdfile

We can represent the current situation with the following graph, we have a new commit in our
local repository that fixes the conflicts we had:

29
Finally, we can just push our changes to the remote repository to have both of them
synchronized.
git status
On branch master
Your branch is ahead of 'origin/master' by 2 commits.
(use "git push" to publish your local commits)

nothing to commit, working tree clean


git push origin master
Enumerating objects: 10, done.
Counting objects: 100% (10/10), done.
Delta compression using up to 8 threads
Compressing objects: 100% (6/6), done.
Writing objects: 100% (6/6), 665 bytes | 665.00 KiB/s, done.
Total 6 (delta 2), reused 0 (delta 0)
To fpc-git.upc.es:jordi.montornes/gittutorial.git
ed49ccf..e0d9464 master -> master
git status
On branch master
Your branch is up to date with 'origin/master'.

nothing to commit, working tree clean

And the final status of our repository can be represented on the following figure:

30
Additional Knowledge
When working with remotes there’s a lot of additional concepts you should probably know in
a real working environment, but we’ll not cover in deep here as they’re not needed in the scope of the
lab sessions.

Merge Requests
Another interesting concept is the one called “Merge (Or Pull) Request”. This is a way of
working feature that has become a staple for any sizable team working on the same codebase. Usually
is done through Gitlab, GitHub, Bitbucket or another similar platform.

The idea is to have the ​master b​ ranch configured as protected, so no direct pushes can be done
from the local repository to the remote one. Instead of that, work is pushed to a different branch and
when completed, a ​merge request is opened. Then, other members of the team can check the changes,
make comments and suggestions, and finally approve the request. Once approved, it can be merged to
the ​master​ branch.

This way of working is really useful to maintain a good quality of the code, as every code is
reviewed by your peers before being integrated in the ​master branch. Moreover, as all partial
development is done in its own branch, developers can commit their daily work maintaining a track of
all the development process, and later on this work can be integrated on the codebase in a single step.

31
An example diff screen from a Merge Request

Rebase
First of all, there is the concept of rebase. Rebase can be used instead of merge when updating
from remote repositories. The idea is that instead of merging the changes coming from the remote
with the local ones, the repository first syncs with the remote one ignoring your local commits, and
later on starts applying the commits one by one over the updated repository.

One of the characteristics of using rebase is that it “rewrites” the history of the branch, so
you’ve to be careful when doing rebase on branches that have already been published remotely. Also,
if the local history has diverged a lot from the remote one, the rebase process can be long and
cumbersome. On the other hand, as an advantage, the history of the repository stays linear, and
reverting a single change becomes trivial. You can get more information about the pros and cons here:
https://www.atlassian.com/git/tutorials/merging-vs-rebasing

32
Appendix

Lab session Git cheat sheet


This is a useful cheat sheet to follow in our weekly lab sessions. It summarizes the process
you should follow on a typical lab session:
1. Clone the repository for the current session using ​git clone [email protected]:......git
2. Make a branch with the group name using ​git branch group_name and change to it using ​git
checkout group_name
3. While working, do the following process each time you want to consolidate some changes
a. Stage the changes using ​git add .
b. Commit the changes using ​git commit -m “commit message”
c. Retrieve possible remote changes (if the group is working simultaneously in a remote
fashion) with ​git pull origin group_name.​
d. Solve conflicts if needed and commit them.
e. Publish the changes to the remote using ​git push origin group_name​ .
4. The work will be checked on the corresponding branch by the lecturers.

33
Using Gitlab for deploying to the cluster
For the exercises that need deployment to the cluster, we have created at Gitlab several
Pipelines that will help you to upload their code to the Master node. These Pipelines consist of two
different stages, a build and a deploy one. The build one is automatically triggered when pushing code
to the repository, and the deploy one can be triggered manually when we want to deploy to the Master
Node in the cluster. Be aware that you need to have the cluster running to be able to deploy to the
Master node, and you’ll also need the port remapping information you get by email when the cluster
starts.

1. The first step is to edit the file ​.gitlab-ci.yml and on the variables section on the
deploy stage set up our cluster user, password and the port where port 22 has been
redirected:

2. Once we have that data included, commit and push the change to the repository. If
you go to the CI/CD section on Gitlab, you’ll see that the build stage is running:

3. Once the build stage has finished successfully, you will see the build stage with a
green check and the deploy one with a grey icon.

4. Clicking on the icon, a dropdown list will appear with the steps you can trigger. Click
on the grey play icon to trigger the deploy one. Take into account that you need to
have the cluster running in order to be able to deploy the jar file in the master node.

34
5. If everything goes right, the stage will also end successfully, and you’ll see a green
check icon.

6. We can enter on the deploy step log and see that the copy to the master node has been
performed successfully:

Once you have the code in the master node, you can execute it as it’s explained on each
session documentation. If for any reason you prefer to test your program without pushing code to the
repository, you can still build it locally and copy it to the cluster using any of the available tools
(WinScp, Nautilus, etc).

35

You might also like