0% found this document useful (0 votes)
102 views

Version Control Systems

Version control systems track changes to files over time. Centralized systems have a single server repository that users push changes to after committing locally. Distributed systems allow each user to have their own complete repository that changes are committed to locally, then pushed to the central repository. The key differences are that distributed systems allow for offline work and avoid single points of failure, while centralized systems provide tighter access control and simpler collaboration workflows. Overall, distributed systems like Git provide more flexibility while centralized systems like SVN are generally easier to use for smaller teams.

Uploaded by

Ayesha Fayyaz
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
102 views

Version Control Systems

Version control systems track changes to files over time. Centralized systems have a single server repository that users push changes to after committing locally. Distributed systems allow each user to have their own complete repository that changes are committed to locally, then pushed to the central repository. The key differences are that distributed systems allow for offline work and avoid single points of failure, while centralized systems provide tighter access control and simpler collaboration workflows. Overall, distributed systems like Git provide more flexibility while centralized systems like SVN are generally easier to use for smaller teams.

Uploaded by

Ayesha Fayyaz
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 14

What is a “version control system”?

 
Version control systems are a category of software tools that helps in recording
changes made to files by keeping a track of modifications done to the code. 
Why Version Control system is so Important?
As we know that a software product is developed in collaboration by a group of
developers they might be located at different locations and each one of them
contributes in some specific kind of functionality/features. So in order to
contribute to the product, they made modifications in the source code(either by
adding or removing). A version control system is a kind of software that helps
the developer team to efficiently communicate and manage(track) all the
changes that have been made to the source code along with the information
like who made and what change has been made. A separate branch is created
for every contributor who made the changes and the changes aren’t merged
into the original source code unless all are analyzed as soon as the changes
are green signalled they merged to the main source code. It not only keeps
source code organized but also improves productivity by making the
development process smooth.
Benefits of the version control system:
a) Enhances the project development speed by providing efficient collaboration,
b) Leverages the productivity, expedite product delivery, and skills of the
employees through better communication and assistance,
c) Reduce possibilities of errors and conflicts meanwhile project development
through traceability to every small change,
d) Employees or contributor of the project can contribute from anywhere
irrespective of the different geographical locations through this VCS,
e) For each different contributor of the project a different working copy is
maintained and not merged to the main file unless the working copy is
validated. A most popular example is Git, Helix core, Microsoft TFS,
f) Helps in recovery in case of any disaster or contingent situation,
g) Informs us about Who, What, When, Why changes have been made.
Use of Version Control System: 
 
 A repository: It can be thought of as a database of changes. It contains all
the edits and historical versions (snapshots) of the project.
 Copy of Work (sometimes called as checkout): It is the personal copy of
all the files in a project. You can edit to this copy, without affecting the work
of others and you can finally commit your changes to a repository when you
are done making your changes.
Types of Version Control Systems: 
 
 Local Version Control Systems
 Centralized Version Control Systems
 Distributed Version Control Systems
Local Version Control Systems: It is one of the simplest forms and has a
database that kept all the changes to files under revision control. RCS is one of
the most common VCS tools. It keeps patch sets (differences between files) in
a special format on disk. By adding up all the patches it can then re-create what
any file looked like at any point in time. 
Centralized Version Control Systems: Centralized version control systems
contain just one repository and each user gets their own working copy. You
need to commit to reflecting your changes in the repository. It is possible for
others to see your changes by updating. 
Two things are required to make your changes visible to others which are: 
 
 You commit
 They update
 

The benefit of CVCS (Centralized Version Control Systems) makes


collaboration amongst developers along with providing an insight to a certain
extent on what everyone else is doing on the project. It allows administrators to
fine-grained control over who can do what. 
It has some downsides as well which led to the development of DVS. The most
obvious is the single point of failure that the centralized repository represents if
it goes down during that period collaboration and saving versioned changes is
not possible. What if the hard disk of the central database becomes corrupted,
and proper backups haven’t been kept? You lose absolutely everything. 
Distributed Version Control Systems: Distributed version control systems
contain multiple repositories. Each user has their own repository and working
copy. Just committing your changes will not give others access to your
changes. This is because commit will reflect those changes in your local
repository and you need to push them in order to make them visible on the
central repository. Similarly, When you update, you do not get other’s changes
unless you have first pulled those changes into your repository. 
To make your changes visible to others, 4 things are required: 
 
 You commit
 You push
 They pull
 They update
The most popular distributed version control systems are Git, Mercurial. They
help us overcome the problem of single point of failure. 
 
Purposeof Version Control: 
 
 Multiple people can work simultaneously on a single project. Everyone works
on and edits their own copy of the files and it is up to them when they wish to
share the changes made by them with the rest of the team.
 It also enables one person to use multiple computers to work on a project, so
it is valuable even if you are working by yourself.
 It integrates the work that is done simultaneously by different members of
the team. In some rare case, when conflicting edits are made by two people
to the same line of a file, then human assistance is requested by the version
control system in deciding what should be done.
 Version control provides access to the historical versions of a project. This is
insurance against computer crashes or data loss. If any mistake is made,
you can easily roll back to a previous version. It is also possible to undo
specific edits that too without losing the work done in the meanwhile. It can
be easily known when, why, and by whom any part of a file was edited
Centralized vs Distributed Version
Control: Which One Should We Choose?
 Difficulty Level : Medium
 Last Updated : 13 Sep, 2021
Many of us are aware of version control when it comes to work with multiple developers
on a single project and collaborate with them. There is no doubt that version control
makes developers work more easily and fast. In most of the organization, developers use
either Centralized Version Control System(CVCS) like Subversion(SVN) or Concurrent
Version System(CVS) or Distributed Version Control System(DVCS) like Git (Written
in C), Mercurial (Written in Python) or Bazaar (Written in Python). 
Now come to the point, which one is best or which one we need to choose? We will
compare each one’s workflow, learning curve, security, popularity, and other aspects. 
Firstly we need to break a myth that most beginners have about DVCS is that “There is
no central version in the code or no master branch.” That’s not true, In DVCS there is
also a master branch or central version in the code but it works in a different way than
centralized source control. 
 

Let’s go through the overview of both version control systems.  

Centralized Version Control System

In centralized source control, there is a server and a client. The server is the master
repository that contains all of the versions of the code. To work on any project, firstly
user or client needs to get the code from the master repository or server. So the client
communicates with the server and pulls all the code or current version of the code from
the server to their local machine. In other terms we can say, you need to take an update
from the master repository and then you get the local copy of the code in your system. So
once you get the latest version of the code, you start making your own changes in the
code and after that, you simply need to commit those changes straight forward into the
master repository. Committing a change simply means merging your own code into the
master repository or making a new version of the source code. So everything is
centralized in this model. 
There will be just one repository and that will contain all the history or version of the
code and different branches of the code. So the basic workflow involves in the
centralized source control is getting the latest version of the code from a central
repository that will contain other people’s code as well, making your own changes in the
code, and then committing or merging those changes into the central repository. 

Distributed Version Control System

In distributed version control most of the mechanism or model applies the same as
centralized. The only major difference you will find here is, instead of one single
repository which is the server, here every single developer or client has their own server
and they will have a copy of the entire history or version of the code and all of its
branches in their local server or machine. Basically, every client or user can work locally
and disconnected which is more convenient than centralized source control and that’s
why it is called distributed. 
You don’t need to rely on the central server, you can clone the entire history or copy of
the code to your hard drive. So when you start working on a project, you clone the code
from the master repository in your own hard drive, then you get the code from your own
repository to make changes and after doing changes, you commit your changes to your
local repository and at this point, your local repository will have ‘change sets‘ but it is
still disconnected with the master repository (master repository will have different ‘sets
of changes‘ from each and every individual developer’s repository), so to communicate
with it, you issue a request to the master repository and push your local repository code to
the master repository. Getting the new change from a repository is called “pulling” and
merging your local repository’s ‘set of changes’ is called “pushing“. 
It doesn’t follow the way of communicating or merging the code straight forward to the
master repository after making changes. Firstly you commit all the changes in your own
server or repository and then the ‘set of changes’ will merge to the master repository. 
Below is the diagram to understand the difference between these two in a better way: 
 
Basic Difference with Pros and Cons 

 Centralized version control is easier to learn than distributed. If you are a beginner
you’ll have to remember all the commands for all the operations in DVCS and
working on DVCS might be confusing initially. CVCS is easy to learn and easy to set
up.
 DVCS has the biggest advantage in that it allows you to work offline and gives
flexibility. You have the entire history of the code in your own hard drive, so all the
changes you will be making in your own server or to your own repository which
doesn’t require an internet connection, but this is not in the case of CVCS.
 DVCS is faster than CVCS because you don’t need to communicate with the remote
server for each and every command. You do everything locally which gives you the
benefit to work faster than CVCS.
 Working on branches is easy in DVCS. Every developer has an entire history of the
code in DVCS, so developers can share their changes before merging all the ‘sets of
changes to the remote server. In CVCS it’s difficult and time-consuming to work on
branches because it requires to communicate with the server directly.
 If the project has a long history or the project contain large binary files, in that case,
downloading the entire project in DVCS can take more time and space than usual,
whereas in CVCS you just need to get few lines of code because you don’t need to
save the entire history or complete project in your own server so there is no
requirement for additional space.
 If the main server goes down or it crashes in DVCS, you can still get the backup or
entire history of the code from your local repository or server where the full revision
of the code is already saved. This is not in the case of CVCS, there is just a single
remote server that has entire code history.
 Merge conflicts with other developer’s code are less in DVCS. Because every
developer work on their own piece of code. Merge conflicts are more in CVCS in
comparison to DVCS.
 In DVCS, sometimes developers take the advantage of having the entire history of the
code and they may work for too long in isolation which is not a good thing. This is
not in the case of CVCS.

Comparison – Centralized, Decentralized


and Distributed Systems
 Difficulty Level : Medium
 Last Updated : 13 Sep, 2021
In this article, we will try to understand and compare different aspects of centralized,
decentralized, and distributed systems. 

1. CENTRALIZED SYSTEMS:

We start with centralized systems because they are the most intuitive and easy to
understand and define. 
Centralized systems are systems that use client/server architecture where one or more
client nodes are directly connected to a central server. This is the most commonly used
type of system in many organizations where a client sends a request to a company server
and receives the response. 
 

Figure – Centralized system visualization 


Example – 
Wikipedia. Consider a massive server to which we send our requests and the server
responds with the article that we requested. Suppose we enter the search term ‘junk food’
in the Wikipedia search bar. This search term is sent as a request to the Wikipedia servers
(mostly located in Virginia, U.S.A) which then responds back with the articles based on
relevance. In this situation, we are the client node, Wikipedia servers are the central
server. 
Characteristics of Centralized System – 
 Presence of a global clock: As the entire system consists of a central node(a server/ a
master) and many client nodes(a computer/ a slave), all client nodes sync up with the
global clock(the clock of the central node). 
 One single central unit: One single central unit which serves/coordinates all the
other nodes in the system. 
 Dependent failure of components: Central node failure causes the entire system to
fail. This makes sense because when the server is down, no other entity is there to
send/receive responses/requests. 
 
Scaling – 
Only vertical scaling on the central server is possible. Horizontal scaling will contradict
the single central unit characteristic of this system of a single central entity. 
Components of Centralized System – 
Components of Centralized System are, 
 Node (Computer, Mobile, etc.). 
 Server. 
 Communication link (Cables, Wi-Fi, etc.). 
 
Architecture of Centralized System – 
Client-Server architecture. The central node that serves the other nodes in the system is
the server node and all the other nodes are the client nodes. 
Limitations of Centralized System – 
 Can’t scale up vertically after a certain limit – After a limit, even if you increase the
hardware and software capabilities of the server node, the performance will not
increase appreciably leading to a cost/benefit ratio < 1. 
 
 Bottlenecks can appear when the traffic spikes – as the server can only have a finite
number of open ports to which can listen to connections from client nodes. So, when
high traffic occurs like a shopping sale, the server can essentially suffer a Denial-of-
Service attack or Distributed Denial-of-Service attack. 
 
Advantages of Centralized System – 
 Easy to physically secure. It is easy to secure and service the server and client nodes
by virtue of their location
 Smooth and elegant personal experience – A client has a dedicated system which he
uses(for example, a personal computer) and the company has a similar system which
can be modified to suit custom needs
 Dedicated resources (memory, CPU cores, etc)
 More cost-efficient for small systems up to a certain limit – As the central systems
take fewer funds to set up, they have an edge when small systems have to be built
 Quick updates are possible – Only one machine to update.
 Easy detachment of a node from the system. Just remove the connection of the client
node from the server and voila! Node detached.
Disadvantages of Centralized System – 
 Highly dependent on the network connectivity – The system can fail if the nodes lose
connectivity as there is only one central node.
 No graceful degradation of the system – abrupt failure of the entire system
 Less possibility of data backup. If the server node fails and there is no backup, you
lose the data straight away
 Difficult server maintenance – There is only one server node and due to availability
reasons, it is inefficient and unprofessional to take the server down for maintenance.
So, updates have to be done on-the-fly(hot updates) which is difficult and the system
could break.
Applications of Centralized System – 
 Application development – Very easy to set up a central server and send client
requests. Modern technology these days do come with default test servers which can
be launched with a couple of commands. For example, Express server, Django server.
 Data analysis – Easy to do data analysis when all the data is in one place and available
for analysis
 Personal computing
Use Cases – 
 Centralized databases – all the data in one server for use.
 Single-player games like Need For Speed, GTA Vice City – an entire game in one
system(commonly, a Personal Computer)
 Application development by deploying test servers leading to easy debugging, easy
deployment, easy simulation
 Personal Computers
Organizations Using – 
National Informatics Center (India), IBM 
 

2. DECENTRALIZED SYSTEMS:

These are other types of systems that have been gaining a lot of popularity, primarily
because of the massive hype of Bitcoin. Now many organizations are trying to find the
application of such systems. 
In decentralized systems, every node makes its own decision. The final behavior of the
system is the aggregate of the decisions of the individual nodes. Note that there is no
single entity that receives and responds to the request. 
 

Figure – Decentralized system visualization 


Example – 
Bitcoin. Let’s take Bitcoin for example because it is the most popular use case of
decentralized systems. No single entity/organization owns the bitcoin network. The
network is a sum of all the nodes who talk to each other for maintaining the amount of
bitcoin every account holder has. 
Characteristics of Decentralized System – 
 Lack of a global clock: Every node is independent of each other and hence, has
different clocks that they run and follow.
 Multiple central units (Computers/Nodes/Servers): More than one central unit
which can listen for connections from other nodes
 Dependent failure of components: one central node failure causes a part of the
system to fail; not the whole system
Scaling – 
Vertical scaling is possible. Each node can add resources(hardware, software) to itself to
increase the performance leading to an increase in the performance of the entire system. 
Components – 
Components of Decentralized System are, 
 Node (Computer, Mobile, etc.)
 Communication link (Cables, Wi-Fi, etc.)
Architecture of Decentralized System – 
 peer-to-peer architecture – all nodes are peers of each other. No one node has
supremacy over other nodes
 master-slave architecture – One node can become a master by voting and help in
coordinating of a part of the system but this does not mean the node has supremacy
over the other node which it is coordinating
Limitations of Decentralized System – 
 May lead to the problem of coordination at the enterprise level – When every node is
the owner of its own behavior, its difficult to achieve collective tasks
 Not suitable for small systems – Not beneficial to build and operate small
decentralized systems because of the low cost/benefit ratio
 No way to regulate a node on the system – no superior node overseeing the behavior
of subordinate nodes
Advantages of Decentralized System – 
 Minimal problem of performance bottlenecks occurring – The entire load gets
balanced on all the nodes; leading to minimal to no bottleneck situations
 High availability – Some nodes(computers, mobiles, servers) are always
available/online for work, leading to high availability
 More autonomy and control over resources – As each node controls its own behavior,
it has better autonomy leading to more control over resources
Disadvantages of Decentralized System – 
 Difficult to achieve global big tasks – No chain of command to command others to
perform certain tasks
 No regulatory oversight
 Difficult to know which node failed – Each node must be pinged for availability
checking and partitioning of work has to be done to actually find out which node
failed by checking the expected output with what the node generated
 Difficult to know which node responded – When a request is served by a
decentralized system, the request is actually served by one of the nodes in the system
but it is actually difficult to find out which node indeed served the request.
Applications of Decentralized System – 
 Private networks – peer nodes joined with each other to make a private network.
 Cryptocurrency – Nodes joined to become a part of a system in which digital currency
is exchanged without any trace and location of who sent what to whom. However, in
bitcoin, we can see the public address and amount of bitcoin transferred, but those
public addresses are mutable and hence difficult to trace.
Use Cases – 
 Blockchain
 Decentralized databases – Entire databases split into parts and distributed to different
nodes for storage and use. For example, records with names starting from ‘A’ to ‘K’
in one node, ‘L’ to ‘N’ in the second node, and ‘O’ to ‘Z’ in the third node
 Cryptocurrency
Organizations Using – 
Bitcoin, Tor network 

3. DISTRIBUTED SYSTEMS:

This is the last type of system that we are going to discuss. Let’s head right into it! 
In decentralized systems, every node makes its own decision. The final behaviour of the
system is the aggregate of the decisions of the individual nodes. Note that there is no
single entity that receives and responds to the request. 
 
Figure – Distributed system visualization 
Example – 
Google search system. Each request is worked upon by hundreds of computers which
crawl the web and return the relevant results. To the user, Google appears to be one
system, but it actually is multiple computers working together to accomplish one single
task (return the results to the search query). 
Characteristics of Distributed System – :
 Concurrency of components: Nodes apply consensus protocols to agree on the same
values/transactions/commands/logs.
 Lack of a global clock: All nodes maintain their own clock.
 Independent failure of components: In a distributed system, nodes fail
independently without having a significant effect on the entire system. If one node
fails, the entire system sans the failed node continues to work.
Scaling – 
Horizontal and vertical scaling is possible. 
Components of Distributed System – 
Components of Distributed System are, 
 Node (Computer, Mobile, etc.)
 Communication link (Cables, Wi-Fi, etc.)
Architecture of Distributed System – 
 peer-to-peer – all nodes are peers of each other and work towards a common goal
 client-server – some nodes become server nodes for the role of coordinator, arbiter,
etc.
 n-tier architecture – different parts of an application are distributed in different nodes
of the systems and these nodes work together to function as an application for the
user/client
Limitations of Distributed System – 
 Difficult to design and debug algorithms for the system. These algorithms are difficult
because of the absence of a common clock; so no temporal ordering of
commands/logs can take place. Nodes can have different latencies which have to be
kept in mind while designing such algorithms. The complexity increases with the
increase in the number of nodes. Visit this link for more information
 No common clock causes difficulty in the temporal ordering of events/transactions
 Difficult for a node to get the global view of the system and hence take informed
decisions based on the state of other nodes in the system
Advantages of Distributed System – 
 Low latency than a centralized system – Distributed systems have low latency
because of high geographical spread, hence leading to less time to get a response
Disadvantages of Distributed System – 
 Difficult to achieve consensus
 The conventional way of logging events by absolute time they occur is not possible
here
Applications of Distributed System –  
 Cluster computing – a technique in which many computers are coupled together to
work so that they achieve global goals. The computer cluster acts as if they were a
single computer
 Grid computing – All the resources are pooled together for sharing in this kind of
computing turning the systems into a powerful supercomputer; essentially.
Use Cases – 
 SOA-based systems
 Multiplayer online games
Organizations Using – 
Apple, Google, Facebook.

You might also like