Version Control Systems - High-Level Overview
Keeping track of what changed, who changed it, and why they did it.
- Motivations
- Repositories
- Trunk
- Branches
- File vs. Repository Versioning
- Centralized vs. Distributed Systems
- Examples
- Conclusions
Motivations
There are a variety of motivations for developers to use version control systems:
- Archiving
- Synchronizing
- Repair
- Accountability
- Experimentation
Archiving
The need to clearly see the state of the project at a given point in time, and to compare versions.
Synchronizing
The need for multiple developers working independently to ensure they are all working off of the same version of a project’s codebase.
Repair
The need to sometimes undo changes that have been made and ‘roll back’ to an earlier version.
Accountability
The need to see who did what and when they did it.
Experimentation
The need to allow developers to try out new directions or ideas without concern for any permanent damage should they fail.
Repositories
A repository (or ‘repo’) is the database where files are stored and tracked by the version control system.
In a version control workflow, a repository can either be:
- Local
- Remote
Local
A local repository resides on a developer’s own machine.
When multiple developers are collaborating in a distributed fashion, a local repository may be used to track each developer’s own work.
Remote
A remote repository resides on some other machine… this could be another developer’s machine, or somewhere on a shared server.
When multiple developers are collaborating using a centralized version control system, a remote repository on a shared server may be the only repository where changes are tracked.
Trunk
The trunk, ‘main’, or ‘master’ line of a repository is the authoritative best version of the code in the repository at any given point in time.
Benefits
Maintaining code in a trunk allows each developer to unambiguously know where to find the best most authoritative version of the code in the repository at any given time.
Drawbacks
If developers work directly on the code in the trunk (known as ‘trunk-based development’) without a very careful workflow, bugs may appear in the trunk code, corrupting it.
Branches
Branches represent variants of the code at any given point in time time. They are departures from the main line in the repository.
Benefits
Those wary of working directly on the trunk often make changes in isolated branches, separate from the trunk, and then ‘merge’ the changes into the code in the trunk once they are sure they have no bugs.
Drawbacks
If multiple developers work independently in separate branches for long periods of time, they may end up in ‘merge hell’, where the trunk has changed drastically since the time when they started the offshoot branch.
File vs. Repository Versioning
When a change is made to a file, a new version number is issued.
- In file versioning systems, a new version number is issued only for the file that has been changed.
- In repository versioning systems, a new version number is issued for the repository as a whole, even when only one file has changed.
File Versioning
Each file in the repository has its own version number - so frequently changed files will typically have higher version numbers than stable files.
Benefit
- Easy to find the history of all versions of a given file.
Drawback
- Possibly difficult to determine all the individual files that comprise one logical release version.
Repository Versioning
Any changes trigger a new version number of the entire repository.
Benefit
- Easy to find all the individual files comprising a given system release version.
Drawback
- Possibly difficulty to determine the change history of one individual file.
Centralized vs. Distributed Versioning
Version control systems can either allow for centralized or decentralized control:
- In centralized version control systems, one repository maintains the single authoritive copy of the code at all times.
- In distributed version control systems, all developers maintain and share the code in their own local repositories, with no single central authority.
Centralized Systems
Benefits
- Avoid conflicts by usually requiring developers to “check out” a file to change it and then “check in” the file when done.
Drawbacks
- Doesn’t allow individual developers to track their own experiments before sharing them
- Doesn’t allow tracking while offline and disconnected from the central server.
Distributed Systems
Benefits
- Allows each developer to track their own work with all the features of version control on their local machines.
- Allows tracking while offline and disconnected from the central server.
- Can be used in a centralized version control workflow, but doesn’t require it.
Drawbacks
- Without a careful workflow, developers may fall out of sync with one-another, leading to ‘merge hell’.
Examples
Centralized Systems
- CVS - a centralized file versioning system
- Subversion - a centralized repository versioning system
Distributed Systems
- Git - a distributed repository versioning system
Conclusions
Thank you. Bye.