Git and GitHub - Enabling Distributed Cooperation on Software Projects
Enabling distributed cooperation on software projects.
- Introductions
- Three-Tier Model
- Git Terminology
- Creating a Local Repository
- Tracking Changes with the Local Repository
- Committing Changes to the Local Repository
- Sharing Changes with Other Developers
- Conflicts
- Configuration
- Logs
- Conclusions
Introductions
Git
The world’s most popular open source version control system.
-
Distributed - all developers maintain and share the code in their own local repositories, with no single central authority.
-
Repository versioning - a new version number is issued for the repository as a whole, even when only one file has changed.
-
Created by Linus Torvalds to help Linux kernel developers collaborate
GitHub
A commercial web app used for hosting remote git
repositories.
-
Free hosting of repositories.
-
Allows control over who can accesse each repository.
-
Provides a web app and mobile apps for viewing repository files.
-
Acquired by Microsoft for $7.5 billion in 2018.
Three-Tier Model
Introduction
Git version control depends upon three tiers, or three places where files are stored.
- Working copy
- Staging index
- Repository
Working copy
The copy of the files that a developer works on.
-
Not tracked by git.
-
Your ‘normal’ files you are familiar with.
Staging index
A holding area for changes that might be included in the project’s official history.
-
Allows developers pick and choose which changed files to include in the next commit to the repository
-
Read more about some benefits of this
# add any changed files within the working directory to the staging index
git add .
# add only one file to the staging index
git add my_staging_index_slide.md
Repository
The official history of the project, including the history of all committed files.
-
With each commit, changed files in the staging index are moved to the repository
-
A short one-line message describes the changes included in each commit
# commit any files in the staging index to the repository
git commit -m "Updating slide about git repositories"
Git Terminology
Introduction
A few terms we must familiarize ourselves with.
- Clone
- Fork
- Main
- Branch
- Remote
- Pull
- Push
- Merge
- Fetch
- Pull Request
Clone
A new repository made by copying files from an existing repository.
-
Typically involves creating a new local repository that is a copy of an existing remote repository.
-
The new repository automatically keeps a link to the
origin
remote repository from which it came.
# clone a remote repository onto the local machine
git clone https://github.com/your-github-handle/your-repository-name.git
Fork
Not a Git term, but rather a term used by remote repository hosting providers such as GitHub.
-
A fancy word for creating a remote clone of a remote repository.
-
A fork of a GitHub repository creates a clone of that repository in the active user’s own GitHub account
Main
The Git term for the trunk
line of code.
-
The default line of code that all commits are placed into unless other branch lines are created.
-
PS: in earlier versions of
git
, the trunk was calledmaster
. You will still see this name used in many places.
Branch
The term for an offshoot of the trunk
or main
line of code.
-
creating a branch allows developers to work on new code in isolation with no effect on the main
trunk
line. -
code in a branch can be
merged
into the trunk branch at any time, if desired
# see a list of all branches in the local repository
git branch
# create a new branch
git branch experiment1
# switch to the new branch
git checkout experiment1
# switch back to the main branch
git checkout main
Remote
A term used to mean a remote repository that the local repository is linked to.
-
Typically, this refers to the remote repository from which a local repository was cloned.
-
The nickname
origin
is usually used to refer to the original remote repository from which the local repository was cloned. -
The nickname
upstream
is also sometimes used to refer to a remote repository from which theorigin
repository was originally forked.
# view a list of all currently known remotes
git remote -v
# inform git of a remote and give it the nickname 'origin'
git remote add origin https://github.com/your-github-handle/your-repository-name.git
# inform git of a remote and give it the nickname 'upstream'
git remote add upstream https://github.com/some-other-github-handle/some-other-repository-name.git
Pull
Download and merge any changes in the current branch from a remote repository to the local repository.
- It’s good practice to
pull
before youpush
.
# download and merge any changes in the main branch from the 'origin' remote repository to the local repository
git pull origin main
# download and merge any changes in the main branch from the 'upstream' remote repository to the local repository
git pull upstream main
Push
Upload changes from the local repository to a remote repository.
# Upload any changs in the main branch from the local repository to the 'origin' remote repository
git push origin main
# Upload any changs in the main branch from the local repository to the 'upstream' remote repository
git push upstream main
Fetch
Download but do not automatically merge any changes in the current branch from a remote repository to the local repository.
- This is similar to
git pull
, butpull
automatically merges any changes into the current branch, whilefetch
does not.
# fetch any changes from the remote repository
git fetch origin
# fetch any changes from the 'upstream' remote repository
git fetch upstream
Merge
Take the changes from one branch or repository and incorporate them into another branch or repository.
-
If both branches include recent changes to the same lines of code,
git
will inform you of a conflict. -
Any such merge conflicts must be manually resolved using your preferred editor.
- A typical workflow:
git pull origin main # get latest code for the current branch from the remote origin repository
git checkout some-branch-name # switch to a branch where you will make changes in isolation... either first create this branch with `git branch some-branch-name` or do `git checkout -b some-branch-name` to create it and check it out at once
# make some chages to the code in this branch, if desired using your preferred editor
# stage and commit your changes, i.e. `git add .`, `git commit -m "some message."`
git checkout main # switch back to the main branch to prepare to merge
git pull origin main # download any recent changes from the origin server
git merge some-branch-name # incorporaate your changes in the branch into the code in the main branch
git branch -d some-branch-name # delete the branch now that we're done with it
git push origin main # upload your changes to the remote origin repository
Pull Request
A request to another GitHub user to merge changes from one branch or repository into another.
- A pull request is a GitHub feature, not a
git
feature. - One user requests another user to review the code and approve its merger.
- The reviewer can approve or reject the request outright, or add comments that ask the initiator of the pull request to make changes to the code before the pull request can be approved.
- The idea is to enforce a peer review of the code before it is merged into the main branch. All team members must share the responsibility of reviewing each-others’ code.
Creating a Local Repository
Introduction
A local repository can be created in one of two ways:
- from scratch
- as a clone of an existing repository
From scratch
mkdir project0
cd project0
git init
If you later decide to link a local repository to a remote repository:
git remote add origin git@github.com:YOUR-USERNAME/YOUR-REPOSITORY.git
As a clone of an existing repository
git clone https://github.com/YOUR-USERNAME/YOUR-REPOSITORY.git
Tracking Changes with the Local Repository
Introduction
Changes in the Working Copy (adding, modifying, or deleting files) are not tracked.
Files must be added to the staging index in order to be tracked.
Add files to the staging index
Make any changes you desire.
Then add all the modified files in working copy to the staging index:
git add .
Committing Changes to the Local Repository
Introduction
The staging index does not maintain the history of all changes.
Changes must be moved to the repository in order to be fully archived.
Commit all changes in the staging index to the repository:
git commit -m 'resolving issue #23'
Sharing Changes with Other Developers
Introduction
Any changes to the local repository must usually be shared with other developers.
For this reason, they should frequently be uploaded to a shared remote repository.
Pushing changes to the remote repository
Push the code in the current branch from the local repository to the branch named ‘main’ on the remote repository called ‘origin’:
git pull
git push origin main
Optionally, you can set origin
as the default place to push by running this command once:
git push --set-upstream origin main
From then on, you can use just git push
to push changes to origin
.
Merge Conflicts
Introduction
Conflicts may occur in a number of scenarios:
- When pulling code from a remote repository to a local repository.
- When pushing code from a local repository to a remote repository.
- When merging code from one branch into another.
- When merging code from one repository into another.
Git will merge code automatically, if possible, and ask the developer to do it themselves, if not.
Configuration
Introduction
Several important configurations that must be completed prior to versioning:
- configure
git
username and email address to match that of GitHub - turn off rebasing
- use
.gitignore
to exclude files from versioning
Git username and email address
Local git user.name
and user.email
settings must match exactly your login name and email set up on GitHub:
git config --global user.name "monalisa"
git config --global user.email "mona.lisa@mlouvre.org"
GitHub name
Leave blank the Name
field in your GitHub account settings to be sure that your username is used instead for all commits. This is not optional.
Turn off rebasing
git
has two ways it can handle pull
events, where changes from a remote repository are downloaded and integrated into the local repository: merge
and rebase
. We want to use merge
because this maintains the full history of the project, whereas rebase
changes the history to make it simpler and you thus may not receive credit for your full contribution to a project.
To turn off rebasing and use regular merging, you must run this command:
git config --global pull.rebase false
Read more about rebase vs. merge if interested - note the last section of the document.
.gitignore
A file in the project directory named .gitignore
instructs Git to ignore certain files or directories.
This absolutely must include instructions to ignore any platform code and 3rd party modules.
View an example
Logs
Introduction
Git maintains a log of all commits to the repository. To see this:
git log
Hit the SPACE
bar to scroll through the logs and hit q
to quit when you’re done.
Options
There are a few useful options when viewing logs.
-
show a list of all commits in a compact format
git log --oneline
-
show all commits with a few statistics of what was changed in each
git log --shortstat
… and many more
gitlogstats
A Famous developer has created the gitlogstats tool to generate a summary of the logs, with many additional options.
e.g., find all contributions to a repository over a specific time period:
gitlogstats -s 11/15/2024 -e 12/15/2024 -r https://github.com/some-repo.git
Conclusions
You now know the basics of using Git and its relationship to GitHub.
- Thank you. Bye.