Skip to content

Git and GitHub Basics

Git and GitHub are indispensable tools in modern software development, used in nearly every significant project worldwide.

Although their core concepts are straightforward, their extensive features can sometimes feel overwhelming. Mastering these tools is essential for contributing to real-world software projects, and we will use them extensively throughout this course. To begin, we will explore Git and GitHub.

Initially, we will focus on using these tools as an individual developer. Later, we will delve into team-based collaboration workflows.

Git and GitHub are distinct tools, so we will examine each separately, starting with Git.

What is Git?

Git is a version control system (VCS) designed to track changes to a set of files over time, allowing you to access and manage previous versions.

Using a VCS provides several key benefits:

  • Reverting files to earlier versions.
  • Comparing changes made over time.
  • Synchronizing version history across multiple locations or collaborators.
  • Identifying when and by whom changes were made.

While many VCSs exist, Git stands out in how it stores version history. Git represents version history as a series of snapshots of the project.

When you create a new version (known as a “commit”), Git saves a snapshot of all project files. This version history is essentially a stream of these snapshots.

Git optimizes storage by avoiding duplication. If a file remains unchanged, Git doesn’t store a new copy; instead, it links to the previous version.

This stream of snapshots is stored in a Git repository (repo) located within the project directory:

Git Repository Structure

Representing data as snapshots enables powerful features like branches, which we will explore later.

Another defining feature of Git is that it is a distributed VCS. Every location containing the version history is a complete mirror of the entire history, including every version, commit, and snapshot.

Distributed Version Control

This distributed nature has both advantages and challenges. A significant advantage is that every clone of the repository acts as a complete backup, increasing resilience to machine failures.

However, because every clone contains the full version history, care must be taken to prevent the snapshot streams from diverging.

For example, if developers A and B make commits in their respective clones simultaneously, they create different versions of the project’s history. While these versions can be reconciled, the process can become difficult if the changes diverge significantly or conflict.

To address this, workflows are often used in Git projects with collaborators to ensure smooth reconciliation of work. We will discuss these workflows in detail later.

Finally, most Git operations are local, meaning you can work on a Git project without an internet connection.

Remote operations, such as synchronizing copies, require a network connection, but these are only necessary when explicitly executed.

We will soon explore how to use Git in practice.

What is GitHub?

GitHub is a cloud-based hosting service for Git repositories, widely used to back up code and facilitate collaboration. Developers use Git commands to synchronize their code with a remote repository on GitHub as they make commits.

However, GitHub offers much more than just hosting. It provides a suite of tools that build upon hosted Git repositories to serve as a central hub for collaboration.

These tools, accessible through GitHub’s web client, include:

  • Bug and issue tracking.
  • Code review via pull requests.
  • Task management.
  • Continuous integration.
  • Social features for developers.

GitHub is especially popular for team-based projects. A code project is hosted in a repository on GitHub, which acts as the central point for collaboration.

Each team member works on their own clone of the central GitHub repository. As they write code and make commits to their local clone, they push those commits back to the central repository. Other collaborators can then pull these commits from the central repository into their own clones, ensuring everyone stays up to date and can incorporate each other’s changes.

Later in this course, we will explore GitHub’s collaboration tools and workflows in greater detail, focusing on how they support team-based projects.

Basic Workflow

Let’s walk through a basic workflow for an individual developer using Git and GitHub, focusing on Git operations for making commits and synchronizing them.

Git is primarily a command-line tool, so we will emphasize the command-line version of Git operations. While I encourage you to become comfortable with the command-line interface to access Git’s full functionality, you are free to use GUI Git clients like GitHub Desktop or GitKraken. Many code editors, such as VS Code, also include built-in Git clients.

If you plan to use GitKraken for this course, note that working with private repositories on GitHub requires GitKraken Pro. While GitKraken Pro is typically a paid service, students can access it for free through the GitHub Student Developer Pack, which also includes other useful tools.

Installing Git

To use Git, you need to have a Git client installed on your development machine. The Pro Git book provides detailed installation instructions for various platforms. Here are some platform-specific notes:

  • Windows: Install Git for Windows, which includes a command-line version of Git (called Git Bash) and a simple GUI.
  • Mac: Use Homebrew to install Git with the following command brew install git.
  • Linux: The command-line version of Git is typically pre-installed (including on the OSU ENGR servers).

Configuring Git

If you’re using Git for the first time on your development machine, you’ll need to configure some basic settings before you can start. At a minimum, you must set your name and email address, which Git uses to identify you as the author of your commits. Use the following commands to configure these settings:

Terminal window
git config --global user.name "John Doe"
git config --global user.email "john.doe@example.com"

If you already have a GitHub account, you can use your GitHub email address here.

Starting a Git Project

Now that Git is installed and configured, you’re ready to start using it. There are two main ways to begin working on a Git-based project:

  • Create a new repository from scratch—and eventually mirroring it on GitHub
  • Clone an existing repository, such as one hosted on GitHub

We’ll explore both methods in detail.

Creating a Git Repository from Scratch

A Git repository is always tied to a directory, which aligns well with the practice of keeping code projects in their own dedicated directories. Git allows you to selectively manage which files and subdirectories are placed under version control.

To create a new Git repository, follow these steps:

Terminal window
mkdir my-project # Create a new project directory
cd my-project # Navigate into the directory
git init # Initialize a Git repository

This command creates an initial “database” to store your project’s version history and other Git-related information. This database resides in a hidden .git/ directory within your project folder.

Hidden directories (those starting with a .) are not visible by default, but you can view them using:

  • Unix Terminals: ls -a
  • Windows Command Prompt: dir /a:hd

States in a Git Repository

Files in a Git repository can exist in different states:

  1. Not Staged for Commit:
    • untracked: the file is not tracked by Git and is not included in the version history.
    • modified: the file is tracked by Git, but its contents have changed since the last commit.
    • deleted: the file is tracked by Git, but it has been deleted from the working directory.
  2. Staged: The file is marked to be included in the next commit. It can be in state new file, modified, or deleted, respectively.
  3. Committed: The file’s changes are saved in the repository and doesn’t have any changes since the last commit.

This means that before a file can be committed, it must be staged.

In general, we can think of a Git project as having three separate parts: the working directory, the staging area, and the repository itself (i.e. the .git/ directory).

The working directory is where your files live. This is where you write code, edit files, and make changes. The staging area is a temporary holding area for files that are ready to be committed. The repository is where the version history of your project is stored.

In particular, as we work on a project, we may make changes to many files at once. At some point, we may decide that we are ready to commit some of those changes but not ready to commit others. This is what the staging area allows us to do. The working directory, the staging area, and the Git repository interact like this:

Loading diagram...
Source
sequenceDiagram
.git Directory ->> Working Directory: Checkout the latest commit
Working Directory ->> Staging Area: Add files to staging area
Staging Area ->> .git Directory: Commit staged changes

Let’s run through an example of this workflow.

Staging Files

With the repository initialized, you can now add files to version control. Start by creating two files in your project directory, cat.js and dog.js, and add some content to them:

Terminal window
echo 'console.log("Meow!");' > cat.js
echo 'console.log("Woof!");' > dog.js

Next, check the current state of your working directory using the git status command:

Terminal window
git status

The output will look something like this:

On branch master
No commits yet
Untracked files:
(use "git add <file>..." to include in what will be committed)
cat.js
dog.js
nothing added to commit but untracked files present (use "git add" to track)

The “untracked files” section indicates that Git sees these files but is not yet tracking them. To start tracking the files, use the git add command:

Terminal window
git add cat.js dog.js

Alternatively, to track all files, you can use:

Terminal window
git add .

Run git status again, and you’ll see the updated output:

On branch master
No commits yet
Changes to be committed:
(use "git rm --cached <file>..." to unstage)
new file: cat.js
new file: dog.js

Committing Files

Now the files are staged for a commit. To save this snapshot, use the git commit command:

Terminal window
git commit

Git will open a text editor for you to write a commit message. Enter a brief description, such as:

Add cat and dog sounds

Save and close the editor to complete the commit. If you check the status again, you’ll see:

nothing to commit, working tree clean

Alternatively, you can skip the editor by using the -m option to specify the commit message directly:

Terminal window
git commit -m "Add cat and dog sounds"

To view the commit history, use:

Terminal window
git log

The output will include details like the commit hash, author, timestamp, and message:

commit c4f570c3ce30fcf50f8f8a7736306030667e9337 (HEAD -> master)
Author: Alexander Ulbrich <adulbrich@users.noreply.github.com>
Date: Thu Feb 27 21:04:30 2025 -0800
Add cat and dog sounds

The ong hexadecimal value associated with the commit is known as the commit hash. Here, the commit hash is:

c4f570c3ce30fcf50f8f8a7736306030667e9337

There are a couple things to know about the commit hash:

  • The commit hash is a checksum computed based on the contents of the commit itself. This means that the commit hash verifies the integrity of the commit it’s associated with.
  • The commit hash also serves as a unique identifier for the commit it’s associated with. When we need to refer to a specific commit in certain Git operations, we will typically do so using the hash of that commit.

Reviewing Changes with git diff

Now, let’s modify the files. Update cat.js and dog.js as follows:

cat.js
console.log("Purr...");
dog.js
console.log("Woof!");
console.log("Bark!");

Before committing these changes, check the status again:

Terminal window
git status

The output will indicate “changes not staged for commit.”

On branch master
Changes not staged for commit:
(use "git add <file>..." to update what will be committed)
(use "git restore <file>..." to discard changes in working directory)
modified: cat.js
modified: dog.js
no changes added to commit (use "git add" and/or "git commit -a")

To see the exact changes, use the git diff command:

Terminal window
git diff

This will display a unified diff showing the differences between the current working directory and the last commit:

diff --git a/cat.js b/cat.js
index ab949dc..45819c5 100644
--- a/cat.js
+++ b/cat.js
@@ -1 +1 @@
-console.log("Meow!");
+console.log("Purr...");
diff --git a/dog.js b/dog.js
index 9260ad6..5b62dbb 100644
--- a/dog.js
+++ b/dog.js
@@ -1 +1,2 @@
console.log("Woof!");
+console.log("Bark!");

Git uses a specific format called unified format to produce diffs. This format can be a bit difficult to read at first because it’s designed for both human and machine consumption. Here’s what you should know about Git diffs:

A diff will represent each file under version control that has been modified since the last commit. The changes for each file are indicated by a 4-line header, such as:

diff --git a/dog.js b/dog.js
index 9260ad6..5b62dbb 100644
--- a/dog.js
+++ b/dog.js

The most important part of this header is the file name, in this case, dog.js.

Following the header, you’ll find one or more change hunks detailing the specific modifications made to the file. Each hunk starts with a line that specifies the line numbers involved in the change, like this:

@@ -1 +1,2 @@

The range information indicates the changes starting at line 1 of cat.js in the last commit (represented by the - sign) and spanning from line 1 to line 2 in the current working directory (represented by the + sign).

Finally, the hunk contains the actual changes, represented by three types of lines (note that the coloring may vary depending on the Git implementation):

  • Lines starting with a - indicate a deletion, meaning the line existed in the last commit but is no longer present in the working directory.
  • Lines starting with a + indicate an addition, meaning the line is present in the current working directory but was not in the last commit.
  • Lines that don’t start with either - or + are contextual lines, providing surrounding context for the changes.

For example, the change hunk for dog.js shows the two lines added since the last commit, along with a contextual line:

console.log("Woof!");
console.log("Bark!");

In the change hunk for cat.js, a modified line is represented as both a deletion and an addition:

console.log("Meow!");
console.log("Purr...");

Staging and Committing Changes

When you made your changes, you should stage and commit again. You can do it piecewise. For example, to stage only dog.js, run:

Terminal window
git add dog.js

Check the status again:

On branch master
Changes to be committed:
(use "git restore --staged <file>..." to unstage)
modified: dog.js
Changes not staged for commit:
(use "git add <file>..." to update what will be committed)
(use "git restore <file>..." to discard changes in working directory)
modified: cat.js

Now, commit the staged changes:

Terminal window
git commit -m "Make dog bark too"

You can repeat this process for cat.js or any other files. Use git log to verify the commit history and git status to check the working directory’s state.

Connecting your Local Machine with GitHub

Currently, the repository we’ve been working on exists only on our development machine, with no remote copies. Let’s change that by allowing our local machine to connect to a remote one, in this case GitHub.

Start by visiting github.com in your web browser. If you don’t already have a GitHub account, sign up for one and log in.

Once logged in, you can create a new repository on GitHub. Follow these steps to ensure the new repository is set up correctly for mirroring an existing local repository:

  1. Navigate to the “create a new repository” page on GitHub.
  2. Provide the following details:
    • Repository name: Match the name of your project directory, e.g., my-project.
    • Description: Optional, add if desired.
    • Visibility: Choose between public (visible to everyone) or private (visible only to you and collaborators you invite).
    • Add a README file: Leave this unchecked.
    • Add .gitignore: Select “None.”
    • Choose a license: Select “None.”

After completing these steps, GitHub will create the repository and redirect you to its page.

Before pushing commits from your local repository to the newly created GitHub repository, you need to configure how your development machine will communicate with GitHub. There are two options for this: SSH and HTTPS.

At the top of your new repository’s page on GitHub, you’ll find a box with buttons to toggle between SSH and HTTPS. Choose your preferred method, as each requires specific setup steps on both GitHub and your development machine. We’ll briefly outline these setups in the following sections.

HTTPS: Creating a Personal Access Token (PAT)

When using HTTPS to communicate with GitHub, Git will prompt you to authenticate during operations that interact with GitHub. However, GitHub does not support simple password-based authentication. Instead, you’ll need to generate a Personal Access Token (PAT), which acts as a secure alternative to a password and includes specific permissions.

To create a PAT, follow the instructions in the GitHub documentation. Here are the recommended settings for this course:

  • Type: Personal Access Token (classic)
  • Scopes: repo

Once generated, the PAT will be a long string. Copy it and store it securely, as you would a password. You’ll need this token when pushing commits from your local repository to GitHub.

Depending on your environment, Git may prompt you for the PAT each time you perform an operation requiring authentication. In some cases, Git will remember the token after the first use.

SSH: Setting up SSH Keys

To communicate with GitHub via SSH, you’ll need to create an SSH key and register it with GitHub. An SSH key is a secure authentication credential that replaces the need for a username and password. Once set up, you’ll typically only need to enter the SSH key’s password once, enabling “passwordless” authentication for future operations.

Follow these steps to set up an SSH key and register it with GitHub:

  1. Check for Existing SSH Keys: Start by verifying if you already have SSH keys on your machine. Refer to GitHub’s guide for instructions.
  2. Generate a New SSH Key: If no keys exist, create one and configure your machine to use it. Detailed steps are available in GitHub’s documentation.
  3. Register the SSH Key with GitHub: Add your newly created SSH key to your GitHub account by following these instructions.
  4. Test the SSH Connection: Confirm that your SSH setup works by testing the connection. See GitHub’s guide for details.

Mirroring a Local Git Repo on GitHub

To mirror your local repository with a GitHub repository, you need to establish a connection using either HTTPS or SSH.

  1. Copy the Repository URL

    On your GitHub repository page, choose either HTTPS or SSH as your preferred communication method. Copy the corresponding URL, which will look like one of these:

    Terminal window
    https://github.com/<username>/<repository>.git # HTTPS
    git@github.com:<username>/<repository>.git # SSH
  2. Add the Remote Repository

    Register the GitHub repository as a remote in your local repository. A Git remote is simply a reference to a repository hosted elsewhere. Use the following command, replacing <HTTPS_or_SSH_URL> with the URL you copied:

    Terminal window
    git remote add origin <HTTPS_or_SSH_URL>
  3. Verify the Remote

    Confirm that the remote was added successfully by running:

    Terminal window
    git remote -v

    This will display a list of remotes and their URLs, which should look like this:

    origin https://github.com/<username>/<repository>.git (fetch)
    origin https://github.com/<username>/<repository>.git (push)
  4. Set the Default Branch

    Ensure your local repository uses main as the default branch name:

    Terminal window
    git branch -M main
  5. Push to GitHub

    Use the git push command to upload your local commit history to the GitHub repository:

    Terminal window
    git push -u origin main

    The -u option establishes a tracking relationship between your local branch and the remote branch, allowing you to use commands like git push and git pull without additional arguments in the future.

Once the push is complete, your files and commit history will be visible on GitHub. Refresh the repository page in your browser to explore its features, such as viewing the commit history.

We will cover additional GitHub repository features as the course progresses.

Starting with an Existing Repo on GitHub

Another common way to begin working on a Git project is by cloning an existing remote repository, such as one hosted on GitHub.

To create a working copy of the remote repository on your development machine, follow these steps:

  1. Choose a Location for the Clone

    If you’re cloning the repository on the same machine where you created the original repo, navigate to a different directory. Alternatively, you can use a different machine (e.g., by connecting via ssh to one of the ENGR servers). We’ll refer to the new location as “location #2” and the original location as “location #1”.

  2. Copy the Repository URL

    On your GitHub repository page, click the green Code button. This opens a dropdown with options for cloning the repository. Select either HTTPS or SSH as your preferred communication method and copy the provided URL. This URL should match the one used when setting up the remote earlier.

  3. Clone the Repository

    At “location #2”, run the following command, replacing <HTTPS_or_SSH_URL> with the URL you copied:

    Terminal window
    git clone <HTTPS_or_SSH_URL>

    This command creates a local copy of the repository in a new directory named after the repository (e.g., my-project/). The directory will contain the complete version history of the project.

  4. Verify the Clone

    Navigate into the cloned directory and list its contents. The files should match the most recent commit made at “location #1” (e.g., any uncommitted changes, like modifications to cat.js, will not appear).

  5. Check the Remote Connection

    When cloning a repository, Git automatically sets up a connection to the remote repository, naming it origin. You can verify this by running:

    Terminal window
    git remote -v

    The output will display the remote URLs for fetching and pushing, similar to this:

    origin https://github.com/<username>/<repository>.git (fetch)
    origin https://github.com/<username>/<repository>.git (push)

    Additionally, Git configures the main branch of the local repository to track the main branch of the remote repository. This allows you to use commands like git push and git pull without specifying additional arguments.

Once the repository is cloned, you can work on it at “location #2” just as you would at “location #1.” If you plan to work in both locations, there are additional considerations to ensure the commit history remains consistent. These will be covered in the next section.

Working on a Git Repo in Two Different Locations

When working on two separate instances of the same Git repository (e.g., on two different machines), it’s important to take extra care to keep the commit history consistent between them.

In this scenario, we have two clones of the same repository: one at “location #1” and another at “location #2”. Both are connected to the same GitHub repository.

To illustrate, let’s make a new commit at “location #2” and then handle that commit at “location #1”. At “location #2”, modify the dog.js file as follows (adding additional “bark” lines):

dog.js
console.log("Woof!");
console.log("Bark!");
console.log("Bark!");
console.log("Bark!");

After making this change, stage and commit it, and push it to the GitHub repository:

Terminal window
git push

Now, suppose you want to resume work at “location #1”. Since changes were made and pushed from “location #2”, you must first ensure that “location #1” has the latest version of the code. To synchronize the repository at “location #1” with the GitHub repository, use the git pull command:

Terminal window
git pull

This command will fetch all the commits from GitHub and merge them into the working directory at “location #1”.

Next, let’s explore what happens if this rule is not followed.

Dealing with Merge Conflicts

Let’s revisit “location #1”. Recall that we had made a modification to cat.js but hadn’t committed it yet. You can verify this change using git diff:

diff --git a/cat.js b/cat.js
index ab949dc..45819c5 100644
--- a/cat.js
+++ b/cat.js
@@ -1 +1 @@
-console.log("Meow!");
+console.log("Purr...");

Commit and push this change at location #1:

Terminal window
git add cat.js
git commit
git push

Now, switch to “location #2”. Suppose we forget the rule of pulling changes before starting work. We modify cat.js to add a new line for a hissing sound, without reflecting the changes pushed from “location #1”:

cat.js
console.log("Meow!");
console.log("Hiss!");

At this point, we remember to pull from GitHub. Running git pull results in an error:

error: Your local changes to the following files would be overwritten by merge:
cat.js
Please commit your changes or stash them before you merge.
Aborting
Loading diagram...
Source
sequenceDiagram
participant Developer
participant Local Repository
participant Remote Repository
Developer->>Local Repository: Modify files
Developer-xRemote Repository: git pull
Remote Repository-->>Local Repository: Error

The error suggests two options: commit the changes or stash them. Let’s explore both.

Option 1: Stashing Changes (Painless)

Stashing temporarily saves your changes and reverts the working directory to the last commit:

Terminal window
git stash

After stashing, the working directory reflects the last committed state. Verify this with git status, which should show a clean working directory. You can also view stashed changes:

Terminal window
git stash list

At this point, we could run git pull and it would successfully pull the most recent commits from GitHub. Then, we could reapply the changes we stashed and keep going from there. However, let’s not do this yet. Instead, let’s explore the more painful option we could have taken when we first tried to pull changes from GitHub.

To reapply the stashed changes (with or without pulling), use:

Terminal window
git stash pop

Note that the subcommand here is called pop because Git’s stash functionality stores stashed changes in a stack.

Here’s a visual representation of the resolution process using the stash:

Loading diagram...
Source
sequenceDiagram
participant Developer
participant Local Repository
participant Remote Repository
Developer->>Local Repository: Modify files
Developer->>Local Repository: git stash
Local Repository-->>Local Repository: Save changes to stash
Developer->>Remote Repository: git pull
Remote Repository->>Local Repository: Fetch and merge changes
Developer->>Local Repository: git stash pop
Local Repository-->>Local Repository: Reapply stashed changes

Option 2: Committing Changes (Leads to Trouble)

Instead of stashing then pulling, let’s commit the changes directly:

Terminal window
git add cat.js
git commit -m "Cat hisses"

Attempting to push this commit results in an error:

To https://github.com/<username>/my-project.git
! [rejected] main -> main (non-fast-forward)
error: failed to push some refs to 'https://github.com/<username>/my-project.git'
hint: Updates were rejected because the tip of your current branch is behind
hint: its remote counterpart. If you want to integrate the remote changes,
hint: use 'git pull' before pushing again.
hint: See the 'Note about fast-forwards' in 'git push --help' for details.

This happens because the local and remote branches now have divergent histories. Running git pull prompts another more accurate error:

hint: You have divergent branches and need to specify how to reconcile them.
hint: You can do so by running one of the following commands sometime before
hint: your next pull:
hint:
hint: git config pull.rebase false # merge
hint: git config pull.rebase true # rebase
hint: git config pull.ff only # fast-forward only
hint:
hint: You can replace "git config" with "git config --global" to set a default
hint: preference for all repositories. You can also pass --rebase, --no-rebase,
hint: or --ff-only on the command line to override the configured default per
hint: invocation.
fatal: Need to specify how to reconcile divergent branches.

Again, we have a couple different options here for how to deal with the issue. We’ll try to resolve it by merging the two commit histories together (we’ll discuss rebase later):

Terminal window
git config pull.rebase false # sets pull to merge
git pull

Alternatively, explicitly merge:

Terminal window
git merge origin/main

During the merge, Git reports a conflict:

Auto-merging cat.js
CONFLICT (content): Merge conflict in cat.js
Automatic merge failed; fix conflicts and then commit the result.

Trying to run these again will not help:

error: Merging is not possible because you have unmerged files.
hint: Fix them up in the work tree, and then use 'git add/rm <file>'
hint: as appropriate to mark resolution and make a commit.
fatal: Exiting because of an unresolved conflict.

The underlying problem here is that the latest commit in each of the two different commit histories here have different versions of the same line of code.

This is known as a conflict, and it is preventing Git from being able to automatically merge the two commit histories together (which it can normally do if there are no conflicts). This means we must manually resolve the conflict to complete the merge.

Inspecting cat.js reveals the conflict:

<<<<<<< HEAD
console.log("Meow!");
console.log("Hiss!");
=======
console.log("Purr...");
>>>>>>> 1817ae01cb389bd5f48282e8b09f56a84e7f474e

The <<<<<<< section shows the local changes, while the ======= and >>>>>>> sections show the remote changes. Resolve the conflict by editing the file. For example, merge both changes:

cat.js
console.log("Meow!");
console.log("Hiss!");
console.log("Purr...");

Mark the conflict as resolved:

Terminal window
git add cat.js
git commit

Alternatively, instead of using git commit, you can use git merge --continue (which was introduced in Git 2.12 and aligns with the git rebase command).

Terminal window
Finally, push the resolved changes:
```shell
git push

Return to “location #1” and pull the latest changes to ensure both locations have the same commit history:

Terminal window
git pull

Check the commit history with git log to confirm consistency across locations and GitHub.

Here’s a visual representation of the merge conflict resolution process:

Loading diagram...
Source
sequenceDiagram
participant Developer
participant Local Repository
participant Remote Repository
Developer->>Local Repository: Modify files
Developer->>Local Repository: git commit
Developer-xRemote Repository: git push
Remote Repository-->>Local Repository: Reject push (divergent branches)
Developer->>Remote Repository: git pull
Remote Repository->>Local Repository: Fetch and attempt merge
Local Repository-->>Developer: Report merge conflict
Developer->>Local Repository: Resolve conflict manually
Developer->>Local Repository: git commit
Developer->>Remote Repository: git push

Cleaning Up Your Commit History

Git is designed to help you maintain a clean and organized commit history. Each commit should represent a single, coherent change to your project. This approach makes it easier to understand your project’s history, collaborate with others, and even showcase your work to potential employers.

A common way to tidy up your commit history is by using Git’s rebase operation. This allows you to rewrite your commit history by combining, splitting, or reordering commits.

Reviewing the Current Commit History

Let’s start by examining the current commit history using git log:

commit da7e2e995e86828a46779fb3ae2c06efa9e43ea3 (HEAD -> main, origin/main)
Merge: 673e6b8 1817ae0
Author: Alexander Ulbrich <adulbrich@users.noreply.github.com>
Date: Thu Feb 27 21:51:26 2025 -0800
Merge branch 'main' of https://github.com/adulbrich/my-project
commit 673e6b830f9adb1cbce9aa200afe9dd40dfc0a4b
Author: Alexander Ulbrich <adulbrich@users.noreply.github.com>
Date: Thu Feb 27 21:38:06 2025 -0800
Cat hisses
commit 1817ae01cb389bd5f48282e8b09f56a84e7f474e
Author: Alexander Ulbrich <adulbrich@users.noreply.github.com>
Date: Thu Feb 27 21:35:05 2025 -0800
Cat purrs now
commit 3b0a84cd3767da40a9e9ed8d51722d589871a443
Author: Alexander Ulbrich <adulbrich@users.noreply.github.com>
Date: Thu Feb 27 21:06:39 2025 -0800
Make dog bark too
commit c4f570c3ce30fcf50f8f8a7736306030667e9337
Author: Alexander Ulbrich <adulbrich@users.noreply.github.com>
Date: Thu Feb 27 21:04:30 2025
Add cat and dog sounds

This history is somewhat messy. It includes a merge commit that doesn’t represent a meaningful change and multiple commits related to the same feature. Let’s clean it up using rebase.

Starting an Interactive Rebase

To begin, run the following command:

Terminal window
git rebase -i HEAD~3

This starts an interactive rebase for the last three commits. Git will open an editor displaying the commits:

pick 3b0a84c Make dog bark too
pick 673e6b8 Cat hisses
pick 1817ae0 Cat purrs now
# Rebase c4f570c..da7e2e9 onto c4f570c (3 commands)
#
# Commands:
# p, pick <commit> = use commit
# r, reword <commit> = use commit, but edit the commit message
# e, edit <commit> = use commit, but stop for amending
# s, squash <commit> = use commit, but meld into previous commit
# f, fixup [-C | -c] <commit> = like "squash" but keep only the previous
# commit's log message, unless -C is used, in which case
# keep only this commit's message; -c is same as -C but
# opens the editor
# x, exec <command> = run command (the rest of the line) using shell
# b, break = stop here (continue rebase later with 'git rebase --continue')
# d, drop <commit> = remove commit
# l, label <label> = label current HEAD with a name
# t, reset <label> = reset HEAD to a label
# m, merge [-C <commit> | -c <commit>] <label> [# <oneline>]
# create a merge commit using the original merge commit's
# message (or the oneline, if no original merge commit was
# specified); use -c <commit> to reword the commit message
# u, update-ref <ref> = track a placeholder for the <ref> to be updated
# to this position in the new commits. The <ref> is
# updated at the end of the rebase
#
# These lines can be re-ordered; they are executed from top to bottom.
#
# If you remove a line here THAT COMMIT WILL BE LOST.
#
# However, if you remove everything, the rebase will be aborted.
#

Notice that by design, the merge commit does not show up. When you use git rebase, it:

  • Temporarily sets aside your commits
  • Updates your branch to the latest version of the target branch
  • Reapplies your commits one by one on top of that updated base

If we wanted to include our initial commit, we could have used git rebase -i --root instead.

For now, let’s clean up our commit history by squashing the last two commits into the first one. To do this, we’ll change the file to look like this:

pick 3b0a84c Make dog bark too
squash 673e6b8 Cat hisses
squash 1817ae0 Cat purrs now

Save and close the editor. Git will reapply the commits in the specified order and open another editor for the new commit message. You can keep the default message or write a more descriptive one.

Verifying the Updated History

After completing the rebase, check the updated commit history with git log:

commit 600f9a073e36a932b870c9886229d02b2376ae21 (HEAD -> main)
Author: Alexander Ulbrich <adulbrich@users.noreply.github.com>
Date: Thu Feb 27 21:06:39 2025 -0800
dog barks too, cat hisses and purrs
commit c4f570c3ce30fcf50f8f8a7736306030667e9337
Author: Alexander Ulbrich <adulbrich@users.noreply.github.com>
Date: Thu Feb 27 21:04:30 2025
Add cat and dog sounds

The history is now cleaner, with related changes grouped into a single commit.

Interactive rebase also allows you to:

  • Reorder commits: Rearrange the order of commits in the editor.
  • Split commits: Use the edit command to break a commit into smaller parts.
  • Remove commits: Use the drop command to delete a commit.

If conflicts arise during the rebase, resolve them manually, then continue with:

Terminal window
git rebase --continue

To abort the rebase and restore the original history, use:

Terminal window
git rebase --abort

Updating the Remote Repository

After rewriting history locally, you’ll need to update the remote repository. A regular git push will be rejected because the history has changed. Instead, use:

Terminal window
git push --force-with-lease

This ensures you overwrite the remote history only if no one else has pushed changes in the meantime.

Additional Readings and Resources

One of the nicest kinds of resources you can have as you’re continuing to learn how to use Git and GitHub is a Git cheat sheet, which just provides a basic summary of some of the most important Git commands. You can find a couple nice Git cheat sheets at these locations:

In addition, Julia Evans does a great job at explaining Git’s quirks in her blog posts, such as:

And of course, there’s the Pro Git book, which is a great resource for learning more about Git. Most visuals on this page are sourced from the book. Check out Chapters 1 and 2 of the book for a more in-depth introduction to Git.

Summary

In this lecture, we covered the basics of using Git and GitHub for version control. We learned how to:

  • Initialize a Git repository
  • Stage and commit changes
  • Push and pull changes between local and remote repositories
  • Handle merge conflicts
  • Clean up commit history using interactive rebase
  • Work with multiple locations and branches
  • Use SSH and HTTPS for remote connections
  • Clone an existing repository from GitHub
  • Use Git commands to manage and review changes

Here’s a table summarizing key git commands:

CommandDescription
git initInitialize a new Git repository
git addStage changes for commit
git diffShow changes between commits, commit and working tree, etc.
git commitCommit staged changes with a message
git statusShow the status of the working directory
git logView commit history
git mergeJoin two or more development histories together
git rebaseReapply commits on top of another base
git pullFetch and merge changes from a remote repository
git pushPush local commits to a remote repository
git stashTemporarily save changes
git stash popReapply stashed changes
git clone <url>Clone a remote repository
git configConfigure Git settings
git remote add origin <url>Add a remote repository
git remote -vShow remote repository URLs