Git and GitHub Basics

Git and GitHub are indispensable tools in modern software development, used in nearly every significant project worldwide.

Although their core concepts are straightforward, their extensive features can sometimes feel overwhelming. Mastering these tools is essential for contributing to real-world software projects, and we will use them extensively throughout this course. To begin, we will explore Git and GitHub.

Initially, we will focus on using these tools as an individual developer. Later, we will delve into team-based collaboration workflows.

Git and GitHub are distinct tools, so we will examine each separately, starting with Git.

What is Git?

Git is a version control system (VCS) designed to track changes to a set of files over time, allowing you to access and manage previous versions.

Using a VCS provides several key benefits:

Reverting files to earlier versions.
Comparing changes made over time.
Synchronizing version history across multiple locations or collaborators.
Identifying when and by whom changes were made.

While many VCSs exist, Git stands out in how it stores version history. Git represents version history as a series of snapshots of the project.

When you create a new version (known as a “commit”), Git saves a snapshot of all project files. This version history is essentially a stream of these snapshots.

Git optimizes storage by avoiding duplication. If a file remains unchanged, Git doesn’t store a new copy; instead, it links to the previous version.

This stream of snapshots is stored in a Git repository (repo) located within the project directory:

Git Repository Structure

Representing data as snapshots enables powerful features like branches, which we will explore later.

Another defining feature of Git is that it is a distributed VCS. Every location containing the version history is a complete mirror of the entire history, including every version, commit, and snapshot.

Distributed Version Control

This distributed nature has both advantages and challenges. A significant advantage is that every clone of the repository acts as a complete backup, increasing resilience to machine failures.

However, because every clone contains the full version history, care must be taken to prevent the snapshot streams from diverging.

For example, if developers A and B make commits in their respective clones simultaneously, they create different versions of the project’s history. While these versions can be reconciled, the process can become difficult if the changes diverge significantly or conflict.

To address this, workflows are often used in Git projects with collaborators to ensure smooth reconciliation of work. We will discuss these workflows in detail later.

Finally, most Git operations are local, meaning you can work on a Git project without an internet connection.

Remote operations, such as synchronizing copies, require a network connection, but these are only necessary when explicitly executed.

We will soon explore how to use Git in practice.

What is GitHub?

GitHub is a cloud-based hosting service for Git repositories, widely used to back up code and facilitate collaboration. Developers use Git commands to synchronize their code with a remote repository on GitHub as they make commits.

However, GitHub offers much more than just hosting. It provides a suite of tools that build upon hosted Git repositories to serve as a central hub for collaboration.

These tools, accessible through GitHub’s web client, include:

Bug and issue tracking.
Code review via pull requests.
Task management.
Continuous integration.
Social features for developers.

GitHub is especially popular for team-based projects. A code project is hosted in a repository on GitHub, which acts as the central point for collaboration.

Each team member works on their own clone of the central GitHub repository. As they write code and make commits to their local clone, they push those commits back to the central repository. Other collaborators can then pull these commits from the central repository into their own clones, ensuring everyone stays up to date and can incorporate each other’s changes.

Later in this course, we will explore GitHub’s collaboration tools and workflows in greater detail, focusing on how they support team-based projects.

Basic Workflow

Let’s walk through a basic workflow for an individual developer using Git and GitHub, focusing on Git operations for making commits and synchronizing them.

Git is primarily a command-line tool, so we will emphasize the command-line version of Git operations. While I encourage you to become comfortable with the command-line interface to access Git’s full functionality, you are free to use GUI Git clients like GitHub Desktop or GitKraken. Many code editors, such as VS Code, also include built-in Git clients.

If you plan to use GitKraken for this course, note that working with private repositories on GitHub requires GitKraken Pro. While GitKraken Pro is typically a paid service, students can access it for free through the GitHub Student Developer Pack, which also includes other useful tools.

Installing Git

To use Git, you need to have a Git client installed on your development machine. The Pro Git book provides detailed installation instructions for various platforms. Here are some platform-specific notes:

Windows: Install Git for Windows, which includes a command-line version of Git (called Git Bash) and a simple GUI.
Mac: Use Homebrew to install Git with the following command brew install git.
Linux: The command-line version of Git is typically pre-installed (including on the OSU ENGR servers).

Configuring Git

If you’re using Git for the first time on your development machine, you’ll need to configure some basic settings before you can start. At a minimum, you must set your name and email address, which Git uses to identify you as the author of your commits. Use the following commands to configure these settings:

git config --global user.name "John Doe"
git config --global user.email "john.doe@example.com"

If you already have a GitHub account, you can use your GitHub email address here.

On the OSU ENGR servers, the git output might look a little strange because of a color setting. You can fix this by running the following command:

git config --global color.ui false

Starting a Git Project

Now that Git is installed and configured, you’re ready to start using it. There are two main ways to begin working on a Git-based project:

Create a new repository from scratch—and eventually mirroring it on GitHub
Clone an existing repository, such as one hosted on GitHub

We’ll explore both methods in detail.

Creating a Git Repository from Scratch

A Git repository is always tied to a directory, which aligns well with the practice of keeping code projects in their own dedicated directories. Git allows you to selectively manage which files and subdirectories are placed under version control.

To create a new Git repository, follow these steps:

mkdir my-project # Create a new project directory
cd my-project    # Navigate into the directory
git init         # Initialize a Git repository

This command creates an initial “database” to store your project’s version history and other Git-related information. This database resides in a hidden .git/ directory within your project folder.

Hidden directories (those starting with a .) are not visible by default, but you can view them using:

Unix Terminals: ls -a
Windows Command Prompt: dir /a:hd

States in a Git Repository

Files in a Git repository can exist in different states:

Not Staged for Commit:
- untracked: the file is not tracked by Git and is not included in the version history.
- modified: the file is tracked by Git, but its contents have changed since the last commit.
- deleted: the file is tracked by Git, but it has been deleted from the working directory.
Staged: The file is marked to be included in the next commit. It can be in state new file, modified, or deleted, respectively.
Committed: The file’s changes are saved in the repository and doesn’t have any changes since the last commit.

This means that before a file can be committed, it must be staged.

In general, we can think of a Git project as having three separate parts: the working directory, the staging area, and the repository itself (i.e. the .git/ directory).

The working directory is where your files live. This is where you write code, edit files, and make changes. The staging area is a temporary holding area for files that are ready to be committed. The repository is where the version history of your project is stored.

In particular, as we work on a project, we may make changes to many files at once. At some point, we may decide that we are ready to commit some of those changes but not ready to commit others. This is what the staging area allows us to do. The working directory, the staging area, and the Git repository interact like this:

Loading diagram...

Source

sequenceDiagram
  .git Directory ->> Working Directory: Checkout the latest commit
  Working Directory ->> Staging Area: Add files to staging area
  Staging Area ->> .git Directory: Commit staged changes

Let’s run through an example of this workflow.

Staging Files

With the repository initialized, you can now add files to version control. Start by creating two files in your project directory, cat.js and dog.js, and add some content to them:

echo 'console.log("Meow!");' > cat.js
echo 'console.log("Woof!");' > dog.js

Next, check the current state of your working directory using the git status command:

git status

The output will look something like this:

On branch master

No commits yet

Untracked files:
  (use "git add <file>..." to include in what will be committed)
        cat.js
        dog.js

nothing added to commit but untracked files present (use "git add" to track)

The “untracked files” section indicates that Git sees these files but is not yet tracking them. To start tracking the files, use the git add command:

git add cat.js dog.js

Alternatively, to track all files, you can use:

git add .

Run git status again, and you’ll see the updated output:

On branch master

No commits yet

Changes to be committed:
  (use "git rm --cached <file>..." to unstage)
        new file:   cat.js
        new file:   dog.js

Committing Files

Now the files are staged for a commit. To save this snapshot, use the git commit command:

git commit

Git will open a text editor for you to write a commit message. Enter a brief description, such as:

Add cat and dog sounds

Save and close the editor to complete the commit. If you check the status again, you’ll see:

nothing to commit, working tree clean

Alternatively, you can skip the editor by using the -m option to specify the commit message directly:

git commit -m "Add cat and dog sounds"

We can use git config to set the editor Git opens for us when we run git commit. For example, if we want Git to use vim as the default editor, we could run this command (you can replace vim with the command-line command for your favorite editor):

git config --system core.editor vim

To view the commit history, use:

git log

The output will include details like the commit hash, author, timestamp, and message:

commit c4f570c3ce30fcf50f8f8a7736306030667e9337 (HEAD -> master)
Author: Alexander Ulbrich <adulbrich@users.noreply.github.com>
Date:   Thu Feb 27 21:04:30 2025 -0800

    Add cat and dog sounds

The ong hexadecimal value associated with the commit is known as the commit hash. Here, the commit hash is:

c4f570c3ce30fcf50f8f8a7736306030667e9337

There are a couple things to know about the commit hash:

The commit hash is a checksum computed based on the contents of the commit itself. This means that the commit hash verifies the integrity of the commit it’s associated with.
The commit hash also serves as a unique identifier for the commit it’s associated with. When we need to refer to a specific commit in certain Git operations, we will typically do so using the hash of that commit.

Reviewing Changes with `git diff`

Now, let’s modify the files. Update cat.js and dog.js as follows:

console.log("Purr...");

console.log("Woof!");
console.log("Bark!");

Before committing these changes, check the status again:

git status

The output will indicate “changes not staged for commit.”

On branch master
Changes not staged for commit:
  (use "git add <file>..." to update what will be committed)
  (use "git restore <file>..." to discard changes in working directory)
        modified:   cat.js
        modified:   dog.js

no changes added to commit (use "git add" and/or "git commit -a")

To see the exact changes, use the git diff command:

git diff

This will display a unified diff showing the differences between the current working directory and the last commit:

diff --git a/cat.js b/cat.js
index ab949dc..45819c5 100644
--- a/cat.js
+++ b/cat.js
@@ -1 +1 @@
-console.log("Meow!");
+console.log("Purr...");
diff --git a/dog.js b/dog.js
index 9260ad6..5b62dbb 100644
--- a/dog.js
+++ b/dog.js
@@ -1 +1,2 @@
 console.log("Woof!");
+console.log("Bark!");

Git uses a specific format called unified format to produce diffs. This format can be a bit difficult to read at first because it’s designed for both human and machine consumption. Here’s what you should know about Git diffs:

A diff will represent each file under version control that has been modified since the last commit. The changes for each file are indicated by a 4-line header, such as:

diff --git a/dog.js b/dog.js
index 9260ad6..5b62dbb 100644
--- a/dog.js
+++ b/dog.js

The most important part of this header is the file name, in this case, dog.js.

Following the header, you’ll find one or more change hunks detailing the specific modifications made to the file. Each hunk starts with a line that specifies the line numbers involved in the change, like this:

@@ -1 +1,2 @@

The range information indicates the changes starting at line 1 of cat.js in the last commit (represented by the - sign) and spanning from line 1 to line 2 in the current working directory (represented by the + sign).

Finally, the hunk contains the actual changes, represented by three types of lines (note that the coloring may vary depending on the Git implementation):

Lines starting with a - indicate a deletion, meaning the line existed in the last commit but is no longer present in the working directory.
Lines starting with a + indicate an addition, meaning the line is present in the current working directory but was not in the last commit.
Lines that don’t start with either - or + are contextual lines, providing surrounding context for the changes.

For example, the change hunk for dog.js shows the two lines added since the last commit, along with a contextual line:

console.log("Woof!");
console.log("Bark!");

In the change hunk for cat.js, a modified line is represented as both a deletion and an addition:

console.log("Meow!");
console.log("Purr...");

Staging and Committing Changes

When you made your changes, you should stage and commit again. You can do it piecewise. For example, to stage only dog.js, run:

git add dog.js

Check the status again:

On branch master
Changes to be committed:
  (use "git restore --staged <file>..." to unstage)
        modified:   dog.js

Changes not staged for commit:
  (use "git add <file>..." to update what will be committed)
  (use "git restore <file>..." to discard changes in working directory)
        modified:   cat.js

Now, commit the staged changes:

git commit -m "Make dog bark too"

You can repeat this process for cat.js or any other files. Use git log to verify the commit history and git status to check the working directory’s state.

Connecting your Local Machine with GitHub

Currently, the repository we’ve been working on exists only on our development machine, with no remote copies. Let’s change that by allowing our local machine to connect to a remote one, in this case GitHub.

Start by visiting github.com in your web browser. If you don’t already have a GitHub account, sign up for one and log in.

Once logged in, you can create a new repository on GitHub. Follow these steps to ensure the new repository is set up correctly for mirroring an existing local repository:

Navigate to the “create a new repository” page on GitHub.
Provide the following details:
- Repository name: Match the name of your project directory, e.g., my-project.
- Description: Optional, add if desired.
- Visibility: Choose between public (visible to everyone) or private (visible only to you and collaborators you invite).
- Add a README file: Leave this unchecked.
- Add .gitignore: Select “None.”
- Choose a license: Select “None.”

After completing these steps, GitHub will create the repository and redirect you to its page.

Before pushing commits from your local repository to the newly created GitHub repository, you need to configure how your development machine will communicate with GitHub. There are two options for this: SSH and HTTPS.

At the top of your new repository’s page on GitHub, you’ll find a box with buttons to toggle between SSH and HTTPS. Choose your preferred method, as each requires specific setup steps on both GitHub and your development machine. We’ll briefly outline these setups in the following sections.

HTTPS: Creating a Personal Access Token (PAT)

When using HTTPS to communicate with GitHub, Git will prompt you to authenticate during operations that interact with GitHub. However, GitHub does not support simple password-based authentication. Instead, you’ll need to generate a Personal Access Token (PAT), which acts as a secure alternative to a password and includes specific permissions.

To create a PAT, follow the instructions in the GitHub documentation. Here are the recommended settings for this course:

Type: Personal Access Token (classic)
Scopes: repo

Once generated, the PAT will be a long string. Copy it and store it securely, as you would a password. You’ll need this token when pushing commits from your local repository to GitHub.

Depending on your environment, Git may prompt you for the PAT each time you perform an operation requiring authentication. In some cases, Git will remember the token after the first use.

SSH: Setting up SSH Keys

To communicate with GitHub via SSH, you’ll need to create an SSH key and register it with GitHub. An SSH key is a secure authentication credential that replaces the need for a username and password. Once set up, you’ll typically only need to enter the SSH key’s password once, enabling “passwordless” authentication for future operations.

Follow these steps to set up an SSH key and register it with GitHub:

Check for Existing SSH Keys: Start by verifying if you already have SSH keys on your machine. Refer to GitHub’s guide for instructions.
Generate a New SSH Key: If no keys exist, create one and configure your machine to use it. Detailed steps are available in GitHub’s documentation.
Register the SSH Key with GitHub: Add your newly created SSH key to your GitHub account by following these instructions.
Test the SSH Connection: Confirm that your SSH setup works by testing the connection. See GitHub’s guide for details.

Mirroring a Local Git Repo on GitHub

To mirror your local repository with a GitHub repository, you need to establish a connection using either HTTPS or SSH.

Copy the Repository URL

On your GitHub repository page, choose either HTTPS or SSH as your preferred communication method. Copy the corresponding URL, which will look like one of these:
Terminal window
```
https://github.com/<username>/<repository>.git # HTTPS
git@github.com:<username>/<repository>.git     # SSH
```
Add the Remote Repository

Register the GitHub repository as a remote in your local repository. A Git remote is simply a reference to a repository hosted elsewhere. Use the following command, replacing <HTTPS_or_SSH_URL> with the URL you copied:
Terminal window
```
git remote add origin <HTTPS_or_SSH_URL>
```
Verify the Remote

Confirm that the remote was added successfully by running:
Terminal window
```
git remote -v
```
This will display a list of remotes and their URLs, which should look like this:
```
origin  https://github.com/<username>/<repository>.git (fetch)
origin  https://github.com/<username>/<repository>.git (push)
```
Set the Default Branch

Ensure your local repository uses main as the default branch name:
Terminal window
```
git branch -M main
```
Push to GitHub

Use the git push command to upload your local commit history to the GitHub repository:
Terminal window
```
git push -u origin main
```
The -u option establishes a tracking relationship between your local branch and the remote branch, allowing you to use commands like git push and git pull without additional arguments in the future.

Once the push is complete, your files and commit history will be visible on GitHub. Refresh the repository page in your browser to explore its features, such as viewing the commit history.

We will cover additional GitHub repository features as the course progresses.

Starting with an Existing Repo on GitHub

Another common way to begin working on a Git project is by cloning an existing remote repository, such as one hosted on GitHub.

To create a working copy of the remote repository on your development machine, follow these steps:

Choose a Location for the Clone

If you’re cloning the repository on the same machine where you created the original repo, navigate to a different directory. Alternatively, you can use a different machine (e.g., by connecting via ssh to one of the ENGR servers). We’ll refer to the new location as “location #2” and the original location as “location #1”.
Copy the Repository URL

On your GitHub repository page, click the green Code button. This opens a dropdown with options for cloning the repository. Select either HTTPS or SSH as your preferred communication method and copy the provided URL. This URL should match the one used when setting up the remote earlier.
Clone the Repository

At “location #2”, run the following command, replacing <HTTPS_or_SSH_URL> with the URL you copied:
Terminal window
```
git clone <HTTPS_or_SSH_URL>
```
This command creates a local copy of the repository in a new directory named after the repository (e.g., my-project/). The directory will contain the complete version history of the project.
Verify the Clone

Navigate into the cloned directory and list its contents. The files should match the most recent commit made at “location #1” (e.g., any uncommitted changes, like modifications to cat.js, will not appear).
Check the Remote Connection

When cloning a repository, Git automatically sets up a connection to the remote repository, naming it origin. You can verify this by running:
Terminal window
```
git remote -v
```
The output will display the remote URLs for fetching and pushing, similar to this:
```
origin  https://github.com/<username>/<repository>.git (fetch)
origin  https://github.com/<username>/<repository>.git (push)
```
Additionally, Git configures the main branch of the local repository to track the main branch of the remote repository. This allows you to use commands like git push and git pull without specifying additional arguments.

Once the repository is cloned, you can work on it at “location #2” just as you would at “location #1.” If you plan to work in both locations, there are additional considerations to ensure the commit history remains consistent. These will be covered in the next section.

Working on a Git Repo in Two Different Locations

When working on two separate instances of the same Git repository (e.g., on two different machines), it’s important to take extra care to keep the commit history consistent between them.

In this scenario, we have two clones of the same repository: one at “location #1” and another at “location #2”. Both are connected to the same GitHub repository.

To illustrate, let’s make a new commit at “location #2” and then handle that commit at “location #1”. At “location #2”, modify the dog.js file as follows (adding additional “bark” lines):

console.log("Woof!");
console.log("Bark!");
console.log("Bark!");
console.log("Bark!");

After making this change, stage and commit it, and push it to the GitHub repository:

git push

Now, suppose you want to resume work at “location #1”. Since changes were made and pushed from “location #2”, you must first ensure that “location #1” has the latest version of the code. To synchronize the repository at “location #1” with the GitHub repository, use the git pull command:

git pull

This command will fetch all the commits from GitHub and merge them into the working directory at “location #1”.

Next, let’s explore what happens if this rule is not followed.

Dealing with Merge Conflicts

Let’s revisit “location #1”. Recall that we had made a modification to cat.js but hadn’t committed it yet. You can verify this change using git diff:

diff --git a/cat.js b/cat.js
index ab949dc..45819c5 100644
--- a/cat.js
+++ b/cat.js
@@ -1 +1 @@
-console.log("Meow!");
+console.log("Purr...");

Commit and push this change at location #1:

git add cat.js
git commit
git push

Now, switch to “location #2”. Suppose we forget the rule of pulling changes before starting work. We modify cat.js to add a new line for a hissing sound, without reflecting the changes pushed from “location #1”:

console.log("Meow!");
console.log("Hiss!");

At this point, we remember to pull from GitHub. Running git pull results in an error:

error: Your local changes to the following files would be overwritten by merge:
    cat.js
Please commit your changes or stash them before you merge.
Aborting

Loading diagram...

Source

sequenceDiagram
    participant Developer
    participant Local Repository
    participant Remote Repository

    Developer->>Local Repository: Modify files
    Developer-xRemote Repository: git pull
    Remote Repository-->>Local Repository: Error

The error suggests two options: commit the changes or stash them. Let’s explore both.

Option 1: Stashing Changes (Painless)

Stashing temporarily saves your changes and reverts the working directory to the last commit:

git stash

After stashing, the working directory reflects the last committed state. Verify this with git status, which should show a clean working directory. You can also view stashed changes:

git stash list

At this point, we could run git pull and it would successfully pull the most recent commits from GitHub. Then, we could reapply the changes we stashed and keep going from there. However, let’s not do this yet. Instead, let’s explore the more painful option we could have taken when we first tried to pull changes from GitHub.

To reapply the stashed changes (with or without pulling), use:

git stash pop

Note that the subcommand here is called pop because Git’s stash functionality stores stashed changes in a stack.

Here’s a visual representation of the resolution process using the stash:

Loading diagram...

Source

sequenceDiagram
    participant Developer
    participant Local Repository
    participant Remote Repository

    Developer->>Local Repository: Modify files
    Developer->>Local Repository: git stash
    Local Repository-->>Local Repository: Save changes to stash
    Developer->>Remote Repository: git pull
    Remote Repository->>Local Repository: Fetch and merge changes
    Developer->>Local Repository: git stash pop
    Local Repository-->>Local Repository: Reapply stashed changes

Option 2: Committing Changes (Leads to Trouble)

Instead of stashing then pulling, let’s commit the changes directly:

git add cat.js
git commit -m "Cat hisses"

Attempting to push this commit results in an error:

To https://github.com/<username>/my-project.git
 ! [rejected]        main -> main (non-fast-forward)
error: failed to push some refs to 'https://github.com/<username>/my-project.git'
hint: Updates were rejected because the tip of your current branch is behind
hint: its remote counterpart. If you want to integrate the remote changes,
hint: use 'git pull' before pushing again.
hint: See the 'Note about fast-forwards' in 'git push --help' for details.

This happens because the local and remote branches now have divergent histories. Running git pull prompts another more accurate error:

hint: You have divergent branches and need to specify how to reconcile them.
hint: You can do so by running one of the following commands sometime before
hint: your next pull:
hint:
hint:   git config pull.rebase false  # merge
hint:   git config pull.rebase true   # rebase
hint:   git config pull.ff only       # fast-forward only
hint:
hint: You can replace "git config" with "git config --global" to set a default
hint: preference for all repositories. You can also pass --rebase, --no-rebase,
hint: or --ff-only on the command line to override the configured default per
hint: invocation.
fatal: Need to specify how to reconcile divergent branches.

Again, we have a couple different options here for how to deal with the issue. We’ll try to resolve it by merging the two commit histories together (we’ll discuss rebase later):

git config pull.rebase false # sets pull to merge
git pull

Alternatively, explicitly merge:

git merge origin/main

During the merge, Git reports a conflict:

Auto-merging cat.js
CONFLICT (content): Merge conflict in cat.js
Automatic merge failed; fix conflicts and then commit the result.

Trying to run these again will not help:

error: Merging is not possible because you have unmerged files.
hint: Fix them up in the work tree, and then use 'git add/rm <file>'
hint: as appropriate to mark resolution and make a commit.
fatal: Exiting because of an unresolved conflict.

The underlying problem here is that the latest commit in each of the two different commit histories here have different versions of the same line of code.

This is known as a conflict, and it is preventing Git from being able to automatically merge the two commit histories together (which it can normally do if there are no conflicts). This means we must manually resolve the conflict to complete the merge.

Inspecting cat.js reveals the conflict:

<<<<<<< HEAD
console.log("Meow!");
console.log("Hiss!");
=======
console.log("Purr...");
>>>>>>> 1817ae01cb389bd5f48282e8b09f56a84e7f474e

The <<<<<<< section shows the local changes, while the ======= and >>>>>>> sections show the remote changes. Resolve the conflict by editing the file. For example, merge both changes:

console.log("Meow!");
console.log("Hiss!");
console.log("Purr...");

Mark the conflict as resolved:

git add cat.js
git commit

Alternatively, instead of using git commit, you can use git merge --continue (which was introduced in Git 2.12 and aligns with the git rebase command).

Finally, push the resolved changes:

```shell
git push

Return to “location #1” and pull the latest changes to ensure both locations have the same commit history:

git pull

Check the commit history with git log to confirm consistency across locations and GitHub.

Here’s a visual representation of the merge conflict resolution process:

Loading diagram...

Source

sequenceDiagram
    participant Developer
    participant Local Repository
    participant Remote Repository

    Developer->>Local Repository: Modify files
    Developer->>Local Repository: git commit
    Developer-xRemote Repository: git push
    Remote Repository-->>Local Repository: Reject push (divergent branches)
    Developer->>Remote Repository: git pull
    Remote Repository->>Local Repository: Fetch and attempt merge
    Local Repository-->>Developer: Report merge conflict
    Developer->>Local Repository: Resolve conflict manually
    Developer->>Local Repository: git commit
    Developer->>Remote Repository: git push

Cleaning Up Your Commit History

Git is designed to help you maintain a clean and organized commit history. Each commit should represent a single, coherent change to your project. This approach makes it easier to understand your project’s history, collaborate with others, and even showcase your work to potential employers.

A common way to tidy up your commit history is by using Git’s rebase operation. This allows you to rewrite your commit history by combining, splitting, or reordering commits.

Reviewing the Current Commit History

Let’s start by examining the current commit history using git log:

commit da7e2e995e86828a46779fb3ae2c06efa9e43ea3 (HEAD -> main, origin/main)
Merge: 673e6b8 1817ae0
Author: Alexander Ulbrich <adulbrich@users.noreply.github.com>
Date:   Thu Feb 27 21:51:26 2025 -0800

    Merge branch 'main' of https://github.com/adulbrich/my-project

commit 673e6b830f9adb1cbce9aa200afe9dd40dfc0a4b
Author: Alexander Ulbrich <adulbrich@users.noreply.github.com>
Date:   Thu Feb 27 21:38:06 2025 -0800

    Cat hisses

commit 1817ae01cb389bd5f48282e8b09f56a84e7f474e
Author: Alexander Ulbrich <adulbrich@users.noreply.github.com>
Date:   Thu Feb 27 21:35:05 2025 -0800

    Cat purrs now

commit 3b0a84cd3767da40a9e9ed8d51722d589871a443
Author: Alexander Ulbrich <adulbrich@users.noreply.github.com>
Date:   Thu Feb 27 21:06:39 2025 -0800

    Make dog bark too

commit c4f570c3ce30fcf50f8f8a7736306030667e9337
Author: Alexander Ulbrich <adulbrich@users.noreply.github.com>
Date:   Thu Feb 27 21:04:30 2025

    Add cat and dog sounds

This history is somewhat messy. It includes a merge commit that doesn’t represent a meaningful change and multiple commits related to the same feature. Let’s clean it up using rebase.

Starting an Interactive Rebase

To begin, run the following command:

git rebase -i HEAD~3

This starts an interactive rebase for the last three commits. Git will open an editor displaying the commits:

pick 3b0a84c Make dog bark too
pick 673e6b8 Cat hisses
pick 1817ae0 Cat purrs now

# Rebase c4f570c..da7e2e9 onto c4f570c (3 commands)
#
# Commands:
# p, pick <commit> = use commit
# r, reword <commit> = use commit, but edit the commit message
# e, edit <commit> = use commit, but stop for amending
# s, squash <commit> = use commit, but meld into previous commit
# f, fixup [-C | -c] <commit> = like "squash" but keep only the previous
#                    commit's log message, unless -C is used, in which case
#                    keep only this commit's message; -c is same as -C but
#                    opens the editor
# x, exec <command> = run command (the rest of the line) using shell
# b, break = stop here (continue rebase later with 'git rebase --continue')
# d, drop <commit> = remove commit
# l, label <label> = label current HEAD with a name
# t, reset <label> = reset HEAD to a label
# m, merge [-C <commit> | -c <commit>] <label> [# <oneline>]
#         create a merge commit using the original merge commit's
#         message (or the oneline, if no original merge commit was
#         specified); use -c <commit> to reword the commit message
# u, update-ref <ref> = track a placeholder for the <ref> to be updated
#                       to this position in the new commits. The <ref> is
#                       updated at the end of the rebase
#
# These lines can be re-ordered; they are executed from top to bottom.
#
# If you remove a line here THAT COMMIT WILL BE LOST.
#
# However, if you remove everything, the rebase will be aborted.
#

Notice that by design, the merge commit does not show up. When you use git rebase, it:

Temporarily sets aside your commits
Updates your branch to the latest version of the target branch
Reapplies your commits one by one on top of that updated base

If we wanted to include our initial commit, we could have used git rebase -i --root instead.

For now, let’s clean up our commit history by squashing the last two commits into the first one. To do this, we’ll change the file to look like this:

pick 3b0a84c Make dog bark too
squash 673e6b8 Cat hisses
squash 1817ae0 Cat purrs now

Save and close the editor. Git will reapply the commits in the specified order and open another editor for the new commit message. You can keep the default message or write a more descriptive one.

Verifying the Updated History

After completing the rebase, check the updated commit history with git log:

commit 600f9a073e36a932b870c9886229d02b2376ae21 (HEAD -> main)
Author: Alexander Ulbrich <adulbrich@users.noreply.github.com>
Date:   Thu Feb 27 21:06:39 2025 -0800

    dog barks too, cat hisses and purrs

commit c4f570c3ce30fcf50f8f8a7736306030667e9337
Author: Alexander Ulbrich <adulbrich@users.noreply.github.com>
Date:   Thu Feb 27 21:04:30 2025

    Add cat and dog sounds

The history is now cleaner, with related changes grouped into a single commit.

Interactive rebase also allows you to:

Reorder commits: Rearrange the order of commits in the editor.
Split commits: Use the edit command to break a commit into smaller parts.
Remove commits: Use the drop command to delete a commit.

If conflicts arise during the rebase, resolve them manually, then continue with:

git rebase --continue

To abort the rebase and restore the original history, use:

git rebase --abort

Updating the Remote Repository

After rewriting history locally, you’ll need to update the remote repository. A regular git push will be rejected because the history has changed. Instead, use:

git push --force-with-lease

This ensures you overwrite the remote history only if no one else has pushed changes in the meantime.

Additional Readings and Resources

One of the nicest kinds of resources you can have as you’re continuing to learn how to use Git and GitHub is a Git cheat sheet, which just provides a basic summary of some of the most important Git commands. You can find a couple nice Git cheat sheets at these locations:

In addition, Julia Evans does a great job at explaining Git’s quirks in her blog posts, such as:

And of course, there’s the Pro Git book, which is a great resource for learning more about Git. Most visuals on this page are sourced from the book. Check out Chapters 1 and 2 of the book for a more in-depth introduction to Git.

Summary

In this lecture, we covered the basics of using Git and GitHub for version control. We learned how to:

Initialize a Git repository
Stage and commit changes
Push and pull changes between local and remote repositories
Handle merge conflicts
Clean up commit history using interactive rebase
Work with multiple locations and branches
Use SSH and HTTPS for remote connections
Clone an existing repository from GitHub
Use Git commands to manage and review changes

Here’s a table summarizing key git commands:

Command	Description
`git init`	Initialize a new Git repository
`git add`	Stage changes for commit
`git diff`	Show changes between commits, commit and working tree, etc.
`git commit`	Commit staged changes with a message
`git status`	Show the status of the working directory
`git log`	View commit history
`git merge`	Join two or more development histories together
`git rebase`	Reapply commits on top of another base
`git pull`	Fetch and merge changes from a remote repository
`git push`	Push local commits to a remote repository
`git stash`	Temporarily save changes
`git stash pop`	Reapply stashed changes
`git clone <url>`	Clone a remote repository
`git config`	Configure Git settings
`git remote add origin <url>`	Add a remote repository
`git remote -v`	Show remote repository URLs

Git and GitHub Basics

What is Git?

What is GitHub?

Basic Workflow

Installing Git

Configuring Git

Starting a Git Project

Creating a Git Repository from Scratch

States in a Git Repository

Staging Files

Committing Files

Reviewing Changes with git diff

Staging and Committing Changes

Connecting your Local Machine with GitHub

HTTPS: Creating a Personal Access Token (PAT)

SSH: Setting up SSH Keys

Mirroring a Local Git Repo on GitHub

Starting with an Existing Repo on GitHub

Working on a Git Repo in Two Different Locations

Dealing with Merge Conflicts

Option 1: Stashing Changes (Painless)

Option 2: Committing Changes (Leads to Trouble)

Cleaning Up Your Commit History

Reviewing the Current Commit History

Starting an Interactive Rebase

Verifying the Updated History

Updating the Remote Repository

Additional Readings and Resources

Summary

Reviewing Changes with `git diff`