Module 03Version Control~1 hour

Version Control and Git Basics

Understand why version control exists, how Git models history, and master the core commands you will use every single working day.

Covers:MLO4
26

The cost of not tracking changes

Before version control, developers backed up work by copying folders. The result was chaos:

plain — the classic folder mess
project/
├── app_v1.py
├── app_FINAL.py
├── app_FINAL_real.py
├── app_FINAL_v2_FIXED.py
└── app_backup_2024-01-09.py

This system has fatal flaws: there is no record of what changed between versions, no explanation of why, no way to work on two things simultaneously without overwriting each other, and no safe way to experiment without risking the working copy.

Git solves all of this. But understanding Git properly requires understanding the model it uses to store history.

27

How Git stores history

Git's mental model is snapshots, not differences. When you commit, Git takes a complete picture of every tracked file at that moment in time. It stores this efficiently — unchanged files are referenced rather than duplicated — but the conceptual model is complete snapshots.

Each snapshot is called a commit. Every commit contains:

  • The complete state of all tracked files
  • A reference to the previous commit (the parent)
  • The author's name and email
  • The timestamp
  • A message you wrote describing the change
  • A unique identifier (a long hex string called a hash)

The sequence of commits forms a history — a chain going back to the very first commit. You can view any snapshot, compare any two, or return to any previous state. Nothing is ever lost.

★ Git is a content-addressable filesystem

Every commit is identified by a hash of its content. If two commits have the same hash, they have exactly the same content. This means Git's history is tamper-evident — you cannot quietly edit an old commit without changing its hash and breaking the chain. This is why Git is trusted for auditing and compliance.

28

The three areas

Understanding Git requires understanding three places where your code can live:

AreaDescriptionMoved by
Working directoryThe actual files on your disk that you editYou, editing files
Staging area (index)A holding area for changes you intend to commitgit add
Repository (.git)The permanent history of committed snapshotsgit commit

The staging area is the key to understanding Git. It lets you be selective: if you changed three files but only two of the changes are ready, you can stage just those two and commit them. The third change stays in your working directory until you are ready.

bash — visualising the three areas
# Changes exist only in working directory
[working dir]  app.py (modified)
[staging]      (empty)
[repository]   last commit: ...

# After: git add app.py
[working dir]  app.py (staged)
[staging]      app.py
[repository]   last commit: ...

# After: git commit -m 'Fix login bug'
[working dir]  app.py (clean)
[staging]      (empty)
[repository]   new commit: Fix login bug
29

Setting up Git

bash — first-time configuration
# Required: your identity appears in every commit
git config --global user.name "Your Name"
git config --global user.email "you@lancaster.ac.uk"

# Set your preferred editor for commit messages
git config --global core.editor nano

# Set default branch name to 'main' (modern convention)
git config --global init.defaultBranch main

# Verify your settings
git config --list
30

Core workflow commands

  1. git init

    Create a new repository in the current directory.

    bash
    mkdir my-project
     && cd my-project
    git init
    Initialized empty Git repository in .git/
    

    This creates a hidden .git directory. Your project is now a repository.

  2. git status

    Your most-used command. Shows what has changed, what is staged, and what is untracked.

    bash
    git status
    On branch main
    Changes not staged for commit:
      modified:   app.py
    
    Untracked files:
      tests/
    
    no changes added to commit
    
  3. git add

    Stage changes for the next commit.

    bash
    # Stage a specific file
    git add app.py
    
    # Stage all changes in current directory
    git add .
    
    # Stage parts of a file interactively
    git add -p app.py
      # choose individual hunks
    
  4. git commit

    Take a snapshot of the staged changes.

    bash
    # Commit with inline message
    git commit -m "Add user validation"
    [main a1b2c3d] Add user validation
     2 files changed, 34 insertions(+), 5 deletions(-)
    
    # Open editor for a longer message
    git commit
    
    # Stage all tracked files and commit in one step
    git commit -am "Quick fix"
    
  5. git log

    View the history of commits.

    bash
    # Default: full log
    git log
    
    # One line per commit
    git log --oneline
    a1b2c3d Add user validation
    9f3e2a1 Initial commit
    
    # Show last N commits
    git log --oneline -5
    
    # Show commits by a specific author
    git log --author="Alice"
    
    # Show commits affecting a specific file
    git log --oneline -- auth.py
    
    # Graphical representation (useful with branches)
    git log --oneline --graph --all
    
31

git diff in depth

git diff compares versions of your files. Which versions it compares depends on what arguments you give it:

bash — git diff variations
# Unstaged changes: working dir vs staging area
git diff

# Staged changes: staging area vs last commit
git diff --staged

# All changes: working dir vs last commit
git diff HEAD

# Compare two commits
git diff a1b2c3d f4e5d6c

# Show only the names of changed files
git diff --name-only

# Show statistics (lines added/removed per file)
git diff --stat

Reading a diff:

plain — reading a diff
diff --git a/app.py b/app.py
--- a/app.py          # old version
+++ b/app.py          # new version
@@ -12,7 +12,8 @@    # location in file
 def login(user, pw): # unchanged line
-    return True      # removed line (red)
+    if not user or not pw:  # added line (green)
+        return False
32

The .gitignore file

Some files should never be committed: compiled outputs, local configuration, secrets, and large data files. A .gitignore file tells Git to ignore them.

plain — .gitignore
# Python
__pycache__/
*.pyc
*.pyo
.pytest_cache/
htmlcov/

# Secrets — NEVER commit these
.env
*.key
secrets.yml

# IDE files
.vscode/
.idea/

# OS files
.DS_Store
Thumbs.db

# Dependency directories
node_modules/
venv/
✕ Never commit secrets

If you accidentally commit a password, API key, or private key, it is in the repository's history forever — even if you delete the file in the next commit. The correct response is to immediately rotate the secret (generate a new one and revoke the old one), then clean the history using git filter-repo. Prevention is infinitely easier: add .env and any key files to .gitignore before creating them.

Use environment variables (Module 1) to pass secrets to your application. The .env file stores them locally and is excluded from version control.

33

Undoing things

Knowing how to undo is as important as knowing how to commit. Here are the main options:

git restore app.py
Discard unstaged changes — reverts working directory file to last committed state
git restore --staged app.py
Unstage a file — moves it from staging area back to working directory
git commit --amend
Edit the most recent commit message, or add forgotten files to it
git revert HEAD
Create a new commit that undoes the last commit (safe for shared branches)
git stash
Save current changes temporarily without committing them
git stash pop
Restore the most recently stashed changes
⚠ git reset --hard is dangerous

git reset --hard moves HEAD to a different commit and discards all changes in between. Any uncommitted work is lost permanently. Use it only on local branches you have not shared with others.

34

Writing good commit messages

A commit message is a gift to your future self and your teammates. Reading the log should tell the story of the project.

plain — anatomy of a good commit message
Add rate limiting to the login endpoint

Login attempts were unlimited, making brute-force attacks trivial.
Implements a sliding window of 5 attempts per IP per 15 minutes.
Exceeding the limit returns 429 Too Many Requests.

Fixes: JIRA-1234
RuleGood exampleBad example
Imperative mood in subjectAdd login rate limitingAdded login rate limit
50 characters or less in subjectFix null check in user parserFixed the null pointer bug that was crashing the parser in some edge cases
Blank line between subject and body— (correct)— (skipped)
Explain why, not whatBypass slow DNS lookup on startupChange DNS call
35

Key terms

repository
A project folder tracked by Git, containing a .git directory.
commit
A snapshot of all tracked files at a point in time, with a message and unique hash.
working directory
The files on your disk that you edit directly.
staging area
The holding area for changes prepared for the next commit.
hash
A unique identifier for each commit, derived from its content.
HEAD
A pointer to the current commit — usually the tip of the current branch.
.gitignore
A file listing patterns that Git should not track.
git diff
Shows differences between versions of files.
git stash
Temporarily saves uncommitted changes without committing them.
git log
Shows the commit history.
36

Exercises

✎ Lab exercises — approximately 50 minutes

Part A: Build a history

  1. Create a new directory and initialise a Git repository.
  2. Create a Python file with a function that adds two numbers. Commit it.
  3. Add a docstring to the function. Stage and commit that change separately.
  4. Add a second function. Commit it with a meaningful message.
  5. Use git log --oneline to see all three commits.
  6. Use git show followed by a commit hash to inspect each commit in detail.

Part B: Working with diffs

  1. Modify the file without staging. Run git diff. What do you see?
  2. Stage the change with git add. Run git diff again. Why is there no output?
  3. Run git diff --staged. Now you should see the staged changes.
  4. Commit. Run git diff HEAD~1 HEAD to compare the last two commits.

Part C: Undoing

  1. Make a change to the file but do not stage it. Use git restore to discard it.
  2. Stage a change then use git restore --staged to unstage it.
  3. Make and commit a change with a typo in the message. Use git commit --amend to fix it.
  4. Create a commit you want to reverse. Use git revert HEAD to undo it safely.

Part D: .gitignore

  1. Create a .gitignore file that ignores __pycache__/, *.pyc, and .env.
  2. Create a file called .env with fake credentials. Verify git status does not list it.
  3. Run git check-ignore -v .env to confirm .gitignore is working.