Version Control and Git Basics
Understand why version control exists, how Git models history, and master the core commands you will use every single working day.
The cost of not tracking changes
Before version control, developers backed up work by copying folders. The result was chaos:
project/
├── app_v1.py
├── app_FINAL.py
├── app_FINAL_real.py
├── app_FINAL_v2_FIXED.py
└── app_backup_2024-01-09.py
This system has fatal flaws: there is no record of what changed between versions, no explanation of why, no way to work on two things simultaneously without overwriting each other, and no safe way to experiment without risking the working copy.
Git solves all of this. But understanding Git properly requires understanding the model it uses to store history.
How Git stores history
Git's mental model is snapshots, not differences. When you commit, Git takes a complete picture of every tracked file at that moment in time. It stores this efficiently — unchanged files are referenced rather than duplicated — but the conceptual model is complete snapshots.
Each snapshot is called a commit. Every commit contains:
- The complete state of all tracked files
- A reference to the previous commit (the parent)
- The author's name and email
- The timestamp
- A message you wrote describing the change
- A unique identifier (a long hex string called a hash)
The sequence of commits forms a history — a chain going back to the very first commit. You can view any snapshot, compare any two, or return to any previous state. Nothing is ever lost.
Every commit is identified by a hash of its content. If two commits have the same hash, they have exactly the same content. This means Git's history is tamper-evident — you cannot quietly edit an old commit without changing its hash and breaking the chain. This is why Git is trusted for auditing and compliance.
The three areas
Understanding Git requires understanding three places where your code can live:
| Area | Description | Moved by |
|---|---|---|
| Working directory | The actual files on your disk that you edit | You, editing files |
| Staging area (index) | A holding area for changes you intend to commit | git add |
| Repository (.git) | The permanent history of committed snapshots | git commit |
The staging area is the key to understanding Git. It lets you be selective: if you changed three files but only two of the changes are ready, you can stage just those two and commit them. The third change stays in your working directory until you are ready.
# Changes exist only in working directory
[working dir] app.py (modified)
[staging] (empty)
[repository] last commit: ...
# After: git add app.py
[working dir] app.py (staged)
[staging] app.py
[repository] last commit: ...
# After: git commit -m 'Fix login bug'
[working dir] app.py (clean)
[staging] (empty)
[repository] new commit: Fix login bug
Setting up Git
# Required: your identity appears in every commit
git config --global user.name "Your Name"
git config --global user.email "you@lancaster.ac.uk"
# Set your preferred editor for commit messages
git config --global core.editor nano
# Set default branch name to 'main' (modern convention)
git config --global init.defaultBranch main
# Verify your settings
git config --list
Core workflow commands
git init
Create a new repository in the current directory.
bashmkdir my-project && cd my-project git init Initialized empty Git repository in .git/This creates a hidden
.gitdirectory. Your project is now a repository.git status
Your most-used command. Shows what has changed, what is staged, and what is untracked.
bashgit status On branch main Changes not staged for commit: modified: app.py Untracked files: tests/ no changes added to commitgit add
Stage changes for the next commit.
bash# Stage a specific file git add app.py # Stage all changes in current directory git add . # Stage parts of a file interactively git add -p app.py # choose individual hunksgit commit
Take a snapshot of the staged changes.
bash# Commit with inline message git commit -m "Add user validation" [main a1b2c3d] Add user validation 2 files changed, 34 insertions(+), 5 deletions(-) # Open editor for a longer message git commit # Stage all tracked files and commit in one step git commit -am "Quick fix"git log
View the history of commits.
bash# Default: full log git log # One line per commit git log --oneline a1b2c3d Add user validation 9f3e2a1 Initial commit # Show last N commits git log --oneline -5 # Show commits by a specific author git log --author="Alice" # Show commits affecting a specific file git log --oneline -- auth.py # Graphical representation (useful with branches) git log --oneline --graph --all
git diff in depth
git diff compares versions of your files. Which versions it compares depends on what arguments you give it:
# Unstaged changes: working dir vs staging area
git diff
# Staged changes: staging area vs last commit
git diff --staged
# All changes: working dir vs last commit
git diff HEAD
# Compare two commits
git diff a1b2c3d f4e5d6c
# Show only the names of changed files
git diff --name-only
# Show statistics (lines added/removed per file)
git diff --stat
Reading a diff:
diff --git a/app.py b/app.py
--- a/app.py # old version
+++ b/app.py # new version
@@ -12,7 +12,8 @@ # location in file
def login(user, pw): # unchanged line
- return True # removed line (red)
+ if not user or not pw: # added line (green)
+ return False
The .gitignore file
Some files should never be committed: compiled outputs, local configuration, secrets, and large data files. A .gitignore file tells Git to ignore them.
# Python
__pycache__/
*.pyc
*.pyo
.pytest_cache/
htmlcov/
# Secrets — NEVER commit these
.env
*.key
secrets.yml
# IDE files
.vscode/
.idea/
# OS files
.DS_Store
Thumbs.db
# Dependency directories
node_modules/
venv/
If you accidentally commit a password, API key, or private key, it is in the repository's history forever — even if you delete the file in the next commit. The correct response is to immediately rotate the secret (generate a new one and revoke the old one), then clean the history using git filter-repo. Prevention is infinitely easier: add .env and any key files to .gitignore before creating them.
Use environment variables (Module 1) to pass secrets to your application. The .env file stores them locally and is excluded from version control.
Undoing things
Knowing how to undo is as important as knowing how to commit. Here are the main options:
git restore app.pygit restore --staged app.pygit commit --amendgit revert HEADgit stashgit stash popgit reset --hard moves HEAD to a different commit and discards all changes in between. Any uncommitted work is lost permanently. Use it only on local branches you have not shared with others.
Writing good commit messages
A commit message is a gift to your future self and your teammates. Reading the log should tell the story of the project.
Add rate limiting to the login endpoint
Login attempts were unlimited, making brute-force attacks trivial.
Implements a sliding window of 5 attempts per IP per 15 minutes.
Exceeding the limit returns 429 Too Many Requests.
Fixes: JIRA-1234
| Rule | Good example | Bad example |
|---|---|---|
| Imperative mood in subject | Add login rate limiting | Added login rate limit |
| 50 characters or less in subject | Fix null check in user parser | Fixed the null pointer bug that was crashing the parser in some edge cases |
| Blank line between subject and body | — (correct) | — (skipped) |
| Explain why, not what | Bypass slow DNS lookup on startup | Change DNS call |
Key terms
Exercises
Part A: Build a history
- Create a new directory and initialise a Git repository.
- Create a Python file with a function that adds two numbers. Commit it.
- Add a docstring to the function. Stage and commit that change separately.
- Add a second function. Commit it with a meaningful message.
- Use git log --oneline to see all three commits.
- Use git show followed by a commit hash to inspect each commit in detail.
Part B: Working with diffs
- Modify the file without staging. Run git diff. What do you see?
- Stage the change with git add. Run git diff again. Why is there no output?
- Run git diff --staged. Now you should see the staged changes.
- Commit. Run git diff HEAD~1 HEAD to compare the last two commits.
Part C: Undoing
- Make a change to the file but do not stage it. Use git restore to discard it.
- Stage a change then use git restore --staged to unstage it.
- Make and commit a change with a typo in the message. Use git commit --amend to fix it.
- Create a commit you want to reverse. Use git revert HEAD to undo it safely.
Part D: .gitignore
- Create a
.gitignorefile that ignores__pycache__/,*.pyc, and.env. - Create a file called
.envwith fake credentials. Verify git status does not list it. - Run
git check-ignore -v .envto confirm .gitignore is working.