Git and GitHub Best Practices for Professional Developers

Q: How do I undo the last commit without losing changes?

Use git reset --soft HEAD~1. This moves HEAD back one commit but keeps all the changes from that commit staged. If you've already pushed the commit, use git revert HEAD instead to create a new commit that undoes the changes.

Q: What's the difference between git fetch and git pull?

git fetch downloads new data from the remote but doesn't change your working directory. git pull is git fetch followed by git merge. Using git fetch first lets you inspect changes before integrating them, which is safer.

Q: How should I handle large binary files in Git?

Use Git LFS (Large File Storage) to handle large binary files. Git LFS replaces large files with text pointers in the repository while storing the actual file content on a separate server.

Q: How many approvals should be required for a pull request?

For most teams, one approval is the sweet spot. For critical paths like security-sensitive code or database migrations, consider requiring two approvals. Use GitHub's CODEOWNERS file to automatically assign reviewers.

Last updated: May 28, 2026

By kongastral

Published April 11, 2026 · Updated May 28, 2026 · 38 min read

Summary

What this post covers: A professional-grade reference for Git and GitHub workflows, including branching strategies, commit conventions, pull-request and code-review practices, CI/CD with GitHub Actions, Git hooks, advanced recovery commands, repository security, and the monorepo-versus-polyrepo trade-off.

Key insights:

Trunk-based development with short-lived branches outperforms Git Flow for most teams that ship multiple times per day. Git Flow’s long-lived develop and release branches impose overhead that only versioned-software teams genuinely require.
Conventional Commits combined with a clear PR template convert project history into an auditable narrative and enable automated changelogs, semantic versioning, and faster debugging with git bisect.
Branch protection rules, required reviews, and signed commits are not optional ceremony. They constitute the single most effective defense against the kind of accidental force-push that destroys weeks of work.
Most “Git emergencies” (lost commits, bad merges, detached HEAD) are recoverable through git reflog. Understanding Git as a directed acyclic graph of snapshots rather than as a save button distinguishes senior from junior engineers.
Pre-commit hooks (linting, formatting, secret scanning) catch problems before they reach the remote and represent the lowest-cost quality investment a team can make.

Main topics: Why Git Mastery Matters More Than Is Commonly Recognized, Branching Strategies That Scale, Commit Conventions That Tell a Story, Pull Request Best Practices, Code Review Workflow and Standards, GitHub Actions and CI/CD Integration, Git Hooks for Quality Enforcement, Advanced Git Techniques, Security: Protecting Your Repository, Monorepo versus Polyrepo.

In 2017, a developer at a major financial institution accidentally force-pushed to the main branch on a Friday afternoon. The push overwrote three weeks of work from a team of twelve engineers. No branch protection rules were in place, no reviews were required, and the backup strategy amounted to a general instruction to be careful. The team spent the entire weekend reconstructing commits from local copies scattered across developer machines, Slack messages containing code snippets, and memory. The estimated cost, accounting for overtime, delayed releases, and lost client confidence, exceeded $300,000.

The incident was not isolated. A 2023 survey by GitLab found that 40 percent of developers had experienced significant code loss or merge conflicts requiring more than a full day to resolve. Stack Overflow’s developer survey consistently shows that, although over 95 percent of professional developers use Git, the majority rely on fewer than ten commands. They are familiar with git add, git commit, git push, and git pull. When something goes wrong, they typically panic, copy the working directory to the desktop as a precaution, and consult a search engine.

The uncomfortable reality is that most developers use approximately 10 percent of Git’s capabilities. They treat it as a save button rather than as the distributed version control system it is. In an era of collaborative, fast-moving software development in which teams ship dozens of times per day through automated pipelines, this knowledge gap is not merely inconvenient; it is hazardous.

This guide is designed to close that gap. It covers branching strategies used by teams at Google, Meta, and Stripe; commit conventions that render project history genuinely useful; and advanced techniques such as interactive rebase and bisect that can save hours of debugging. The intended audience includes both junior developers seeking to develop their skills and senior engineers who wish to formalize what they already know.

Why Git Mastery Matters More Than Is Commonly Recognized

Git is the most widely used version control system in the world. As of 2025, GitHub alone hosts more than 400 million repositories and has over 100 million developers. GitLab and Bitbucket add tens of millions more. Every Fortune 500 company uses Git in some form. It is not a tool that can be used casually.

Git mastery, however, is not merely a matter of knowing commands. It is a matter of understanding workflows: the patterns and conventions that allow teams of five, fifty, or five thousand developers to work on the same codebase without disruption. A developer who understands Git deeply can perform the following tasks.

Resolve merge conflicts in minutes rather than hours, because they understand what Git is actually tracking.
Navigate project history to determine when and why a defect was introduced, using tools such as git bisect and git log.
Recover from mistakes—accidental commits, bad merges, and even deleted branches—using git reflog.
Collaborate effectively through well-structured pull requests and meaningful commit messages.
Automate quality checks using Git hooks that run before code reaches the remote repository.

The difference between a developer who “uses Git” and one who “understands Git” becomes especially apparent during incidents. When production is down and the team must identify the commit that introduced the regression, revert it cleanly, and deploy a fix within minutes, Git proficiency directly affects the team’s mean time to recovery (MTTR).

Key Takeaway: Git proficiency is a force multiplier. Time invested in learning Git deeply yields daily returns in faster debugging, smoother collaboration, and fewer catastrophic errors.

Building the Correct Mental Model

Before discussing specific practices, a mental model that simplifies subsequent material should be established.

Git is fundamentally a directed acyclic graph (DAG) of snapshots. Every commit is a complete snapshot of the project at a point in time, linked to its parent commits. Branches are movable pointers to commits. Tags are fixed pointers. HEAD is a pointer to the branch or commit currently in use.

Internalizing this model removes much of Git’s apparent mystery. A merge creates a new commit with two parents. A rebase replays commits on top of a new base. A cherry-pick copies a single commit to a new location. These are graph operations, not arcane procedures.

Understanding this graph model is particularly important when working with the same repository across Docker-based development environments, where multiple containers may interact with the same codebase, or when a CI/CD pipeline must decide on actions based on what changed between commits.

Branching Strategies That Scale

Choosing the appropriate branching strategy is one of the most consequential decisions a team makes. The wrong strategy creates bottlenecks, increases merge conflicts, and slows delivery. The right one makes collaboration feel effortless.

Three branching strategies dominate professional software development, each optimized for different team sizes and release cadences.

Git Flow

Introduced by Vincent Driessen in 2010, Git Flow uses two long-lived branches—main (production) and develop (integration)—along with short-lived feature, release, and hotfix branches. It is the most structured of the three strategies.

The workflow proceeds as follows.

Developers create feature branches from develop.
Completed features merge back into develop.
When enough features have accumulated, a release branch is cut from develop.
The release branch receives final testing and bug fixes.
The release merges into both main (tagged with a version) and back into develop.
Hotfix branches are created from main for critical production bugs and then merged into both main and develop.

When to use Git Flow: teams with scheduled releases, such as mobile apps subject to App Store review cycles, products that must maintain multiple versions simultaneously, or organizations with strict release-management processes.

When to avoid it: for teams that deploy continuously (multiple times per day), Git Flow imposes unnecessary ceremony. The release-branch process becomes a bottleneck when fast shipping is the priority.

GitHub Flow

GitHub Flow is substantially simpler. There is one long-lived branch, main; everything else is a feature branch.

Create a branch from main.
Make commits on that branch.
Open a pull request.
Discuss and review the code.
Merge to main and deploy.

This is the complete workflow. There is no develop branch, no release branches, and no hotfix branches. The simplicity is intentional. Every merge to main triggers a deployment, which means that main must always be deployable.

When to use GitHub Flow: web applications with continuous deployment, SaaS products, open-source projects, and any team that deploys frequently and wishes to minimize process overhead.

Trunk-Based Development

Trunk-Based Development (TBD) simplifies the workflow further. Developers commit directly to the trunk (main) or use very short-lived feature branches that last no more than a day or two. This is the strategy used by Google, where thousands of engineers commit to a single monorepo.

The key enablers for trunk-based development are listed below.

Feature flags: incomplete features are hidden behind toggles so that they can reside in the codebase without being visible to users.
Comprehensive automated testing: with no release branch available for manual QA, automated tests must be thorough.
Small, incremental changes: large features are decomposed into small, independently deployable pieces.

When to use TBD: high-velocity teams with strong CI/CD pipelines, experienced developers who can work in small increments, and organizations that prioritize deployment speed over release ceremony.

Aspect	Git Flow	GitHub Flow	Trunk-Based
Long-lived branches	main + develop	main only	main only
Feature branch lifespan	Days to weeks	Hours to days	Hours (max 1-2 days)
Release process	Release branches	Merge to main = deploy	Continuous from trunk
Complexity	High	Low	Low
Best for	Scheduled releases	Continuous deployment	High-velocity teams
Team size	Medium to large	Any size	Senior/experienced teams

Tip: Teams beginning to formalize a Git workflow should start with GitHub Flow. It is simple enough that everyone can learn it quickly and flexible enough to scale. Migration to trunk-based development is straightforward once CI/CD maturity has improved.

Commit Conventions That Tell a Story

The commit history is a narrative of a project’s evolution. A well-maintained history allows any developer to understand what changed, why it changed, and when it changed without reading every line of code. A poorly maintained history is noise.

The following two commit histories from real projects illustrate the contrast.

# Bad history — tells you nothing
fix stuff
updates
WIP
more changes
asdfasdf
final fix (for real this time)
oops

# Good history — tells a story
feat(auth): add JWT refresh token rotation
fix(api): handle race condition in concurrent order processing
docs(readme): add deployment instructions for AWS
refactor(db): extract connection pooling into shared module
test(auth): add integration tests for OAuth2 flow

The difference is substantial. The remainder of this section discusses how to achieve the second style consistently.

The Conventional Commits Specification

Conventional Commits is a lightweight convention for commit messages that provides structure without imposing significant overhead. The format is as follows.

<type>(<scope>): <description>

[optional body]

[optional footer(s)]

The type describes the category of change.

Type	Purpose	Example
`feat`	New feature	feat(cart): add quantity selector to checkout
`fix`	Bug fix	fix(auth): prevent session hijacking on token refresh
`docs`	Documentation only	docs(api): update rate limiting section
`style`	Formatting, no code change	style: apply prettier to all JS files
`refactor`	Code change that’s not a fix or feature	refactor(db): simplify query builder interface
`perf`	Performance improvement	perf(search): add index for full-text queries
`test`	Adding or fixing tests	test(payments): add edge cases for currency conversion
`chore`	Maintenance tasks	chore(deps): upgrade React from 18.2 to 18.3
`ci`	CI/CD configuration changes	ci: add Node.js 20 to test matrix

The scope (optional but recommended) identifies the module, component, or area of the codebase affected. The description is a short, imperative statement of what the commit does: “add,” not “added” or “adds.”

The Discipline of Atomic Commits

An atomic commit contains exactly one logical change. Not two; not half of one; exactly one.

This is more difficult than it sounds. Developers naturally work on multiple things simultaneously. They begin to fix a bug and notice a typo in a comment. They refactor a function and recognize that the tests should also be updated. Within a short time, the working directory contains changes spanning five files and three unrelated concerns.

The discipline of atomic commits involves using git add -p (patch mode) to stage only the hunks related to one change, committing, and then staging and committing the next change. This approach is fundamental to clean code principles: a commit history should be as well-organized as the code itself.

# Stage specific parts of a file interactively
git add -p src/auth/login.py

# Git will show each "hunk" (changed section) and ask:
# Stage this hunk [y,n,q,a,d,s,e,?]?
# y = yes, n = no, s = split into smaller hunks, e = edit manually

# After staging the relevant hunks, commit
git commit -m "fix(auth): validate email format before database lookup"

# Now stage and commit the next logical change
git add -p src/auth/login.py
git commit -m "refactor(auth): extract validation logic into separate module"

The reason this matters is practical. Six months later, when a specific change must be reverted with git revert or a fix cherry-picked to a release branch, atomic commits enable a clean operation. If a single commit combines a bug fix and an unrelated refactor, reverting the buggy part also reverts the good refactor.

Caution: Work-in-progress (WIP) commits should never be pushed to shared branches. When work must be saved before a context switch, git stash or a personal branch prefixed with WIP is preferable. The history should be cleaned up before a pull request is opened.

Writing Commit Messages of Lasting Value

The commit description answers “what.” The commit body answers “why.” A template for non-trivial commits is shown below.

fix(api): return 429 status when rate limit is exceeded

Previously, the API returned a generic 500 error when a client
exceeded the rate limit. This made it impossible for clients to
distinguish between server errors and rate limiting, leading to
incorrect retry behavior.

Now returns 429 Too Many Requests with a Retry-After header,
conforming to RFC 6585. Clients can use this header to implement
proper exponential backoff.

Fixes #1234
See also: https://datatracker.ietf.org/doc/html/rfc6585

The structure is straightforward: an imperative subject line (under 72 characters), a blank line, and then a body explaining the state before, the state after, and why the change was required. This pattern, sometimes called the “50/72 rule,” is widely adopted because most Git tools wrap text at these boundaries.

Pull Request Best Practices

Pull requests (PRs) are where individual work becomes team work. A good PR makes the reviewer’s task straightforward. A poor PR—a 3,000-line submission with the description “some updates”—leaves everyone frustrated and typically results in a rubber-stamp approval, which defeats the entire purpose of code review.

The Primary Rule: Keep PRs Small

Research from Google’s engineering practices indicates a clear correlation: larger PRs are less effective to review. Reviewer attention degrades sharply after approximately 200 to 400 lines of changes. A 2,000-line PR almost guarantees that subtle bugs will slip through because no reviewer can sustain focused attention across that much code.

The ideal PR exhibits the following properties.

Under 400 lines of changed code, excluding generated files, lock files, and test fixtures.
Focused on a single concern: one feature, one bug fix, or one refactor.
Self-contained: it does not leave the codebase in a broken state if no subsequent PRs are merged.

If a feature requires 2,000 lines of code, it should be decomposed into a stack of four or five smaller PRs that build on one another. Many teams use tools such as Graphite, ghstack, or GitHub’s branch protection rules to manage stacked PRs.

Writing PR Descriptions That Accelerate Review

A good PR description follows a template that answers three questions: what was changed, why the change was made, and how the reviewer can verify it.

## What

Add rate limiting to the public API endpoints using a
token bucket algorithm. Limits are configurable per
endpoint and per API key tier.

## Why

We've been experiencing abuse from scrapers hitting our
search endpoint at 1000+ requests/minute, degrading
performance for legitimate users. This was flagged in
incident INC-2847.

## How to Test

1. Run `make test-integration` to execute the new rate
   limiting tests
2. For manual testing:
   - Start the server: `docker compose up`
   - Hit the endpoint rapidly: `for i in {1..100}; do
     curl -s -o /dev/null -w "%{http_code}\n"
     http://localhost:8000/api/search; done`
   - Verify you get 429 responses after exceeding the limit

## Screenshots

[Before/after screenshots if applicable]

## Checklist

- [x] Tests pass locally
- [x] Documentation updated
- [x] No breaking API changes
- [x] Rate limit headers added per RFC 6585

A description of this kind reduces a thirty-minute review to ten minutes. The reviewer does not need to infer why the change exists or how to test it; the information is provided directly.

PR Etiquette That Builds Team Trust

Pull requests involve human interaction as much as they involve code. The following conventions help sustain a healthy PR culture.

For authors:

Respond to every review comment, even briefly with “Done” or “Good point, fixed.”
Treat review feedback as a critique of code, not of the author personally.
Where there is disagreement with feedback, explain the reasoning rather than ignoring the comment.
Self-review the PR before requesting reviews; many obvious issues can be caught this way.
Add inline comments to complex sections to proactively explain the reasoning.

For reviewers:

Review within twenty-four hours; blocking a colleague’s PR for days disregards their time.
Distinguish between blocking concerns and minor suggestions; prefix optional remarks with “nit:” or “optional:”.
Explain why something should change, not only what should change.
Approve with comments where appropriate; not every suggestion needs to block the merge.
Acknowledge good work. A brief “nice approach here” carries weight.

Tip: The GitHub repository should be configured with branch protection rules that require at least one approving review, passing CI checks, and up-to-date branches before a merge. This prevents accidental merges of broken code and ensures that the review process is followed consistently.

Code Review Workflow and Standards

Code review is among the highest-value activities in software engineering. Google’s data indicate that code review catches approximately 15 percent of defects before they reach production. The benefits extend well beyond defect detection.

Knowledge sharing: reviews distribute awareness of the codebase across the team, reducing single-person dependency.
Mentorship: senior developers can guide juniors through real-world coding decisions.
Consistency: reviews enforce coding standards and architectural patterns across the team.
Documentation: the PR discussion becomes a record of why decisions were made.

What to Examine in a Code Review

A thorough code review examines several dimensions.

Correctness: does the code do what it claims to do? Are edge cases handled? Are off-by-one errors, null-pointer risks, or race conditions present?

Design: is the approach appropriate? Could it be simpler? Does it follow existing patterns in the codebase? Will it scale?

Readability: can another developer understand the code six months from now? Are variable names descriptive? Is the logic clear rather than unnecessarily clever?

Testing: are tests present? Do they cover the important cases? Do they test behavior (preferred) or implementation details (fragile)?

Security: is user input validated? Are SQL-injection or XSS vulnerabilities present? Are secrets hard-coded? This is especially important when building REST APIs with frameworks such as FastAPI, where input validation must be rigorous.

Performance: are there N+1 queries, unbounded loops, memory leaks, or large allocations in hot paths?

Automating the Routine Parts

Human reviewers should focus on design, logic, and architecture rather than formatting, style, or obvious errors. Everything that can be automated should be automated.

# .github/workflows/code-quality.yml
name: Code Quality
on: [pull_request]

jobs:
  lint:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - name: Run linter
        run: npx eslint . --format=json --output-file=lint-results.json

  format-check:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - name: Check formatting
        run: npx prettier --check "src/**/*.{ts,tsx,json}"

  type-check:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - name: TypeScript type check
        run: npx tsc --noEmit

When linting, formatting, and type checking are handled by CI, reviewers can omit “missing semicolon” comments and focus on substantive issues.

GitHub Actions and CI/CD Integration

GitHub Actions has become the de facto CI/CD platform for projects hosted on GitHub. It integrates seamlessly with pull requests, branch protection rules, and the wider GitHub ecosystem. Effective use of Actions is a core professional skill.

Anatomy of a GitHub Actions Workflow

A workflow is defined in a YAML file under .github/workflows/. The following is a production-ready example for a Python project of the kind one might use when building a FastAPI application.

# .github/workflows/ci.yml
name: CI Pipeline

on:
  push:
    branches: [main]
  pull_request:
    branches: [main]

permissions:
  contents: read
  pull-requests: write

jobs:
  test:
    runs-on: ubuntu-latest
    strategy:
      matrix:
        python-version: ["3.11", "3.12", "3.13"]

    services:
      postgres:
        image: postgres:16
        env:
          POSTGRES_PASSWORD: testpass
          POSTGRES_DB: testdb
        ports:
          - 5432:5432
        options: >-
          --health-cmd pg_isready
          --health-interval 10s
          --health-timeout 5s
          --health-retries 5

    steps:
      - uses: actions/checkout@v4

      - name: Set up Python ${{ matrix.python-version }}
        uses: actions/setup-python@v5
        with:
          python-version: ${{ matrix.python-version }}

      - name: Cache dependencies
        uses: actions/cache@v4
        with:
          path: ~/.cache/pip
          key: ${{ runner.os }}-pip-${{ hashFiles('**/requirements*.txt') }}
          restore-keys: ${{ runner.os }}-pip-

      - name: Install dependencies
        run: |
          python -m pip install --upgrade pip
          pip install -r requirements.txt
          pip install -r requirements-dev.txt

      - name: Run linting
        run: |
          ruff check .
          ruff format --check .

      - name: Run tests with coverage
        run: |
          pytest --cov=src --cov-report=xml --cov-report=term-missing
        env:
          DATABASE_URL: postgresql://postgres:testpass@localhost:5432/testdb

      - name: Upload coverage
        if: matrix.python-version == '3.12'
        uses: codecov/codecov-action@v4
        with:
          file: ./coverage.xml

  security:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - name: Run security scan
        uses: pyupio/safety-action@v1
      - name: Check for secrets
        uses: trufflesecurity/trufflehog@main
        with:
          extra_args: --only-verified

  deploy:
    needs: [test, security]
    runs-on: ubuntu-latest
    if: github.ref == 'refs/heads/main' && github.event_name == 'push'
    steps:
      - uses: actions/checkout@v4
      - name: Deploy to production
        run: echo "Deploy step here"
        env:
          DEPLOY_KEY: ${{ secrets.DEPLOY_KEY }}

This workflow demonstrates several best practices: matrix testing across Python versions, service containers for database tests, dependency caching for faster builds, security scanning as a separate job, and conditional deployment that only runs on main branch pushes after all checks pass.

Protecting the Main Branch

Branch protection rules are the safeguards that prevent accidents. At a minimum, the following should be configured for the main branch.

# Configure via GitHub UI: Settings > Branches > Branch protection rules
# Or via GitHub CLI:
gh api repos/{owner}/{repo}/branches/main/protection -X PUT \
  -f "required_status_checks[strict]=true" \
  -f "required_status_checks[contexts][]=test" \
  -f "required_status_checks[contexts][]=security" \
  -f "required_pull_request_reviews[required_approving_review_count]=1" \
  -f "required_pull_request_reviews[dismiss_stale_reviews]=true" \
  -f "enforce_admins=true" \
  -f "restrictions=null"

These rules ensure that:

No one can push directly to main (all changes go through PRs)
At least one team member must approve the PR
All CI checks must pass before merging
Stale approvals are dismissed when new commits are pushed (preventing approval bypass)
Even repository admins must follow the rules

Git Hooks for Quality Enforcement

Git hooks are scripts that run automatically at specific points in the Git workflow. They serve as a first line of defense, catching issues on the developer’s machine before code even reaches the remote repository.

Essential Git Hooks

The two most useful client-side hooks are pre-commit and pre-push.

Pre-commit runs before every commit and is suited to fast checks such as linting, formatting, and static analysis. If the hook fails, the commit is rejected.

Pre-push runs before every push to a remote and is suited to slower checks such as running the test suite, type checking, or security scanning. It is the last gate before code leaves the developer’s machine.

#!/bin/sh
# .git/hooks/pre-commit

echo "Running pre-commit checks..."

# Check for formatting issues
if ! npx prettier --check "src/**/*.{ts,tsx,json}" 2>/dev/null; then
    echo "ERROR: Formatting issues found. Run 'npx prettier --write .' to fix."
    exit 1
fi

# Run linter
if ! npx eslint src/ --quiet; then
    echo "ERROR: Linting errors found. Fix them before committing."
    exit 1
fi

# Check for console.log statements
if git diff --cached --name-only | xargs grep -l 'console\.log' 2>/dev/null; then
    echo "WARNING: Found console.log statements in staged files."
    echo "Remove them or use a proper logger before committing."
    exit 1
fi

# Check for secrets (basic check)
if git diff --cached | grep -iE '(api_key|secret|password|token)\s*=' | grep -v '#' | grep -v '//'; then
    echo "ERROR: Possible secrets detected in staged changes!"
    exit 1
fi

echo "All pre-commit checks passed."

Using Husky and lint-staged for JavaScript/TypeScript Projects

Managing Git hooks manually is tedious. Husky automates hook installation, and lint-staged runs tools only on staged files (not the entire project), making hooks fast even in large codebases.

# Install Husky and lint-staged
npm install --save-dev husky lint-staged

# Initialize Husky
npx husky init

# Create pre-commit hook
echo "npx lint-staged" > .husky/pre-commit

Configure lint-staged in package.json:

{
  "lint-staged": {
    "*.{ts,tsx}": [
      "eslint --fix",
      "prettier --write"
    ],
    "*.{json,md}": [
      "prettier --write"
    ],
    "*.py": [
      "ruff check --fix",
      "ruff format"
    ]
  }
}

For Python projects, the equivalent tool is pre-commit (confusingly named the same as the Git hook). It supports hooks for any language and manages tool versions automatically:

# .pre-commit-config.yaml
repos:
  - repo: https://github.com/astral-sh/ruff-pre-commit
    rev: v0.4.0
    hooks:
      - id: ruff
        args: [--fix]
      - id: ruff-format
  - repo: https://github.com/pre-commit/pre-commit-hooks
    rev: v4.6.0
    hooks:
      - id: trailing-whitespace
      - id: end-of-file-fixer
      - id: check-yaml
      - id: check-added-large-files
        args: ['--maxkb=500']
      - id: detect-private-key

Key Takeaway: Git hooks shift quality enforcement left, catching issues on the developer’s machine rather than in CI. This creates a faster feedback loop and reduces wasted CI minutes. Combine local hooks for fast checks with CI for comprehensive checks.

Advanced Git Techniques

The techniques in this section separate competent Git users from Git power users. These commands can save a developer hours of debugging and make complex code-history operations feel routine.

Interactive Rebase: Rewriting History Carefully

Interactive rebase (git rebase -i) allows a developer to rewrite commit history before sharing it. This is particularly powerful for consolidating a disorganized development history into a clean, logical sequence of commits before opening a PR.

# Rebase the last 5 commits interactively
git rebase -i HEAD~5

# Your editor will show something like:
pick a1b2c3d feat(auth): add login endpoint
pick d4e5f6g WIP: working on validation
pick h7i8j9k fix typo
pick l0m1n2o add input validation
pick p3q4r5s feat(auth): add password reset flow

# Change to:
pick a1b2c3d feat(auth): add login endpoint
fixup d4e5f6g WIP: working on validation    # merge into previous, discard message
fixup h7i8j9k fix typo                      # merge into previous, discard message
squash l0m1n2o add input validation          # merge into previous, edit message
pick p3q4r5s feat(auth): add password reset flow

# Result: 3 messy commits become part of the first commit
# with a clean, combined message

The commands available in interactive rebase are listed below.

Command	What It Does
`pick`	Keep the commit as-is
`reword`	Keep changes but edit the commit message
`squash`	Merge into the previous commit, combine messages
`fixup`	Merge into previous commit, discard this commit’s message
`edit`	Pause rebase to amend the commit (add/remove files, split it)
`drop`	Delete the commit entirely

Caution: Never rebase commits that have been pushed to a shared branch. Rebasing rewrites commit hashes, which means anyone else who has pulled those commits will have conflicts. The golden rule: rebase local commits before pushing; never rebase shared history.

Git Bisect: Finding Bugs with Binary Search

git bisect uses binary search to identify which commit introduced a bug. Instead of checking every commit one by one, it narrows down the responsible commit in logarithmic time, examining roughly 10 commits to search through 1,000.

# Start bisecting
git bisect start

# Mark the current commit as bad (has the bug)
git bisect bad

# Mark a known good commit (before the bug existed)
git bisect good v2.1.0

# Git checks out a commit halfway between good and bad
# Test it, then tell Git:
git bisect good  # if this commit doesn't have the bug
# or
git bisect bad   # if this commit has the bug

# Git narrows the range and checks out the next commit to test
# Repeat until Git identifies the exact commit

# When done:
git bisect reset

# Pro tip: Automate bisect with a test script
git bisect start HEAD v2.1.0
git bisect run python -m pytest tests/test_auth.py::test_login -x

The automated version (git bisect run) is especially powerful. When supplied with a script that exits with code 0 for “good” and a non-zero code for “bad,” it will find the offending commit without any manual intervention. This is a valuable technique when tracking down regressions in complex systems, whether the work involves Python or Rust codebases.

Cherry-Pick: Surgical Commit Transplanting

git cherry-pick copies a specific commit from one branch to another. It is essential for backporting fixes to release branches or for selectively applying changes.

# Apply a specific commit to the current branch
git cherry-pick a1b2c3d

# Cherry-pick without committing (stage the changes instead)
git cherry-pick --no-commit a1b2c3d

# Cherry-pick a range of commits
git cherry-pick a1b2c3d..f4e5d6c

# If there are conflicts during cherry-pick:
# Fix the conflicts, then:
git cherry-pick --continue
# Or abort:
git cherry-pick --abort

A common use case arises after an important bug has been fixed on main and the same fix is also required on a release branch. Instead of merging all of main into the release branch, which would include unfinished features, a developer can cherry-pick only the fix commit.

Reflog: The Git Safety Net

The reflog (reference log) is Git’s undo history. It records every time HEAD moves, including commits, merges, rebases, resets, and checkouts. Even when commits appear to have been lost through a bad rebase or a hard reset, the reflog usually retains them.

# View the reflog
git reflog

# Output looks like:
# a1b2c3d HEAD@{0}: commit: feat(api): add rate limiting
# d4e5f6g HEAD@{1}: rebase: finishing
# h7i8j9k HEAD@{2}: rebase: starting
# l0m1n2o HEAD@{3}: commit: fix(db): close connection on error
# p3q4r5s HEAD@{4}: checkout: moving from feature-x to main

# Recover a commit lost during rebase
git checkout -b recovery-branch HEAD@{3}

# Or reset to a previous state
git reset --hard HEAD@{4}

The reflog functions as a time machine. It is the reason that, in Git, it is almost impossible to truly lose work: the data is still present and only needs to be located. Reflog entries are kept for 90 days by default, which provides a generous window for recovery.

Tip: If a branch is accidentally deleted or a reset targets the wrong commit, recovery is straightforward. Run git reflog, find the required commit hash, and create a new branch pointing to it: git checkout -b rescue HEAD@{n}.

Git Worktree: Multiple Working Directories

A developer often needs to work on a hotfix while a feature branch still has uncommitted changes. Instead of stashing, which can become disorganized, git worktree creates a separate working directory for the same repository.

# Create a new worktree for a hotfix
git worktree add ../hotfix-branch hotfix/critical-bug

# Work in the new directory
cd ../hotfix-branch
# Make changes, commit, push

# When done, remove the worktree
git worktree remove ../hotfix-branch

# List all worktrees
git worktree list

Each worktree is a fully functional checkout with its own staging area and working directory. A developer can maintain as many as required, all sharing the same repository history and objects. This is especially useful for those who frequently context-switch between tasks.

Security: Protecting the Repository

Security in Git extends beyond writing secure code. It also requires ensuring that the repository itself does not become a vulnerability vector. A single committed secret can compromise an entire infrastructure.

A Comprehensive.gitignore

The .gitignore file is the first line of defense against accidentally committing sensitive files. A comprehensive template should be used as a starting point and then customized for the specific technology stack.

# Environment and secrets
.env
.env.*
!.env.example
*.pem
*.key
*.p12
credentials.json
service-account.json

# Dependencies
node_modules/
vendor/
__pycache__/
*.pyc
.venv/
venv/

# Build output
dist/
build/
*.egg-info/
target/

# IDE files
.idea/
.vscode/settings.json
*.swp
*.swo
.DS_Store

# Logs and databases
*.log
*.sqlite3
*.db

# Test and coverage
coverage/
.coverage
htmlcov/
.pytest_cache/
.nyc_output/

When an application is containerized with Docker for production deployments, the .dockerignore file should mirror the .gitignore to avoid baking secrets into Docker images.

Secrets Scanning

Even with a well-configured .gitignore, developers sometimes commit secrets accidentally. GitGuardian’s 2024 State of Secrets Sprawl report found that over 12 million new secrets were detected in public GitHub commits in a single year.

Multiple layers of protection are advisable.

Pre-commit hook: tools such as detect-secrets or trufflehog scan changes before they are committed.

GitHub’s built-in secret scanning: available for public repositories at no cost and for private repositories through GitHub Advanced Security. It scans for known secret patterns from over 200 service providers.

CI pipeline scanning: a secrets scan added to the CI workflow serves as a final safety net.

# Install detect-secrets
pip install detect-secrets

# Create a baseline of existing secrets (to handle legacy code)
detect-secrets scan > .secrets.baseline

# Scan for new secrets
detect-secrets scan --baseline .secrets.baseline

# Add to pre-commit config
# .pre-commit-config.yaml
repos:
  - repo: https://github.com/Yelp/detect-secrets
    rev: v1.4.0
    hooks:
      - id: detect-secrets
        args: ['--baseline', '.secrets.baseline']

Caution: If a secret is accidentally committed, simply removing it in a new commit is not enough. The secret remains in Git history permanently. Three steps are required: (1) immediately rotate the compromised credential, (2) use git filter-repo or BFG Repo-Cleaner to purge the secret from history, and (3) force-push the cleaned history. GitHub also provides a guide for removing sensitive data.

Signed Commits: Verifying Identity

Git commits include an author field, but nothing prevents someone from setting it to any name or email address. Signed commits use GPG or SSH keys to cryptographically verify that a commit genuinely originated from the claimed author.

# Option 1: Sign with SSH key (simpler, recommended since Git 2.34)
git config --global gpg.format ssh
git config --global user.signingkey ~/.ssh/id_ed25519.pub
git config --global commit.gpgsign true

# Option 2: Sign with GPG key (traditional approach)
# First, generate a GPG key:
gpg --full-generate-key

# Get your key ID:
gpg --list-secret-keys --keyid-format=long

# Configure Git to use it:
git config --global user.signingkey YOUR_KEY_ID
git config --global commit.gpgsign true

# Verify a signed commit
git log --show-signature

# On GitHub, signed commits show a "Verified" badge

Many organizations now require signed commits as a matter of security policy. GitHub, GitLab, and Bitbucket all display verification badges on signed commits, giving the team confidence that commits have not been tampered with.

Monorepo vs Polyrepo

As an organization grows, it faces a fundamental architectural decision: whether to keep all code in a single repository (monorepo) or to split it across multiple repositories (polyrepo).

The Monorepo Approach

Google, Meta, Microsoft, and Twitter/X all use monorepos, single repositories containing multiple projects, services, and libraries. Google’s monorepo is legendary: over 2 billion lines of code, 86 terabytes, with 25,000 developers committing changes daily.

Advantages:

Atomic cross-project changes: Refactor a shared library and update all consumers in a single commit
Code sharing: Easy to extract common code into shared packages
Unified tooling: One CI/CD pipeline, one set of linting rules, one testing framework
Simplified dependency management: No version matrix across repos

Challenges:

Scale: Git slows down considerably with very large repositories (hundreds of GB), requiring tools such as VFS for Git, sparse checkouts, or git clone --filter
CI complexity: requires intelligent CI that tests only what changed, not the entire repository
Access control: Harder to restrict access to specific directories (GitHub has CODEOWNERS; GitLab has more granular permissions)

Popular monorepo tooling includes Nx (JavaScript/TypeScript), Bazel (multi-language, used by Google), Turborepo (JavaScript), and Pants (Python). These tools understand the dependency graph of a monorepo and can determine which projects are affected by a change, running only the necessary tests and builds.

The Polyrepo Approach

Most organizations use polyrepos—separate repositories for each service, library, or application. This is the default pattern on GitHub and maps naturally to microservices architectures where each service lives in its own Docker container.

Advantages:

Clear ownership: Each repo has a defined team, README, and set of maintainers
Independent deployment: Each service can be built, tested, and deployed independently
Access control: Simple and granular—each repo has its own permissions
Git performance: Never an issue; repos stay small

Challenges:

Cross-repo changes: Updating a shared library requires PRs to every consuming repo
Version conflicts: Service A depends on library v1.2, Service B depends on v1.5, and the two are incompatible
Inconsistent tooling: Each repo might use different linters, test frameworks, or CI configurations
Discovery: Hard for new developers to find relevant code across dozens of repos

Factor	Monorepo	Polyrepo
Cross-project refactoring	Easy, single commit	Hard—multiple PRs
Git performance	Degrades at scale	Always fast
Access control	Complex (CODEOWNERS)	Simple per-repo
CI/CD	Needs smart build tools	Standard per-repo
Code sharing	Direct imports	Via package registries
Team independence	Less—shared rules	More, full autonomy
Best for	Tightly coupled services	Independent microservices

Key Takeaway: There is no universally correct answer. Many successful organizations use a hybrid approach: a monorepo for closely related services and shared libraries, with separate repositories for truly independent applications. The choice should be based on team size, the degree of coupling between projects, and tooling maturity.

Frequently Asked Questions

Should I use merge or rebase to integrate changes from the main branch?

It depends on your team’s preference and the context. Merge preserves the exact history of how development happened—you can see when branches diverged and reconnected. Rebase creates a linear history that’s easier to read and bisect. A common best practice is to rebase your feature branch onto main before merging (to stay up to date and resolve conflicts early), then use a merge commit to integrate the feature into main. This gives you the best of both worlds: a clean branch history with an explicit record of when the feature was integrated. Many teams enforce this with GitHub’s “Require linear history” or “Squash and merge” options.

How do I undo the last commit without losing changes?

Use git reset --soft HEAD~1. This moves HEAD back one commit but keeps all the changes from that commit staged and ready to be recommitted. If you also want to unstage the changes (keep them as working directory modifications), use git reset --mixed HEAD~1 (or simply git reset HEAD~1 since mixed is the default). If you’ve already pushed the commit, use git revert HEAD instead—this creates a new commit that undoes the changes, preserving shared history.

What’s the difference between git fetch and git pull?

git fetch downloads new data from the remote repository (new commits, branches, tags) but doesn’t change your working directory or current branch. It updates your remote-tracking branches (like origin/main) so you can see what’s changed. git pull is essentially git fetch followed by git merge (or git rebase if configured). Using git fetch first gives you the opportunity to inspect changes before integrating them, which is safer. Many experienced developers prefer git fetch + git merge (or rebase) over git pull for this reason.

How should I handle large binary files in Git?

Git is designed for text files. Large binary files (images, videos, compiled assets, ML models) bloat the repository because Git stores every version. Use Git LFS (Large File Storage) to handle binaries. Git LFS replaces large files with text pointers in the repository while storing the actual file content on a separate server. Set it up with git lfs install and git lfs track "*.psd". GitHub provides 1 GB of free LFS storage per repository, with additional storage available for purchase.

How many approvals should be required for a pull request?

For most teams, one approval is the sweet spot. It ensures that at least one other person has reviewed the code without creating a bottleneck. For critical paths (security-sensitive code, database migrations, infrastructure changes), consider requiring two approvals. Use GitHub’s CODEOWNERS file to automatically assign reviewers based on which files are changed. Avoid requiring more than two approvals, it creates delays without proportionally increasing quality. If you have concerns about a specific change, escalate through conversation rather than adding more required reviewers.

Concluding Remarks

Git mastery is not a matter of memorizing obscure commands. It rests on understanding the mental model—the DAG of snapshots, the pointers, the graph operations—and on building on that foundation with disciplined practices that improve team productivity, codebase maintainability, and deployment reliability.

The most consequential practices covered in this guide are summarized below.

Choose a branching strategy deliberately. GitHub Flow offers simplicity and speed. Git Flow offers structure and release management. Trunk-Based Development offers velocity at the cost of requiring greater discipline and mature CI/CD. The appropriate choice is the one that matches the team’s circumstances rather than the one that sounds most sophisticated.

Write atomic commits with meaningful messages. A commit history is a communication tool. Conventional Commits provides structure. git add -p helps maintain focus. Messages should explain why, not only what.

Keep pull requests small and well-described. Under 400 lines. One logical change per PR. Include context, testing instructions, and screenshots. Reviewers will reciprocate with faster and more thorough reviews.

Automate quality enforcement. Use pre-commit hooks for fast local checks, GitHub Actions for comprehensive CI, and branch protection rules to prevent accidents. The most effective teams structure their tooling so that doing the wrong thing is harder than doing the right thing.

Learn the advanced tools. Interactive rebase for cleaning up history. Bisect for finding bugs efficiently. Reflog for recovering from mistakes. These are not esoteric tricks but routine instruments for professional developers.

Take security seriously. Use a comprehensive .gitignore. Scan for secrets in pre-commit hooks and CI. Sign commits. Remember that Git history is permanent: a committed secret is a compromised secret, even if it is removed in the next commit.

The investment in learning these practices yields compound returns. Each clean commit, well-structured PR, and automated check accumulates into a codebase that is a pleasure to work with rather than a hazard to navigate. In an industry where the ability to ship reliable software quickly is a core competitive advantage, this matters more than any framework or language choice.

One change should be initiated this week, whether it is adopting Conventional Commits, adding a pre-commit hook, or configuring branch protection rules on the main repository. Small, consistent improvements compound over time, in Git practices as in any other long-term discipline.

References

Git Official Documentation—Comprehensive reference for all Git commands and concepts
GitHub Docs—Official documentation for GitHub features including Actions, branch protection, and code review
Conventional Commits Specification v1.0.0,The standard for structured commit messages
Trunk Based Development—Comprehensive resource on the trunk-based branching strategy
Google Engineering Practices: Code Review—Google’s code review standards and best practices

ProgrammingComplex Event Processing with Apache Flink: Building Real-Time CEP Pipelines from Scratch ProgrammingSchema Evolution and the Schema Registry: Avro, Protobuf, and Compatibility ProgrammingBuilding an Apache Kafka Multivariate Time Series Engine