How Git Works - Internal Structure of Version Control

15 min read | 2025.12.08

Git Fundamentals

Git is a distributed version control system. Each developer has a complete copy of the repository and can work offline.

Git’s characteristic: Manages files using snapshots, recording the entire state rather than differences.

Types of Git Objects

Git manages data with four types of objects:

1. Blob (Binary Large Object)

An object that stores file contents. It does not include the filename, only pure data.

# View blob object
git cat-file -p <blob-hash>

2. Tree

An object representing directory structure. Contains references to blobs or other trees along with filenames.

3. Commit

An object that records a snapshot:

  • Reference to the top-level tree
  • Reference to parent commit(s)
  • Author and committer information
  • Commit message

4. Tag

An object that gives a name to a specific commit. Used for managing release versions.

SHA-1 Hash

All objects are identified by SHA-1 hashes. Since identical content produces the same hash, duplicate data can be efficiently eliminated.

# Check object hash
git hash-object README.md

# Check commit hash
git log --oneline

How Branches Work

A branch is simply a “pointer to a commit.” Creating a new branch just adds one pointer, no data copying occurs.

# The actual branch (.git/refs/heads/main)
cat .git/refs/heads/main
# → Shows the commit hash

What is HEAD: A pointer that indicates the currently checked-out branch. It’s recorded in the .git/HEAD file.

Types of Merges

Fast-forward Merge

When there’s no divergence, simply moves the pointer forward to complete.

3-way Merge

When there’s divergence, compares the common ancestor with both branches and creates a new merge commit.

Summary

Git achieves efficient and reliable version control using four types of objects (blob, tree, commit, tag) and SHA-1 hashes. Understanding that branches are just pointers makes Git operations more intuitive.

← Back to list