Most people think of Git as a version control tool. CTF players think of it as a crime scene.
Press enter or click to view image in full size
In this challenge — “Git 2” from PicoCTF — a flag is buried inside a disk image. There are no commits. No branches. No obvious trail. Just raw objects sitting quietly in .git/objects, waiting for someone who knows where to look.
This article walks through the full forensic methodology: from cracking open a disk image, to mounting partitions, to understanding Git’s internal object database well enough to recover data that was never meant to be found.
gunzip disk.img.gzYou start with a .gz compressed file. gunzip simply decompresses it into disk.img — a raw binary snapshot of an entire hard drive.
file disk.imgOutput:
disk.img: DOS/MBR boot sector; partition 1 : ID=0x83, active, start-CHS (0x2,0,33),
end-CHS (0x263,8,56), startsector 2048, 614400 sectors;
partition 2 : ID=0x82, start-CHS (0x263,8,57) ...
partition 3 : ID=0x83, start-CHS (0x3ff,15,63) ...file reads the magic bytes at the start of the file and identifies it. What you see here is a DOS/MBR boot sector — this is a real disk layout with a Master Boot Record at the very beginning, followed by a partition table. There are three partitions on this disk.
fdisk -l disk.imgOutput:
Disk disk.img: 1 GiB, 1073741824 bytes, 2097152 sectors
Units: sectors of 1 * 512 = 512 bytesDevice Boot Start End Sectors Size Id Type
disk.img1 * 2048 616447 614400 300M 83 Linux
disk.img2 616448 1140735 524288 256M 82 Linux swap / Solaris
disk.img3 1140736 2097151 956416 467M 83 Linuxfdisk -l reads the partition table and gives you a human-readable layout. Think of it like a table of contents for the disk.
Press enter or click to view image in full size
Why partition 3? Because it’s the largest non-swap Linux partition. User files, home directories, and application code live here.
To mount partition 3, we need to tell Linux exactly where in the file it starts. Partitions are measured in sectors, and each sector is 512 bytes.
Start sector of partition 3 = 1140736
Byte offset = 1140736 × 512 = 583,544,832 bytessudo mkdir -p /mnt/git2
sudo mount -o loop,offset=$((1140736 * 512)) disk.img /mnt/git2Breaking down the mount command:
-o loop — treat the file as a loop device (a virtual block device)offset=$((1140736 * 512)) — start reading from this byte offsetdisk.img — the source file/mnt/git2 — where to mount itAfter this, /mnt/git2 behaves like a real mounted filesystem. You can ls, cat, and find files just like a normal drive.
find /mnt/git2 -name ".git" -type d 2>/dev/nullThis recursively searches the entire partition for directories named .git. The 2>/dev/null suppresses permission errors.
Result:
/mnt/git2/home/ctf-player/Code/killer-chat-app/.gitFound it. A Git repository lives inside the home directory of a user called ctf-player.
cd /mnt/git2/home/ctf-player/Code/killer-chat-app/
git log --onelineOutput:
fatal: your current branch 'master' does not have any commits yetgit statusOutput:
On branch masterNo commits yet
Changes to be committed:
(use "git rm --cached <file>..." to unstage)
new file: client
new file: logs/1.txt
new file: logs/2.txt
new file: logs/4.txt
new file: server
Two things immediately stand out:
git log is useless herelogs/3.txt is missing — we have 1, 2, and 4 staged, but not 3This is your first investigative clue. Something happened to logs/3.txt. It was either:
git rm --cachedRegardless of which scenario, Git’s object database may still hold the answer.
.git/objects — Git's Internal DatabaseThis is the heart of the investigation, and the most important concept in this entire article.
.git/objects?Git uses a content-addressable storage system. Every piece of data Git ever processes — every file, every directory snapshot, every commit — is stored as an “object” in .git/objects.
Each object is named by the SHA-1 hash of its content:
.git/objects/
66/
273877d2ff3f51a14473b7200aae5a798ff64fFull hash = 66273877d2ff3f51a14473b7200aae5a798ff64f
Join Medium for free to get updates from this writer.
The first 2 characters become the folder name. The remaining 38 characters become the filename. This is purely an optimization — it prevents a single directory from containing hundreds of thousands of files.
blob — Raw File ContentA blob is nothing more than the raw bytes of a file. No filename. No metadata. Just content.
git cat-file -p 66273877d2ff3f51a14473b7200aae5a798ff64f
# → (raw contents of whatever file this is)Key insight: if two files have identical content, they share one blob. Git deduplicates automatically.
tree — Directory SnapshotA tree maps filenames to blob hashes (and other tree hashes for subdirectories).
git cat-file -p a0c13fe974d95661f24e32bc0d79f54f05ea13c5100644 blob 66273877... logs/1.txt
100644 blob 7178644... logs/2.txt
100644 blob f150f0b... logs/3.txt ← could be here even if not staged
100644 blob aa1cc01... logs/4.txt
040000 tree 6b1ebe1... src/This is critical: a tree object can reference logs/3.txt even if that file was never committed. If the blob for 3.txt was created (by git add), it lives in .git/objects forever — until git gc is run.
commit — Snapshot in TimeA commit points to a tree and stores metadata.
git cat-file -p 01533f718556a0e59f1467dae4fa462eed82c2a1tree 22f7d0c9bd045563ae33bfacfbe46fe406a5b318
parent 2c0a9b2b15dce92f800393d5030c7454efc278ae
author ctf-player <[email protected]> 1693000000 +0000
committer ctf-player <[email protected]> 1693000000 +0000initial commit
Even though git log shows no commits on master, commit objects can exist in the database if they were created on another branch, or if the branch pointer was reset.
tag — Named ReferenceAnnotated tags point to commits and add a name and message. Less relevant for forensics in this case.
Git stores everything as objects:
commit ──→ tree ──→ blob (logs/1.txt)
├──→ blob (logs/2.txt)
├──→ blob (logs/3.txt) ← orphaned, the flag is here
├──→ blob (logs/4.txt)
└──→ blob (client)Every commit is a complete snapshot, not a diff. Git’s diffs are computed on-the-fly by comparing blobs between commits.
Git normally starts from refs:
HEAD
branch refs
tagsand walks the graph.
Anything connected to those refs is reachable.
main → commit A → tree → blob(flag.txt)Git can reach everything.
Suppose you delete a branch:
main → commit A
old_branch → commit B → tree → blob(flag)After deleting old_branch, nothing references commit B.
Now:
commit B = unreachable
tree from B = unreachable
blob from tree = unreachableBUT the objects still physically exist in .git/objects.
That’s why forensic recovery works.
git cat-file --batch-all-objects --batch-checkThis command iterates over every object in .git/objects and prints a one-line summary:
01533f718556a0e59f1467dae4fa462eed82c2a1 commit 238
201c707b43219a63c1d3499b29c7d539af079861 tree 99
2151ef0ccc15aed1ab88e1afdc7484aaeff211c4 commit 244
66273877d2ff3f51a14473b7200aae5a798ff64f blob 140
7178644433e7cb6da3adf028f1c80d382a18e7b6 blob 188
...Format: <hash> <type> <size in bytes>
--batch-check vs --batchFlagWhat it outputsUse case--batch-checkHeader only (hash + type + size)Survey — understand what exists--batchHeader + raw content of every objectExtract — dump everything for searching
Think of --batch-check as the table of contents and --batch as reading every page.
git cat-file --batch-all-objects --batch | strings | grep -i "picoCTF\|3.txt"Breaking this down:
PartWhat it does--batch-all-objectsIterate over every object--batchOutput raw content of each objectstringsExtract printable ASCII strings from binary datagrep -i "picoCTF|3.txt"Search for the flag format OR the missing filename
Output:
.100644 3.txt
Jay: Ask Rusty at the door and use password picoCTF{g17_r35cu3_********}.The flag was inside the blob for logs/3.txt. It was never staged, never committed — but its content was git added at some point, creating a blob object that persisted.
git fsck --unreachablefsck stands for File System Check. It walks the entire object graph starting from known references (HEAD, branches, tags) and identifies objects that cannot be reached.
Example output:
unreachable blob 66273877d2ff3f51a14473b7200aae5a798ff64f
unreachable commit 2151ef0ccc15aed1ab88e1afdc7484aaeff211c4
dangling commit 01533f718556a0e59f1467dae4fa462eed82c2a1dangling vs unreachabledangling — nothing points to this object at all. Truly orphaned.unreachable — not reachable from HEAD/branches, but another unreachable object points to it.dangling commit → unreachable tree → unreachable blob ← (flag lives here)Then follow the chain:
# Read the dangling commit → get the tree hash
git cat-file -p 01533f718556a0e59f1467dae4fa462eed82c2a1
# Read the tree → see all files including 3.txt
git cat-file -p <tree-hash>
# Read the blob → get the flag
git cat-file -p <blob-hash>Press enter or click to view image in full size
This is the principle that makes Git forensics possible:
Git objects are immutable and persist until
git gcis explicitly run.
Here are the common scenarios that create orphaned objects:
Press enter or click to view image in full size
The only thing that cleans this up is git gc (garbage collection), which prunes objects not reachable from any reference. Until then, the data is fully recoverable.
This is by design. Git prioritizes data safety over storage efficiency. It would rather keep a “deleted” file than risk losing something the user might need.
1. Disk images are layered. MBR → partition table → filesystem → files. Each layer requires a different tool: file, fdisk, mount, then standard shell commands.
2. Partition byte offsets matter. offset = start_sector × sector_size. Getting this wrong means mounting nothing, or the wrong partition.
3. Git never truly deletes. git log showing "no commits" is not the whole story. The object database is the whole story.
4. .git/objects is a content-addressed database. Every blob, tree, and commit has a SHA-1 hash name. Objects are immutable. The database only grows until git gc is run.
5. Two forensic strategies, different tradeoffs. --batch | strings | grep is fast and broad. git fsck --unreachable is slow and structured. Use both depending on what you need.
6. The missing file is the clue. logs/1.txt, logs/2.txt, logs/4.txt were staged. logs/3.txt was not. Gaps in sequences are almost always intentional. Always look for what's missing.
7. Another useful command. Find unreachable/dangling objects
`git fsck — full — no-reflogs`
The Git 2 challenge demonstrates how powerful forensic analysis becomes when low-level system knowledge is combined with an understanding of application internals. What initially appeared to be an empty repository with no commit history ultimately revealed recoverable evidence hidden inside Git’s object database.