Skip to main content

Command Palette

Search for a command to run...

Inside Git : How it works and the role of .git folder

Under the Hood Of Git

Published
4 min read

Let’s see under the hood of Git :

Look at this image :

.git folder

This is what inside .git folder looks like.

Let us take an example :

Imagine in our project we just have a single file named “Myfile.txt“. Now, when we run the command git init. It initialize and make our project a git tracked repository.

Which makes the .git folder (Hidden folder). It keeps the version history and track of branches, tags and HEAD.

Head :

It is a file which contains a reference or we say a pointer to the current branch on which we are.

refs Folder :

It has two main folders named tags and heads.

The heads : contains the files named with branch’s names. Inside these files it has a 40 digit hash code which is the hash of latest commit of that branch.

Now let’s move to the main thing inside the .git folder : The Database (Objects folder) :

objects folder contains mainly three types of objects :

  • Blob : The Binary Large Object. It contain the actual content of a file in our project inside it. If two files have exact same data then instead of creating two different blobs git creates a single blob and points to it twice.

  • Tree : It is also a file but it works as a “Virtual Folder“. It contains list of all the sub-trees(sub-folders) and the blobs inside this tree. It contains the filenames for blobs and folder name for sub-trees along with their hashes.

  • Commit : It is a text file. It points to a specific tree. It includes The author and committer, The committer message, A pointer to parent Commit.

How git track changes ? :

As we were on our example of having a single file named “Myfile.txt“ in our project. And with git init we have initialized our project. Now we have .git folder in our project folder along with “Myfile.txt“ file.

Step 1 : Staging the file :-

  • Now when we stage this file, and run get add command. It adds a header to the file’s content. Header is a metatag containing the type of object(blob, tree, commit) and size of content in bytes followed by a null character ‘\0‘. Header example : type size\0.

  • For example, if our file “Myfile.txt“ contains content like, “Hello World“.

  • Then the header will look like, blob 11\0. This header is added in the starting of content or we say before content.

  • Then git runs SHA-1 algorithm on this (Header + Content) and generates the 40 digit unique hash code. (for example :- 82fe31....)

  • Then git compresses the (Header + content) using Zlib.

  • After that git splits the 40 digits hash into two parts (82 and fe31....).

  • Then it creates a folder(called buckets) with first two digits 82 of hash inside the objects folder and saves the compressed data into a file named with remaining 38 digits of hash code fe31.... . Like example .git/objects/82/fe31…

Step 2 : Creating the tree (The Map) :-

  • Then git creates a tree object on running git commit . Which will contain the list of all the blobs and sub-trees. In this tree object git maps the actual filename with the hash of that file/folder (blob/tree).

  • Git adds Header to the content of tree object too which look like, tree [size]\0.

  • Git runs SHA-1 algorithm and generates 40 digit unique hash code for this tree object. lets say tree’s hash is d5e6f7.... .

  • Then Git compresses the data using zlib and stores it in folder named d5 inside objects folder in a file named e6f7…. like example .git/objects/d5/e6f7…

  • tree’s internal : tree 35\0100644 Myfile.txt\0[20-byte-binary-hash-of-blob]

Step 3 : Committing (The History) :-

  • Then Git make commit object at the same time of creation of tree object while running git commit command. Which contains the committing history along with tree hash.

  • It includes commit info pointing to the tree with tree’s hash code. like tree [hash-of-tree]\nauthor Me...\n\nInitial commit.

  • Git adds the header (commit [size]\0). And then runs SHA-1 algorithm on this uncompressed data and generates a unique 40 digit hash code. example, 4f5a6b... .

  • Git then compresses the data using Zlib and stores it in folder named 4f inside objects folder in a file named 5a6b…. like example : .git/objects/4f/5a6b…

  • Commit’s internal will look like : -

commit 238\0

tree d5e6f7a1b2c3d4e5f6g7h8i9j0k1l2m3n4o5p6\n

author Student student@university.edu 1706432400 +0530\n

committer Student student@university.edu 1706432400 +0530\n\nInitial commit

Now our objects/ folder have three folders mainly named 82, d5 and 4f and each of them contain respective files named fe31…, e6f7…, 5a6b…. Here file named fe31… is a blob, file named e67f… is a tree object and file named 5a6b… is commit object.

That’s all Under the HOOD OF GIT.