Blobs
As you can see, at the right of the previous command output, we have README.md and shoppinglist.txt, which makes us guess that Git blobs represent the files. As before, we can verify its contents; let's see what's inside 637a0:
[17] ~/grocery (master) $ git cat-file -p 637a0 banana
Wow! Its content is exactly the content of our shoppingFile.txt file.
To confirm, we can use the cat command, which on *nix systems allows you to see the contents of a file:
[18] ~/grocery (master) $ cat shoppingList.txt banana
As you can see, the result is the same.
Blobs are binary files, nothing more and nothing less. These byte sequences, which cannot be interpreted with the naked eye, retain inside information belonging to any file, whether binary or textual, images, source code, archives, and so on. Everything is compressed and transformed into a blob before archiving it into a Git repository.
As already mentioned previously, each file is marked with a hash; this hash uniquely identifies the file within our repository, and it is thanks to this ID that Git can then retrieve it when needed, and detect any changes when the same file is altered (files with different content will have different hashes).
We said SHA-1 hashes are unique; but what does it mean?
Let's try to understand it better with an example.
Open a shell and try to play a bit with another plumbing command, git hash-object:
[19] ~/grocery (master) $ echo "banana" | git hash-object --stdin 637a09b86af61897fb72f26bfb874f2ae726db82
The git hash-object command is the plumbing command to calculate the hash of any object; in this example, we used the --stdin option to pass as a command argument the result of the preceding command, echo "banana"; in a few words, we calculated the hash of the string "banana", and it came out 637a09b86af61897fb72f26bfb874f2ae726db82.
And on your computer, did you try it? What is the result?
A bit of suspense... That's incredible, it's the same!
You can try to rerun the command as many times as you want, the resulting hash will always be the same (if not, it can be due to different line endings in your operating system or shell).
This makes us understand something very important: an object, whatever it is, will always have the same hash in any repository, in any computer, on the face of the Earth.
The experienced and the smart ones probably had "smelt a rat" for some time now, but I hope that in the rest of the readers I have pulled up the same amazement that caught me when I did this for the first time. This behavior has some interesting implications, as we will see soon.
Last, but not least, I want to highlight how Git calculates the hash on the content of the file, not in the file itself; in fact, the 637a09b86af61897fb72f26bfb874f2ae726db82 hash calculated using git hash-object is the same as the blob we inspect previously using git cat-file -p. This teaches us an important lesson: if you have two different files with the same content, even if they have different names and paths, in Git you will end up having only one blob.