๐Ÿ“ฆ RivaanRanawat / git-clone-python

๐Ÿ“„ README.md ยท 176 lines
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176# PyGit - A Simple Git Clone in Python

> **Watch the full tutorial on YouTube!**  
> https://youtu.be/g2cfjDENSyw

---

## ๐Ÿ“– What is PyGit?

PyGit is a **Python implementation of Git** that demonstrates the core concepts and internals of version control systems. This project is for educational purposes to understand how Git works under the hood by implementing the fundamental data structures and operations.

## Core Components

### 1. **GitObject Class**

- Base class for all Git objects (Blob, Tree, Commit)
- Handles serialization/deserialization with zlib compression
- Generates SHA-1 hashes for object identification (real Git uses SHA-256 nowadays)

### 2. **Blob Objects**

- Store actual file contents
- Represent individual files in the repository

### 3. **Tree Objects**

- Represent directory structures
- Store references to blobs and other trees
- Maintain file permissions and names

### 4. **Commit Objects**

- Store metadata about commits (author, timestamp, message)
- Reference tree objects and parent commits
- Form the commit history chain

### 5. **Repository Class**

- Manages the `.git` directory structure
- Handles object storage and retrieval
- Implements Git commands (init, add, commit, checkout, etc.)

---

## Features

- **Repository Initialization** - Create new Git repositories
- **File Staging** - Add files to the staging area
- **Commit Creation** - Create commits with messages and metadata
- **Branch Management** - Create, switch, and delete branches
- **Commit History** - View commit logs and history
- **Status Checking** - Monitor repository state
- **Object Storage** - Efficient storage using SHA-1 hashing and compression

---

## ๐Ÿ“ฆ Installation & Setup

### Prerequisites

- Python 3.7+
- No external dependencies required (uses standard libraries only)

### Quick Start

```bash
# Clone the repository
git clone <this-repo-url>
cd git_clone

# Run PyGit commands
python3 main.py init
python3 main.py add README.md
python3 main.py commit -m "Initial commit"
```

---

## ๐Ÿ’ป Usage Examples

### Initialize a Repository

```bash
python3 main.py init
# Output: Initialized empty Git repository in ./.git
```

### Add Files to Staging

```bash
# Add single file
python3 main.py add main.py

# Add entire directory
python3 main.py add src/

# Add multiple files
python3 main.py add file1.py file2.py src/
```

### Create Commits

```bash
python3 main.py commit -m "Add new feature"
python3 main.py commit -m "Fix bug" --author "Rivaan Ranawat <rivaan@rivaan.com>"
```

### Branch Operations

```bash
# List branches
python3 main.py branch

# Create new branch
python3 main.py checkout -b feature-branch

# Switch to existing branch
python3 main.py checkout main

# Delete branch
python3 main.py branch feature-branch -d
```

### View Repository Status

```bash
# Check working directory status
python3 main.py status

# View commit history
python3 main.py log -n 5
```

---

## ๐Ÿ—‚๏ธ Project Structure

```
git_clone/
โ”œโ”€โ”€ main.py          # Main PyGit implementation
โ”œโ”€โ”€ README.md        # This file
โ””โ”€โ”€ .git/           # Git repository (created after init)
    โ”œโ”€โ”€ objects/    # Git objects database
    โ”œโ”€โ”€ refs/       # References and branches
    โ”œโ”€โ”€ HEAD        # Current branch pointer
    โ””โ”€โ”€ index       # Staging area
```

---

## ๐Ÿ” How It Works

### 1. **Object Storage**

- Files are stored as **Blob objects** with compressed content
- Directories are represented as **Tree objects** with file references
- Each object gets a unique SHA-1 hash

### 2. **Staging Process**

- Files are read and converted to Blob objects
- Object hashes are stored in the index (staging area)
- Index tracks which files are ready for commit

### 3. **Commit Process**

- Creates a Tree object from the current index
- Generates a Commit object with metadata
- Updates branch reference to point to new commit

### 4. **Branch Management**

- Branches are just files pointing to commit hashes
- Checkout updates HEAD and restores working directory
- Branch creation copies current commit reference