aboutsummaryrefslogtreecommitdiff
path: root/README.md
blob: eb93f43ad5ee9ee6b17d03b644fc28b684d82026 (about) (plain) (blame)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
# Furgit

[![builds.sr.ht status](https://builds.sr.ht/~runxiyu/furgit.svg)](https://builds.sr.ht/~runxiyu/furgit)
[![Go Reference](https://pkg.go.dev/badge/codeberg.org/lindenii/furgit.svg)](https://pkg.go.dev/codeberg.org/lindenii/furgit)

Furgit is a low-level Git library in Go.

## Project status

* Several years away from stable
* Do not use in production
* Mature alternative: [go-git](https://github.com/go-git/go-git)
* Will use [Semantic Versioning 2.0.0](https://semver.org/spec/v2.0.0.html) starting at 1.0.0

## Goals

* General-purpose Git plumbing library for UNIX-like systems
* Aim for clear architecture then high performance
* Expect familiarity with Git internals

## Finding your way around

If you are working with an on-disk repository, start with
`repository.Open(...)`. It opens the repository and wires together the refs storage, object
storage, and resolver.

That gives you a repository handle with a few different entry points, but they
serve different purposes:

* `repo.Refs()` is for branch names, tags, `HEAD`, and ref updates.
  * Use it when you are starting from names rather than object IDs.
  * A common pattern is to resolve a ref first, then pass the resulting object
    ID to the resolver.

* `repo.Resolver()` is the main object-facing API for most callers.
  * Use it when you want commits, trees, blobs, or tags as typed values.
  * It also handles peeling through annotated tags, resolving objects to the
    type you actually want, and walking paths inside trees.
  * It even allows you to access a tree as an `io/fs.FS`.
  * If your goal is "show me this commit", "read this tree", "follow this tag",
    or "get me the file at this path", this is usually the right layer.

* `repo.Objects()` is the storage layer underneath resolution.
  * Use it when you need to read object headers, read raw object contents,
    stream object data, or otherwise look up objects directly by ID.
  * Most callers who want to work with Git objects as commits, trees, blobs, or
    tags should prefer the resolver instead.
  * However, checking an object ID's size and type are somewhat common
    operations that should be done here.

Some object concepts are kept separate:

* `object` contains parsed Git object values such as blobs, trees, commits, and
  tags. These are the decoded contents of Git objects and do not tell you
  anything about the object's identity.

* `object/stored` wraps a parsed object together with the object ID it was
  loaded from. This is used when you need both the parsed value and the
  identity it was loaded under.

As a rule of thumb:

* If you have a ref name, start with `repo.Refs()`.
* If you want typed objects or path-based access, use `repo.Resolver()`.
* If you need raw object lookup by ID, object headers, or object streams, use
  `repo.Objects()`.

Some useful operations are built separately and are meant to be constructed
over the stores that `Repository` already exposes:

* To check whether one revision is an ancestor of another, or to compute merge
  bases, construct a `commitquery.Query` over `repo.Objects()`.
  * This is the tool to reach for when you already have object IDs and want to
    ask commit-history questions.
  * If you already have a commit-graph reader, pass it in as well for
    performance.

* To walk commits or all reachable objects from a set of starting points,
  construct a `reachability.Reachability` over `repo.Objects()`.
  * Use commit traversal when you only care about history, and full object
    traversal when you care about the complete reachable object set.
  * This is useful for tasks such as connectivity checks and computing the
    object set that a fetch or push needs to account for.

* To accept pushes on the server side, construct `receivepack` or
  `receivepack/service` with the repository's ref store, object store, and
  object ID algorithm.
  * Push handling also needs the repository's object storage root so incoming
    objects can be quarantined and later promoted.
  * `Repository` does not currently expose that root directly (we'll consider
    possible solutions sometime later), so a push server usually keeps the
    repository path or object root handle alongside the `Repository` value.
  * Hook-based checks are just Go functions; then, a fast-forward check can use
    `commitquery` over the existing and quarantined object stores. Some hooks
    are provided.

## Benchmarks

* See [gitbench](https://git.sr.ht/~runxiyu/gitbench).
* `legacy` branch furgit is slightly faster due to buffer reuse and custom
  ZLIB. These will be re-added.
* Alpine edge, i5-10210U, `performance` governor, `linux.git`.
* go-git may become much faster when
  [#1894](https://github.com/go-git/go-git/pull/1894)
  and such are fully in use.
* These lone tests do not represent all workloads. Test your usage
  pattern yourself (and contribute to gitbench).

### Traversing all trees in `HEAD` and fetching each file size

Mainly tests the packfile object reader.

| Implementation | Total  | User   | System |
| -              | -      | -      | -      |
| Git            | 337 ms | 226 ms | 108 ms |
| libgit2        | 391 ms | 269 ms | 120 ms |
| Furgit         | 487 ms | 457 ms | 49 ms  |
| go-git         | 37 s   | 35 s   | 2 s    |

## Repos and mirrors

* [Codeberg](https://codeberg.org/lindenii/furgit) (with the canonical issue tracker)
* [SourceHut mirror](https://git.sr.ht/~runxiyu/furgit)
* [tangled mirror](https://tangled.org/@runxiyu.tngl.sh/furgit)
* [GitHub mirror](https://github.com/runxiyu/furgit)

## Community

* [#lindenii](https://webirc.runxiyu.org/kiwiirc/#lindenii)
  on [irc.runxiyu.org](https://irc.runxiyu.org)
* [#lindenii](https://web.libera.chat/#lindenii)
  on [Libera.Chat](https://libera.chat)

## History and lineage

* Lindenii Forge
* [hare-git](https://codeberg.org/lindenii/hare-git)
* Faster Git library needed for
  [Lindenii Villosa](https://codeberg.org/lindenii/villosa)
  the next generation of Lindenii Forge
* Translated hare-git and put it into `internal/common/git` in Villosa
* Extracted it out into this general-purpose library
* "Fur" is "git" left-shifted by 1 on QWERTY
* Some architectural elements inspired by [upstream Git](https://git-scm.com),
  OpenBSD's [Game of Trees](https://gameoftrees.org), and
  [9front Git](https://git.9front.org/plan9front/9front/HEAD/sys/src/cmd/git/f.html).

## Reporting bugs

Bug reports ideally include a reproduction recipe: a Go program which starts
out with an empty repository and calls Furgit and/or Git commands to trigger
undesirable behavior.

Please ask for help with writing your regression test before asking for your
problem to be fixed. Time invested in writing a regression test saves time
wasted on back-and-forth discussion about how the problem can be reproduced. A
regression test will need to be written in any case to verify a fix and prevent
the problem from resurfacing.

If writing an automated test really turns out to be impossible, please explain
in very clear terms how the problem can be reproduced.

## License

This project is licensed under the GNU Affero General Public License,
Version 3.0 only.

Pursuant to Section 14 of the GNU Affero General Public License, Version 3.0,
[Runxi Yu](https://runxiyu.org) is hereby designated as the proxy who is
authorized to issue a public statement accepting any future version of the
GNU Affero General Public License for use with this Program.

Therefore, notwithstanding the specification that this Program is licensed
under the GNU Affero General Public License, Version 3.0 only, a public
acceptance by the Designated Proxy of any subsequent version of the GNU Affero
General Public License shall permanently authorize the use of that accepted
version for this Program.

For the purposes of the Developer Certificate of Origin, the "open source
license" refers to the GNU Affero General Public License, Version 3.0, with the
above proxy designation pursuant to Section 14.

All contributors are required to "sign-off" their commits (using `git commit
-s`) to indicate that they have agreed to the [Developer Certificate of
Origin](https://developercertificate.org), reproduced below.

```
Developer Certificate of Origin
Version 1.1

Copyright (C) 2004, 2006 The Linux Foundation and its contributors.
1 Letterman Drive
Suite D4700
San Francisco, CA, 94129

Everyone is permitted to copy and distribute verbatim copies of this
license document, but changing it is not allowed.


Developer's Certificate of Origin 1.1

By making a contribution to this project, I certify that:

(a) The contribution was created in whole or in part by me and I
    have the right to submit it under the open source license
    indicated in the file; or

(b) The contribution is based upon previous work that, to the best
    of my knowledge, is covered under an appropriate open source
    license and I have the right under that license to submit that
    work with modifications, whether created in whole or in part
    by me, under the same open source license (unless I am
    permitted to submit under a different license), as indicated
    in the file; or

(c) The contribution was provided directly to me by some other
    person who certified (a), (b) or (c) and I have not modified
    it.

(d) I understand and agree that this project and the contribution
    are public and that a record of the contribution (including all
    personal information I submit with it, including my sign-off) is
    maintained indefinitely and may be redistributed consistent with
    this project or the open source license(s) involved.
```