# Furgit [![builds.sr.ht status](https://builds.sr.ht/~runxiyu/furgit.svg)](https://builds.sr.ht/~runxiyu/furgit) [![Go Reference](https://pkg.go.dev/badge/codeberg.org/lindenii/furgit.svg)](https://pkg.go.dev/codeberg.org/lindenii/furgit) Furgit is a low-level Git library in Go. ## Project status * Several years away from stable * Do not use in production * Mature alternative: [go-git](https://github.com/go-git/go-git) * Will use [Semantic Versioning 2.0.0](https://semver.org/spec/v2.0.0.html) starting at 1.0.0 ## Goals * General-purpose Git plumbing library for UNIX-like systems * Aim for clear architecture then high performance * Expect familiarity with Git internals ## Finding your way around If you are working with an on-disk repository, start with `repository.Open(...)`. It opens the repository and wires together the refs storage, object storage, and resolver. That gives you a repository handle with a few different entry points, but they serve different purposes: * `repo.Refs()` is for branch names, tags, `HEAD`, and ref updates. * Use it when you are starting from names rather than object IDs. * A common pattern is to resolve a ref first, then pass the resulting object ID to the resolver. * `repo.Resolver()` is the main object-facing API for most callers. * Use it when you want commits, trees, blobs, or tags as typed values. * It also handles peeling through annotated tags, resolving objects to the type you actually want, and walking paths inside trees. * It even allows you to access a tree as an `io/fs.FS`. * If your goal is "show me this commit", "read this tree", "follow this tag", or "get me the file at this path", this is usually the right layer. * `repo.Objects()` is the storage layer underneath resolution. * Use it when you need to read object headers, read raw object contents, stream object data, or look up objects directly by ID. * Most callers who want to work with Git objects as commits, trees, blobs, or tags should prefer the resolver instead. * However, checking an object ID's size and type are somewhat common operations that should be done here. Some object concepts are kept separate: * `object` contains parsed Git object values such as blobs, trees, commits, and tags. These are the decoded contents of Git objects and do not tell you anything about the object's identity. * `object/stored` wraps a parsed object together with the object ID it was loaded from. This is used when you need both the parsed value and the identity it was loaded under. As a rule of thumb: * If you have a ref name, start with `repo.Refs()`. * If you want typed objects or path-based access, use `repo.Resolver()`. * If you need raw object lookup by ID, object headers, or object streams, use `repo.Objects()`. Some useful operations are built separately and are meant to be constructed over the stores that `Repository` already exposes: * To check whether one revision is an ancestor of another, or to compute merge bases, construct a `commitquery.Query` over `repo.Objects()`. * This is the tool to reach for when you already have object IDs and want to ask commit-history questions. * If you already have a commit-graph reader, pass it in as well for performance. * To walk commits or all reachable objects from a set of starting points, construct a `reachability.Reachability` over `repo.Objects()`. * Use commit traversal when you only care about history, and full object traversal when you care about the complete reachable object set. * This is useful for tasks such as connectivity checks and computing the object set that a fetch or push needs to account for. * To accept pushes on the server side, construct `receivepack` or `receivepack/service` with the repository's ref store, object store, and object ID algorithm. * Push handling also needs the repository's object storage root so incoming objects can be quarantined and later promoted. * `Repository` does not currently expose that root directly (we'll consider possible solutions sometime later), so a push server usually keeps the repository path or object root handle alongside the `Repository` value. * Hook-based checks are just Go functions; then, a fast-forward check can use `commitquery` over the existing and quarantined object stores. Some hooks are provided. ## Features * Configuration * [X] Parsing * [ ] Includes * [ ] Writing * [X] Object IDs * [X] SHA-256 * [X] SHA-1 * [X] Object model (incl., parse, serialize) * [X] Blobs * [X] Trees * [X] File mode definitions * [X] Entry insertion ordering * [X] Traversal * [ ] Pathspec * [X] Commits * [X] Annotated tags * [X] Stored objects * Further cryptography * [ ] OpenPGP signatures * [ ] SSH signatures * [X] Reading object stores * [X] Pluggable interface * [X] Chain lookup store * [X] Bundle store * [X] MRU lookup store * [X] Reading loose objects * [ ] Promisor remotes * [ ] Alternates * [X] Reading packed objects * [X] Pack index lookups * [X] Delta caching * [X] Delta application * [ ] Pack-wide bloom filters * [ ] Multi pack indexes * [ ] Writing objects * [X] Loose object writing * Misc bundle features * [ ] Writing bundles * Misc packfile features * [X] Writing pack indexes * [X] Writing reverse pack indexes * [ ] Writing packfiles * [ ] Writing thin packs * [ ] Compressing deltas * [ ] Delta islands * [ ] Pack verification * Compression * [ ] Plugabble compression algorithms * [X] ZLIB support * [ ] DEFLATE optimizations * [X] Adler-32 SIMD optimizations * [X] References * [X] Detached references * [X] Symbolic references * [X] Name verification/resolution * [X] Annotated tag ref peeling * [ ] Describe * [ ] Revision syntax * [ ] Namespaces * [ ] Replace refs, grafts * [X] Reference stores * [X] Chain lookup store * [X] Files reference store * [X] Reading loose refs * [X] Reading packed refs * [X] Atomic writes * [X] Batched writes * [ ] Packing refs * [ ] Reflogs * [ ] Reftable * Reachability * [X] Have/wants walks * [X] Is ancestor * [X] Merge bases * [X] Commit graph * [X] Changed path bloom filters * [X] Chained graphs * [ ] Writing * [ ] Reachability bitmaps * [ ] For a single packfile * [ ] For Multi pack indexes * Misc repository * [X] Opening relevant stores * [ ] Creating repositories * [ ] Filter branch/repo * [ ] Fast import/export * [ ] Git notes * [ ] Git attributes * [ ] Pseudorefs * Integrity and maintenance * [ ] Fsck * [ ] Repacking * [ ] Garbage collection * [ ] Cruft packing * [ ] Expiration * [ ] Grep * [ ] Submodules * [ ] Worktrees * [ ] Archive * [ ] LFS * [ ] Revision log walk * [ ] Topological ordering * [ ] Date ordering * [ ] Path-limited * [ ] Diffing * [ ] Blame * [ ] Annotate * [X] Tree diffing * [ ] Similarity/rename/copy detection * [ ] Multi-way diffs * [ ] Patch-id * [ ] Range-diff * Blob diffing * [ ] Word diffs * [X] Myers * [ ] Patience * [ ] Histogram * [ ] Tree-way * [ ] Format patch * [ ] Apply/amend patch * Branch integration/rewrite/etc methods * [ ] Merge * [ ] Recursive * [ ] ORT * [ ] Rebase * [ ] Cherry pick * [ ] Revert * [ ] Rerere * Network protocols and related features * [X] pkt-line * [X] side-band-64k * [X] Ingesting packfiles * [X] Quarantine areas * [X] Un-thinning thin packs * Version 0, version 1 protocols * [X] Server side * [X] Reference advertisement * [X] Capability negotiation * [X] Receive * [ ] "Upload" * [ ] Client side * [ ] Send * [ ] Fetch * Version 2 protocol * [ ] Server side * [ ] "Upload" * [ ] Client side * [ ] Fetch * Protocol-independent logic * Common * [X] Progress meters * Client side * [ ] Refspec * [ ] Fetch * [ ] Partial clones * [ ] Object filtering * [ ] Bundle URI * [ ] Packfile URI * [ ] Shallow clones * [ ] Send * Server side * [ ] Upload * [ ] Object filtering * [X] Receive * [ ] Signed push * Hooks * Slots * [ ] After ref negotiation * [X] After object unpacking * Provided samples * [X] Chain * [X] Force push rejection * [ ] Working trees * [ ] Stashing * [ ] Ignore rules * [ ] Checkouts * [ ] Sparse checkouts * [ ] CR/LF conversions * [ ] File mode conversions * [ ] Indexes * [ ] Conflict resolution * [ ] Split index * [ ] Sparse index * [ ] Untracked cache * [ ] Status listing * [ ] Filesystem monitor * [ ] Worktree * [ ] Common directory * [ ] Worktree-specific references * Research * [ ] Dynamic packfiles * [ ] Compaction; page-sized hole punching * [ ] Dynamic indexing * [ ] Linear/extendible/spiral hashing * [ ] Dynamic reachability bitmaps ## Not planned * CLI tools * Clone * Anything reasonably considered "porcelain" * Credential helper * Transports * Auth * Remote management * Bisect * Any use of env vars * Repository discovery walking I might make a second project that supports these. Furgit will probably not, and will remain sans-IO. ## Benchmarks * See [gitbench](https://git.sr.ht/~runxiyu/gitbench). * `legacy` branch furgit is slightly faster due to buffer reuse and custom ZLIB. These will be re-added. * Alpine edge, i5-10210U, `performance` governor, `linux.git`. * go-git may become much faster when [#1894](https://github.com/go-git/go-git/pull/1894) and such are fully in use. * These lone tests do not represent all workloads. Test your usage pattern yourself (and contribute to gitbench). ### Traversing all trees in `HEAD` and fetching each file size Mainly tests the packfile object reader. | Implementation | Total | User | System | | - | - | - | - | | Git | 337 ms | 226 ms | 108 ms | | libgit2 | 391 ms | 269 ms | 120 ms | | Furgit | 487 ms | 457 ms | 49 ms | | go-git | 37 s | 35 s | 2 s | ## Repos and mirrors * [Codeberg](https://codeberg.org/lindenii/furgit) (with the canonical issue tracker) * [SourceHut mirror](https://git.sr.ht/~runxiyu/furgit) * [tangled mirror](https://tangled.org/@runxiyu.tngl.sh/furgit) * [GitHub mirror](https://github.com/runxiyu/furgit) ## Community * [#lindenii](https://webirc.runxiyu.org/kiwiirc/#lindenii) on [irc.runxiyu.org](https://irc.runxiyu.org) * [#lindenii](https://web.libera.chat/#lindenii) on [Libera.Chat](https://libera.chat) ## History and lineage * Lindenii Forge * [hare-git](https://codeberg.org/lindenii/hare-git) * Faster Git library needed for [Lindenii Villosa](https://codeberg.org/lindenii/villosa) the next generation of Lindenii Forge * Translated hare-git and put it into `internal/common/git` in Villosa * Extracted it out into this general-purpose library * "Fur" is "git" left-shifted by 1 on QWERTY * Some architectural elements inspired by [upstream Git](https://git-scm.com), OpenBSD's [Game of Trees](https://gameoftrees.org), and [9front Git](https://git.9front.org/plan9front/9front/HEAD/sys/src/cmd/git/f.html). ## Reporting bugs Bug reports ideally include a reproduction recipe: a Go program which starts out with an empty repository and calls Furgit and/or Git commands to trigger undesirable behavior. Please ask for help with writing your regression test before asking for your problem to be fixed. Time invested in writing a regression test saves time wasted on back-and-forth discussion about how the problem can be reproduced. A regression test will need to be written in any case to verify a fix and prevent the problem from resurfacing. If writing an automated test really turns out to be impossible, please explain in very clear terms how the problem can be reproduced. ## License This project is licensed under the GNU Affero General Public License, Version 3.0 only. Pursuant to Section 14 of the GNU Affero General Public License, Version 3.0, [Runxi Yu](https://runxiyu.org) is hereby designated as the proxy who is authorized to issue a public statement accepting any future version of the GNU Affero General Public License for use with this Program. Therefore, notwithstanding the specification that this Program is licensed under the GNU Affero General Public License, Version 3.0 only, a public acceptance by the Designated Proxy of any subsequent version of the GNU Affero General Public License shall permanently authorize the use of that accepted version for this Program. For the purposes of the Developer Certificate of Origin, the "open source license" refers to the GNU Affero General Public License, Version 3.0, with the above proxy designation pursuant to Section 14. All contributors are required to "sign-off" their commits (using `git commit -s`) to indicate that they have agreed to the [Developer Certificate of Origin](https://developercertificate.org), reproduced below. ``` Developer Certificate of Origin Version 1.1 Copyright (C) 2004, 2006 The Linux Foundation and its contributors. 1 Letterman Drive Suite D4700 San Francisco, CA, 94129 Everyone is permitted to copy and distribute verbatim copies of this license document, but changing it is not allowed. Developer's Certificate of Origin 1.1 By making a contribution to this project, I certify that: (a) The contribution was created in whole or in part by me and I have the right to submit it under the open source license indicated in the file; or (b) The contribution is based upon previous work that, to the best of my knowledge, is covered under an appropriate open source license and I have the right under that license to submit that work with modifications, whether created in whole or in part by me, under the same open source license (unless I am permitted to submit under a different license), as indicated in the file; or (c) The contribution was provided directly to me by some other person who certified (a), (b) or (c) and I have not modified it. (d) I understand and agree that this project and the contribution are public and that a record of the contribution (including all personal information I submit with it, including my sign-off) is maintained indefinitely and may be redistributed consistent with this project or the open source license(s) involved. ```