From 6be60560864cdd58d571cafa0eeee134c9ae4f70 Mon Sep 17 00:00:00 2001 From: Runxi Yu Date: Wed, 11 Mar 2026 19:10:00 +0800 Subject: research: Maybe drop mmap in packfile_bloom --- research/packfile_bloom.txt | 11 +++++++++++ 1 file changed, 11 insertions(+) diff --git a/research/packfile_bloom.txt b/research/packfile_bloom.txt index 7a3aae02..2b695ad8 100644 --- a/research/packfile_bloom.txt +++ b/research/packfile_bloom.txt @@ -128,3 +128,14 @@ relevant object format hash algorithm could be used to fill up the bloom filters, rendering some buckets useless. In the worst case, if they somehow fill all filters, this proposal's optimizations become useless, but would not be a significant DoS vector. + +TODOs +----- + +* Consider dropping mmap (page read vs cachline read) +* How should B and K be chosen? +* How does creation/insert work? Note that packfiles and `.idx`es are immutable. +* What are the sizes? +* What are the false positive rates? +* Is there a way to make this SIMD friendly? +* How are benchmarks? -- cgit v1.3.1-10-gc9f91