littlefs

mirror of https://github.com/littlefs-project/littlefs.git synced 2025-12-01 12:20:02 +00:00

Author	SHA1	Message	Date
Christopher Haster	406fbe785e	gbmap: Reverted attempt at limiting in-use zeroing to unknown window See previous commit for motivation	2025-10-23 23:58:49 -05:00
Christopher Haster	d8f3346f13	gbmap: Attempted to limit in-use zeroing to unknown window Unfortunately this doesn't work and will need to be ripped-out/reverted. --- The goal was to limit in-use -> free zeroing to the uknown window, which would allow the gbmap to be updated in-place, saving the extra RAM we need to maintain the extra gbmap snapshot during traversals and lfs3_alloc_zerogbmap. Unfortunately this doesn't seem to work. If we limit zeroing to the unknown window, blocks can get stuck in the in-use state as long as they stay in the known window. Since the gbmap's known window encompasses most of the disk, this can cause the allocators to lock up and be unable to make progress. So will revert, but committing the current implementation in case we revisit the idea. As a plus, reverting avoids needing to maintain this unknown window logic, which is tricky and error-prone.	2025-10-23 23:57:53 -05:00
Christopher Haster	12874bff76	gbmap: Added gc_repoplookahead_thresh and gc_repopgbmap_thresh To allow relaxing when LFS3_I_REPOPLOOKAHEAD and LFS3_I_REPOPGBMAP will be set, potentially reducing gc workload after allocating only a couple blocks. The relevant cfg comments have quite a bit more info. Note -1 (not the default, 0, maybe we should explicitly flip this?) restores the previous functionality of setting these flags on the first block allocation. --- Also tweaked gbmap repops during gc/traversals to _not_ try to repop unless LFS3_I_REPOPGBMAP is set. We probably should have done this from the beginning since repopulating the gbmap writes to disk and is potentially destructive. Adds code, though hopefully we can claw this back with future config rework: code stack ctx before: 37176 2352 684 after: 37208 (+0.1%) 2352 (+0.0%) 688 (+0.6%) code stack ctx gbmap before: 40024 2368 848 gbmap after: 40120 (+0.2%) 2368 (+0.0%) 856 (+0.9%)	2025-10-23 23:56:50 -05:00
Christopher Haster	1dc1a26f11	gc: Added LFS3_GC_ALL to make running all gc work easier This is an alias for all possible gc work, which is a bit more complicated than you might think due to compile-time features (example: LFS3_GC_REPOPGBMAP). The intention is to make loops like the following easy to write: struct lfs3_fsinfo fsinfo; lfs3_fs_stat(&lfs3, &fsinfo) => 0; lfs3_trv_t trv; lfs3_trv_open(&lfs3, &trv, fsinfo.flags & LFS3_GC_ALL) => 0; ... It's possible to do this by explicitly setting all gc flags, but that requires quite a bit of knowledge from the user. Another option is allowing -1 for gc/traversal flags, but that loses assert protection against unknown/misplaced flags. --- This raises more questions about the prefix naming: it feels a bit weird to take LFS3_I_* flags, mask with LFS3_GC_* flags, and pass them as LFS3_T_* flags, but it gets the job done. Limiting LFS3_GC_ALL to the LFS3_GC_* namespace avoids issues with opt-out/mode flags such as LFS3_T_RDONLY, LFS3_T_MTREEONLY, etc. For this reason it probably doesn't make sense to add something similar to the other namespaces.	2025-10-23 23:55:54 -05:00
Christopher Haster	1f824a029b	Renamed LFS3_T_COMPACT -> LFS3_T_COMPACTMETA (and gc_compactmeta_thresh) - LFS3_T_COMPACT -> LFS3_T_COMPACTMETA - gc_compact_thresh -> gc_compactmeta_thresh And friends: LFS3_M_COMPACTMETA 0x00000800 Compact metadata logs LFS3_GC_COMPACTMETA 0x00000800 Compact metadata logs LFS3_I_COMPACTMETA 0x00000800 Filesystem may have uncompacted metadata LFS3_T_COMPACTMETA 0x00000800 Compact metadata logs --- This does two things: 1. Highlights that LFS3_T_COMPACTMETA only interacts with metadata logs, and has no effect on data blocks. 2. Better matches the verb+noun names used for other gc/traversal flags (REPOPGBMAP, CKMETA, etc). It is a bit more of a mouthful, but I'm not sure that's entirely a bad thing. These are pretty low-level flags.	2025-10-23 23:54:57 -05:00
Christopher Haster	9bdfb25a09	Renamed LFS3_T_LOOKAHEAD -> LFS3_T_REPOPLOOKAHEAD And friends: LFS3_M_REPOPLOOKAHEAD 0x00000200 Repopulate lookahead buffer LFS3_GC_REPOPLOOKAHEAD 0x00000200 Repopulate lookahead buffer LFS3_I_REPOPLOOKAHEAD 0x00000200 Lookahead buffer is not full LFS3_T_REPOPLOOKAHEAD 0x00000200 Repopulate lookahead buffer To match LFS3_T_REPOPGBMAP, which is more-or-less the same operation. Though this does turn into quite the mouthful...	2025-10-23 23:54:02 -05:00
Christopher Haster	ced63a4c73	Renamed inline_size -> shrub_size There's a strong argument for naming this inline_size as that's more likely what users expect, but shrub_size is just the more correct name and avoids confusion around having multiple names for the same thing. It also highlights that shrubs in littlefs3 are a bit different than inline files in littlefs2, and that this config also affects large files with a shrubbed root. May rerevert this in the future, but probably only if there is significant user confusion.	2025-10-23 23:53:02 -05:00
Christopher Haster	d58205d621	Renamed lfs3_fs_flushgdelta -> lfs3_fs_zerogdelta This really didn't match the use of "flush" elsewhere in the system.	2025-10-23 23:52:09 -05:00
Christopher Haster	3b4e1e9e0b	gbmap: Renamed gbmap_rebuild_thresh -> gbmap_repop_thresh And tweaked a few related comments. I'm still on the fence with this name, I don't think it's great, but it at least betters describes the "repopulation" operation than "rebuilding". The important distinction is that we don't throw away information. Bad/erased block info (future) is still carried over into the new gbmap snapshot, and persists unless you explicitly call rmgbmap + mkgbmap. So, adopting gbmap_repop_thresh for now to see if it's just a habit thing, but may adopt a different name in the future. As a plus, gbmap_repop_thresh is two characters shorter.	2025-10-23 23:51:18 -05:00
Christopher Haster	fb90bf976c	trv: Split lfs3_trv_t -> lfs3_trv_t, lfs3_mgc_t, and lfs3_mtrv_t A big downside of LFS3_T_REBUILDGBMAP is the addition of an lfs3_btree_t struct to _every_ traversal object. Unfortunately, I don't see a way around this. We need to track the new gbmap snapshot _somewhere_, and other options (such as a global gbmap.b_ snapshot) just move the RAM around without actually saving anything. To at least mitigate this internally, this splits lfs3_trv_t into distinct lfs3_trv_t, lfs3_mgc_t, and lfs3_mtrv_t structs that capture only the relevant state for internal traversal layers: - lfs3_mtree_traverse <- lfs3_mtrv_t - lfs3_mtree_gc <- lfs3_mgc_t (contains lfs3_mtrv_t) - lfs3_trv_read <- lfs3_trv_t (contains lfs3_mgc_t) This minimizes the impact of the gbmap rebuild snapshots, and saves a big chunk of RAM. As a plus it also saves RAM in the default build by limiting the 2-block block queue to the high-level lfs3_trv_read API: code stack ctx before: 37176 2360 684 after: 37176 (+0.0%) 2352 (-0.3%) 684 (+0.0%) code stack ctx gbmap before: 40060 2432 848 gbmap after: 40024 (-0.1%) 2368 (-2.6%) 848 (+0.0%) The main downside? Our field names are continuing in their ridiculousness: lfs3.gc.gc.t.b.h.flags // where else would the global gc flags be?	2025-10-23 23:49:58 -05:00
Christopher Haster	06bc4dff04	trv: Simplified MUTATED/DIRTY flags, no more swapping A bit less simplified than I hoped, we don't _strictly_ need both LFS3_t_DIRTY + LFS3_t_MUTATED if we're ok with either (1) making multiple passes to confirm fixorphans succeeded or (2) clear the COMPACT flag after one pass (which may introduce new uncompacted metadata). But both of these have downsides, and we're not _that_ stressed for flag space yet... So keeping all three of: LFS3_t_DIRTY 0x04000000 Filesystem modified outside traversal LFS3_t_MUTATED 0x02000000 Filesystem modified during traversal LFS3_t_CKPOINTED 0x01000000 Filesystem ckpointed during traversal But I did manage to get rid of the bit swapping by tweaking LFS3_t_DIRTY to imply LFS3_t_MUTATED instead of being exclusive. This removes the "failed" gotos in lfs3_mtree_gc and makes things a bit more readable. --- I also split lfs3_fs/handle_clobber into separate lfs3_fs/handle_clobber and lfs3_fs/handle_mutate functions. This added a bit of code, but I think is worth it for a simpler internal API. A confusing internal API is no good. In total these simplifications saved a bit of code: code stack ctx before: 37208 2360 684 after: 37176 (-0.1%) 2360 (+0.0%) 684 (+0.0%) code stack ctx gbmap before: 40100 2432 848 gbmap after: 40060 (-0.1%) 2432 (+0.0%) 848 (+0.0%)	2025-10-23 23:41:43 -05:00
Christopher Haster	5a7e0c2b58	gbmap: Renamed a couple gbmap/lookahead things to be more consistent - lfs3_gbmap_set* -> lfs3_gbmap_mark* - lfs3_alloc_markfree -> lfs3_alloc_adopt - lfs3_alloc_mark* -> lfs3_alloc_markinuse* Mainly for consistency, since the gbmap and lookahead buffer are more or less the same algorithm, ignoring nuances (lookahead only ors inuse bits, gbmap rebuilding can result in multiple snapshots, etc). The rename lfs3_gbmap_set* -> lfs3_gbmap_mark* also makes space for lfs3_gbmap_set* to be used for range assignments with a payload, which may be useful for erased ranges (gbmap tracked ecksums?)	2025-10-23 23:39:59 -05:00
Christopher Haster	f5508a1b6c	gbmap: Added LFS3_T_REBUILDGBMAP and friends This adds LFS3_T_REBUILDGBMAP and friends, and enables incremental gbmap rebuilds as a part of gc/traversal work: LFS3_M_REBUILDGBMAP 0x00000400 Rebuild the gbmap LFS3_GC_REBUILDGBMAP 0x00000400 Rebuild the gbmap LFS3_I_REBUILDGBMAP 0x00000400 The gbmap is not full LFS3_T_REBUILDGBMAP 0x00000400 Rebuild the gbmap On paper, this is more or less identical to repopulating the lookahead buffer -- traverse the filesystem, mark blocks as in-use, adopt the new gbmap/lookahead buffer on success -- but a couple nuances make rebuilding the gbmap a bit trickier: - Unlike the lookahead buffer, which eagerly zeros in allocation, we need an explicit zeroing pass before we start marking blocks as in-use. This means multiple traversals can potentially conflict with each other, risking the adoption of a clobbered gbmap. - The gbmap, which stores information on disk, relies on block allocation and the temporary "in-flight window" defined by allocator ckpoints to avoid circular block states during gbmap rebuilds. This makes gbmap rebuilds sensitive to allocator ckpoints, which we consider more-or-less a noop in other parts of the system. Though now that I'm writing this, it might have been possible to instead include gbmap rebuild snapshots in fs traversals... but that would probably have been much more complicated. - Rebuilding the gbmap requires writing to disk and is generally much more expensive/destructive. We want to avoid trying to rebuild the gbmap when it's not possible to actually make progress. On top of this, the current trv-clobber system is a delicate, error-prone mess. --- To simplify everything related to gbmap rebuilds, I added a new internal traversal flag: LFS3_t_CKPOINTED: LFS3_t_CKPOINTED 0x04000000 Filesystem ckpointed during traversal LFS3_t_CKPOINTED is set, unconditionally, on all open traversals in lfs3_alloc_ckpoint, and provides a simple, robust mechanism for checking if _any_ allocator checkpoints have occured since a traversal was started. Since lfs3_alloc_ckpoint is required before any block allocation, this provides a strong guarantee that nothing funny happened to any allocator state during a traversal. This makes lfs3_alloc_ckpoint a bit less cheap, but the strong guarantees that allocator state is unmodified during traversal are well worth it. This makes both lookahead and gbmap passes simpler, safer, and easier to reason about. I'd like to adopt something similar+stronger for LFs3_t_MUTATED, and reduce this back to two flags, but that can be a future commit. --- Unfortunately due to the potential for recursion, this ended up reusing less logic between lfs3_alloc_rebuildgbmap and lfs3_mtree_gc than I had hoped, but at like the main chunks (lfs3_alloc_remap, lfs3_gbmap_setbptr, lfs3_alloc_adoptgbmap) could be split out into common functions. The result is a decent chunk of code and stack, but the value is high as incremental gbmap rebuilds are the only option to reduce the latency spikes introduced by the gbmap allocator (it's not significantly worse than the lookahead buffer, but both do require traversing the entire filesystem): code stack ctx before: 37164 2352 684 after: 37208 (+0.1%) 2360 (+0.3%) 684 (+0.0%) code stack ctx gbmap before: 39708 2376 848 gbmap after: 40100 (+1.0%) 2432 (+2.4%) 848 (+0.0%) Note the gbmap build is now measured with LFS3_GBMAP=1, instead of LFS3_YES_GBMAP=1 (maybe-gbmap) as before. This includes the cost of mkgbmap, lfs3_f_isgbmap, etc.	2025-10-23 23:39:55 -05:00
Christopher Haster	5bfa2a1071	gbmap: Added an lfs3_alloc_ckpoint to lfs3_fs_mkconsistent lfs3_fs_mkconsistent is already limited to call sites where lfs3_alloc_ckpoint is valid (lfs3_fs_mkconsistent internally relies on lfs3_mdir_commit), so might as well include an unconditional lfs3_alloc_ckpoint to populate allocators and save some code: code stack ctx no-gbmap before: 37168 2352 684 no-gbmap after: 37164 (-0.0%) 2352 (+0.0%) 684 (+0.0%) code stack ctx maybe-gbmap before: 39720 2376 848 maybe-gbmap after: 39708 (-0.0%) 2376 (+0.0%) 848 (+0.0%) code stack ctx yes-gbmap before: 39208 2376 848 yes-gbmap after: 39204 (-0.0%) 2376 (+0.0%) 848 (+0.0%)	2025-10-17 14:03:14 -05:00
Christopher Haster	61dc21ccb7	gbmap: Renamed/moved lookahead.bmapped -> gbmap.known And: - Tweaked the behavior of gbmap.window/known to _not_ match disk. gbmap.known matching disk is what required a separate lookahead.bmapped in the first place, but we never use both fields. - _Don't_ revert gbmap on failed mdir commits! This was broken! If we reverted we risked inheriting outdated in-flight block information. This could be fixed by also zeroing lookahead.bmapped, but would force a gbmap rebuild. And why? The only interaction between mdir commit and the gbmap is block allocation, which is intentionally allowed to go out-of-sync to relax issues like this. Note we still revert in lfs3_fs_grow, the new gbmap we create there is incompatible with the previous disk size. As a part of these changes, gbmap.window now behaves roughly the same as gbmap.known and updates eagerly on block allocation. This makes lookahead.window and gbmap.window somewhat redundant, but simplifies the relevant logic (especially due to how lookahead.window lags behind lookahead.off). --- A bunch of bugs fell out-of-this, the interactions with lfs3_fs_mkgbmap and lfs3_fs_grow being especially tricky, but fortunately our testing is doing a good job. At least the code changes were minimal, saves a bit of RAM: code stack ctx no-gbmap before: 37168 2352 684 no-gbmap after: 37168 (+0.0%) 2352 (+0.0%) 684 (+0.0%) code stack ctx maybe-gbmap before: 39688 2392 852 maybe-gbmap after: 39720 (+0.1%) 2376 (-0.7%) 848 (-0.5%) code stack ctx yes-gbmap before: 39156 2392 852 yes-gbmap after: 39208 (+0.1%) 2376 (-0.7%) 848 (-0.5%)	2025-10-17 14:02:47 -05:00
Christopher Haster	67d3c6ea69	scripts: Ignore errors with compat-disabled gstate The gbmap introduces quite a bit of complexity with how it interacts with config: block_count => gbmap weight, and wcompat => gbmap enabled. On one hand this means fewer sources of truth, on the other hand it makes the gbmap logic cross subsystems and a bit messy. To avoid trying to parse a bunch of disabled/garbage gstate, this adds wcompat/rcompat checks to our Gstate class, exposed via __bool__. This also means we actually need to parse wcompat/rcompat/ocompat flags, but that wasn't to difficult (though currently only supports 32-bits). --- I added conditional repr logic for the grm and gbmap, but didn't bother with the gcksum. The gcksum is used too many other places in these scripts to expect a nice rendering when disabled.	2025-10-17 14:02:46 -05:00
Christopher Haster	b5a94f3397	gbmap: Added mkgbmap and rmgbmap for enabling/disabling the gbmap These two functions allow changing whether or not the gbmap is in use after format: // Enable the global on-disk block-map // // Returns a negative error code on failure. Does nothing if a gbmap // already exists. int lfs3_fs_mkgbmap(lfs3_t lfs3); // Disable the global on-disk block-map // // Returns a negative error code on failure. Does nothing if no gbmap // is found. int lfs3_fs_rmgbmap(lfs3_t lfs3); rmgbmap was easy enough, but implementing mkgbmap turned out to be surprisingly tricky due to how gstate permeates the system: - Even if we zero gstate when we removing the gbmap, mounting the image on a driver that doesn't understand the gbmap results in garbage gstate over time as mdir compacts drop unknown gdeltas. I think this sort of implicit gdelta cleanup is a good thing, but the possibility of garbage gstate is a bit annoying. Example A: the dbg scripts are currently printing a bunch of warnings for corrupt gstate that can be safely ignored. To support recovering from garbage gstate in mkgbmap, I changed lfs3_fs_commitgdelta to _always_ track p state even when disabled. We already needed to do this in lfs3_fs_flush/consumegdelta anyways, since we don't know if the gbmap is used until parsing wcompat flags. - The commit that enables the gbmap is tricky. We need the gbmap enabled to calculate the new gdelta, but we also need it disabled so we don't traverse the existing gbmap_p (which may be garbage). As a workaround I added gbmap.b_p, which is in theory redundant with gbmap_p, but (1) avoids needing to decode gbmap_p during traversals, and (2) allows the two to temporarily fall out-of-sync in mkgbmap. This means we potentially have 5 (!) snaphots flying around when rebuilding the gbmap, which is starting to get a bit silly. But this was also motivated by gbmap_p decoding adding roughly the same amount of RAM to lfs3_mtree_traverse_, so the total RAM usage should in theory be roughly the same. There might be a better solution, but this at least gets mkgbmap working. The gbmap builds are not our most RAM senstive configurations anyways. --- Also added a couple more tests in test_gbmap to test these: - test_gbmap_files - test_gbmap_rmgbmap - test_gbmap_mkgbmap - test_gbmap_rmmkgbmap - test_gbmap_mkrmgbmap And an explicit wraparound test to test_alloc. This was loosely implied by the nospc tests, but it's probably better to have an explicit test. The only downside is this implementation is limited to files: - test_alloc_wraparound_files --- Note we are currently dealing with three different configurations: no-gbmap (the default), yes-gbmap (LFS3_YES_GBMAP), and maybe-gbmap (LFS3_GBMAP + LFS3_F_GBMAP at runtime). It only makes sense to include these in maybe-gbmap mode, so this is the only mode with a notable code increase. However these functions are relatively cheap. The stack/ctx changes also affect yes-gbmap, but should mostly cancel out, see above: code stack ctx no-gbmap before: 37168 2352 684 no-gbmap after: 37168 (+0.0%) 2352 (+0.0%) 684 (+0.0%) code stack ctx maybe-gbmap before: 39292 2456 800 maybe-gbmap after: 39688 (+1.0%) 2392 (-2.6%) 852 (+6.5%) code stack ctx yes-gbmap before: 39116 2456 800 yes-gbmap after: 39156 (+0.1%) 2392 (-2.6%) 852 (+6.5%)	2025-10-17 14:02:05 -05:00
Christopher Haster	9e45249b29	gbmap: Added support for gbmap in lfs3_fs_grow In lfs3_fs_grow, we need to update any gbmaps to match the new disk size. The actual patch to the gbmap is easy, but it does get a bit delicate since we need to feed the gbmap with an allocator in the new disk size. Fortunately, the opportunistism of the gbmap allocator avoids any catch-22 issues, as long as we make sure to not trigger any gbmap rebuilds. Adds a bit of code, but not much: code stack ctx before: 37168 2352 684 after: 37168 (+0.0%) 2352 (+0.0%) 684 (+0.0%) code stack ctx gbmap before: 39000 2456 800 gbmap after: 39116 (+0.3%) 2456 (+0.0%) 800 (+0.0%)	2025-10-12 14:24:32 -05:00
Christopher Haster	24d75a24c5	btree: Moved most btree claims into lfs3_btree_commit_ Highlighted by the gbmap work, the need for every btree commit to claim (mark as unfetched, forcing erased-state to be rechecked) every possible btree snapshot is tedious and error prone. Unfortunately we can't avoid this for in-flight/stack allocated btrees, but we can at least automatically claim the global/tracked btrees (mtree, gbmap, and file btrees) in lfs3_btree_commit_. This makes most btree commits just do the right thing, and hopefully minimizes the risk of forgetting a necessary btree claim. It also cleans up the various btree-specific claims we were doing, and makes the codebase a bit less of a mess. --- Also fixed bshrubs never claiming cached leaves. We now also claim bshrubs (not just btrees), but avoid clobbering erased-state with is-shrub checks in lfs3_btree_claim. Code changes minor, btree claims are at least a cheap operation: code stack ctx before: 37172 2352 684 after: 37168 (-0.0%) 2352 (+0.0%) 684 (+0.0%) code stack ctx gbmap before: 38996 2456 800 gbmap after: 39000 (+0.0%) 2456 (+0.0%) 800 (+0.0%)	2025-10-09 14:33:27 -05:00
Christopher Haster	7bb7d93c9f	gbmap: Minimized commits in lfs3_gbmap_set_ This rearranges lfs3_gbmap_set_ a bit to try to minimize the number of commits necessary for gbmap updates. By combining the split and range creation, we can reduce the common no-merge case to a single commit. This matters quite a bit because rebuilding the gbmap requires a ton of lfs3_gbmap_set_ calls (~2d). --- The original idea was to see if adopting a builder pattern (see lfs3_file_graft_) here would reduce the commits necessary, but I don't think it can. Worst case we need to delete 3 ranges, and since they can reside in different btree leaves, this requires 3 separate commits. And the current implementation uses no worse than 3 commits. --- Code changes minimal: code stack ctx before: 37172 2352 684 after: 37172 (+0.0%) 2352 (+0.0%) 684 (+0.0%) code stack ctx gbmap before: 38992 2456 800 gbmap after: 38996 (+0.0%) 2456 (+0.0%) 800 (+0.0%)	2025-10-09 14:33:27 -05:00
Christopher Haster	633cbe8fd6	gbmap: Reuse old gbmap during rebuilds This changes the gbmap rebuild strategy to clear in-use ranges from a snapshot of the old gbmap instead of building a new gbmap from scratch. The theory of building a new gbmap from scratch is it skips the cost of clearing in-use ranges, but: 1. This potentially misses out on erased-state still in the gbmap. 2. We would need to copy over any erased/bad state (not yet implemented) before traversing, and reusing the old gbmap makes this a bit simpler. To make this a little bit more efficient, I extended lfs3_gbmap_set_ to accept a weight, however this is limited to modifying only a single range. Cross-range sets would be quite a bit more complicated (see file grafting). We're probably dominated by the per-block set operation during traversal anyways. --- Costs a bit of code, but in theory makes erased/bad block tracking cheaper: code stack ctx before: 37172 2352 684 after: 37172 (+0.0%) 2352 (+0.0%) 684 (+0.0%) code stack ctx gbmap before: 38852 2456 800 gbmap after: 38992 (+0.4%) 2456 (+0.0%) 800 (+0.0%)	2025-10-09 14:33:27 -05:00
Christopher Haster	cb9bda5a94	gbmap: Renamed gbmap_scan_thresh -> gbmap_rebuild_thresh I think a good rule of thumb is if you refer to some variable/config/ field with a different name in comments/writing/etc more often than not, you should just rename the variable/config/field to match. So yeah, gbmap_rebuild_thresh controls when the gbmap is rebuilt. Also touched up the doc comment a bit.	2025-10-09 14:33:27 -05:00
Christopher Haster	ea05ad04b9	gbmap: Cleanup of gbmap comments, TODOs, code formatting, etc Just cleaning up a bunch of outdated TODOs and commented out code, as well as a little bit of code formatting, and scrubbing airspace/gbatc names as these are no longer used and will just confuse new users.	2025-10-09 14:33:27 -05:00
Christopher Haster	9b4ee982bc	gbmap: Tried to adopt the gbmap name more consistently Having gbmap/bmap used in different places for the same thing was confusing. Preferring gbmap as it is consistent with other gstate (grm queue, gcksums), even if it is a bit noisy. It's interesting to note what didn't change: - The BM* range tags: LFS3_TAG_BMFREE, etc. These already differs from the GBMAP* prefix enough, and adopting GBM* would risk confusion for actual gstate. - The gbmap revdbg string: "bb~r". We don't have enough characters for anything else! - dbgbmap.py/dbgbmapsvg.py. These aren't actually related to the gbmap, so the name difference is a good thing.	2025-10-09 14:33:27 -05:00
Christopher Haster	9d322741ca	bmap: Simplified bmap configs, reduced to one LFS3_F_GBMAP flag TLDR: This drops the idea of different bmap strategies/modes, and sorts out most of the compile-time/runtime conditional bmap interactions. --- Motivation: Benchmarking (at least up to the 32-bit word limit) has shown the bmap will unlikely be a significant bottleneck, even on large disks. The largest disks tend to be NAND, and NAND's ridiculous block size limits pressure on block allocation. There are still concerns for areas I haven't measured yet: - SD/eMMC/FTL - Small blocks, so more pressure on block allocation. In theory the logical block size can be artificially increased, but this comes with a granularity tradeoff. - I've only measured throughput, latency is a whole other story. However, users have reported lfs3_fs_gc is useful for mitigating this, so maybe latency is less of a concern now? But while there may still be room for improvement via alternative bmap strategies, the risk a concerning amount of complexity. Yes, configuration gets more complicated, but the real issue is any bmap strategies that try to track _deallocations_ (the original idea being treediffing) risk falling leaking blocks if all cases aren't covered. The current "bmap cache" strategy strikes a really nice balance where it reduces _amortized_ block allocation -> ~O(log n) without RAM, while retaining the safe, bug-resistant, single-source-of-truth properties that come with lookahead-based allocation. --- So, long story short, dropping other strategies, and now the presence of the bmap is a boolean flag. This is also the first format-specific flag: - Define LFS3_BMAP to enable the bmap logic, but note by default the bmap will still not be used. - Define LFS3_YES_BMAP to force the bmap to be used. - With LFS3_BMAP, passing LFS3_F_GBMAP to lfs3_format will include the on-disk block-map. - No flag is needed during mount, the presence of the bmap is determined by the on-disk wcompat flags (LFS3_WCOMPAT_GBMAP). This also prevents rw mounting if the bmap is not supported, but rdonly mounting is allowed. - Users can check if the bmap is in use via lfs3_fs_stat, which reports LFS3_I_GBMAP in the flags field. There's still some missing pieces, but these will be a bit more involved: - lfs3_fs_grow needs to be made bmap aware! - We probably want something like lfs3_fs_mkgbmap and lfs3_fs_rmgbmap to allow converting between bmap backed/not-backed filesystem images. Code changes minimal: code stack ctx before: 37172 2352 684 after: 37172 (+0.0%) 2352 (+0.0%) 684 (+0.0%) code stack ctx bmap before: 38844 2456 800 bmap after: 38852 (+0.0%) 2456 (+0.0%) 800 (+0.0%)	2025-10-09 14:33:27 -05:00
Christopher Haster	38cfa5cc5e	scripts: dbgtag.py: Fixed overlooked LFSR -> LFS3 prefix Not sure how this was missed, but tags should start with LFS3_ now.	2025-10-09 14:33:27 -05:00
Christopher Haster	4b2bd11393	bmap: Finally fixed embedded directive macro warning with bmap format It makes sense, nesting directives (#ifdef) in macro arguments invites all sort of weird parse errors. Unfortunately, this doesn't leave us with many options for conditionally including rattrs in LFS3_RATTRS lists... This is especially important for lfs3_format, where we can expect many rattrs to depend on compile-time configurations. To fix the warning, I went ahead and adopted a conditionally predefined LFS3_RATTR_IFDEF_BMAP before the rattr list. I'm not super happy with this fix (ugh, missing comma), but it at least avoids a warning and non-portable behavior.	2025-10-09 14:33:27 -05:00
Christopher Haster	982394305e	emubd/kiwibd: Fixed unused path param, dropped disk_path For some reason emubd had both a path argument to lfs3_emubd_create, and a disk_path config option, with only the disk_path actually being used. But the real curiosity is why did GCC only starting warning about it when copied to kiwibd? path is clearly unused in lfs3_emubd_createcfg, but no warning... --- Anyways, not sure which one is a better API, but we definitely don't need two APIs, so eeny meeny miny moe... Went ahead and chose the lfs3_emubd_create path param for some consistency with filebd.	2025-10-09 14:33:27 -05:00
Christopher Haster	e622656538	bmap: Tweaked bmap ranges, dropped in-flight tag for now New bmap range tags: LFS3_TAG_BMRANGE 0x033u v--- --11 --11 uuuu LFS3_TAG_BMFREE 0x0330 v--- --11 --11 ---- LFS3_TAG_BMINUSE 0x0331 v--- --11 --11 ---1 LFS3_TAG_BMERASED 0x0332 v--- --11 --11 --1- LFS3_TAG_BMBAD 0x0333 v--- --11 --11 --11 Note 0x334-0x33f are still reserved for future bmap tags, but the new encoding fits in the surprisingly common 2-bit subfield that may deduplicate some decoding code. Fitting in 2-bits is the main reason for this, now that in-flight ranges look like they won't be worth exploring further. Worst case we can always add more bm tags in the future. And it may even make sense to use an entire bit for in-flight tags, since in theory the concept can apply to more than just in-use blocks. --- Another benefit of this encoding: In-use vs free is a bit check, and I like the implication that an in-use + erased block can only be a bad block. No code changes: code stack ctx before: 37172 2352 684 after: 37172 (+0.0%) 2352 (+0.0%) 684 (+0.0%) code stack ctx bmap before: 38844 2456 800 bmap after: 38844 (+0.0%) 2456 (+0.0%) 800 (+0.0%)	2025-10-09 14:33:24 -05:00
Christopher Haster	43a6053d5e	alloc: Tried to simplify alloc info statements So now: lfs3.c:11482:info: Rebuilding bmap (bmap 37/256) lfs3.c:11246:error: No more free space (lookahead 0/256) Instead of the previously somewhat confusing: lfs3.c:11484:info: Rebuilding bmap (bmap 62/256/256) lfs3.c:11247:error: No more free space (lookahead 0/0/256) While the previous info statements did have more info (window + ckpoint + block count), usually one of these ended up redundant (window == ckpoint == 0 during ENOSPC, for example).	2025-10-04 13:33:08 -05:00
Christopher Haster	92620d386f	bmap: Recheckpoint the allocator after rebuilding the bmap If before rebuilding the bmap is a valid checkpoint, after is too. This lets us realloc any blocks that may have been temporarily allocated when rebuilding the bmap. This probably doesn't matter much except for low-storage states when blocks are extremely scarce, but allocator checkpoints are cheap so better safe than sorry. Code changes minimal (negative?): code stack ctx before: 37172 2352 684 after: 37172 (+0.0%) 2352 (+0.0%) 684 (+0.0%) code stack ctx bmap before: 38852 2456 800 bmap after: 38844 (-0.0%) 2456 (+0.0%) 800 (+0.0%)	2025-10-04 13:30:52 -05:00
Christopher Haster	052fc200c8	util: More parens in LFS3_MIN/MAX Previously this had a very naive number of parens, which led to a very confusing night trying to debug some code that looked roughly like this: LFS3_MAX(1, (false) ? 64 : 256) => 1 ??? Fixed by adding more parens.	2025-10-01 17:58:09 -05:00
Christopher Haster	2a2d3173ce	btree: Implemented quick-fetches to try to speed up btree commits I've had this trick in my back pocket for a while, but didn't think it would be worth the code cost. Benchmarks suggested this was a bottleneck, so gave it an impl... But it turned out to be a red herring... At least the code cost is ridiculously cheap? code stack ctx before: 37156 2352 684 after: 37172 (+0.0%) 2352 (+0.0%) 684 (+0.0%) Oh, sidenote, this also removes shrub trunk fetching, repurposing that bit as an internal flag for quick-fetches. I don't think fetching shrubs makes sense anymore? This code was probably leftover from a less-correct traversal implementation. --- The basic idea: the most recent trunk contains all the info we need to fetch a btree node for committing: - We can infer the rbyd weight from one trunk: The total weight is just the sum of alt pointer weights + the leaf weight. - The checksum tags provide the perturb bit, ecksum, etc. The only thing we can't find from the most recent trunk is the checksum, but this is already implicit in our CoW branch pointers! (Technically the weight is as well, but we have to scan the alts anyways.) So we don't need to scan the entire rbyd if we know the checksum, just the most recent trunk + checksum tags. In theory, quick-fetches drop our btree commit runtime from O(b log_b n + (log b)(log_b^2 n)) -> O((log b)(log_b^2 n)). --- In practice, this doesn't seem to matter, even on NAND with 128KiB blocks. We're still dominated by compaction costs, perhaps due to the poor granularity of NAND's read size? I'm going to keep this for now just for the peace-of-mind while benchmarking, but it may be worth removing in the future (or maybe not? the code size is much less than I was expecting). At least it simplifies the runtime complexity...	2025-10-01 17:58:06 -05:00
Christopher Haster	b94f9fe071	runners: Fixed 64-bit overflow when size_t < bench_io_t Long story short: %zd != %jd! This was a simple oversight when writing the bench printing code, and easy to miss on x86_64 and other modern PCs, but the mistake becomes very apparent when trying to bench under qemu in thumb mode!	2025-10-01 17:58:05 -05:00
Christopher Haster	0698c49e1b	Allow crystal_thresh to go below prog_size After sleeping on it, allowing crystal_thresh < prog_size makes more sense than I initially thought, if only to better support the case where prog_size = block_size (SD/eMMC). It's true fragments + crystal_thresh were intended to avoid needing to write padding to raw data blocks, but this only makes sense up until block padding is cheaper than rbyd overheads. At ~block_size/4, rbyds vs padded data blocks have roughly the same cost, and at ~block_size/2 rbyds use ~2x the storage due to logging/splitting. At ~block_size/2 we definitely want to crystallize even if this is still below prog_size. And it turns out allowing crystal_thresh < prog_size fixes the 512B block size issues on SD/eMMC we were running into earlier! --- Implementing this required some tweaks to lfs3_file_crystallize_: 1. We intentionally do not align down partial crystallizations if we can't satisfy prog alignment, as we risk making no progress in this case. 2. If we can't satisfy prog alignment, don't mark the bptr as erased. Resuming crystallization to an unaligned block is an error. Unaligned progs should already be implicitly padded by the lower bd caching logic, so not aligning should be all we need to do to pad data blocks. Oh, and also relax the crystal_thresh >= prog_size constraints. Adds a bit of code, but the improved block usage on SD/eMMC will hopefully be valuable: code stack ctx before: 37112 2352 684 after: 37156 (+0.1%) 2352 (+0.0%) 684 (+0.0%)	2025-10-01 17:58:03 -05:00
Christopher Haster	2f6f7705f1	Limit crystal_thresh to >=prog_size I confused myself a bit while benchmarking because crystal_thresh < prog_size was showing some very confusing results. But it turns out the relevant code was just not written well enough to support this configuration. And, to be fair, this configuration really doesn't make sense. The whole point of the fragment + crystallization system is so we never have to write unaligned data to blocks. I mean, we could explicitly write padding in this case, but why? --- This should probably eventually be either an assert or mutable limit, but in the meantime I'm just adjusting crystal_thresh at runtime, which adds a bit of code: code stack ctx before: 37076 2352 684 after: 37112 (+0.1%) 2352 (+0.0%) 684 (+0.0%) On the plus side, this prevents crystal_thresh=0 issues much more elegantly.	2025-10-01 17:58:01 -05:00
Christopher Haster	8cc91ffa9e	Prevent oscillation when crystal_thresh < fragment_size When crystal_thresh < fragment_size, there was a risk that repeated write operations would oscillate between crystallizing and fragmenting every operation. Not only would this wreck performance, it would also violently wear down blocks as each crystallization would trigger an erase. Fortunately all we need to do to prevent this is check both fragment_size and crystal_thresh before fragmenting. Note this also affects the fragment checks in truncate/fruncate. --- crystal_thresh < fragment_size is kind of a weird configuration, to be honest we should probably just assert if configured this way (we never write fragments > crystal_thresh, because at that point we would just crystallize). But at the moment the extra leniency is useful for benchmarking. Adds a bit of code, but will probably either assert or mutably limit in the future: code stack ctx before: 37028 2352 684 after: 37076 (+0.1%) 2352 (+0.0%) 684 (+0.0%)	2025-10-01 17:57:58 -05:00
Christopher Haster	eab526ad9f	Fixed crystal_thresh=0 bugs There was a mismatch between the lfs3_cfg comment and the actual crystal_thresh math where crystal_thresh=0 would break things: - In lfs3_file_flush_, crystal_thresh=0 meant we would never resume crystallization, leading to terrible, _terrible_, linear write performance. - In lfs3_file_sync and lfs3_set, it's unclear if small file commit optimizations were working properly. I went ahead and added a lfs3_max(lfs3->cfg->crystal_thresh, 1) just to be safe. The other references to crystal_thresh all check for >= crystal_thresh conditions, so shouldn't be broken (except for an unrelated bug in lfs3_file_flushset_). The reason for this is because crystal_thresh=1 is technically the lower bound for this math. Allowing crystal_thresh=0 is just a convenience, and honestly allowing it may have a been a bad idea. Maybe we should require crystal_thresh=1 at minimum? I added a TODO. All the new v3 config needs revisiting anyways, for defaults, etc. --- Curiously, this actually saved code? My best guess is maybe some weird code path in lfs3_file_flush_ was eliminated: code stack ctx before: 37036 2352 684 after: 37028 (-0.0%) 2352 (+0.0%) 684 (+0.0%)	2025-10-01 17:57:54 -05:00
Christopher Haster	2c67fb1ea2	scripts: Dropped -e/--exec shortform flag, now just --exec Too much room for confusion, and potential flag conflicts in the future. Note it already conflicted with -e/--error-* flags. --exec is a rather technical flag anyways, and will probably be wrapped in other ci/script scaffolding most of the time.	2025-10-01 17:57:52 -05:00
Christopher Haster	be118ab93d	scripts: Fixed -s/-S sorting of .csv/.json outputs I'm not sure if this was ever implemented, or broken during a refactor, but we were ignoring -s/-S flags when writing .csv/.json output with -o/-O. Curious, because the functionality _was_ implemented in fold, just unused. All this required was passing -s/-S to fold correctly. Note we _don't_ sort diff_results, because these are never written to .csv/.json output. At some point this behavior may have been a bit more questionable, since we use to allow mixing -o/-O and table rendering. But now that -o/-O is considered an exclusive operation, ignoring -s/-S doesn't really make sense. --- Why did this come up? Well imagine my frustration when: 1. In tikz/pgfplots, \addplot table only really works with sorted data 2. csv.py has a -s/-S flag for sorting! 3. -s/-S doesn't work!	2025-10-01 17:57:49 -05:00
Christopher Haster	c33182b49b	Relax recrystallization when fruncating/logging This was a nasty performance hole found while benchmarking. Basically, any time crystallization is triggered, the crystallization algorithm tries to pack as much data into as few blocks as possible. When fruncating (the common, and performance sensitive, use case being logging), this can lead to the algorithm rewriting fruncated blocks. What the crystallization algorithm doesn't realize, however, is that when fruncating/logging, we're probably going to fruncate again on the next call, so rewriting the block is a waste of effort. Worst case -- a 1 block file -- this can cause littlefs to rewrite the entire file on every append. --- The solution implemented here, which is a bit of a hack, is to use the actual block start for block alignment instead of the logical start-of-block referenced by our btree/bshrub. This solves the fruncating/logging performance hole, with the tradeoff of using more storage than is strictly necessary. This tradeoff is probably expected with logging however. Code changes minimal: code stack ctx before: 37024 2352 684 after: 37036 (+0.0%) 2352 (+0.0%) 684 (+0.0%)	2025-10-01 17:57:47 -05:00
Christopher Haster	14d0c4121c	bmap: Dropped treediff buffers for now We're not currently using these (at the moment it's unclear if the original intention behind the treediff algorithms is worth pursuing), and they are showing up in our heap benchmarks. The good news is that means our heap benchmarks are working. Also saves a bit of code/ctx in bmap mode: code stack ctx before: 37024 2352 684 after: 37024 (+0.0%) 2352 (+0.0%) 684 (+0.0%) code stack ctx bmap before: 38752 2456 812 bmap after: 38704 (-0.1%) 2456 (+0.0%) 800 (-1.5%)	2025-10-01 17:57:42 -05:00
Christopher Haster	232f039ccc	kiwibd: Added kiwibd, a lighter-weight variant of emubd Useful for emulating much larger disks in a file (or in RAM). kiwibd doesn't have all the features of emubd, but this allows it to prioritize disk size and speed for benchmarking. kiwibd still keeps some features useful for benchmarking/emulation: - Optional erase value emulation, including nor-masking - Read/prog/erase trackers for measuring bd operations - Read/prog/erase sleeps for slowing down the simulation to a human viewable speed	2025-10-01 17:57:39 -05:00
Christopher Haster	6ba3204816	scripts: Some csv script tweaks to better interact with other scripts - Added --small-total. Like --small-header, this omits the first column which usually just has the informative text TOTAL. - Tweaked -Q/--small-table so it renders with --small-total if -Y/--summary is provided. - Added --total as an alias for --summary + --no-header + --small-total, i.e. printing only the totals (which may be multiple columns) and no other decoration. This is useful for scripting, now it's possible to extract just, say, the sum of some csv and embed with $(): echo $(./scripts/code.py lfs3.o --total) - Tweaked total to always output a number (0) instead of a dash (-), even if we have no results. This relies on Result() with no args, which risks breaking scripts where the Result type expects an argument. To hopefully catch this early, the table renderer currently creates a Result() before trying to fold the total result. - If first column is empty (--small-total + --small-header, --no-header, etc) collapse width to zero. This avoids a bunch of extra whitespace, but still includes the two spaces normal used to separate names from fields. But I think those spaces are a good thing. It makes it hard to miss the implicit padding in the table renderer that risks breaking dependent scripts.	2025-10-01 17:57:37 -05:00
Christopher Haster	3e8f304138	scripts: ctx.py/structs.py: Worked around incomplete structs/unions Found when trying to measure ctx of yaffs2, which relies on incomplete structs to hide some internal state (yaffs_summary_tags, yaffs_DIR). This is less common in microcontroller filesystems since almost all structs end up statically/stack allocated, and you can't statically allocate incomplete structs. It's not too surprising, but incomplete structs have no associated DW_AT_byte_size in the relevant dwarf info, which broke ctx.py and structs.py... As a workaround, I'm now defaulting to size=0 if DW_AT_byte_size is missing. --- With this fix, at least structs.py is able to pick up the later internal definition of yaffs_summary_tags. ctx.py doesn't because it only looks at the unique dwarf offset referenced by the function definition, but I'm hesitant to try anything more clever here. yaffs_DIR is noteworthy in that there is simply no complete definition. Internally, yaffs_DIR pointers alias yaffsfs_DirSearchContext structs. In this case I think returning size=0 is the only reasonable option.	2025-10-01 17:57:35 -05:00
Christopher Haster	c9691503bc	scripts: plot[mpl].py: Added --x/ylim-ratio for simpler limits I've been struggling to keep plots readable with --x/ylim-stddev, it may have been the wrong tool for the job. This adds --x/ylim-ratio as an alternative, which just sets the limit to include x-percent of the data (I avoided "percen"t in the name because it should be --x/ylim-ratio=0.98, not 98, though I'm not sure "ratio" is great either...). Like --x/ylim-stddev, this can be used in both one and two argument forms: $ ./scripts/plot.py --ylim-ratio=0.98 $ ./scripts/plot.py --ylim-=-0.98,+0.98 So far, --x/ylim-ratio has proven much easier to use, maybe because our amortized results don't follow a normal distribution? --x/ylim-ratio seems to do a good job of clipping runaway amortized results without too much information loss.	2025-10-01 17:57:32 -05:00
Christopher Haster	92af5de3ca	emubd: Added optional nor-masking emulation This adds NOR-style masking emulation to emubd when erase_value is set to -2: erase => 0xff prog 0xf0 => 0xf0 prog 0xcc => 0xc0 We do _not_ rely on this property in littlefs, and so this feature will probably go unused in our tests, but it's useful for running other filesystems (SPIFFS) on top of emubd. It may be a bit of a scope violation to merge this into littlefs's core repo, but it's useful to centralize emubd's features somewhere...	2025-10-01 17:57:28 -05:00
Christopher Haster	6a57258558	make: Adopted lowercase for foreach variables This seems to be the common style in other Makefiles, and avoids confusion with global/env variables.	2025-10-01 17:57:23 -05:00
Christopher Haster	a1b75497d6	bmap: rdonly: Got LFS3_RDONLY + LFS3_BMAP compiling Counterintuitively, LFS3_RDONLY + LFS3_BMAP _does_ make sense for cases where you want to include the bmap in things like ckmeta/ckdata scans. Though this is another argument for a LFS3_RDONLY + LFS3_NO_TRV build. Traversals add quite a bit of code to the rdonly build that is probably not always needed. --- This just required another bunch of ifdefs. Current bmap rdonly code size: code stack ctx rdonly: 10616 896 532 rdonly+bmap: 10892 (+2.6%) 896 (+0.0%) 636 (+19.5%)	2025-10-01 17:57:15 -05:00
Christopher Haster	60ef118dcd	rdonly: Got LFS3_RDONLY compiling again Just a few alloc/eoff references slipped through in the bmap work. Current rdonly code size: code stack ctx default: 37024 2352 684 rdonly: 10616 (-71.3%) 896 (-61.9%) 532 (-22.2%) This biggest change was tweaking our mtortoise again to use the unused trunk field for the power-of-two bound. The original intention of using eoff was an extra precaution to avoid the mtortoise looking like a valid shrub at any point, but eoff is not available in LFS3_RDONLY. And we definitely want our mtortoise in LFS3_RDONLY! --- Note I haven't actually tested LFS3_RDONLY + LFS3_BMAP. Does this config even make sense? I guess ckmeta/ckdata will need to traverse the bmap, so, counterintuitively, yes?	2025-10-01 17:57:14 -05:00

1 2 3 4 5 ...

2486 Commits