littlefs

mirror of https://github.com/littlefs-project/littlefs.git synced 2025-12-01 12:20:02 +00:00

Author	SHA1	Message	Date
Christopher Haster	58c5506e85	Brought back lazy grafting, but not too lazy Continued benchmarking efforts are indicating this isn't really an optional optimization. This brings back lazy grafting, where the file leaf is allowed to fall out-of-date to minimize bshrub/btree updates. This is controlled by LFS3_o_UNGRAFT, which is similar, but independent from LFS3_o_UNCRYST: - LFS3_o_UNCRYST - File's leaf not fully crystallized - LFS3_o_UNGRAFT - File's leaf does not match disk Note it makes sense for files to be UNGRAFT only, in the case where the current crystal terminates at the end-of-file but future appends are likely. And it makes sense for files to be UNCRYST only, in cases where we graft uncrystallized blocks so the bshrub/btree makes sense. Which brings us to the main change from the previous lazy-grafting implementation: lfs3_file_lookupnext no longer includes ungrafted leaves. Instead, functions should call lfs3_file_graft if they need lfs3_file_lookupnext to make sense. This significantly reduces the code cost of lazy grafting, at the risk of needing to graft more frequently. Fortunately we don't actually need to call lfs3_file_graft all that often: - lfs3_file_read already flushes caches/leaves before attempting any bshrub/btree reads for simplicity (heavy are not currently considered a priority, if you need this consider opening two file handles). - lfs3_file_flush_ _does_ need to call lfs3_file_graft before the crystallization heuristic pokes, but if we can't resume crystallization, we would probably need to graft the crystal to satisfy the flush anyways. --- Lazy grafting, i.e. procrastinating on bshrub/btree updates during block appends, is an optimization previously dropped due to perceived nicheness: - We can only lazily graft blocks, inlined data fragments always require bshrub/btree updates since they live in the bshrub/btree. - Sync forces bshrub/btree updates anyways, so lazy grafting has no benefit for most logging applications. - This performance penalty of eagerly grafting goes away if your caches are large enough. Note that the last argument is a non-argument in littlefs's case. They whole point of littlefs is that you _don't_ need RAM to fix things. However these arguments are all moot when you consider that the "niche use case" -- linear file writes -- is the default bottleneck for most applications. Any file operation becomes a linear write bottleneck when the arguments are large enough. And this becomes a noticeable issue when benchmarking. So... This brings back lazy grafting. But with a more limited scope w.r.t. internal file operations (the above lfs3_file_lookupnext/ lfs3_file_graft changes). --- Long story short, lazy grafting is back again, reverting the ~3x performance regression for linear file writes. But now with quite a bit less code/stack cost: code stack ctx before: 36820 2368 684 after: 37032 (+0.6%) 2352 (-0.7%) 684 (+0.0%)	2025-10-01 17:57:01 -05:00
Christopher Haster	27a722456e	scripts: Added support for SI-prefixes as iI punescape modifiers This adds %i and %I as punescape modifiers for limited printing of integers with SI prefixes: - %(field)i - base-10 SI prefixes - 100 => 100 - 10000 => 10K - 0.01 => 10m - %(field)I - base-2SI prefixes - 128 => 128 - 10240 => 10Ki - 0.125 => 128mi These can also easily include units as a part of the punescape string: - %(field)iops/s => 10Kops/s - %(field)IB => 10KiB This is particularly useful in plotmpl.py for adding explicit x/yticklabels without sacrificing the automatic SI-prefixes.	2025-10-01 17:56:51 -05:00
Christopher Haster	2a4e0496b6	scripts: csv.py: Fixed lexing of signed float exponents So now these lex correctly: - 1e9 => 1000000000 - 1e+9 => 1000000000 - 1e-9 => -1000000000 A bit tricky when you think about how these could be confused for binary addition/subtraction. To fix we just eagerly grab any signs after the e. These are particularly useful for manipulating simulated benchmarks, where we need to convert things to/from nanoseconds.	2025-10-01 17:56:29 -05:00
Christopher Haster	8666830515	bmap: scripts: Fixed missing geometry race condition The gbmap's weight is defined by the block count stored in the geometry config field, which should always be present in valid littlefs3 images. But our scripts routinely try to parse _invalid_ littlefs3 images when running in parallel with benchmarks/tests (littlefs3 does _not_ support multiple read/writers), so this was causing exceptions to be thrown. The fix is to just assume weight=0 when the geometry field is missing. The image isn't valid, and the gbmap is optional anyways.	2025-10-01 17:56:13 -05:00
Christopher Haster	ebae43898e	bmap: Changing direction, store bmap mode in wcompat flags The idea behind separate ctrled+unctrled airspaces was to try to avoid multiple interpretations of the on-disk bmap, but I'm starting to think this adds more complexity than it solves. The main conflict is the meaning of "in-flight" blocks. When using the "uncontrolled" bmap algorithm, in-flight blocks need to be double-checked by traversing the filesystem. But in the "controlled" bmap algorithm, blocks are only marked as "in-flight" while they are truly in-flight (in-use in RAM, but not yet in use on disk). Representing these both with the same "in-flight" state risks incompatible algorithms misinterpreting the bmap across different mounts. In theory the separate airspaces solve this, but now all the algorithms need to know how to convert the bmap from different modes, adding complexity and code cost. Well, in theory at least. I'm unsure separate airspaces actually solves this due to subtleties between what "in-flight" means in the different algorithms (note both in-use and free blocks are "in-flight" in the unknown airspace!). It really depends on how the "controlled" algorithm actually works, which isn't implemented/fully designed yet. --- Long story short, due to a time crunch, I'm ripping this out for now and just storing the current algorithm in the wcompat flags: LFS3_WCOMPAT_GBMAP 0x00006000 Global block-map in use LFS3_WCOMPAT_GBMAPNONE 0x00000000 Gbmap not in use LFS3_WCOMPAT_GBMAPCACHE 0x00002000 Gbmap in cache mode LFS3_WCOMPAT_GBMAPVFR 0x00004000 Gbmap in VFR mode LFS3_WCOMPAT_GBMAPIFR 0x00006000 Gbmap in IFR mode Note GBMAPVFR/IFR != BMAPSLOW/FAST! At least BMAPSLOW/FAST can share bmap representations: - GBMAPVFR => Uncontrolled airspace, i.e. in-flight blocks may or may not be in use, need to traverse open files. - GBMAPIFR => Controlled airspace, i.e. in-flight blocks are in use, at least until powerloss, no traversal needed, but requires more bmap writes. - BMAPSLOW => Treediff by checking what blocks are in B but not in A, and what blocks are in A but not in B, O(n^2), but minimizes bmap updates. Can be optimized with a bloom filter. - BMAPFAST => Treediff by clearing all blocks in A, and then setting all blocks in B, O(n), but also writes all blocks to the bmap twice even on small changes. Can be optimized with a sliding bitmap window (or a block hashtable, though a bitmap converges to the same thing in both algorithms when >=disk_size). It will probably be worth unifying the bmap representation later (the more algorithm-specific flags there are, the harder interop becomes for users, but for now this opens a path to implementing/experimenting with bmap algorithms without dealing with this headache.	2025-10-01 17:56:08 -05:00
Christopher Haster	e7c3755e21	bmap: Split known into ctrled+unctrled	2025-10-01 17:56:05 -05:00
Christopher Haster	98f016b07e	bmap: Added initial gbatc interactions, up until out-of-known or remount This only works immediately after format, and only for one pass of the disk, but it's a good way to test bmap lookups/allocation without worrying about more complicated filesystem-wide interactions.	2025-10-01 17:55:31 -05:00
Christopher Haster	59a4ae6f61	bmap: Taught littlefs how to traverse the gbmap Fortunately the btree traversal logic is pretty reusable, so this just required an additional tstate (LFS3_TSTATE_BMAP). This raises an interesting question: _when_ do we traverse the bmap? We need to wait until at least mtree traversal completes for gstate to be reconstructed during lfs3_mount, but I think traversing before file btrees makes sense.	2025-10-01 17:55:27 -05:00
Christopher Haster	5f65b49ef8	bmap: scripts: Added on-disk bmap traversal to dbgbmap and friends And yes, dbgbmapsvg.py's parents are working, thanks to a hacky blocks @property (Python to the rescue!)	2025-10-01 17:55:16 -05:00
Christopher Haster	88180b6081	bmap: Initial scaffolding for on-disk block map This is pretty exploratory work, so I'm going to try to be less thorough in commit messages until the dust settles. --- New tag for gbmapdelta: LFS3_TAG_GBMAPDELTA 0x0104 v--- ---1 ---- -1rr New tags for in-bmap block types: LFS3_TAG_BMRANGE 0x033u v--- --11 --11 uuuu LFS3_TAG_BMFREE 0x0330 v--- --11 --11 ---- LFS3_TAG_BMINFLIGHT 0x0331 v--- --11 --11 ---1 LFS3_TAG_BMINUSE 0x0332 v--- --11 --11 --1- LFS3_TAG_BMBAD 0x0333 v--- --11 --11 --11 LFS3_TAG_BMERASED 0x0334 v--- --11 --11 -1-- New gstate decoding for gbmap: .---+- -+- -+- -+- -. cursor: 1 leb128 <=5 bytes \| cursor \| known: 1 leb128 <=5 bytes +---+- -+- -+- -+- -+ block: 1 leb128 <=5 bytes \| known \| trunk: 1 leb128 <=4 bytes +---+- -+- -+- -+- -+ cksum: 1 le32 4 bytes \| block \| total: 23 bytes +---+- -+- -+- -+- -' \| trunk \| +---+- -+- -+- -+ \| cksum \| '---+---+---+---' New bmap node revdbg string: vvv---- -111111- -11---1- -11---1- (62 62 7e v0 bb~r) bmap node New mount/format/info flags (still unsure about these): LFS3_M_BMAPMODE 0x03000000 On-disk block map mode LFS3_M_BMAPNONE 0x00000000 Don't use the bmap LFS3_M_BMAPCACHE 0x01000000 Use the bmap to cache lookahead scans LFS3_M_BMAPSLOW 0x02000000 Use the slow bmap algorithm LFS3_M_BMAPFAST 0x03000000 Use the fast bmap algorithm New gbmap wcompat flag: LFS3_WCOMPAT_GBMAP 0x00002000 Global block-map in use	2025-10-01 17:55:13 -05:00
Christopher Haster	4b7a5c9201	trv: Renamed OMDIRS -> HANDLES, OBTREE -> HBTREE Looks like these traversal states were missed in the omdir -> handle rename. I think HANDLES and HBTREE states make sense: - LFS3_TSTATE_OMDIRS -> LFS3_TSTATE_HANDLES - LFS3_TSTATE_OBTREE -> LFS3_TSTATE_HBTREE	2025-07-21 16:47:24 -05:00
Christopher Haster	c87361508b	scripts: test.py/bench.py: Added --no-internal to skip internal tests The --no-internal flag avoids building any internal tests/benches (tests/benches with in="lfs3.c"), which can be useful for quickly testing high-level things while refactoring. Refactors tend to break all the internal tests, and it can be a real pain to update everything. Note that --no-internal can be injected into the build with TESTCFLAGS: TESTCFLAGS=--no-internal make test-runner -j \ && ./scripts/test.py -j -b For a curious data point, here's the current number of internal/non-internal tests: suites cases perms total: 24 808 633968/776298 internal: 22 (91.7%) 532 (65.8%) 220316/310247 (34.8%) non-internal: 2 ( 8.3%) 276 (34.2%) 413652/466051 (65.2%) It's interesting to note that while internal tests have more test cases, the non-internal tests generate a larger number of test permutations. This is probably because internal tests tend to target specific corner cases/known failure points, and don't invite much variants. --- While --no-internal may be useful for high-level testing during a refactor, I'm not sure it's a good idea to rely on it for _debugging_ a refactor. The whole point of internal testing is to catch low-level bugs early, with as little unnecessary state as possible. Skipping these to debug integration tests is a bit counterproductive!	2025-07-20 09:53:53 -05:00
Christopher Haster	7b330d67eb	Renamed config -> cfg Note this includes both the lfs3_config -> lfs3_cfg structs as well as the LFS3_CONFIG -> LFS3_CFG include define: - LFS3_CONFIG -> LFS3_CFG - struct lfs3_config -> struct lfs3_cfg - struct lfs3_file_config -> struct lfs3_file_cfg - struct lfs3_bd_config -> struct lfs3_bd_cfg - cfg -> cfg We were already using cfg as the variable name everywhere. The fact that these names were different was an inconsistency that should be fixed since we're committing to an API break. LFS3_CFG is already out-of-date from upstream, and there's plans for a config rework, but I figured I'd go ahead and change it as well to lower the chances it gets overlooked. --- Note this does _not_ affect LFS3_TAG_CONFIG. Having the on-disk vs driver-level config take slightly different names is not a bad thing.	2025-07-18 18:29:41 -05:00
Christopher Haster	457a0c0487	alloc: Added the concept of block allocator flags Currently this just has one flag the replaces the previous `erase` argument: LFS3_ALLOC_ERASE 0x00000001 Please erase the block Benefits include: - Slightly better readability at lfs3_alloc call sites. - Possibility of more allocator flags in the future: - LFS3_ALLOC_EMERGENCY - Use reserved blocks - Uh, that's all I can think of right now No code changes.	2025-07-18 16:42:45 -05:00
Christopher Haster	0828fd9bf3	Reverted LFS3_CKDATACKSUMREADS -> LFS3_CKDATACKSUMS LFS3_CKDATACKSUMREADS is just too much. The downside is it may not be clear how LFS3_CKDATACKSUMREADS interacts with the future planned LFS3_CKREADS (LFS3_CKREADS implies LFS3_CKDATACKSUMS + LFS3_CKMETAREDUND), but on the flip side you may actually be able to type LFS3_CKDATACKSUMS on the first try.	2025-07-16 14:25:20 -05:00
Christopher Haster	29e1701964	scripts: gdb: Globbed all dbg scripts into dbg.gdb.py This goes ahead and makes all dbg scripts available in dbg.gdb.py, via the magic of globbing __file__ relative, and dynamic python class generation. Probably one of the more evil scripts I've written, but this means we don't need to worry about dbg.gdb.py falling out-of-date when adding new dbg scripts. Not all of the dbg scripts are useful inside gdb, but most of them are. After all, what's cooler than this! (gdb) dbgrbyd -b4096 "disk" -t \ file->b.shrub.blocks[0] \ --trunk lfs3_rbyd_trunk(&file->b.shrub) rbyd 0x46.23a w2048, rev 00000000, size 629, cksum 8f5169e1 00000004: .-> 0-334 data w335 0 00000009: .-+-> 335 data w1 1 71 0000000e: \| .-> 336 data w1 1 67 00000013: .-+-+-> 337 data w1 1 66 ... 00000144: \| \| \| \| .-> 350 data w1 1 74 0000019a: \| \| \| \| .-+-> 351 data w1 1 78 000001f5: \| \| \| \| \| .-> 352-739 data w388 1 76 00000258: +-+-+-+-+-+-> 740-2047 data w1308 1 6c Note some tricks to help interact with bash and gdb: - Flags are passed as is (-b4096, -t, --trunk) - All non-flags are parsed as expressions (file->b.shrub.blocks[0]) - String expressions may be useful for paths and stuff ("./disk")	2025-07-04 18:55:46 -05:00
Christopher Haster	090611af14	scripts: dbgflags.py: Tweaked internals for readability Mainly just using 'P_NAME' instead of 'P', 'NAME' in the FLAGS table, every bit of horizontal spacing helps with these definitions.	2025-07-04 18:08:11 -05:00
Christopher Haster	19747f691e	scripts: dbgflags.py: Reimplemented filters as flags So instead of: $ ./scripts/dbgflags.py o 0x10000003 The filter is now specified as a normal(ish) argparse flag: $ ./scripts/dbgflags.py --o 0x10000003 This is a bit easier to interop with in dbg.gdb.py, and I think a bit more readable. Though -a and --a now do _very_ different things. I'm sure that won't confuse anyone...	2025-07-04 18:08:11 -05:00
Christopher Haster	0c19a68536	scripts: test.py/bench.py: Added support for multiple header files Like test.py --gdb-script, being able to specify multiple header files seems useful and is easy enough to add. --- Note that the default is only used if no other header files are specified, so this _replaces_ the default header file: $ ./scripts/test.py --include=my_header.h If you don't want to replace the default header file, you currently need to specify it explicitly: $ ./scripts/test.py \ --include=runners/test_runner.h \ --include=my_header.h	2025-07-04 18:08:11 -05:00
Christopher Haster	0b804c092b	scripts: gdb: Added some useful GDB scripts to test.py --gdb These just invoke the existing dbg*.py python scripts, but allow quick references to variables in the debugginged process: (gdb) dbgflags o file->b.o.flags LFS3_O_RDWR 0x00000002 Open a file as read and write LFS3_o_REG 0x10000000 Type = regular-file LFS3_o_UNSYNC 0x01000000 File's metadata does not match disk Quite neat and useful! This works by injecting dbg.gdb.py via gdb -x, which includes the necessary python hooks to add these commands to gdb. This can be overridden/extended with test.py/bench.py's --gdb-script flag. Currently limited to scripts that seem the most useful for process internals: - dbgerr - Decode littlefs error codes - dbgflags - Decode littlefs flags - dbgtag - Decode littlefs tags	2025-07-04 18:08:04 -05:00
Christopher Haster	a85f08cfe3	Dropped lazy grafting, but kept lazy crystallization This merges LFS3_o_GRAFT into LFS3_o_UNCRYST, simplifying the file write path and avoiding the mess that is ungrafted leaves. --- This goes for a different lazy crystallization/grafting strategy that was overlooked before. Instead of requiring all leaves to be both crystallized and grafted, we allow leaves to be uncrystallied, but they _must_ be grafted (in-tree) at all times. This gets us most of the rewrite preformance of lazy-crystallization, without needing to worry about out-of-date file leaves. Out-of-date file leaves were a headache for both code cost and concerns around confusing filesystem states and related bugs. Note LFS3_o_UNCRYST gets some extra behavior here: - LFS3_o_UNCRYST indicates when crystallization is _necessary_, and no longer when crystallization is _possible_. We already keep track of when crystallization is _possible_ via bptr's erased-state, and this lets us control recrystallization in lfs3_file_flush_ without erased-state-clearing hacks (which probably wouldn't work with the future ddtree). - We opportunistically clear the UNCRYST flag if it's not possible for future lfs3_file_crystallize_ calls to make progress: - When we crystallize a full block - When we hit the end of the file - When we hit a hole - When we hit an unaligned block --- Note this does impact performance! Unlike true lazy grafting, eagerly grafting means we're always committing to the bshrub/btree more than is strictly necessary, and this translates to more frequent btree node erases/compactions. Current simulated benchmarks show a ~3x increase (~20us -> ~60us) in write times for linear file writes on NOR flash. However: - The moment you need unaligned progs, this performance optimization goes out the window, as we need to graft bptrs before any padding fragments. - This only kicks in once we start crystallizing. So any writes < crystal_thresh (both in new files and in between blocks) are forced to commit to the bshrub/btree every flush. This risks a difficult to predict performance characteristic. - If you sync frequently (logging), we're forced to crystallize/graft anyways. - The performance hit can be alleviated with either larger writes or larger caches, though I realize this goes against littlefs's "RAM-not-required" mantra. Worst case, we can always bring back "lazy grafting" as a high-performance option in the future. Though note the above concerns around in-between/pre crystallization performance. This may only make sense when cache_size >= both prog_size and crystal_thresh. And of course, there's a significant code tradeoff! code stack ctx before: 38020 2456 656 after: 37588 (-1.1%) 2472 (+0.7%) 656 (+0.0%) Uh, ignore that stack cost. The simplified logic leads to more functions being inlined, which makes a mess of our stack measurements because we don't take shrinkwrapping into account.	2025-07-03 18:04:18 -05:00
Christopher Haster	8cc81aef7d	scripts: Adopt __get__ binding for write/writeln methods This actually binds our custom write/writeln functions as methods to the file object: def writeln(self, s=''): self.write(s) self.write('\n') f.writeln = writeln.__get__(f) This doesn't really gain us anything, but is a bit more correct and may be safer if other code messes with the file's internals.	2025-06-27 12:56:03 -05:00
Christopher Haster	213dba6f6d	scripts: test.py/bench.py: Added ifndef attribute for tests/benches As you might expect, this is the inverse of ifdef, and is useful for supporting opt-out flags. I don't think ifdef + ifndef is powerful enough to handle _all_ compile-time corner cases, but they at least provide convenient handling for the most common flags. Worst case, tests/benches can always include explicit #if/#ifdef/#ifndef statements in the code itself.	2025-06-24 15:17:04 -05:00
Christopher Haster	f967cad907	kv: Adopted LFS3_o_WRSET for better key-value API integration This adds LFS3_o_WRSET as an internal-only 3rd file open mode (I knew that missing open mode would come in handy) that has some _very_ interesting behavior: - Do _not_ clear the configured file cache. The file cache is prefilled with the file's data. - If the file does _not_ exist and is small, create it immediately in lfs3_file_open using the provided file cache. - If the file _does_ exist or is not small, do nothing and open the file normally. lfs3_file_close/sync can do the rest of the work in one commit. This makes it possible to implement one-commit lfs3_set on top of the file APIs with minimal code impact: - All of the metadata commit logic can be handled by lfs3_file_sync_, we just call lfs3_file_sync_ with the found did+name in lfs3_file_opencfg when WRSET. - The invariant that lfs3_file_opencfg always reserves an mid remains intact, since we go ahead and write the full file if necessary, minimizing the impact on lfs3_file_opencfg's internals. This claws back most of the code cost of the one-commit key-value API: code stack ctx before: 38232 2400 636 after: 37856 (-1.0%) 2416 (+0.7%) 636 (+0.0%) before kv: 37352 2280 636 after kv: 37856 (+1.3%) 2416 (+6.0%) 636 (+0.0%) --- I'm quite happy how this turned out. I was worried there for a bit the key-value API was going to end up an ugly wart for the internals, but with LFS3_o_WRSET this integrates quite nicely. It also raises a really interesting question, should LFS3_o_WRSET be exposed to users? For now I'm going to play it safe and say no. While potentially useful, it's still a pretty unintuitive API. Another thing worth mentioning is that this does have a negative impact on compile-time gc. Duplication adds code cost when viewing the system as a whole, but tighter integration can backfire if the user never calls half the APIs. Oh well, compile-time opt-out is always an option in the future, and users seem to care more about pre-linked measurements, probably because it's an easier thing to find. Still, it's funny how measuring code can have a negative impact on code. Something something Goodhart's law.	2025-06-22 15:37:07 -05:00
Christopher Haster	6eba1180c8	Big rename! Renamed lfs -> lfs3 and lfsr -> lfs3	2025-05-28 15:00:04 -05:00
Christopher Haster	9f1d6cf1db	scripts: Big script cleanup! Kinda. It's actually only 3 scripts. These have been replaced with the new dbg*.py scripts: - readblock.py -> dbgblock.py - readmdir.py -> dbgrbyd.py - readtree.py -> dbglfs.py	2025-05-27 21:05:56 -05:00
Christopher Haster	bce8f45a64	scripts: Tried to better document ansi color codes	2025-05-25 13:00:11 -05:00
Christopher Haster	a991c39f29	scripts: Dropped max-width from generated svgs Not sure what the point of this was, I think it was copied from a d3 example svg at some point. But it forces the svg to always fit in the window, even if this makes the svg unreadable. These svgs tend to end up questionably large in order to fit in the most info, so the unreadableness ends up a real problem for even modest window sizes.	2025-05-25 12:56:16 -05:00
Christopher Haster	f7e17c8aad	Added LFS_T_RDONLY, LFS_T_RDWR, etc These mimic the relevant LFS_O_* flags, and allow users to assert whether or not a traversal will mutate the filesystem: LFS_T_MODE 0x00000001 The traversal's access mode LFS_T_RDWR 0x00000000 Open traversal as read and write LFS_T_RDONLY 0x00000001 Open traversal as read only In theory, these could also change internal allocations, but littlefs doesn't really work that way. Note we _don't_ add related LFS_GC_RDONLY, LFS_GC_RDWR, etc flags. These are sort of implied by the relevant LFS_M_* flags. Adds a bit more code, probably because of the slightly more complicated internal constants for the internal traversals. But I think the self-documentingness is worth it: code stack ctx before: 37200 2288 636 after: 37220 (+0.1%) 2288 (+0.0%) 636 (+0.0%)	2025-05-24 23:27:10 -05:00
Christopher Haster	5b74aafa17	Reworked the flag encoding again This time to account for the new LFS_o_UNCRYST and LFS_o_UNGRAFT flags. This required moving the T flags out of the way, which of course conflicted with TSTATE, so that had to move... One thing that helped was shoving LFS_O_DESYNC up with the internal state flags. It's definitely more a state flag than the other public flags, it just also happens to be user toggleable. Here's the new jenga: 8 8 8 8 .----++----++----++----. .-..----..-..-..-------. o_flags: \|t\|\| f \|\|o\|\|t\|\| o \| \|-\|\|-.--':-:\|-\|'--.-.--' \|-\|\|-\|.----.\|-'--------. t_flags: \|t\|\|f\|\|tstt\|\| t \| '-''-''----'\|----.-----' .----..-.:-:\|----\|:-:.-. m_flags: \| m \|\|c\|\|o\|\| t \|\|o\|\|m\| \|----\|\|-\|'-'\|-.--''-''-' \|----\|\|-\|---\|-\|.-------. f_flags: \| m \|\|c\| \|t\|\| f \| '----''-'---'-''-------' This adds a bit of code, but that's not the end of the world: code stack ctx before: 37172 2288 636 after: 37200 (+0.1%) 2288 (+0.0%) 636 (+0.0%)	2025-05-24 22:21:39 -05:00
Christopher Haster	f5dd6f69e8	Renamed LFS_CKMETAPARITY and LFS_CKDATACKSUMREADS - LFS_CKPARITY -> LFS_CKMETAPARITY - LFS_CKDATACKSUMS -> LFS_CKDATACKSUMREADS The goal here is to provide hints for 1. what is being checked (META, DATA, etc), and 2. on what operation (FETCHES, PROGS, READS, etc). Note that LFS_CKDATACKSUMREADS is intended to eventually be a part of a set of flags that can pull off closed fully-checked reads: - LFS_CKMETAREDUNDREADS - Check data checksums on reads - LFS_CKDATACKSUMREADS - Check metadata redund blocks on reads - LFS_CKREADS - LFS_CKMETAREDUNDREADS + LFS_CKDATACKSUMREADS Also it's probably not a bad idea for LFS_CKMETAPARITY to be harder to use. It's really not worth enabling unless you understand its limitations (<1 bit of error detection, yay). No code changes.	2025-05-24 21:55:45 -05:00
Christopher Haster	6d9c077261	Reordered LFSR_TAG_NAMELIMIT/FILELIMIT Not sure why, but this just seems more intuitive/correct. Maybe because LFSR_TAG_NAME is always the first tag in a file's attr set: LFSR_TAG_NAMELIMIT 0x0039 v--- ---- --11 1--1 LFSR_TAG_FILELIMIT 0x003a v--- ---- --11 1-1- Seeing as several parts of the codebase still use the previous order, it seems reasonable to switch back to that. No code changes.	2025-05-24 21:51:06 -05:00
Christopher Haster	1cce0dab5c	Reverted limiting file->leaf to reads + erased-state caching Still on the fence about this, but in hindsight the code/stack difference is not _that_ much: code stack ctx before: 36460 2280 636 after: 37092 (+1.7%) 2304 (+1.1%) 636 (+0.0%) Especially with the potential to significantly speed up linear file writes/rewrites, which are usually the most common file operation. You ever just, you know, write a whole file at once? Note we can still add the previous behavior as an opt-in write strategy to save code/stack when preferred over linear write/rewrite speed. This is actually the main reason I think we should prefer lazy-crystallization by default. Of the theoretical/future write strategies, lazy-crystallization was the only one trading performance for code/stack and not vice versa (global-alignment, linear-only, fully-fragmented, etc). If we default to a small, but less performant filesystem, it risks users thinking littlefs is slow when they just haven't turned on the right flags. That being said there's a balance here. Users will probably judge littlefs based on its default code size for the same reason. --- Note this includes the generalized lfsr_file_crystallize_ API, which adds a bit of code: code stack ctx before gen-cryst: 37084 2304 636 after gen-cryst: 37092 (+0.0%) 2304 (+0.0%) 636 (+0.0%)	2025-05-23 19:48:56 -05:00
Christopher Haster	22c43124de	Limited file->leaf to reads + erased-state caching This reverts most of the lazy-grafting/crystallization logic, but keeps the general crystallization algorithm rewrite and file->leaf for caching read operations and erased-state. Unfortunately lazy-grafting/crystallization is both a code and stack heavy feature for a relatively specific write pattern. It doesn't even help if we're forced to write fragments due to prog alignment. Dropping lazy-grafting/crystallization trades off linear write/rewrite performance for code and stack savings: code stack ctx before: 37084 2304 636 after: 36428 (-1.8%) 2248 (-2.4%) 636 (+0.0%) But with file->leaf we still keep the improvements to linear read performance! Compared to pre-file->leaf: code stack ctx before file->leaf: 36016 2296 636 after lazy file->leaf: 37084 (+3.0%) 2304 (+0.3%) 636 (+0.0%) after eager file->leaf: 36428 (+1.1%) 2248 (-2.1%) 636 (+0.0%) I'm still on the fence about this, but lazy-grafting/crystallization is just a lot of code... And the first 6 letters of littlefs don't spell "speedy" last time I checked... At the very least we can always add lazy-grafting/crystallization as an opt-in write strategy later.	2025-05-23 15:22:33 -05:00
Christopher Haster	9c3a866508	Reworked crystallization to better use erased-state on rewrites This adopts lazy crystallization in _addition_ to lazy grafting, managed by separate LFS_o_UNCRYST and LFS_o_UNGRAFT flags: LFS_o_UNCRYST 0x00400000 File's leaf not fully crystallized LFS_o_UNGRAFT 0x00800000 File's leaf does not match bshrub/btree This lets us graft not-fully-crystallized blocks into the tree without needing to fully crystallize, avoiding repeated recrystallizations when linearly rewriting a file. Long story short, this gives file rewrites roughly the same performance as linear file writes. --- In theory you could also have fully crystallized but ungrafted blocks (UNGRAFT + ~UNCRYST), but this doesn't happen with the current logic. lfsr_file_crystallize eagerly grafts blocks once they're crystallized. Internally, lfsr_file_crystallize replaces lfsr_file_graft for the "don't care, gimme file->leaf" operation. This is analogous to lfsr_file_flush for file->cache. Note we do _not_ use LFS_o_UNCRYST to track erased-state! If we did, erased-state wouldn't survive lfsr_file_flush! --- Of course, this adds even more code. Fortunately not _that_ much considering how many lines of code changed: code stack ctx before: 37012 2304 636 after 37084 (+0.2%) 2304 (+0.0%) 636 (+0.0%) There is another downside however, and that's that our benchmarked disk usage is slightly worse during random writes. I haven't fully investigated this, but I think it's due to more temporary fragments/blocks in the B-tree before flushing. This can cause B-tree inner nodes to split earlier than when eagerly recrystallizing. This also leads to higher disk usage pre-flush since we keep both the old and new blocks around while uncrystallized, but since most rewrites are probably going to be CoW on top of committed files, I don't think this will be a big deal. Note the disk usage ends up the same after lfsr_file_flush.	2025-05-23 15:13:56 -05:00
Christopher Haster	9ed326f3d3	Adopted file->leaf, reworked how we track crystallization TLDR: Added file->leaf, which can track file fragments (read only) and blocks independently from file->b.shrub. This speeds up linear read/write performance at a heavy code/stack cost. The jury is still out on if this ends up reverted. --- This is another change motivated by benchmarking, specifically the significant regression in linear reads. The problem is that CTZ skip-lists are actually _really_ good at appending blocks! (but only appending blocks) The entire state of the file is contained in the last block, so file writes can resume without any reads. With B-trees, we need at least 1 B-tree lookup to resume appending, and this really adds up when writing extremely blocks. To try to mitigate this, I added file->leaf, a single in-RAM bptr for tracking the most recent leaf we've operated on. This avoids B-tree lookups during linear reads, and allowing the leaf to fall out-of-sync with the B-tree avoids both B-tree lookups and commits during writes. Unfortunately this isn't a complete win for writes. If we write fragments, i.e. cache_size < prog_size, we still need to incrementally commit to the B-tree. Fragments are a bit annoying for caching as any B-tree commit can discard the block they reside on. For reading, however, this brings read performance back to roughly the same as CTZ skip-lists. --- This also turned into more-or-less a full rewrite of the lfsr_file_flush -> lfsr_file_crystallize code path, which is probably a good thing. This code needed some TLC. file->leaf also replaces the previous eblock/eoff mechanism for erased-state tracking via the new LFSR_BPTR_ISERASED flag. This should be useful when exploring more erased-state tracking mechanisms (ddtree). Unfortunately, all of this additional in-RAM state is very costly. I think there's some cleanup that can be done (the current impl is a bit of a mess/proof-of-concept), but this does add a significant chunk of both code and stack: code stack ctx before: 36016 2296 636 after: 37228 (+3.4%) 2328 (+1.4%) 636 (+0.0%) file->leaf also increases the size of lfsr_file_t, but this doesn't show up in ctx because struct lfs_info dominates: lfsr_file_t before: 116 lfsr_file_t after: 136 (+17.2%) Hm... Maybe ctx measurements should use a lower LFS_NAME_MAX?	2025-05-23 12:15:13 -05:00
Christopher Haster	c44c43ac74	scripts: Renamed d3.py -> svg.py - codemapd3.py -> codemapsvg.py - dbgbmapd3.py -> dbgbmapsvg.py - treemapd3.py -> treemapsvg.py Originally these were named this way to match plotmpl.py, but these names were misleading. These scripts don't actually use the d3 library, they're just piles of Python, SVG, and Javascript, modelled after the excellent d3 treemap examples. Keeping the d3.py names around also felt a bit unfair to brendangregg's flamegraph SVGs, which were the inspiration for the interactive component. With d3 you would normally expect a rich HTML page, which is how you even include the d3 library. plotmpl.py is also an outlier in that it supports both .svg and .png output. So having a different naming convention in this case makes sense to me. So, renaming d3.py -> *svg.py. The inspiration from d3 is still mentioned in the top-level comments in the relevant files.	2025-05-15 19:09:09 -05:00
Christopher Haster	651c3e1eb4	scripts: Renamed Attr -> CsvAttr Mainly to avoid confusion with littlefs's attrs, uattrs, rattrs, etc. This risked things getting _really_ confusing as the scripts evolve.	2025-05-15 18:48:46 -05:00
Christopher Haster	aebe5b1d1b	scripts: plot[mpl].py: Added --x/ylim-stddev for data-dependent limits This adds --xlim-stddev and --ylim-stddev as alternatives to -X/--xlim and -Y/--ylim that define the plot limits in terms of standard deviations from the mean, instead of in absolute values. So want to only plot data within +-1 standard deviation? Use: $ ./scripts/plot.py --ylim-stddev=-1,+1 Want to ignore outliers >3 standard deviations? Use: $ ./scripts/plot.py --ylim-stddev=3 This is very useful for plotting the amortized/per-byte benchmarks, which have a tendency to run off towards infinity near zero. Before, we could truncate data explicitly with -Y/--ylim, but this was getting very tedious and doesn't work well when you don't know what the data is going to look like beforehand.	2025-05-15 18:23:09 -05:00
Christopher Haster	275ca0e0ec	scripts: bench.py: Fixed issue where cumul results were mixed together Whoops, looks like cumulative results were overlooked when multiple bench measurements per bench were added. We were just adding all cumulative results together! This led to some very confusing bench results. The solution here is to keep track of per-measurement cumulative results via a Python dict. Which adds some memory usage, but definitely not enough to be noticeable in the context of the bench-runner.	2025-05-15 16:16:41 -05:00
Christopher Haster	48daeed509	scripts: Fixed rounding-towards-zero issue in si/si2 prefixes This should be floor (rounds towards -inf), not int (rounds towards zero), otherwise sub-integer results get funky: - floor si(0.00001) => 10u - int si(0.00001) => 0.01m - floor si(0.000001) => 1u - int si(0.000001) => m (???)	2025-05-15 16:08:04 -05:00
Christopher Haster	e606e82ecb	scripts: plotmpl.py: Fixed -X/--xlim not considering all datasets This was a simple typo. Unfortunately went unnoticed because the lingering dataset assigned in the above for loop made the results look mostly correct. Yay.	2025-05-15 16:07:40 -05:00
Christopher Haster	c04f36ead4	scripts: plot[mpl].py: Adopted -s/--sort and -S for legend sorting Before this, the only option for ordering the legend was by specifying explicit -L/--add-label labels. This works for the most part, but doesn't cover the case where you don't know the parameterization of the input data. And we already have -s/-S flags in other csv scripts, so it makes sense to adopt them in plot.py/plotmpl.py to allow sorting by one or more explicit fields. Note that -s/-S can be combined with explicit -L/--add-labels to order datasets with the same sort field: $ ./scripts/plot.py bench.csv \ -bBLOCK_SIZE \ -xn \ -ybench_readed \ -ybench_proged \ -ybench_erased \ --legend \ -sBLOCK_SIZE \ -L',bench_readed=bs=%(BLOCK_SIZE)s' \ -L',bench_proged=' \ -L'*,bench_erased=' --- Unfortunately this conflicted with -s/--sleep, which is a common flag in the ascii-art scripts. This was bound to conflict with -s/--sort eventually, so a came up with some alternatives: - -s/--sleep -> -~/--sleep - -S/--coalesce -> -+/--coalesce But I'll admit I'm not the happiest about these...	2025-05-15 15:51:49 -05:00
Christopher Haster	d4c772907d	scripts: csv.py: Fixed completely broken float parsing Whoops! A missing splat repetition here meant we only ever accepted floats with a single digit of precision and no e/E exponents. Humorously this went unnoticed because our scripts were only _outputting_ single digit floats, but now that that's fixed, float parsing also needs a fix. Fixed by allowing >1 digit of precision in our CsvFloat regex.	2025-05-15 15:44:30 -05:00
Christopher Haster	d5b28df33a	scripts: Fixed excessive rounding when writing floats to csv/json files This adds __csv__ methods to all Csv* classes to indicate how to write csv/json output, and adopts Python's default float repr. As a plus, this also lets us use "inf" for infinity in csv/json files, avoiding potential unicode issues. Before this we were reusing __str__ for both table rendering and csv/json writing, which rounded to a single decimal digit! This made float output pretty much useless outside of trivial cases. --- Note Python apparently does some of its own rounding (1/10 -> 0.1?), so the result may still not be round-trippable, but this is probably fine for our somewhat hack-infested csv scripts.	2025-05-15 15:44:30 -05:00
Christopher Haster	43c2330edc	scripts: csv.py: Tweaked hidden fields to not imply -b/--by defaults So now the hidden variants of field specifiers can be used to manipulate by fields and field fields without implying a complete field set: $ ./scripts/csv.py lfs.code.csv \ -Bsubsystem=lfsr_file -Dfunction='lfsr_file_' \ -fcode_size Is the same as: $ ./scripts/csv.py lfs.code.csv \ -bfile -bsubsystem=lfsr_file -Dfunction='lfsr_file_' \ -fcode_size Attempting to use -b/--by here would delete/merge the file field, as cvs.py assumes -b/-f specify all of the relevant field type. Note that fields can also be explicitly deleted with -D/--define's new glob support: $ ./scripts/csv.py lfs.code.csv -Dfile='*' -fcode_size --- This solves an annoying problem specific to csv.py, where manipulating by fields and field fields would often force you to specify all relevant -b/-f fields. With how benchmarks are parameterized, this list ends up _looong_. It's a bit of a hack/abuse of the hidden flags, but the alternative would be field globbing, which 1. would be a real pain-in-the-ass to implement, and 2. affect almost all of the scripts. Reusing the hidden flags for this keeps the complexity limited to csv.py.	2025-05-15 15:44:14 -05:00
Christopher Haster	7526b469b9	scripts: Adopted globs in all field matchers (-D/--define, -c/--compare) Globs in CLI attrs (-L'=bs=%(bs)s' for example), have been remarkably useful. It makes sense to extend this to the other flags that match against CSV fields, though this does add complexity to a large number of smaller scripts. - -D/--define can now use globs when filtering: $ ./scripts/code.py lfs.o -Dfunction='lfsr_file_' -D/--define already accepted a comma-separated list of options, so extending this to globs makes sense. Note this differs from test.py/bench.py's -D/--define. Globbing in test.py/bench.py wouldn't really work since -D/--define is generative, not matching. But there's already other differences such as integer parsing, range, etc. It's not worth making these perfectly consistent as they are really two different tools that just happen to look the same. - -c/--compare now matches with globs when finding the compare entry: $ ./scripts/code.py lfs.o -c'lfs_file_sync' This is quite a bit less useful that -D/--define, but makes sense for consistency. Note -c/--compare just chooses the first match. It doesn't really make sense to compare against multiple entries. This raised the question of globs in the field specifiers themselves (-f'bench_' for example), but I'm rejecting this for now as I need to draw the complexity/scope _somewhere_, and I'm worried it's already way over on the too-complex side. So, for now, field names must always be specified explicitly. Globbing field names would add too much complexity. Especially considering how many flags accept field names in these scripts.	2025-05-15 14:28:57 -05:00
Christopher Haster	55ea13b994	scripts: Reverted del to resolve shadowed builtins I don't know how I completely missed that this doesn't actually work! Using del _does_ work in Python's repl, but it makes sense the repl may differ from actual function execution in this case. The problem is Python still thinks the relevant builtin is a local variables after deletion, raising an UnboundLocalError instead of performing a global lookup. In theory this would work if the variable could be made global, but since global/nonlocal statements are lifted, Python complains with "SyntaxError: name 'list' is parameter and global". And that's A-Ok! Intentionally shadowing language builtins already puts this code deep into ugly hacks territory.	2025-05-15 14:10:42 -05:00
Christopher Haster	48c1a016a0	scripts: Fixed missing tuple unpack in glob-all CLI attrs This was broken: $ ./scripts/plotmpl.py -L'*=bs=%(bs)s' There may be a better way to organize this logic, but spamming if statements works well enough.	2025-05-15 13:47:09 -05:00
Christopher Haster	4a50c5c9ce	scripts: dbgbmap[d3].py: Adopted slightly different row prioritization This still forces the block_rows_ <= height invariant, but also prevents ceiling errors from introducing blank rows. I guess the simplest solution is the best one, eh?	2025-04-30 02:30:31 -05:00

1 2 3 4 5 ...

672 Commits