Commit Graph

50 Commits

Author SHA1 Message Date
3b4e1e9e0b gbmap: Renamed gbmap_rebuild_thresh -> gbmap_repop_thresh
And tweaked a few related comments.

I'm still on the fence with this name, I don't think it's great, but it
at least betters describes the "repopulation" operation than
"rebuilding". The important distinction is that we don't throw away
information. Bad/erased block info (future) is still carried over into
the new gbmap snapshot, and persists unless you explicitly call
rmgbmap + mkgbmap.

So, adopting gbmap_repop_thresh for now to see if it's just a habit
thing, but may adopt a different name in the future.

As a plus, gbmap_repop_thresh is two characters shorter.
2025-10-23 23:51:18 -05:00
cb9bda5a94 gbmap: Renamed gbmap_scan_thresh -> gbmap_rebuild_thresh
I think a good rule of thumb is if you refer to some variable/config/
field with a different name in comments/writing/etc more often than not,
you should just rename the variable/config/field to match.

So yeah, gbmap_rebuild_thresh controls when the gbmap is rebuilt.

Also touched up the doc comment a bit.
2025-10-09 14:33:27 -05:00
9b4ee982bc gbmap: Tried to adopt the gbmap name more consistently
Having gbmap/bmap used in different places for the same thing was
confusing. Preferring gbmap as it is consistent with other gstate (grm
queue, gcksums), even if it is a bit noisy.

It's interesting to note what didn't change:

- The BM* range tags: LFS3_TAG_BMFREE, etc. These already differs from
  the GBMAP* prefix enough, and adopting GBM* would risk confusion for
  actual gstate.

- The gbmap revdbg string: "bb~r". We don't have enough characters for
  anything else!

- dbgbmap.py/dbgbmapsvg.py. These aren't actually related to the gbmap,
  so the name difference is a good thing.
2025-10-09 14:33:27 -05:00
14d0c4121c bmap: Dropped treediff buffers for now
We're not currently using these (at the moment it's unclear if the
original intention behind the treediff algorithms is worth pursuing),
and they are showing up in our heap benchmarks.

The good news is that means our heap benchmarks are working.

Also saves a bit of code/ctx in bmap mode:

                code          stack          ctx
  before:      37024           2352          684
  after:       37024 (+0.0%)   2352 (+0.0%)  684 (+0.0%)

                code          stack          ctx
  bmap before: 38752           2456          812
  bmap after:  38704 (-0.1%)   2456 (+0.0%)  800 (-1.5%)
2025-10-01 17:57:42 -05:00
316ca1cc05 bmap: The initial bmapcache algorithm seems to be working
At least at a proof-of-concept level, there's still a lot of cleanup
needed.

To make things work, lfs3_alloc_ckpoint now takes an mdir, which
provides the target for gbmap gstate updates.

When the bmap is close to empty (configurable via bmap_scan_thresh), we
opportunistically rebuild it during lfs3_alloc_ckpoints. The nice thing
about lfs3_alloc_ckpoint is we know the state of all in-flight blocks,
so rebuilding the bmap just requires traversing the filesystem + in-RAM
state.

We might still fall back to the lookahead buffer, but in theory a well
tuned bmap_scan_thresh can prevent this from becoming a bottleneck (at
the cost of more frequent bmap rebuilds).

---

This is also probably a good time to resume measuring code/ram costs,
though it's worth repeating the above note about the bmap work still
needing cleanup:

             code          stack          ctx
  before:   36840           2368          684
  after:    36920 (+0.2%)   2368 (+0.0%)  684 (+0.0%)

Haha, no, the bmap isn't basically free, it's just an opt-in features.
With -DLFS3_YES_BMAP=1:

             code          stack          ctx
  no bmap:  36920           2368          684
  yes bmap: 38552 (+4.4%)   2472 (+4.4%)  812 (+18.7%)
2025-10-01 17:56:14 -05:00
88180b6081 bmap: Initial scaffolding for on-disk block map
This is pretty exploratory work, so I'm going to try to be less thorough
in commit messages until the dust settles.

---

New tag for gbmapdelta:

  LFS3_TAG_GBMAPDELTA   0x0104  v--- ---1 ---- -1rr

New tags for in-bmap block types:

  LFS3_TAG_BMRANGE      0x033u  v--- --11 --11 uuuu
  LFS3_TAG_BMFREE       0x0330  v--- --11 --11 ----
  LFS3_TAG_BMINFLIGHT   0x0331  v--- --11 --11 ---1
  LFS3_TAG_BMINUSE      0x0332  v--- --11 --11 --1-
  LFS3_TAG_BMBAD        0x0333  v--- --11 --11 --11
  LFS3_TAG_BMERASED     0x0334  v--- --11 --11 -1--

New gstate decoding for gbmap:

  .---+- -+- -+- -+- -. cursor: 1 leb128  <=5 bytes
  | cursor            | known:  1 leb128  <=5 bytes
  +---+- -+- -+- -+- -+ block:  1 leb128  <=5 bytes
  | known             | trunk:  1 leb128  <=4 bytes
  +---+- -+- -+- -+- -+ cksum:  1 le32    4 bytes
  | block             | total:            23 bytes
  +---+- -+- -+- -+- -'
  | trunk         |
  +---+- -+- -+- -+
  |     cksum     |
  '---+---+---+---'

New bmap node revdbg string:

  vvv---- -111111- -11---1- -11---1-  (62 62 7e v0  bb~r)  bmap node

New mount/format/info flags (still unsure about these):

  LFS3_M_BMAPMODE     0x03000000  On-disk block map mode
  LFS3_M_BMAPNONE     0x00000000  Don't use the bmap
  LFS3_M_BMAPCACHE    0x01000000  Use the bmap to cache lookahead scans
  LFS3_M_BMAPSLOW     0x02000000  Use the slow bmap algorithm
  LFS3_M_BMAPFAST     0x03000000  Use the fast bmap algorithm

New gbmap wcompat flag:

  LFS3_WCOMPAT_GBMAP  0x00002000  Global block-map in use
2025-10-01 17:55:13 -05:00
7b330d67eb Renamed config -> cfg
Note this includes both the lfs3_config -> lfs3_cfg structs as well as
the LFS3_CONFIG -> LFS3_CFG include define:

- LFS3_CONFIG -> LFS3_CFG
- struct lfs3_config -> struct lfs3_cfg
- struct lfs3_file_config -> struct lfs3_file_cfg
- struct lfs3_*bd_config -> struct lfs3_*bd_cfg
- cfg -> cfg

We were already using cfg as the variable name everywhere. The fact that
these names were different was an inconsistency that should be fixed
since we're committing to an API break.

LFS3_CFG is already out-of-date from upstream, and there's plans for a
config rework, but I figured I'd go ahead and change it as well to lower
the chances it gets overlooked.

---

Note this does _not_ affect LFS3_TAG_CONFIG. Having the on-disk vs
driver-level config take slightly different names is not a bad thing.
2025-07-18 18:29:41 -05:00
b700c8c819 Dropped fragmenting blocks > 1 fragment
So we now keep blocks around until they can be replaced with a single
fragment. This is simpler, cheaper, and reduces the number of commits
needed to graft (though note arbitrary range removals still keep this
unbounded).

---

So, this is a delicate tradeoff.

On one hand, not fully fragmenting blocks risks keeping around bptrs
containing very little data, depending on fragment_size.

On the other hand:

- It's expensive, and disk utilization during random _deletes_ is not
  the biggest of concerns.

  Note our crystallization algorithm should still clean up partial
  blocks _eventually_, so this doesn't really impact random writes.
  The main concerns are lfs3_file_truncate/fruncate, and in the future
  collapserange/punchhole.

- Fragmenting bptrs introduces more commits, which have their own
  prog/erase cost, and it's unclear how this impacts logging operations.

  There's no point in fragmenting blocks at the head of a log if we're
  going to fruncate them eventually.

I figure lets err on minimizing complexity/code size for now, and if
this turns out to be a mistake, we can always revert or introduce
fragmenting >1 fragment blocks as an optional feature in the future.

---

Saves a big chunk of code, stack, and even some ctx (no more
fragment_thresh):

           code          stack          ctx
  before: 37504           2448          656
  after:  37024 (-1.3%)   2416 (-1.3%)  652 (-0.6%)
2025-07-03 19:46:18 -05:00
6eba1180c8 Big rename! Renamed lfs -> lfs3 and lfsr -> lfs3 2025-05-28 15:00:04 -05:00
bf00c4d427 Limited FRAGMENT_SIZE to 512 bytes in the test/bench runners
This prevents runaway O(n^2) behavior on devices with extremely large
block sizes (NAND, bs=~128KiB - ~1MiB).

The whole point of shrubs is to avoid this O(n^2) runaway when inline
files become necessarily large. Setting FRAGMENT_SIZE to a factor of the
BLOCK_SIZE humorously defeats this.

The 512 byte cutoff is somewhat arbitrary, it's the natural BLOCK_SIZE/8
FRAGMENT_SIZE on most NOR flash (bs=4096), but it's probably worth
tuning based on actual device performance.
2025-05-15 17:19:35 -05:00
c5efe35ab2 Split crystal_thresh into crystal_thresh + fragment_thresh
So now crystal_thresh only controls when fragments are compacted into
blocks, while fragment_thresh controls when blocks are broken into
fragments. Setting fragment_thresh=-1 will follow crystal_thresh and
keeps the previous behavior.

These were already two separate pieces of logic, so it makes sense to
provide two separate knobs for tuning.

Setting fragment_thresh lower than crystal_thresh has some potential to
reduce hysteresis in cases where random writes push blocks close to
crystal_thresh. It will be interesting to explore this more when
benchmarking.

---

The additional config option adds a bit of code/ctx, but hopefully that
will go away in the future config rework:

           code          stack          ctx
  before: 35584           2480          636
  after:  35600 (+0.0%)   2480 (+0.0%)  640 (+0.6%)
2025-04-20 15:53:18 -05:00
19a23c7788 Renamed/reverted file->buffer -> file->cache
And the related config options:

- cfg->file_buffer_size -> cfg->file_cache_size
- file->cfg->buffer_size -> file->cfg->cache_size
- file->cfg->buffer -> file->cfg->cache_buffer

The original motivation to rename this to file->buffer was to better
align with what other filesystems call this, but I think this is a case
where internal consistency is more important than external consistency.

file->cache better matches lfs->pcache and lfs->rcache, and makes it
easier to read code involving both file->cache and other user-provided
buffers.

Keeping the upstream name also helps with continuity.
2025-02-13 16:02:46 -06:00
bac2464b8f Renamed lfs->cfg->shrub_size -> lfs->cfg->inline_size
While I think shrub_size is probably the more correct name at a
technical level, inline_size is probably more what users expect and
doesn't require a deeper understanding of filesystem details.

The only risk is that users may think inline_size has no effect on large
files, when in fact it still controls how much of the btree root can be
inlined.

There's also the point that sticking with inline_size maintains
compatibility with both the upstream version and any future version that
has other file representations.

May revisit this, but renaming to lfs->cfg->inline_size for now.
2025-02-11 02:50:38 -06:00
6cd29bede2 Dropped lfs->cfg->inline_size
Now that we no longer have bmoss files, inline_size and shrub_size are
effectively the same thing.

We weren't using this, so no code change, but it does save a word of
ctx:

           code          stack          ctx
  before: 36280           2576          640
  after:  36280 (+0.0%)   2576 (+0.0%)  636 (-0.6%)
2025-02-11 02:50:38 -06:00
66f5fa152a emubd: Renamed LFS_EMUBD_POWERLOSS_NOOP -> LFS_EMUBD_POWERLOSS_ATOMIC
Mainly to avoid ambiguity with PROGNOOP/ERASENOOP and make it clear
emubd still simulates powerloss, but also because I think the name
sounds cooler.
2025-02-08 14:53:47 -06:00
1b3054db89 gc: Moved incremental gc behind ifdef LFS_GC
Incremental gc, being stateful and not gc-able (ironic), was always
going to need to be conditionally compilable.

This moves incremental gc behind the LFS_GC define, so that we can focus
on the "default" costs. This cuts lfs_t in nearly half!

  lfs_t with LFS_GC:   308
  lfs_t without LFS_C: 168 (-45.5%)

This does save less code than one might expect though. We still need
most of the internal traversal/gc logic for things like block allocation
and orphan cleanup, so most of the savings is limited to the RAM storing
the incremental state:

                          code          stack          ctx
  before:                37916           2608          768
  after with LFS_CFG:    37944 (+0.1%)   2608 (+0.0%)  768 (+0.0%)
  after without LFS_CFG: 37796 (-0.3%)   2608 (+0.0%)  620 (-19.3%)

On the flip side, this does mean most of the incremental gc
functionality is still availables in the lfsr_traversal_t APIs.

Applications with more advanced gc use-cases may actually benefit from
_not_ enabling the incremental gc APIs, and instead use the
lfsr_traversal_t APIs directly.
2025-01-28 14:41:45 -06:00
5d756fe698 gc: Tweaked lfsr_gc API to be more stateful
Before:

  int lfsr_fs_gc(lfs_t *lfs, lfs_soff_t steps, uint32_t flags);

After:

  int lfsr_gc(lfs_t *lfs);
  int lfsr_gc_setflags(lfs_t *lfs, uint32_t flags);
  int lfsr_gc_setsteps(lfs_t *lfs, lfs_soff_t steps);

---

The interesting thing about the lfsr_gc API is that the caller will
often be very different from whoever configures the system. One example
being an OS calling lfsr_gc in a background loop, while leaving
configuration up to the user.

The idea here, is instead of forcing the OS to come up with its own
stateful system to pass flags to lfsr_gc, we just embed this state in
littlefs directly. The whole point of lfsr_gc is that it's a stateful
system anyways.

Unfortunately this state does require a bit more logic to maintain,
which adds code/ctx cost:

           code          stack          ctx
  before: 37812           2608          752
  after:  37916 (+0.3%)   2608 (+0.0%)  768 (+2.1%)
2025-01-28 14:41:45 -06:00
f385f8f778 bench: Tweaked bench.py to include cumulative measurements
This was the one piece needed to be able to replace amor.py with csv.py.
The missing feature in csv.py is the ability to keep track of a
running-sum, but this is a bit of a hack in amor.py considering we
otherwise view csv entries as unordered.

We could add a running-sum to csv.py, or instead, just include a running
sum as a part of our bench output. We have all the information there
anyways, and if it simplifies the mess that is our csv scripts, that's a
win.

---

This also replaces the bench "meas", "iter", and "size" fields with the
slightly simpler "m" (measurement? metric?) and "n" fields. It's up to
the specific benchmark exactly how to interpret "n", but one field is
sufficient for existing scripts.
2024-11-16 17:29:05 -06:00
eced943685 Changed gc_steps into a runtime parameter, better dedup mount gc
So instead of configuring gc_steps at mount time (or eventually compile
time), lfsr_fs_gc now takes a steps parameter that controls how much gc
work to attempt:

  int lfsr_fs_gc(lfs_t *lfs, lfs_soff_t steps, uint32_t flags);

This API was needed internally to better deduplicate on-mount gc, and I
figured it might also be useful for users to be able to easily change
gc_steps per lfsr_fs_gc call.

I realize this could also be accomplished with the theoretical
lfsr_fs_gccfg, but it's a bit easier to not need a struct every call.

Most likely, depending on project/system, users will always call
lfsr_fs_gc with either 1 (minimal work) or -1 (maximal work), or, worst
case, can define a system-wide GC_STEPS somewhere.

---

Deduplicating on-mount gc work better saved some code, though it's worth
noting this could have been done internally and not exposed to users:

           code          stack
  before: 36476           2680 (+0.0%)
  after:  36316 (-0.4%)   2680 (+0.0%)
2024-07-18 20:46:58 -05:00
acfae9e072 Extended lfsr_mount to accept mount flags
This has been a long-time coming, mount flags are just too useful for
configuring a filesystem at runtime.

Currently this is limited to LFS_M_RDONLY and LFS_M_CKPROGS, but there
are a few more planned in the future:

  LFS_M_RDWR     = 0x0000, // Mount the filesystem as read and write
  LFS_M_RDONLY   = 0x0001, // Mount the filesystem as readonly
  LFS_M_STRICT*  = 0x0002, // Error if on-disk config does not match
  LFS_M_FORCE*   = 0x0004, // Ignore compat flags, mount readonly
  LFS_M_FORCEWITHRECKLESSABANDON*
                 = 0x0008, // Ignore compat flags, mount read write

  LFS_M_CKPROGS  = 0x0010, // Check progs by reading back progged data
  LFS_M_CKREADS* = 0x0020, // Check reads via checksums

  * Hypothetical

As a convenience, we also return mount flags in the struct lfs_fsinfo's
flags field as their relevant LFS_I_* variants. Though only to match
statvfs, and only because it's cheap, littlefs's API is low-level and we
should expect users to know what flags they passed to lfsr_mount.

As for the new mount flags:

- LFS_M_RDONLY - For consistency with existing APIs, this just asserts
  on write operations, which makes it a bit useless... But the info flag
  LFS_I_RDONLY may be useful for falling back to a readonly mode if
  we encounter on-disk compat issues.

  At least if implement the theoretical LFS_UNTRUSTED_USER mode
  LFS_M_RDONLY could become a runtime error.

- LFS_M_RDWR - This really just exists to compliment LFS_M_RDONLY and to
  match LFS_O_RDONLY/LFS_O_RDWR. It's just an alias for 0, and I don't
  think there will ever be a reason to make it non-0 (but I can always
  be wrong!).

- LFS_M_CKPROGS - This replaces the check_progs config option and avoids
  using a full byte to store a bool.

  We should probably also have a compile-time option to compile this out
  (LFS_NO_CKPROGS?), but that's a future thing to do.

This ended up adding a surprising bit of code, considering we're just
moving flags around, and noise in lfs_alloc added a bit of stack again:

           code          stack
  before: 35880           2672
  after:  35932 (+0.1%)   2680 (+0.3%)
2024-07-17 20:39:31 -05:00
fc486ca4f7 Reworked lfsr_fs_gc to be incremental
Thinking about use case a bit, most lfsr_fs_gc will be to perform
background work, and can benefit from being incremental.

We already support incremental gc and all the mess associated with
traversal invalidation via the traversal API, so we might as well expose
this through lfsr_fs_gc.

The main downside is that we need to store an lfsr_traversal_t object
somewhere, which is not exactly a cheap struct. I was originally
considering limiting incremental gc to the traversal API for this
reason, but I think the value add of an incremental lfsr_fs_gc is too
compelling... Though we really should add a compile-time option
(LFS_NO_GC? LFS_NO_INCRGC?) to allow users to opt-out of this RAM cost
if they're never going to call this function.

Oh, and lfs_t also becomes self-referential, which might become a
problem for higher-level language users...

---

The incremental behavior of lfsr_fs_gc can be controlled by the new
gc_steps config option. This allows more than one step to be performed
at a time, which may allow for more progress when intermixed with
write-heavy filesystem operations. Setting gc_steps=-1 performs a full
traversal every call, which guarantees always making some amount of
progress.

This adds a bit of code, since we now need to check for/resume existing
traversals. But the real cost is the added RAM to lfs_t, which is
unfortunately wasted if you never call lfsr_fs_gc:

          code           stack          lfs_t
  before: 35708           2672            164
  after:  35756 (+0.1%)   2672 (+0.0%)    296 (+80.5%)
2024-07-17 18:08:32 -05:00
4d06fc2e0e t: (Re)implemented gc_compact_thresh, at least over mdirs
lfs_fs_gc is still not reimplemented, but this is accessible through the
traversal API with LFS_T_COMPACT.

This is also the first traversal operation that can mutate the
filesystem, which brings its own set of problems:

- We need to set LFS_F_DIRTY in lfsr_mtree_gc now, which really
  highlights how much of a mess having two flag fields is...

  We do _not_ clobber in this case, since we assume lfsr_mtree_gc knows
  what it's doing.

- We can now commit to an mroot in the mroot chain outside of the normal
  mroot chain update logic.

  This is a bit scary, but should just work.

  The only issue so far is that we need to allow mdirs to follow the
  mroot during mroot splits if mid=-1, even if they aren't lfs_t's mroot
  mdir.

  This should now be decently tested with the new
  test_traversal_compact_* tests.

- It's easy for mtraversal's mdir and mtinfo's mdir to fall out of sync
  when mutating... Why do we have two of these?

The actual compaction itself is pretty straightforward: just mark as
unerased, eoff=-1, and call lfsr_mdir_commit with an empty commit. This
is now wrapped up in lfsr_mdir_compact.

Code changes:

           code          stack
  before: 34528           2640
  after:  34652 (+0.4%)   2640 (+0.0%)

Though the real hard part will be implementing gc_compact_thresh over
btree nodes...
2024-06-24 21:09:54 -05:00
b4af52bc72 Implemented SOMEBITS/MOSTBITS emubd powerloss behavior
These emulate powerloss behavior where only some of the bits being
progged are actually progged if there is a powerloss. This behavior was
the original motivation for our ecksums/fcrcs, so it's good to have this
tested.

As a simplification, these only test the extremes:

- LFS_EMUBD_POWERLOSS_SOMEBITS => one bit progged
- LFS_EMUBD_POWERLOSS_MOSTBITS => all-but-one bit progged

Also they flips bits instead of preserving exact partial prog behavior,
but this is allowed (progs can have any intermediate value), has the
same effect as partial progs, and should encourage failed progs.

This required a number of tweaks in emubd: moved powerloss before prog,
moved mutate after powerloss, etc, but these shouldn't affect other
powerloss behaviors. Handling powerloss after prog was only to avoid
power_cycles=1 being useless, it's not strictly required.

Good news is testing so far suggests our ecksum design is sound.
2024-06-06 16:58:18 -05:00
b5370d6001 Cherry-picked upstream out-of-order emubd testing
More information upstream (f2a6f45, fc2aa33, 7873d81), but this adds
LFS_EMUBD_POWERLOS_OOO for testing out-of-order block devices that
require sync to be called for things to serialize. It's a simple
implementation, just reverts the first write since last sync on
powerloss, but gets the job done.

Cherry-picking these changes required reverting emubd's scratch buffer,
but carrying around an extra ~block_size of memory isn't a big deal
here.
2024-05-30 13:26:16 -05:00
1ecb346cec Renamed fbuffer_size -> file_buffer_size 2024-05-30 11:52:07 -05:00
c648f96dc5 Added check_progs for immediate prog validation
This configuration option enables the previous behavior of reading back
every prog to check that the data was written correctly.

Unfortunately, this brings a bit of baggage, thanks to our cache
interactions being more complicated now:

- We really want to reuse the rcache for prog validation, despite the
  cache performance implications. Unfortunately, we simply can't, thanks
  to the new bd utility functions tying up the rcache. lfsr_bd_cpy, for
  example, does not expect rcache to be invalidated between a read and
  prog, and if it is, things break (I may or may not have found this by
  experience).

  These bd utilities are valuable, so we really need some other way to
  validate our progs.

- Since we can't rely on the rcache, this leaves checksumming as the
  only option for validating progs. Checksumming isn't perfect, as there
  is a decent chance of false negatives, but to be honest it's probably
  good enough for anything that's not malicious.

- This also adds the new constraint that we need to be able to read back
  any prog into the pcache, which implies read_size <= prog_size. This
  constraint didn't exist when we could clobber our rcache, but this is
  not worth throwing away the new bd utilities. Not to mention
  clobbering our rcache could hurt cache performance.

  Why not make read_size <= prog_size conditional on check_progs?

  The main reason is convenience. One very compelling use case for
  check_progs is to help debug unknown filesystem/integration failures,
  buf if you can't enable check_progs without changing the filesystem
  configuration, you can't really rely on check_progs for debugging.

  This helps future proof what we expect from block devices, in case
  future error detection/correction mechanisms can benefit from our
  prog_size always being readable.

Code changes were not that significant, however there was a surprising
stack cost. This seems to be because lfsr_bd_read__ can now be called
from multiple places, causing it to no longer be inlined in
lfsr_bd_read_, costing a bit of stack for the additional function call:

  before: 33566           2624
  after:  33682 (+0.3%)   2640 (+0.6%)
2024-05-29 23:09:41 -05:00
56b18dfd9a Reworked revision count logic a bit, block_cycles -> block_recycles
The original goal here was to restore all of the revision count/
wear-leveling features that were intentionally ignored during
refactoring, but over time a few other ideas to better leverage our
revision count bits crept in, so this is sort of the amalgamation of
that...

Note! None of these changes affect reading. mdir fetch strictly needs
only to look at the revision count as a big 32-bit counter to determine
which block is the most recent.

The interesting thing about the original definition of the revision
count, a simple 32-bit counter, is that it actually only needs 2-bits to
work. Well, three states really: 1. most recent, 2. less recent, 3.
future most recent. This means the remaining bits are sort of up for
grabs to other things.

Previously, we've used the extra revision count bits as a heuristic for
wear-leveling. Here we reintroduce that, a bit more rigorously, while
also carving out space for a nonce to help with commit collisions.

Here's the new revision count breakdown:

  vvvvrrrr rrrrrrnn nnnnnnnn nnnnnnnn
  '-.''----.----''---------.--------'
    '------|---------------|---------- 4-bit relocation revision
           '---------------|---------- recycle-bits recycle counter
                           '---------- pseudorandom nonce

- 4-bit relocation revision

  We technically only need 2-bits to tell which block is the most
  recent, but I've bumped it up to 4-bits just to be safe and to make
  it a bit more readable in hex form.

- recycle-bits recycle counter

  A user configurable counter, this counter tracks how many times a
  metadata block has been erased. When it overflows we return the block
  to the allocator to participate in block-level wear-leveling again.
  This implements our copy-on-bounded-write strategy.

- pseudorandom nonce

  The remaining bits we fill with a pseudorandom nonce derived from the
  filesystem's prng. Note this prng isn't the greatest (it's just the
  xor of all mdir cksums), but it gets the job done. It should also be
  reproducible, which can be a good thing.

  Suggested by ithinuel, the addition of a nonce should help with the
  commit collision issue caused by noop erases. It doesn't completely
  solve things, since we're only using crc32c cksums not collision
  resistant cryptographic hashes, but we still have the existing
  valid/perturb bit system to fall back on.

When we allocate a new mdir, we want to zero the recycle counter. This
is where our relocation revision is useful for indicating which block is
the most recent:

  initial state: 10101010 10101010 10101010 10101010
                 '-.'
                  +1     zero           random
                   v .----'----..---------'--------.
  lfsr_rev_init: 10110000 00000011 01110010 11101111

When we increment, we increment recycle counter and xor in a new nonce:

  initial state: 10110000 00000011 01110010 11101111
                 '--------.----''---------.--------'
                         +1              xor <-- random
                          v               v
  lfsr_rev_init: 10110000 00000111 01010100 01000000

And when the recycle counter overflows, we relocate the mdir.

If we aren't wear-leveling, we just increment the relocation revision to
maximize the nonce.

---

Some other notes:

- Renamed block_cycles -> block_recycles.

  This is intended to help avoid confusing block_cycles with the actual
  physical number of erase cycles supported by the device.

  I've noticed this happening a few times, and it's unfortunately
  equivalent to disabling wear-leveling completely. This can be improved
  with better documentation, but also changing the name doesn't hurt.

- We now relocate both blocks in the mdir at the same time.

  Previously we only relocated one block in the mdir per recycle. This
  was necessary to keep our threaded linked-list in sync, but the
  threaded linked-list is now no more!

  Relocating both blocks is simpler, updates the mtree less often,
  compatible with metadata redundancy, and avoids aliasing issues that
  were a problem when relocating one block.

  Note that block_recycles is internally multiplied by 2 so each block
  sees the correct number of erase cycles.

- block_recycles is now rounded down to a power-of-2.

  This makes the counter logic easier to work with and takes up less RAM
  in lfs_t. This is a rough heuristic anyways.

- Moved the lfs->seed updates into lfsr_mountinited + lfsr_mdir_commit.

  This avoids readonly operations affecting the seed and should help
  reproducibility.

- Changed rev count in dbg scripts to render as hex, similar to cksums.

  Now that we using most of the bits in the revision count, the decimal
  version is, uh, not helpful...

Code changes:

           code          stack
  before: 33342           2640
  after:  33434 (+0.3%)   2640 (+0.0%)
2024-05-22 18:49:05 -05:00
5c70013c11 Adopted compile-time LFS_MIN/LFS_MAX in test defines
These seem fitting here, even if the test defines aren't "real defines".
The duplicate expressions should still be side-effect free and easy to
optimize out.

This should also avoid future lfs_min32 vs intmax_t issues.
2024-05-22 15:43:46 -05:00
186fd1b5f2 Separated cache_size out into rcache_size/pcache_size/fbuffer_size
A much requested feature, this allows much finer control of how RAM is
allocated for the system.

It was difficult to introduce this in previous versions of littlefs due
to how we steal caches during certain file operations, but now we don't
do that and treat the caches much more transparently.

Managing separate cache sizes does add a bit of code, but this is well
worth the potential for RAM savings due to increased flexibility:

           code          stack
  before: 33656           2632
  after:  33714 (+0.2%)   2640 (+0.3%)

Also interesting to note this reduces alignment requirements for the
rcache/pcache, since they don't need to share alignment, and completely
removes any alignment requirement from the file buffers.
2024-05-22 15:43:10 -05:00
f5beacf6ee Added some comments over lfs_config's fragment_size/crystal_thresh/etc
Also added related asserts to lfs_init.

Note the fragment_size <= block_size/8 limit is to avoid wasteful corner
cases where only one fragment can fit in a block. The shrub_size <=
block_size/4 limit is looser because of how shrubs temporarily
overcommit.

As for the other limits, inline_size is bounded by shrub_size, and
crystal_thresh technically doesn't have a limit, though values >
block_size stop having an effect.
2024-05-18 13:00:15 -05:00
a124ee54e7 Reworked test/bench defines to map to global variables
Motivation:

- Debuggability. Accessing the current test/bench defines from inside
  gdb was basically impossible for some dumb macro-debug-info reason I
  can't figure out.

  In theory, GCC provides a .debug_macro section when compiled with -g3.
  I can see this section with objdump --dwarf=macro, but somehow gdb
  can't seem to find any definitions? I'm guess the #line source
  remapping is causing things to break somehow...

  Though even if macro-debugging gets fixed, which would be valuable,
  accessing defines in the current test/bench runner can trigger quite
  a bit of hidden machinery. This risks side-effects, which is never
  great when debugging.

  All of this is quite annoying because the test/bench defines is
  usually the most important piece of information when debugging!

  This replaces the previous hidden define machinery with simple global
  variables, which gdb can access no problem.

- Also when debugging we no longer awkwardly step into the test_define
  function all the time!

- In theory, global variables, being a simple memory access, should be
  quite a bit faster than the hidden define machinery. This does matter
  because running tests _is_ a dev bottleneck.

  In practice though, any performance benefit is below the noise floor,
  which isn't too surprising (~630s +-~20s).

- Using global variables for defines simplifies the test/bench runner
  quite a bit.

  Though some of the previous complexity was due to a whole internal
  define caching system, which was supposed to lazily evaluate test
  defines to avoid evaluating defines we don't use. This all proved to
  be useless because the first thing we do when running each test is
  evaluate all defines to generate the test id (lol).

So now, instead of lazily evaluating and caching defines, we just
generate global variables during compilation and evaluate all defines
for each test permutation immediately before running.

This relies heavily on __attribute__((weak)) symbols, and lets the
linker really shine.

As a funny perk this also effectively interns all test/bench defines by
the address of the resulting global variable. So we don't even need to
do string comparisons when mapping suite-level defines to the
runner-level defines.

---

Perhaps the more interesting thing to note, is the change in strategy in
how we actually evaluate the test defines.

This ends up being a surprisingly tricky problem, due to the potential
of mutual recursion between our defines.

Previously, because our define machinery was lazy, we could just
evaluate each define on demand. If a define required another define, it
would lazily trigger another evaluation, implicitly recursing through
C's stack. If cyclic, this would eventually lead to a stack overflow,
but that's ok because it's a user error to let this happen.

The "correct" way, at least in terms of being computationally optimal,
would be to topologically sort the defines and evaluate the resulting
tree from the leaves up.

But I ain't got time for that, so the solution here is equal parts
hacky, simple, and effective.

Basically, we just evaluate the defines repeatedly until they stop
changing:

- Initially, mutually recursive defines may read the uninitialized
  values of their dependencies, and end up with some arbitrarily wrong
  result. But as the defines are repeatedly evaluated, assuming no
  cycles, the correct results should eventually bubble up the tree until
  all defines converge to the correct value.

- This is O(n*e) vs O(n+e), but our define graph is usually quite
  shallow.

- To prevent non-halting, we error after an arbitrary 1000 iterations.
  If you hit this, it's likely because there is a cycle in the define
  graph.

  This is runtime configurable via the new --define-depth flag.

- To keep things consistent and reproducible, we zero initialize all
  defines before the first evaluation.

  I don't think this is strictly necessary, but it's important for the
  test runner to have the exact same results on every run. No one wants
  a "works on my machine" situation when the tests are involved.

Experimentation shows we only need an evaluation depth of 2 to
successfully evaluate the current set of defines:

  $ ./runners/test_runner --list-defines --define-depth=2

And any performance impact is negligible (~630s +-~20s).
2024-02-13 18:59:58 -06:00
8f2a6a3095 Implemented file sync broadcasting
Now, when files are synced, they broadcast their disk changes to any other
opened file handles. In effect, all open files match disk after a sync
call to any opened file handle pointing to that file.

This was a much requested feature, as the previous behavior (multiple
opened file handles maintain independent snapshots) is pretty different
from other filesystems. It's also quite difficult to implement outside
of the filesystem, since you need to track all opened files, requiring
either unbounded RAM or a known upper limit.

---

A bit unrelated, but this commit also changes bshrub estimate
calculation to include all opened file handles. This adds some annoying
complexity, but is necessary to prevent sporadic ERANGE errors when
the same file is opened multiple times.

The current implementation just refetches on-disk metadata. This adds
some maybe unnecessary metadata lookups, but simplifies things by
avoiding the tracking of on-disk sprout/shrub size, which risks falling
out of date. Keep in mind we only recalculate the estimate every
~inline_size/2 bytes written.

Just like lfsr_mdir_estimate, this scales O(n^2) with the number of
opened files (this are basically the same function... hmmm... can they
be deduplicated?). This is unlikely to be a problem for littlefs's use
case, but just something to be aware of.

Code changes:

            code          stack
  before:  32920           3032
  after:   33192 (+0.8%)   3048 (+0.5%)
2024-02-03 18:14:28 -06:00
c2e3a391ff Renamed/tweaked crystal_size -> crystal_thresh
Our crystallization threshold doesn't really describe the bounds of an
object, and I think it's a bit easier to think of it as a threshold for
block compaction.

Heck I've already been calling this the crystallization threshold all
over the code base.

An important change is this bumps the value by 1 bytes, so
crystal_thresh now describes the smallest size of a block our write
strategy will attempt to write.

Heuristically:
- data >= crystal_thresh => compacted into blocks
- data <  crystal_thresh => stored as fragments
2023-12-14 12:49:43 -06:00
d485795336 Removed concept of geometries from test/bench runners
This turned out to not be all that useful.

Tests already take quite a bit to run, which is a good thing! We have a
lot of tests! 942.68s or ~15 minutes of tests at the time of writing to
be exact. But simply multiplying the number of tests by some number of
geometries is heavy handed and not a great use of testing time.

Instead, tests where different geometries are relevant can parameterize
READ_SIZE/PROG_SIZE/BLOCK_SIZE at the suite level where needed. The
geometry system was just another define parameterization layer anyways.

Testing different geometries can still be done in CI by overriding the
relevant defines anyways, and it _might_ be interesting there.
2023-12-06 22:23:41 -06:00
c94b5f4767 Redesigned the inlined topology of files, now using geoxylic btrees
As a part of the general redesign of files, all files, not just small
files, can inline some data directly in the metadata log. Originally,
this was a single piece of inlined data or an inlined tree (shrub) that
effectively acted as an overlay over the block/btree data.

This is now changed so that when we have a block/btree, the root of the
btree is inlined. In effect making a full btree a sort of extended
shrub.

I'm currently calling this a "geoxylic btree", since that seems to be a
somewhat related botanical term. Geoxylic btrees have, at least on
paper, a number of benefits:

- There is a single lookup path instead of two, this simplifies code a
  bit and decreases lookup costs.

- One data structure instead of two also means lfsr_file_t requires
  less RAM, since all of the on-disk variants can go into one big union.
  Though I'm not sure this is very significant vs stack/buffer costs.

- The write path is much simpler and has less duplication (it was
  difficult to deduplicate the shrub/btree code because of how the
  shrub goes through the mdir).

  In this redesign, lfsr_btree_commit_ leaves root attrs uncommitted,
  allowing lfsr_bshrub_commit to finish the job via lfsr_mdir_commit.

- We don't need to maintain a shrub estimate, we just lazily evict trees
  during mdir compaction. This has a side-effect of allowing shrubs to
  temporarily grow larger than shrub_size before eviction.

  NOTE THIS (fundamentally?) DOESN'T WORK

- There is no awkwardly high overhead for small btrees. The btree root
  for two-block files should be able to comfortably fit in the shrub
  portion of the btree, for example.

- It may be possible to also make the mtree geoxylic, which should
  reduce storage overhead of small mtrees and make better use of the
  mroot.

All of this being said, things aren't working yet. Shrub eviction during
compaction runs into a problem with a single pcache -- how do we write
the new btrees without dropping the compaction pcache? We can't evict
btrees in a separate pass becauce their number is unbounded...
2023-11-20 23:23:58 -06:00
e8bdd4d381 Reworked bench.py/bench_runner/how bench measurements are recorded
This is based on how bench.py/bench_runners have actually been used in
practice. The main changes have been to make the output of bench.py more
readibly consumable by plot.py/plotmpl.py without needing a bunch of
hacky intermediary scripts.

Now instead of a single per-bench BENCH_START/BENCH_STOP, benches can
have multiple named BENCH_START/BENCH_STOP invocations to measure
multiple things in one run:

  BENCH_START("fetch", i, STEP);
  lfsr_rbyd_fetch(&lfs, &rbyd_, rbyd.block, CFG->block_size) => 0;
  BENCH_STOP("fetch");

Benches can also now report explicit results, for non-io measurements:

  BENCH_RESULT("usage", i, STEP, rbyd.eoff);

The extra iter/size parameters to BENCH_START/BENCH_RESULT also allow
some extra information to be calculated post-bench. This infomation gets
tagged with an extra bench_agg field to help organize results in
plot.py/plotmpl.py:

  - bench_meas=<meas>+amor, bench_agg=raw - amortized results
  - bench_meas=<meas>+div,  bench_agg=raw - per-byte results
  - bench_meas=<meas>+avg,  bench_agg=avg - average over BENCH_SEED
  - bench_meas=<meas>+min,  bench_agg=min - minimum over BENCH_SEED
  - bench_meas=<meas>+max,  bench_agg=max - maximum over BENCH_SEED

---

Also removed all bench.tomls for now. This may seem counterproductive in
a commit to improve benchmarking, but I'm not sure there's actual value
to keeping bench cases committed in tree.

These were alway quick to fall out of date (at the time of this commit
most of the low-level bench.tomls, rbyd, btree, etc, no longer
compiled), and most benchmarks were one-off collections of scripts/data
with results too large/cumbersome to commit and keep updated in tree.

I think the better way to approach benchmarking is a seperate repo
(multiple repos?) with all related scripts/state/code and results
committed into a hopefully reproducible snapshot. Keeping the
bench.tomls in that repo makes more sense in this model.

There may be some value to having benchmarks in CI in the future, but
for that to make sense they would need to actually fail on performance
regression. How to do that isn't so clear. Anyways we can always address
this in the future rather than now.
2023-11-03 10:27:17 -05:00
d1e79bffc7 Renamed crystallize_size -> crystal_size
The original name was a bit of a mouthful.

Also dropped the default crystal_size in the test/bench runners
block_size/4 -> block_size/8. I'm already noticing large amounts of
inflation when blocks are fragmented, though I am experimenting with a
rather small fragment_size right now.

Future benchmarks/experimentation is required to figure out good values
for these.
2023-10-23 12:27:44 -05:00
c815c19c20 New "fragmenting" write strategy
The attempt to implement in-rbyd data slicing, being lazily coalesced
during rbyd compaction, failed pretty much completely.

Slicing is a very enticing write strategy, getting both minimal overhead
post-compaction and fast random write speeds, but the idea has some
fundamental conflicts with how we play out attrs post-compaction.

This idea might work in a more powerful filesystem, but brings back the
need to simulate rbyds in RAM, which is something I really don't want to
do (complex, bug-prone, likely adds code cost, may not even be tractable).

So, third time's the charm?

---

This new write strategy writes only datas and bptrs, and avoids dagging
by completely rewriting any regions of data larger than a configurable
crystallization threshold.

This loses most of the benefits of data crystallization, random writes
will now usually need to rewrite a full block, but as a tradeoff our
data at rest is always stored with optimal overhead.

And at least data crystallization still saves space when our data isn't
block aligned, or in sparse files. From reading up on some other
filesystem designs it seems this is a desirable optimization sometimes
referred to as "tail-packing" or "block suballocation"

Some other changes from just having more time to think about the
problem:

1. Instead of scanning to figure out our current crystal size, we can
   use a simple heuristic of 1. look up left block, 2. look up right
   block, 3. assume any data between these blocks contribute to our
   current crystal.

   This is just a heuristic, so worst case you write the first and last
   byte of a block which is enough to trigger compaction into a block.
   But on the plus side this avoids issues with small holes preventing
   blocks from being formed.

   This approach brings the number of btree lookups down from
   O(crystallize_size) to 2.

2. I've gone ahead and dropped the previous scheme of coalesce_size
   + fragment_size and instead adopted a single fragment_size that
   controls the size of, well, fragments, i.e. data elements stored
   directly in trees.

   This affects both the inlined shrub as well as fragments stored in
   the inner nodes of the btree. I believe it's very similar to what is
   often called "pages" in logging filesystems, though I'm going to
   avoid that term for now because it's a bit overloaded.

   Previously, neighboring writes that, when combined, would exceed our
   coalesce_size, they just weren't combined. Now they are combined up
   to our fragment size, potentially splitting the right fragment.

   Before (fragment_size=8):

     .---+---+---+---+---+---+---+---.
     |            8 bytes            |
     '---+---+---+---+---+---+---+---'
                         +
                         .---+---+---+---+---.
                         |      5 bytes      |
                         '---+---+---+---+---'
                         =
     .---+---+---+---+---+---+---+---+---+---.
     |      5 bytes      |      5 bytes      |
     '---+---+---+---+---+---+---+---+---+---'

   After:

     .---+---+---+---+---+---+---+---.
     |            8 bytes            |
     '---+---+---+---+---+---+---+---'
                         +
                         .---+---+---+---+---.
                         |      5 bytes      |
                         '---+---+---+---+---'
                         =
     .---+---+---+---+---+---+---+---+---+---.
     |            8 bytes            |2 bytes|
     '---+---+---+---+---+---+---+---+---+---'

   This leads to better fragment alignment (much like our block
   strategy), and minimizes tree overhead.

   Any neighboring data to the right is only coalesced if it fits in the
   current fragment, or would be rewritten (carved) anyways, to avoid
   unnecessary data rewriting.

   For example (fragment_size=8):

     .---+---+---+---+---+---+---+---+---+---+---+---+---+---.
     |        6 bytes        |        6 bytes        |2 bytes|
     '---+---+---+---+---+---+---+---+---+---+---+---+---+---'
                                 +
                         .---+---+---+---+---.
                         |      5 bytes      |
                         '---+---+---+---+---'
                                 =
     .---+---+---+---+---+---+---+---+---+---+---+---+---+---.
     |            8 bytes            |    4 bytes    |2 bytes|
     '---+---+---+---+---+---+---+---+---+---+---+---+---+---'

Other than these changes this commit is mostly a bunch of carveshrub
rewriting again, which continues to be nuanced and annoying to get
bug free.
2023-10-21 22:05:46 -05:00
dc8dce8f0c Introduced coalesce_size and crystallize_size, deduplicated test cfg
- coalesce_size - The amount of data allowed to coalesce into single
  data entries.

- crystallize_size - How much data is allowed to be written to btree
  inner nodes before needing to be compacted into a block.

Also deduplicated the test config is something I've been wanting to do
for a while. It doesn't make sense to need to modify several different
instantiations of lfs_config every time a config option is added or
removed...
2023-10-13 23:56:33 -05:00
1d92169e5b Tweaked cache size to temporarily avoid pathological shrub overflows
This will stop being a problem when we actually have btrees, but for now
the fragmentation caused by byte-level syncs was easily enough to
overflow an mdir when cache size is big.

A smaller cache size is also nicer for debugging, since smaller cache
sizes results in data getting flushed to disk earlier, which is easier
to inspect than in-device buffers. And a 16-byte cache still provides
decent test coverage over cache interactions.

---

Also dropped inline_size to block_size/8. I realized while debugging
that opened shrubs take up additional space until we sync, so we need to
expect up to 2 temporary copies of shrubs when writing files.
2023-10-13 23:35:24 -05:00
c74ec1c133 Initial commit of basic file creation
Currently limited to inlined files and only simpler truncate-writes.

But still this lets us test file creation/deletion.

This is also enough logic to make it clear that, even though we have
some powerful high-level primitives, mapping file operations onto these
is still going to be non-trivial.
2023-09-17 11:04:44 -05:00
1c128afc90 Renamed internal runner field filter -> if_
This makes it more consistent with the actual test field, at the cost of
the symbol collision.
2023-08-04 13:54:10 -05:00
5be7bae518 Replaced tn/bn prefixes with an actual dependency system in tests/benches
The previous system of relying on test name prefixes for ordering was
simple, but organizing tests by dependencies and topologically sorting
during compilation is 1. more flexible and 2. simplifies test names,
which get typed a lot.

Note these are not "hard" dependencies, each test suite should work fine
in isolation. These "after" dependencies just hint an ordering when all
tests are ran.

As such, it's worth noting the tests should NOT error of a dependency is
missing. This unfortunately makes it a bit hard to catch typos, but
allows faster compilation of a subset of tests.

---

To make this work the way tests are linked has changed from using custom
linker section (fun linker magic!) to a weakly linked array appended to
every source file (also fun linker magic!).

At least with this method test.py has strict control over the test
ordering, and doesn't depend on 1. the order in which the linker merges
sections, and 2. the order tests are passed to test.py. I didn't realize
the previous system was so fragile.
2023-08-04 13:33:00 -05:00
07244fb2d4 In test/bench.py, added "internal" flag
This marks internal tests/benches (case.in="lfs.c") with an otherwise-unused
flag that is printed during --summary/--list-*. This just helps identify which
tests/benches are internal.
2023-06-01 17:40:48 -05:00
67826159fd Added TEST_PERMUTATION, made it easier to reproduce perm/fuzz failures
TEST_PERMUTATION/BENCH_PERMUTATION make it possible to map an integer to
a specific permutation efficiently. This is helpful since our testing
framework really only parameterizes single integers.

The exact implementation took a bit of trial and error. It's based on
https://stackoverflow.com/a/7919887 and
https://stackoverflow.com/a/24257996, but modified to run in O(n) with
no extra memory. In the discussion it seemed like this may not actually
be possible for lexicographic ordering of permutations, but fortunately
we don't care about the specific ordering, only the reproducibility.

Here's how it works:

1. First populate an array with all numbers 0-n.

2. Iterate through each index, selecting only from the remaining
   numbers based on our current permutation.

          .- i%rem --.
          v     .----+----.
     [p0 p1 |-> r0 r1 r2 r3]

   Normally to maintain lexicographic ordering you should have to do a O(n)
   shift at this step as you remove each number. But instead we can just swap
   the removed number and number under the index. This effectively
   shrinks the remaining part of the array, but permutes the numbers
   a bit. Fortunately, since each successive permutation swaps
   at the same location, the resulting permutations will be both
   exhaustive and reproducible, if unintuitive.

Now permutation/fuzz tests can reproduce specific failures by defining
either -DPERMUTATION=x or -DSEED=x.
2023-03-19 01:21:31 -05:00
59a57cb767 Reworked test_runner/bench_runner to evaluate define permutations lazily
I wondered if walking in Python 2's footsteps was going to run into the
same issues and sure enough, memory backed iterators became unweildy.

The motivation for this change is that large ranges in tests, such as
iterators over seeds or permutations, became prohibitively expensive to
compile. This meant more iteration moving into tests with more steps to
reproduce failures. This sort of defeats the purpuse of the test
framework.

The solution here is to move test permutation generation out of test.py
and into the test runner itself. The allows defines to generate their
values programmatically.

This does conflict with the test frameworks support of sets of explicit
permutations, but this is fixed by also moving these "permutation sets"
down into the test runner.

I guess it turns out the closer your representation matches your
implementation the better everythign works.

Additionally the define caching layer got a bit of tweaking. We can't
precalculate the defines because of mutual recursion, but we can
precalculate which define/permutation each define id maps to. This is
necessary as otherwise figuring out each define's define-specific
permutation would be prohibitively expensive.
2023-03-17 15:06:56 -05:00
f7dbaf7707 Changed rbyd testing to ignore block_size, now testing with all geometries
This turned out to be a bit tricky, and the scheme in bench_rbyd is
broken.

The core issue is that we don't have a distinction between physical and
logical block sizes, so we can't use a block device configured for one
geometry with a littlefs instance operating on a different geometry. For
this and other reasons we should probably have two configuration
variables in the future, but at the moment that is out of scope.

The problem with the approach in bench_rbyd, which changes the
lfs_config at runtime, is that this breaks emubd which also depends on
lfs_config due to a leaky abstraction. This causes unnoticed memory
corruption.

---

To get something working, the tests now change the underlying BLOCK_SIZE
test define before the tests are run. This starts the test with a block
device configured with a large block_size. To keep this from breaking
things the geometry definitions in the test and bench runners no longer
use default dependent definitions, instead defining everything
explicitly.

With block_size being so large, this makes some of the emubd operations
less performant, notably the --disk option for exposing block device
state during testing.

It would also be nice to use the copy-on-write backend of emubd for some
of the permutation testing, but since it operates on a block-by-block
basis, it doesn't really work when the block device is just one big
block.
2023-02-12 17:15:18 -06:00
b0382fa891 Added BENCH/TEST_PRNG, replacing other ad-hoc sources of randomness
When you add a function to every benchmark suite, you know if should
probably be provided by the benchmark runner itself. That being said,
randomness in tests/benchmarks is a bit tricky because it needs to be
strictly controlled and reproducible.

No global state is used, allowing tests/benches to maintain multiple
randomness stream which can be useful for checking results during a run.

There's an argument for having global prng state in that the prng could
be preserved across power-loss, but I have yet to see a use for this,
and it would add a significant requirement to any future test/bench runner.
2022-12-06 23:09:07 -06:00
1a07c2ce0d A number of small script fixes/tweaks from usage
- Fixed prettyasserts.py parsing when '->' is in expr

- Made prettyasserts.py failures not crash (yay dynamic typing)

- Fixed the initial state of the emubd disk file to match the internal
  state in RAM

- Fixed true/false getting changed to True/False in test.py/bench.py
  defines

- Fixed accidental substring matching in plot.py's --by comparison

- Fixed a missed LFS_BLOCk_CYCLES in test_superblocks.toml that was
  missed

- Changed test.py/bench.py -v to only show commands being run

  Including the test output is still possible with test.py -v -O-, making
  the implicit inclusion redundant and noisy.

- Added license comments to bench_runner/test_runner
2022-11-15 13:42:07 -06:00
4fe0738ff4 Added bench.py and bench_runner.c for benchmarking
These are really just different flavors of test.py and test_runner.c
without support for power-loss testing, but with support for measuring
the cumulative number of bytes read, programmed, and erased.

Note that the existing define parameterization should work perfectly
fine for running benchmarks across various dimensions:

./scripts/bench.py \
    runners/bench_runner \
    bench_file_read \
    -gnor \
    -DSIZE='range(0,131072,1024)'

Also added a couple basic benchmarks as a starting point.
2022-11-15 13:33:34 -06:00