Commit Graph

122 Commits

Author SHA1 Message Date
9bc41099f0 scripts: Changed -~/--sleep -> -w/--wait to sleep after -k/--keep-open
This changes -w/--wait to sleep _after_ -k/--keep-open, instead of
including the time spent waiting on inotifywait in the sleep time.

1. It's easier, no need to keep track of when we started waiting.

2. It's simpler to reason about.

3. It trivially avoids the multiple wakeup noise that plagued
   watch.py + vim (vim likes to do a bunch of renaming and stuff when
   saving files, including the file 4913 randomly?)

   Avoiding this was previously impossible because -~/--sleep was
   effectively a noop when combined with -k/--keep-open.

---

Also renamed from -~/--sleep -> -w/--wait, which is a bit more intuitive
and avoids possible shell issues with -~.

To make this work, dropped the -w/--block-cycles shortform flag in
dbgtrace.py. It's not like this flag is ever used anyways.

Though at the moment this is ignoring the possible conflict with
-w/--word-bits...
2025-11-18 00:58:27 -06:00
7da44f12ae Added redund hints to more tags
Well, kinda. At the moment we don't have any reund support (it's a
TODO), so arguably redund=0 and this is just a comment tweak.

Though our mdirs _are_ already redund=1... so maybe these should
actually set redund=1?

It's unclear, so for now I've just tweaked the comment, and we should
probably revisit when _actually_ implementing meta/data redundancy.

---

Note this only really affects struct tags:

  LFS3_TAG_STRUCT         0x04tt  v--- -1-- +ttt tttt
  LFS3_TAG_BRANCH         0x040r  v--- -1-- +--- --rr
  LFS3_TAG_DATA           0x0404  v--- -1-- +--- -1rr
  LFS3_TAG_BLOCK          0x0408  v--- -1-- +--- 1err
  LFS3_TAG_DDKEY*         0x0410  v--- -1-- +--1 --rr
  LFS3_TAG_DID            0x0420  v--- -1-- +-1- ----
  LFS3_TAG_BSHRUB         0x0428  v--- -1-- +-1- 1-rr
  LFS3_TAG_BTREE          0x042c  v--- -1-- +-1- 11rr
  LFS3_TAG_MROOT          0x0431  v--- -1-- +-11 --rr
  LFS3_TAG_MDIR           0x0435  v--- -1-- +-11 -1rr
  LFS3_TAG_MSHRUB+        0x0438  v--- -1-- +-11 1-rr
  LFS3_TAG_MTREE          0x043c  v--- -1-- +-11 11rr
  LFS3_TAG_BMRANGE        0x044u  v--- -1-- +1-- ++uu
  LFS3_TAG_BMFREE         0x0440  v--- -1-- +1-- ----
  LFS3_TAG_BMINUSE        0x0441  v--- -1-- +1-- ---1
  LFS3_TAG_BMERASED       0x0442  v--- -1-- +1-- --1-
  LFS3_TAG_BMBAD          0x0443  v--- -1-- +1-- --11
  LFS3_TAG_DDRC*          0x0450  v--- -1-- +1-1 ----
  LFS3_TAG_DDPCOEFF*      0x0451  v--- -1-- +1-1 ---1
  LFs3_TAG_PCOEFFMAP*     0x0460  v--- -1-- +11- ----

This redund hint may be useful for debugging and the theoretical
CKMETAREDUND feature.
2025-11-18 00:58:18 -06:00
cf34ba9aca Rearranged tag encodings, reserved suptype=0 for internal tags
This was motivated by a discussion with a gh user, in which it was noted
that not having a reserved suptype for internal tags risks potential
issues with long-term future tag compatibility.

I think the risk is low, but, without a reserved suptype, it _is_
possible for a future tag to conflict with an internal tag in an older
driver version, potentially and unintentionally breaking compatibility.
Note this is especially concerning during mdir compactions, where we
copy tags we may not understand otherwise.

In littlefs2 we reserved suptype=0x100, though this was mostly an
accident due to saturating the 3-bit suptype space. With the larger tag
space in littlefs3, the reserved suptype=0x100 was dropped.

---

Long story short, this reserves suptype=0 for internal flags (well, and
null, which is _mostly_ internal only, but does get written to disk as
unreachable tags).

Unfortunately, adding a new suptype _did_ require moving a bunch of
stuff around:

  LFS3_TAG_NULL           0x0000  v--- ---- +--- ----
  LFS3_TAG_INTERNAL       0x00tt  v--- ---- +ttt tttt

  LFS3_TAG_CONFIG         0x01tt  v--- ---1 +ttt tttt
  LFS3_TAG_MAGIC          0x0131  v--- ---1 +-11 --rr
  LFS3_TAG_VERSION        0x0134  v--- ---1 +-11 -1--
  LFS3_TAG_RCOMPAT        0x0135  v--- ---1 +-11 -1-1
  LFS3_TAG_WCOMPAT        0x0136  v--- ---1 +-11 -11-
  LFS3_TAG_OCOMPAT        0x0137  v--- ---1 +-11 -111
  LFS3_TAG_GEOMETRY       0x0138  v--- ---1 +-11 1---
  LFS3_TAG_NAMELIMIT      0x0139  v--- ---1 +-11 1--1
  LFS3_TAG_FILELIMIT      0x013a  v--- ---1 +-11 1-1-
  LFS3_TAG_ATTRLIMIT?     0x013b  v--- ---1 +-11 1-11

  LFS3_TAG_GDELTA         0x02tt  v--- --1- +ttt tttt
  LFS3_TAG_GRMDELTA       0x0230  v--- --1- +-11 ----
  LFS3_TAG_GBMAPDELTA     0x0234  v--- --1- +-11 -1rr
  LFS3_TAG_GDDTREEDELTA*  0x0238  v--- --1- +-11 1-rr
  LFS3_TAG_GPTREEDELTA*   0x023c  v--- --1- +-11 11rr

  LFS3_TAG_NAME           0x03tt  v--- --11 +ttt tttt
  LFS3_TAG_BNAME          0x0300  v--- --11 +--- ----
  LFS3_TAG_REG            0x0301  v--- --11 +--- ---1
  LFS3_TAG_DIR            0x0302  v--- --11 +--- --1-
  LFS3_TAG_STICKYNOTE     0x0303  v--- --11 +--- --11
  LFS3_TAG_BOOKMARK       0x0304  v--- --11 +--- -1--
  LFS3_TAG_SYMLINK?       0x0305  v--- --11 +--- -1-1
  LFS3_TAG_SNAPSHOT?      0x0306  v--- --11 +--- -11-
  LFS3_TAG_MNAME          0x0330  v--- --11 +-11 ----
  LFS3_TAG_DDNAME*        0x0350  v--- --11 +1-1 ----
  LFS3_TAG_DDTOMB*        0x0351  v--- --11 +1-1 ---1

  LFS3_TAG_STRUCT         0x04tt  v--- -1-- +ttt tttt
  LFS3_TAG_BRANCH         0x040r  v--- -1-- +--- --rr
  LFS3_TAG_DATA           0x0404  v--- -1-- +--- -1--
  LFS3_TAG_BLOCK          0x0408  v--- -1-- +--- 1err
  LFS3_TAG_DDKEY*         0x0410  v--- -1-- +--1 ----
  LFS3_TAG_DID            0x0420  v--- -1-- +-1- ----
  LFS3_TAG_BSHRUB         0x0428  v--- -1-- +-1- 1---
  LFS3_TAG_BTREE          0x042c  v--- -1-- +-1- 11rr
  LFS3_TAG_MROOT          0x0431  v--- -1-- +-11 --rr
  LFS3_TAG_MDIR           0x0435  v--- -1-- +-11 -1rr
  LFS3_TAG_MSHRUB+        0x0438  v--- -1-- +-11 1---
  LFS3_TAG_MTREE          0x043c  v--- -1-- +-11 11rr
  LFS3_TAG_BMRANGE        0x044u  v--- -1-- +1-- ++uu
  LFS3_TAG_BMFREE         0x0440  v--- -1-- +1-- ----
  LFS3_TAG_BMINUSE        0x0441  v--- -1-- +1-- ---1
  LFS3_TAG_BMERASED       0x0442  v--- -1-- +1-- --1-
  LFS3_TAG_BMBAD          0x0443  v--- -1-- +1-- --11
  LFS3_TAG_DDRC*          0x0450  v--- -1-- +1-1 ----
  LFS3_TAG_DDPCOEFF*      0x0451  v--- -1-- +1-1 ---1
  LFs3_TAG_PCOEFFMAP*     0x0460  v--- -1-- +11- ----

  LFS3_TAG_ATTR           0x06aa  v--- -11a +aaa aaaa
  LFS3_TAG_UATTR          0x06aa  v--- -11- +aaa aaaa
  LFS3_TAG_SATTR          0x07aa  v--- -111 +aaa aaaa

  LFS3_TAG_SHRUB          0x1kkk  v--1 kkkk +kkk kkkk
  LFS3_TAG_ALT            0x4kkk  v1cd kkkk +kkk kkkk

  LFS3_TAG_CKSUM          0x300p  v-11 ---- ++++ +pqq
  LFS3_TAG_NOTE           0x3100  v-11 ---1 ++++ ++++
  LFS3_TAG_ECKSUM         0x3200  v-11 --1- ++++ ++++
  LFS3_TAG_GCKSUMDELTA    0x3300  v-11 --11 ++++ ++++

  * Planned
  + Reserved
  ? Hypothetical

Some additional notes:

- I was on the fence on keeping the 0x30 prefix on config tags now that
  it is not longer needed to differentiate from null, but ultimately
  decided to keep it because: 1. it's fun, 2. it decreases the chance
  of false positives, 3. it keeps the redund bits readable in hexdumps,
  and 4. it reserves some tags < config, which is useful since order
  matters.

  Instead, I pushed the 0x30 prefix to _more_ tags, mainly gstate.

  As a coincidence, meta related tags (MNAME, MROOT, MRTREE) all shifted
  to also have the 0x30 prefix, which is a nice bit of unexpected
  consistency.

- I also considered reserving the redund bits across the config tags
  similarly to what we've done in struct/gstate tags, but decided
  against it as 1. it significantly reduces the config tag space
  available, and 2. makes alignment with VERSION + R/W/OCOMPAT a bit
  awkward.

  Instead I think would should relax the redund bit alignment in other
  suptypes, though in practice the intermixing of non-redund and redund
  tags makes this a bit difficult.

  Maybe we should consider including redund bits as a hint for things
  like DATA? DDKEY? BSHRUB? etc?

- I created a bit more space for file btree struct tags, allowing for
  both the future planned DDKEY, and BLOCK with optional erased-bit. We
  don't currently use this, but it may be useful for the future planned
  gddtree, which in-theory can track erased-state in partially written
  file blocks.

  Currently tracking erased-state in file blocks is difficult due to
  the potential of multiple references, and inability to prevent ecksum
  conflicts in raw data blocks.

- UATTR/SATTR bumped up to 0x600/0x700 to keep the 1-bit alignment,
  leaving the suptype 0x500 unused. Though this may be useful if we ever
  run out of struct tags (suptype=0x400), which is likely where most new
  tags will go.

---

Code changes were minimal, but with a bunch of noise:

                 code          stack          ctx
  before:       35912           2280          660
  after:        35920 (+0.0%)   2280 (+0.0%)  660 (+0.0%)

                 code          stack          ctx
  gbmap before: 38800           2296          772
  gbmap after:  38812 (+0.0%)   2296 (+0.0%)  772 (+0.0%)
2025-11-18 00:56:48 -06:00
2d68db965b Rearranged on-disk compat flags
Other than moving things around to make space for planned features, this
also adopts the idea of allowing compat flags to be ored into a single
32-bit integer, at least in the short-term.

Note though that these are still stored in separate wcompat/rcompat
tags, to make compat tests easier, and we may introduce conflicting
flags in the future if we run out of 32-bits. This is just an indulgence
to potentially make tooling/debugging easier until that happens.

Rcompat flags:

  RCOMPAT_NONSTANDARD+
                     0x00000001  ---- ---- ---- ---- ---- ---- ---- ---1
  RCOMPAT_WRONLY+    0x00000004  ---- ---- ---- ---- ---- ---- ---- -1--
  RCOMPAT_MMOSS      0x00000010  ---- ---- ---- ---- ---- ---- ---1 ----
  RCOMPAT_MSPROUT+   0x00000020  ---- ---- ---- ---- ---- ---- --1- ----
  RCOMPAT_MSHRUB+    0x00000040  ---- ---- ---- ---- ---- ---- -1-- ----
  RCOMPAT_MTREE      0x00000080  ---- ---- ---- ---- ---- ---- 1--- ----
  RCOMPAT_BMOSS+     0x00000100  ---- ---- ---- ---- ---- ---1 ---- ----
  RCOMPAT_BSPROUT+   0x00000200  ---- ---- ---- ---- ---- --1- ---- ----
  RCOMPAT_BSHRUB     0x00000400  ---- ---- ---- ---- ---- -1-- ---- ----
  RCOMPAT_BTREE      0x00000800  ---- ---- ---- ---- ---- 1--- ---- ----
  RCOMPAT_MDIRR1*    0x00001000  ---- ---- ---- ---- ---1 ---- ---- ----
  RCOMPAT_MDIRR2*    0x00002000  ---- ---- ---- ---- --1- ---- ---- ----
  RCOMPAT_MDIRR3*    0x00003000  ---- ---- ---- ---- --11 ---- ---- ----
  RCOMPAT_BTREER1*   0x00004000  ---- ---- ---- ---- -1-- ---- ---- ----
  RCOMPAT_BTREER2*   0x00008000  ---- ---- ---- ---- 1--- ---- ---- ----
  RCOMPAT_BTREER3*   0x0000c000  ---- ---- ---- ---- 11-- ---- ---- ----
  RCOMPAT_GRM        0x00010000  ---- ---- ---- ---1 ---- ---- ---- ----
  RCOMPAT_GMV?       0x00020000  ---- ---- ---- --1- ---- ---- ---- ----
  RCOMPAT_GDDTREE*   0x00100000  ---- ---- ---1 ---- ---- ---- ---- ----
  RCOMPAT_GPTREE*    0x00200000  ---- ---- --1- ---- ---- ---- ---- ----
  RCOMPAT_DATAR1*    0x00400000  ---- ---- -1-- ---- ---- ---- ---- ----
  RCOMPAT_DATAR2*    0x00800000  ---- ---- 1--- ---- ---- ---- ---- ----
  RCOMPAT_DATAR3*    0x00c00000  ---- ---- 11-- ---- ---- ---- ---- ----
  rcompat_OVERFLOW+  0x80000000  1--- ---- ---- ---- ---- ---- ---- ----

  * Planned
  + Reserved
  ? Hypothetical

Wcompat flags:

  WCOMPAT_NONSTANDARD+
                     0x00000001  ---- ---- ---- ---- ---- ---- ---- ---1
  WCOMPAT_RDONLY+    0x00000002  ---- ---- ---- ---- ---- ---- ---- --1-
  WCOMPAT_GCKSUM     0x00040000  ---- ---- ---- -1-- ---- ---- ---- ----
  WCOMPAT_GBMAP      0x00080000  ---- ---- ---- 1--- ---- ---- ---- ----
  WCOMPAT_DIR        0x01000000  ---- ---1 ---- ---- ---- ---- ---- ----
  WCOMPAT_SYMLINK?   0x02000000  ---- --1- ---- ---- ---- ---- ---- ----
  WCOMPAT_SNAPSHOT?  0x04000000  ---- -1-- ---- ---- ---- ---- ---- ----
  wcompat_OVERFLOW+  0x80000000  1--- ---- ---- ---- ---- ---- ---- ----

  + Reserved
  ? Hypothetical

Ocompat flags:

  OCOMPAT_NONSTANDARD+
                     0x00000001  ---- ---- ---- ---- ---- ---- ---- ---1
  ocompat_OVERFLOW+  0x80000000  1--- ---- ---- ---- ---- ---- ---- ----

  + Reserved

Other notes:

- M* and B* struct flags were reordered to match META -> DATA order
  elsewhere. This no longer matches the tag ordering, but there's an
  argument the B* tags apply more generally (all btrees) than the B*
  compat flag (only file btrees).

- MDIR/BTREE/DATA redund flags were moved near relevant flags, rather
  than sticking them in the higher-order bits as we are planning to do
  in the M_*/F_* flags. The compat flags already won't match because of
  the mdir/btree split (which is IMO too much detail to include in
  M_*/F_* flags, but hard to argue against in the compat flags), and
  this keeps the highest bit free for OVERFLOW, which is useful
  internally.

- Moving DIR to the current-highest bit makes it easy to add 6 more file
  types (7 if you ignore OVERFLOW), before things start getting cramped.

No code changes.
2025-11-13 16:14:56 -06:00
ee519f43b5 scripts: Renamed lookupleaf -> lookupnext_ to match lfs3.c
- lookupleaf -> lookupnext_
- namelookupleaf -> namelookup_

I want to move away from lookupleaf usage in general in the dbg scripts,
like we have in lfs3.c, but I also just really don't want to touch these
scripts again unless I need to. They've been useful, but also a big time
sink.

Maybe I should actually learn Python's new type system. That would
probably help here...
2025-10-26 15:34:45 -05:00
ffc40da878 scripts: Reworked tagrepr -> Tag.repr to rely more on self-parsing
This should make tag editing less tedious/error-prone. We already used
self-parsing to generate -l/--list in dbgtag.py, but this extends the
idea to tagrepr (now Tag.repr), which is used in quite a few more
scripts.

To make this work the little tag encoding spec had to become a bit more
rigorous, fortunately the only real change was the addition of '+'
characters to mark reserved-but-expected-zero bits.

Example:

  TAG_CKSUM = 0x3000  ## v-11 ---- ++++ +pqq
                         ^--^----^----^--^-^-- valid bit, unmatched
                            '----|----|--|-|-- matches 1
                                 '----|--|-|-- matches 0
                                      '--|-|-- reserved 0, unmatched
                                         '-|-- perturb bit, unmatched
                                           '-- phase bits, unmatched

  dbgtag.py 0x3000  =>  cksumq0
  dbgtag.py 0x3007  =>  cksumq3p
  dbgtag.py 0x3017  =>  cksumq3p 0x10
  dbgtag.py 0x3417  =>  0x3417

Though Tag.repr still does a bit of manual formatting for the
differences between shrub/normal/null/alt tags.

Still, this should reduce the number of things that need to be changed
from 2 -> 1 when adding/editing most new tags.
2025-10-24 00:15:21 -05:00
67d3c6ea69 scripts: Ignore errors with compat-disabled gstate
The gbmap introduces quite a bit of complexity with how it interacts
with config: block_count => gbmap weight, and wcompat => gbmap enabled.
On one hand this means fewer sources of truth, on the other hand it
makes the gbmap logic cross subsystems and a bit messy.

To avoid trying to parse a bunch of disabled/garbage gstate, this adds
wcompat/rcompat checks to our Gstate class, exposed via __bool__.

This also means we actually need to parse wcompat/rcompat/ocompat flags,
but that wasn't to difficult (though currently only supports 32-bits).

---

I added conditional repr logic for the grm and gbmap, but didn't bother
with the gcksum. The gcksum is used too many other places in these
scripts to expect a nice rendering when disabled.
2025-10-17 14:02:46 -05:00
9e45249b29 gbmap: Added support for gbmap in lfs3_fs_grow
In lfs3_fs_grow, we need to update any gbmaps to match the new disk
size. The actual patch to the gbmap is easy, but it does get a bit
delicate since we need to feed the gbmap with an allocator in the new
disk size.

Fortunately, the opportunistism of the gbmap allocator avoids any
catch-22 issues, as long as we make sure to not trigger any gbmap
rebuilds.

Adds a bit of code, but not much:

                 code          stack          ctx
  before:       37168           2352          684
  after:        37168 (+0.0%)   2352 (+0.0%)  684 (+0.0%)

                 code          stack          ctx
  gbmap before: 39000           2456          800
  gbmap after:  39116 (+0.3%)   2456 (+0.0%)  800 (+0.0%)
2025-10-12 14:24:32 -05:00
e622656538 bmap: Tweaked bmap ranges, dropped in-flight tag for now
New bmap range tags:

  LFS3_TAG_BMRANGE      0x033u  v--- --11 --11 uuuu
  LFS3_TAG_BMFREE       0x0330  v--- --11 --11 ----
  LFS3_TAG_BMINUSE      0x0331  v--- --11 --11 ---1
  LFS3_TAG_BMERASED     0x0332  v--- --11 --11 --1-
  LFS3_TAG_BMBAD        0x0333  v--- --11 --11 --11

Note 0x334-0x33f are still reserved for future bmap tags, but the new
encoding fits in the surprisingly common 2-bit subfield that may
deduplicate some decoding code.

Fitting in 2-bits is the main reason for this, now that in-flight ranges
look like they won't be worth exploring further. Worst case we can
always add more bm tags in the future. And it may even make sense to use
an entire bit for in-flight tags, since in theory the concept can apply
to more than just in-use blocks.

---

Another benefit of this encoding: In-use vs free is a bit check, and I
like the implication that an in-use + erased block can only be a bad
block.

No code changes:

                code          stack          ctx
  before:      37172           2352          684
  after:       37172 (+0.0%)   2352 (+0.0%)  684 (+0.0%)

                code          stack          ctx
  bmap before: 38844           2456          800
  bmap after:  38844 (+0.0%)   2456 (+0.0%)  800 (+0.0%)
2025-10-09 14:33:24 -05:00
27a722456e scripts: Added support for SI-prefixes as iI punescape modifiers
This adds %i and %I as punescape modifiers for limited printing of
integers with SI prefixes:

- %(field)i - base-10 SI prefixes
  - 100   => 100
  - 10000 => 10K
  - 0.01  => 10m

- %(field)I - base-2SI prefixes
  - 128   => 128
  - 10240 => 10Ki
  - 0.125 => 128mi

These can also easily include units as a part of the punescape string:

- %(field)iops/s => 10Kops/s
- %(field)IB => 10KiB

This is particularly useful in plotmpl.py for adding explicit
x/yticklabels without sacrificing the automatic SI-prefixes.
2025-10-01 17:56:51 -05:00
8666830515 bmap: scripts: Fixed missing geometry race condition
The gbmap's weight is defined by the block count stored in the geometry
config field, which should always be present in valid littlefs3 images.

But our scripts routinely try to parse _invalid_ littlefs3 images when
running in parallel with benchmarks/tests (littlefs3 does _not_ support
multiple read/writers), so this was causing exceptions to be thrown.

The fix is to just assume weight=0 when the geometry field is missing.
The image isn't valid, and the gbmap is optional anyways.
2025-10-01 17:56:13 -05:00
ebae43898e bmap: Changing direction, store bmap mode in wcompat flags
The idea behind separate ctrled+unctrled airspaces was to try to avoid
multiple interpretations of the on-disk bmap, but I'm starting to think
this adds more complexity than it solves.

The main conflict is the meaning of "in-flight" blocks. When using the
"uncontrolled" bmap algorithm, in-flight blocks need to be
double-checked by traversing the filesystem. But in the "controlled"
bmap algorithm, blocks are only marked as "in-flight" while they are
truly in-flight (in-use in RAM, but not yet in use on disk).
Representing these both with the same "in-flight" state risks
incompatible algorithms misinterpreting the bmap across different
mounts.

In theory the separate airspaces solve this, but now all the algorithms
need to know how to convert the bmap from different modes, adding
complexity and code cost.

Well, in theory at least. I'm unsure separate airspaces actually solves
this due to subtleties between what "in-flight" means in the different
algorithms (note both in-use and free blocks are "in-flight" in the
unknown airspace!). It really depends on how the "controlled" algorithm
actually works, which isn't implemented/fully designed yet.

---

Long story short, due to a time crunch, I'm ripping this out for now and
just storing the current algorithm in the wcompat flags:

  LFS3_WCOMPAT_GBMAP       0x00006000  Global block-map in use
  LFS3_WCOMPAT_GBMAPNONE   0x00000000  Gbmap not in use
  LFS3_WCOMPAT_GBMAPCACHE  0x00002000  Gbmap in cache mode
  LFS3_WCOMPAT_GBMAPVFR    0x00004000  Gbmap in VFR mode
  LFS3_WCOMPAT_GBMAPIFR    0x00006000  Gbmap in IFR mode

Note GBMAPVFR/IFR != BMAPSLOW/FAST! At least BMAPSLOW/FAST can share
bmap representations:

- GBMAPVFR => Uncontrolled airspace, i.e. in-flight blocks may or may
  not be in use, need to traverse open files.

- GBMAPIFR => Controlled airspace, i.e. in-flight blocks are in use,
  at least until powerloss, no traversal needed, but requires more bmap
  writes.

- BMAPSLOW => Treediff by checking what blocks are in B but not in A,
  and what blocks are in A but not in B, O(n^2), but minimizes bmap
  updates.

  Can be optimized with a bloom filter.

- BMAPFAST => Treediff by clearing all blocks in A, and then setting all
  blocks in B, O(n), but also writes all blocks to the bmap twice even
  on small changes.

  Can be optimized with a sliding bitmap window (or a block hashtable,
  though a bitmap converges to the same thing in both algorithms when
  >=disk_size).

It will probably be worth unifying the bmap representation later (the
more algorithm-specific flags there are, the harder interop becomes for
users, but for now this opens a path to implementing/experimenting with
bmap algorithms without dealing with this headache.
2025-10-01 17:56:08 -05:00
e7c3755e21 bmap: Split known into ctrled+unctrled 2025-10-01 17:56:05 -05:00
5f65b49ef8 bmap: scripts: Added on-disk bmap traversal to dbgbmap and friends
And yes, dbgbmapsvg.py's parents are working, thanks to a hacky blocks
@property (Python to the rescue!)
2025-10-01 17:55:16 -05:00
88180b6081 bmap: Initial scaffolding for on-disk block map
This is pretty exploratory work, so I'm going to try to be less thorough
in commit messages until the dust settles.

---

New tag for gbmapdelta:

  LFS3_TAG_GBMAPDELTA   0x0104  v--- ---1 ---- -1rr

New tags for in-bmap block types:

  LFS3_TAG_BMRANGE      0x033u  v--- --11 --11 uuuu
  LFS3_TAG_BMFREE       0x0330  v--- --11 --11 ----
  LFS3_TAG_BMINFLIGHT   0x0331  v--- --11 --11 ---1
  LFS3_TAG_BMINUSE      0x0332  v--- --11 --11 --1-
  LFS3_TAG_BMBAD        0x0333  v--- --11 --11 --11
  LFS3_TAG_BMERASED     0x0334  v--- --11 --11 -1--

New gstate decoding for gbmap:

  .---+- -+- -+- -+- -. cursor: 1 leb128  <=5 bytes
  | cursor            | known:  1 leb128  <=5 bytes
  +---+- -+- -+- -+- -+ block:  1 leb128  <=5 bytes
  | known             | trunk:  1 leb128  <=4 bytes
  +---+- -+- -+- -+- -+ cksum:  1 le32    4 bytes
  | block             | total:            23 bytes
  +---+- -+- -+- -+- -'
  | trunk         |
  +---+- -+- -+- -+
  |     cksum     |
  '---+---+---+---'

New bmap node revdbg string:

  vvv---- -111111- -11---1- -11---1-  (62 62 7e v0  bb~r)  bmap node

New mount/format/info flags (still unsure about these):

  LFS3_M_BMAPMODE     0x03000000  On-disk block map mode
  LFS3_M_BMAPNONE     0x00000000  Don't use the bmap
  LFS3_M_BMAPCACHE    0x01000000  Use the bmap to cache lookahead scans
  LFS3_M_BMAPSLOW     0x02000000  Use the slow bmap algorithm
  LFS3_M_BMAPFAST     0x03000000  Use the fast bmap algorithm

New gbmap wcompat flag:

  LFS3_WCOMPAT_GBMAP  0x00002000  Global block-map in use
2025-10-01 17:55:13 -05:00
8cc81aef7d scripts: Adopt __get__ binding for write/writeln methods
This actually binds our custom write/writeln functions as methods to the
file object:

  def writeln(self, s=''):
      self.write(s)
      self.write('\n')
  f.writeln = writeln.__get__(f)

This doesn't really gain us anything, but is a bit more correct and may
be safer if other code messes with the file's internals.
2025-06-27 12:56:03 -05:00
f967cad907 kv: Adopted LFS3_o_WRSET for better key-value API integration
This adds LFS3_o_WRSET as an internal-only 3rd file open mode (I knew
that missing open mode would come in handy) that has some _very_
interesting behavior:

- Do _not_ clear the configured file cache. The file cache is prefilled
  with the file's data.

- If the file does _not_ exist and is small, create it immediately in
  lfs3_file_open using the provided file cache.

- If the file _does_ exist or is not small, do nothing and open the file
  normally. lfs3_file_close/sync can do the rest of the work in one
  commit.

This makes it possible to implement one-commit lfs3_set on top of the
file APIs with minimal code impact:

- All of the metadata commit logic can be handled by lfs3_file_sync_, we
  just call lfs3_file_sync_ with the found did+name in lfs3_file_opencfg
  when WRSET.

- The invariant that lfs3_file_opencfg always reserves an mid remains
  intact, since we go ahead and write the full file if necessary,
  minimizing the impact on lfs3_file_opencfg's internals.

This claws back most of the code cost of the one-commit key-value API:

              code          stack          ctx
  before:    38232           2400          636
  after:     37856 (-1.0%)   2416 (+0.7%)  636 (+0.0%)

  before kv: 37352           2280          636
  after kv:  37856 (+1.3%)   2416 (+6.0%)  636 (+0.0%)

---

I'm quite happy how this turned out. I was worried there for a bit the
key-value API was going to end up an ugly wart for the internals, but
with LFS3_o_WRSET this integrates quite nicely.

It also raises a really interesting question, should LFS3_o_WRSET be
exposed to users?

For now I'm going to play it safe and say no. While potentially useful,
it's still a pretty unintuitive API.

Another thing worth mentioning is that this does have a negative impact
on compile-time gc. Duplication adds code cost when viewing the system
as a whole, but tighter integration can backfire if the user never calls
half the APIs.

Oh well, compile-time opt-out is always an option in the future, and
users seem to care more about pre-linked measurements, probably because
it's an easier thing to find. Still, it's funny how measuring code can
have a negative impact on code. Something something Goodhart's law.
2025-06-22 15:37:07 -05:00
6eba1180c8 Big rename! Renamed lfs -> lfs3 and lfsr -> lfs3 2025-05-28 15:00:04 -05:00
bce8f45a64 scripts: Tried to better document ansi color codes 2025-05-25 13:00:11 -05:00
6d9c077261 Reordered LFSR_TAG_NAMELIMIT/FILELIMIT
Not sure why, but this just seems more intuitive/correct. Maybe because
LFSR_TAG_NAME is always the first tag in a file's attr set:

  LFSR_TAG_NAMELIMIT    0x0039  v--- ---- --11 1--1
  LFSR_TAG_FILELIMIT    0x003a  v--- ---- --11 1-1-

Seeing as several parts of the codebase still use the previous order,
it seems reasonable to switch back to that.

No code changes.
2025-05-24 21:51:06 -05:00
651c3e1eb4 scripts: Renamed Attr -> CsvAttr
Mainly to avoid confusion with littlefs's attrs, uattrs, rattrs, etc.

This risked things getting _really_ confusing as the scripts evolve.
2025-05-15 18:48:46 -05:00
c04f36ead4 scripts: plot[mpl].py: Adopted -s/--sort and -S for legend sorting
Before this, the only option for ordering the legend was by specifying
explicit -L/--add-label labels. This works for the most part, but
doesn't cover the case where you don't know the parameterization of the
input data.

And we already have -s/-S flags in other csv scripts, so it makes sense
to adopt them in plot.py/plotmpl.py to allow sorting by one or more
explicit fields.

Note that -s/-S can be combined with explicit -L/--add-labels to order
datasets with the same sort field:

  $ ./scripts/plot.py bench.csv \
          -bBLOCK_SIZE \
          -xn \
          -ybench_readed \
          -ybench_proged \
          -ybench_erased \
          --legend \
          -sBLOCK_SIZE \
          -L'*,bench_readed=bs=%(BLOCK_SIZE)s' \
          -L'*,bench_proged=' \
          -L'*,bench_erased='

---

Unfortunately this conflicted with -s/--sleep, which is a common flag in
the ascii-art scripts. This was bound to conflict with -s/--sort
eventually, so a came up with some alternatives:

- -s/--sleep -> -~/--sleep
- -S/--coalesce -> -+/--coalesce

But I'll admit I'm not the happiest about these...
2025-05-15 15:51:49 -05:00
55ea13b994 scripts: Reverted del to resolve shadowed builtins
I don't know how I completely missed that this doesn't actually work!

Using del _does_ work in Python's repl, but it makes sense the repl may
differ from actual function execution in this case.

The problem is Python still thinks the relevant builtin is a local
variables after deletion, raising an UnboundLocalError instead of
performing a global lookup. In theory this would work if the variable
could be made global, but since global/nonlocal statements are lifted,
Python complains with "SyntaxError: name 'list' is parameter and
global".

And that's A-Ok! Intentionally shadowing language builtins already puts
this code deep into ugly hacks territory.
2025-05-15 14:10:42 -05:00
48c1a016a0 scripts: Fixed missing tuple unpack in glob-all CLI attrs
This was broken:

  $ ./scripts/plotmpl.py -L'*=bs=%(bs)s'

There may be a better way to organize this logic, but spamming if
statements works well enough.
2025-05-15 13:47:09 -05:00
4a50c5c9ce scripts: dbgbmap[d3].py: Adopted slightly different row prioritization
This still forces the block_rows_ <= height invariant, but also prevents
ceiling errors from introducing blank rows.

I guess the simplest solution is the best one, eh?
2025-04-30 02:30:31 -05:00
de7564e448 Added phase bits to cksum tags
This carves out two more bits in cksum tags to store the "phase" of the
rbyd block (maybe the name is too fancy, this is just the lowest 2 bits
of the block address):

  LFSR_TAG_CKSUM        0x300p  v-11 ---- ---- -pqq
                                                ^ ^
                                                | '-- phase bits
                                                '---- perturb bit

The intention here is to catch mrootanchors that are "out-of-phase",
i.e. they've been shifted by a small number of blocks.

This can happen if we find the wrong mrootanchor (after, say, a magic
scan), and risks filesystem corruption:

                formatted
  .-----------------'-----------------.
                          mounted
           .-----------------'-----------------.
  .--------+--------+--------+--------+ ...
  |(erased)| mroot  |
  |        | anchor |                   ...
  |        |        |
  '--------+--------+--------+--------+ ...

Including the lower 2 bits of the block address in cksum tags avoids
this, for up to a 3 block shift (the maximum number of redund
mrootanchors).

---

Note that cksum tags really are the only place we could put these bits.
Anywhere else and they would interfere with the canonical cksum, which
would break error correction. By definition these need to be different
per block.

We include these phase bits in every cksum tag (because it's easier),
but these don't really say much about mdirs that are not the
mrootanchor. Non-anchor mdirs can have arbitrary block addresses,
therefore arbitrary phase bits.

You _might_ be able to do something interesting if you sort the rbyd
addresses and use the index as the phase bits, but that would add quite
a bit of code for questionable benefit...

You could argue this adds noise to our cksums, but:

1. 2 bits seems like a really small amount of noise
2. our cksums are just crc32cs
3. the phase bits humorously never change when you rewrite a block

---

As with any feature this adds code, but only a small amount. I think
it's worth the extra protection:

           code          stack          ctx
  before: 35792           2368          636
  after:  35824 (+0.1%)   2368 (+0.0%)  636 (+0.0%)

Also added test_mount_incompat_out_of_phase to test this.

The dbg scripts _don't_ error (block mismatch seems likely when
debugging), but dbgrbyd.py at least adds phase mismatch notes in
-l/--log mode.
2025-04-30 00:57:17 -05:00
f2e6b60f36 Reworked grm encoding a bit
This drops the leading count/mode byte, and instead uses mid=0 to
terminate grms. This shaves off 1 bytes from grmdeltas.

Previously, we needed the count/mode byte for a couple reasons:

- We needed to know the number of grm entries somehow, and there wasn't
  always an obvious sentinel value. mid=-1, for example, is
  unrepresentable with our unsigned leb128 encoding.

  But now that development has settled, we can use mid=0.0 to figure out
  the end-of-queue. mid=0.0 should always map to the root bookmark,
  which doesn't make sense to delete, so it makes for a reasonable null
  terminator here.

- It provided a route for future grm extensions, which could use the >2
  count/mode encodings.

  But I think we can use additional grm tag encodings for this.

  There's only one gdelta tag so far, but the current plan for future
  gdelta tags is to carve out the bottom 2 bits for redund like we do
  with the struct tags:

    LFSR_TAG_GDELTA        0x01tt  v--- ---1 -ttt ttrr
    LFSR_TAG_GRMDELTA      0x0100  v--- ---1 ---- ----
    LFSR_TAG_GBMAPDELTA    0x0104  v--- ---1 ---- -1rr
    LFSR_TAG_GDDTREEDELTA  0x0108  v--- ---1 ---- 1-rr
    LFSR_TAG_GPTREEDELTA   0x010c  v--- ---1 ---- 11rr
    ...

  Decoding is a bit more complicated for gstate, since we will need to
  xor those bits if mutable, but this avoids needing a full byte just
  for redund in every auxiliary tree.

  Long story short, we can leverage the lower 2 bits of the grm tag for
  future extensions using the same mechanism.

This may seem like a lot of effort for only a handful of bytes, but keep
in mind each gdelta lives in more-or-less every mdir in the filesystem.

Also saves a bit of code/ctx:

           code          stack          ctx
  before: 35772           2368          640
  after:  35768 (-0.0%)   2368 (+0.0%)  636 (-0.6%)
2025-04-30 00:53:33 -05:00
dc2d58d28e scripts: dbgbmap[d3].py: Prioritize rows at low resolution
This prevents some pretty unintuitive behavior with dbgbmap.py -H2 (the
default) in the terminal.

Consider before:

  bd 4096x256, 7.8% mdir, 0.4% btree, 0.0% data
  mm--------b-----mm--mm--mm--mmmmmmm--mm--mmmm-----------------------

Vs after:

  bd 4096x256, 7.8% mdir, 0.4% btree, 0.0% data
  m-----------------------------------b-mmmmmmmm----------------------

Compared to the original bmap (-H5):

  bd 4096x256, 7.8% mdir, 0.4% btree, 0.0% data
  mm------------------------------------------------------------------
  --------------------------------------------------------------------
  ----------b-----mm--mm--mm--mmmmmmm--mm--mmmm-----------------------
  --------------------------------------------------------------------

What's happening is dbgbmap.py is prioritizing aspect ratio over pixel
boundaries, so it's happy drawing a 4-row bmap to a 1-row Canvas. But of
course we can't see subpixels, so the result is quite confusing.

Prioritizing rows while tiling avoids this.
2025-04-30 00:44:26 -05:00
1f4d7b3b7e scripts: dbgmtree.py: Dropped Mtree.lookupnext
I was toying with making this look more like the mtree API in lfs.c (so
no lookupleaf/namelookupleaf, only lookup/namelookup), but dropped the
idea:

- It would be tedious

- The Mtree class's lookupleaf/namelookupleaf are also helpful for
  returning inner btree nodes when printing debug info

- Not embedding mids in the Mdir class would complicate things

It's ok for these classes to not match littlefs's internal API
_exactly_. The goal is easy access for debug info, not to port the
filesystem to Python.

At least dropped Mtree.lookupnext, because that function really makes no
sense.
2025-04-30 00:44:16 -05:00
677c078b50 Added LFSR_TAG_BNAME/MNAME, stop btree lookups at first tag
Now that we don't have to worry about name tag conflicts as much, we
can add name tags for things that aren't files.

This adds LFSR_TAG_BNAME for branch names, and LFSR_TAG_MNAME for mtree
names. Note that the upper 4 bits of the subtype match LFSR_TAG_BRANCH
and LFSR_TAG_MDIR respectively:

  LFSR_TAG_BNAME        0x0200  v--- --1- ---- ----
  LFSR_TAG_MNAME        0x0220  v--- --1- --1- ----

  LFSR_TAG_BRANCH       0x030r  v--- --11 ---- --rr
  LFSR_TAG_MDIR         0x0324  v--- --11 --1- -1rr

The encoding is somewhat arbitrary, but I figured reserving ~31 types
for files is probably going to be plenty for littlefs. POSIX seems to
do just fine with only ~7 all these years, and I think custom attributes
will be more enticing for "niche" file types (symlinks, compressed
files, etc), given the easy backwards compatibility.

---

In addition to the debugging benefits, the new name tags let us stop
btree lookups on the first non-bname/branch tag. Previously we always
had to fetch the first struct tag as well to check if it was a branch.

In theory this saves one rbyd lookup, but in practice it's a bit muddy.

The problem is that there's two ways to use named btrees:

1. As buckets: mtree -> mdir -> mid
2. As a table: ddtree -> ddid

The only named btree we _currently_ have is the mtree. And the mtree
operates in bucket mode, with each mdir acting more-or-less as an
extension to the btree. So we end up needing to do the second tag lookup
anyways, and all we've done is complicated up the code.

But we will _eventually_ need the table mode for the ddtree, where we
care if the ddname is an exact match.

And returning the first tag is arguably the more "correct" internal API,
vs arbitrarily the first struct tag.

But then again this change is pretty pricey...

           code          stack          ctx
  before: 35732           2440          640
  after:  35888 (+0.4%)   2480 (+1.6%)  640 (+0.0%)

---

It's worth noting the new BNAME/MNAME tags don't _require_ the btree
lookup changes (which is why we can get away with not touching the dbg
scripts). The previous algorithm of always checking for branch tags
still works.

Maybe there's an argument for conditionally using the previous API when
compiling without the ddtree, but that sounds horrendously messy...
2025-04-30 00:25:30 -05:00
5eb194c215 scripts: dbgbmap[d3].py: Limited block conflicts to mismatched types
Block conflict detection was originally implemented with non-dags in
mind. But now that dags are allowed, we shouldn't treat them as errors!

Instead, we only report blocks as conflicts if multiple references have
mismatching types.

This should still be very useful for debugging the upcoming bmap work.
2025-04-29 16:25:45 -05:00
d308ec8322 Reworked tag encoding a little bit
Mainly to make room for some future planned stuff:

- Moved the mroot's redund bits from LFSR_TAG_GEOMETRY to
  LFSR_TAG_MAGIC:

    LFSR_TAG_MAGIC        0x003r  v--- ---- --11 --rr

  This has the benefit of living in a fixed location (off=0x5), which
  may make mounting/debugging easier. It also makes LFSR_TAG_GEOMETRY
  less of a special case (LFSR_TAG_MAGIC is already a _very_ special
  case).

  Unfortunately, this does get in the way of our previous magic=0x3
  encoding. To compensate (and to avoid conflicts with LFSR_TAG_NULL),
  I've added the 0x3_ prefix. This has the funny side-effect of
  rendering redunds 0-3 as ascii 0-3 (0x30-0x33), which is a complete
  accident but may actually be useful when debugging.

  Currently all config tags fit in the 0x3_ prefix, which is nice for
  debugging but not a hard requirement.

- Flipped LFSR_TAG_FILELIMIT/NAMELIMIT:

    LFSR_TAG_FILELIMIT    0x0039  v--- ---- --11 1--1
    LFSR_TAG_NAMELIMIT    0x003a  v--- ---- --11 1-1-

  The file limit is a _bit_ more fundamental. It's effectively the
  required integer size for the filesystem.

  These may also be followed by LFSR_TAG_ATTRLIMIT based on how future
  attr revisits go.

- Rearranged struct tags so that LFSR_TAG_BRANCH = 0x300:

    LFSR_TAG_BRANCH       0x030r  v--- --11 ---- --rr
    LFSR_TAG_DATA         0x0304  v--- --11 ---- -1--
    LFSR_TAG_BLOCK        0x0308  v--- --11 ---- 1err
    LFSR_TAG_DDKEY*       0x0310  v--- --11 ---1 ----
    LFSR_TAG_DID          0x0314  v--- --11 ---1 -1--
    LFSR_TAG_BSHRUB       0x0318  v--- --11 ---1 1---
    LFSR_TAG_BTREE        0x031c  v--- --11 ---1 11rr
    LFSR_TAG_MROOT        0x032r  v--- --11 --1- --rr
    LFSR_TAG_MDIR         0x0324  v--- --11 --1- -1rr
    LFSR_TAG_MTREE        0x032c  v--- --11 --1- 11rr

    *Planned

  LFSR_TAG_BRANCH is a very special tag when it comes to bshrub/btree
  traversal, so I think it deserves the subtype=0 slot.

  This also just makes everything fit together better, and makes room
  for the future planned ddkey tag.

Code changes minimal:

           code          stack          ctx
  before: 35728           2440          640
  after:  35732 (+0.0%)   2440 (+0.0%)  640 (+0.0%)
2025-04-29 16:25:00 -05:00
7dd473df82 Tweaked LFSR_TAG_STICKYNOTE encoding 0x205 -> 0x203
Now that LFS_TYPE_STICKYNOTE is a real type users can interact with, it
makes sense to group it with REG/DIR. This also has the side-effect of
making these contiguous.

---

LFSR_TAG_BOOKMARKs, however, are still hidden from the user. This
unfortunately means there will be a bit of a jump if we ever add
LFS_TYPE_SYMLINK in the future, but I'm starting to wonder if that's the
best way to approach symlinks in littlefs...

If instead LFS_TYPE_SYMLINKS were implied via custom attribute, you
could avoid the headache that comes with adding a new tag encoding, and
allow perfect compatibility with non-symlink drivers. Win win.

This seems like a better approach for _all_ of the theoretical future
types (compressed files, device files, etc), and avoids the risk of
oversaturating the type space.

---

This had a surprising impact on code for just a minor encoding tweak. I
guess the contiguousness pushed the compiler to use tables/ranges for
more things? Or maybe 3 vs 5 is just an easier constant to encode?

           code          stack          ctx
  before: 35952           2440          640
  after:  35928 (-0.1%)   2440 (+0.0%)  640 (+0.0%)
2025-04-24 14:35:52 -05:00
a73f221317 scripts: Fixed issue where rbyd lookups rejected shrub tags
This was caused by including the shrub bit in the tag comparison in
Rbyd.lookup.

Fixed by adding an extra key mask (0xfff). Note this is already how
lfsr_rbyd_lookup works in lfs.c.
2025-04-23 23:19:37 -05:00
6d97398efc scripts: dbglfs.py: Fixed a couple mid=-1 issues
- Fixed Mtree.lookupleaf accepting mbid=0, which caused dbglfs.py to
  double print all files with mbid=-1

- Fixed grm mids not being mapped to mbid=-1 and related orphan false
  positives
2025-04-23 23:19:05 -05:00
8f1ccf089e Adopted lookupleaf, reworked internal btree APIs
This was a surprising side-effect the script rework: Realizing the
internal btree/rbyd lookup APIs were awkwardly inconsistent and could be
improved with a couple tweaks:

- Adopted lookupleaf name for functions that return leaf rbyds/mdirs.

  There's an argument this should be called lookupnextleaf, since it
  returns the next bid, unlike lookup, but I'm going to ignore that
  argument because:

  1. A non-next lookupleaf doesn't really make sense for trees where
     you don't have to fetch the leaf (the mtree)

  2. It would be a bit too verbose

- Adopted commitleaf name for functions that accept leaf rbyds.

  This makes the lfsr_bshrub_commit -> lfsr_btree_commit__ mess a bit
  more readable.

- Strictly limited lookup and lookupnext to return rattrs, even in
  complex trees like the mtree.

  Most use cases will probably stick to the lookupleaf variants, but at
  least the behavior will be consistent.

- Strictly limited lookup to expect a known bid/rid.

  This only really matters for lfsr_btree/bshrub_lookup, which as a
  quirk of their implementation _can_ lookup both bid + rattr at the
  same time. But I don't think we'll need this functionality, and
  limited the behavior may allow for future optimizations.

  Note there is no lfsr_file_lookup. File btrees currently only ever
  have a single leaf rattr, so this API doesn't really make sense.

Internal API changes:

- lfsr_btree_lookupnext_ -> lfsr_btree_lookupleaf
- lfsr_btree_lookupnext  -> lfsr_btree_lookupnext
- lfsr_btree_lookup      -> lfsr_btree_lookup
- added                     lfsr_btree_namelookupleaf
- lfsr_btree_namelookup  -> lfsr_btree_namelookup
- lfsr_btree_commit__    -> lfsr_btree_commit_
- lfsr_btree_commit_     -> lfsr_btree_commitleaf
- lfsr_btree_commit      -> lfsr_btree_commit

- added                     lfsr_bshrub_lookupleaf
- lfsr_bshrub_lookupnext -> lfsr_bshrub_lookupnext
- lfsr_bshrub_lookup     -> lfsr_bshrub_lookup
- lfsr_bshrub_commit_    -> lfsr_bshrub_commitleaf
- lfsr_bshrub_commit     -> lfsr_bshrub_commit

- lfsr_mtree_lookup      -> lfsr_mtree_lookupleaf
- added                     lfsr_mtree_lookupnext
- added                     lfsr_mtree_lookup
- added                     lfsr_mtree_namelookupleaf
- lfsr_mtree_namelookup  -> lfsr_mtree_namelookup

- added                     lfsr_file_lookupleaf
- lfsr_file_lookupnext   -> lfsr_file_lookupnext
- added                     lfsr_file_commitleaf
- lfsr_file_commit       -> lfsr_file_commit

Also added lookupnext to Mdir/Mtree in the dbg scripts.

Unfortunately this did add both code and stack, but only because of the
optional mdir returns in the mtree lookups:

           code          stack          ctx
  before: 35520           2440          636
  after:  35548 (+0.1%)   2472 (+1.3%)  636 (+0.0%)
2025-04-20 15:53:18 -05:00
3ca6670dcd Always log mbid=-1 for mroots and inlined mdirs
So mbid=0 now implies the mdir is not inlined.

Downsides:

- A bit more work to calculate
- May lose information due to masking everything when mtree.weight==0
- Risk of confusion when in-lfs.c state doesn't match (mbid=-1 is
  implied by mtree.weight==0)

Upsides:

- Includes more information about the topology of the mtree
- Avoids multiple dbgmbids for the same physical mdir

Also added lfsr_dbgmbid and lfsr_dbgmrid to help make logging
easier/more consistent.

And updated dbg scripts.
2025-04-20 15:53:18 -05:00
04d3002f3a Adopted ceiling division in mbits formula
So now:
               (block_size)
  mbits = nlog2(----------) = nlog2(block_size) - 3
               (     8    )

Instead of:

               (     (block_size))
  mbits = nlog2(floor(----------)) = nlog2(block_size & ~0x7) - 3
               (     (     8    ))

This makes the post-log - 3 formula simpler, which we probably want to
prefer as it avoids a division. And ceiling is arguably more intuitive
corner case behavior.

This may seem like a minor detail, but because mbits is purely
block_size derived and not configurable, any quirks here will become
a permanent compatibility requirement.

And hey, it saves a couple bytes (I'm not really sure why, the division
should've been optimized to a shift):

           code          stack          ctx
  before: 35528           2440          636
  after:  35520 (-0.0%)   2440 (+0.0%)  636 (+0.0%)
2025-04-20 15:53:18 -05:00
bd70270e11 scripts: Added -w/--word-bits to bound dbgleb128/dbgle32 parsing
This is limited to dbgle32.py, dbgleb128.py, and dbgtag.py for now.

This more closely matches how littlefs behaves, in that we read a
bounded number of bytes before leb128 decoding. This minimizes bugs
related to leb128 overflow and avoids reading inherently undecodable
data.

The previous unbounded behavior is still available with -w0.

Note this gives dbgle32.py much more flexibility in that it can now
decode other integer widths. Uh, ignore the name for now. At least it's
self documenting that the default is 32-bits...

---

Also fixed a bug in fromleb128 where size was reported incorrectly on
offset + truncated leb128.
2025-04-16 15:23:12 -05:00
0cea8b96fb scripts: Fixed O(n^2) slicing in Rbyd.fetch
Do you see the O(n^2) behavior in this loop?

  j = 0
  while j < len(data):
      word, d = fromleb(data[j:])
      j += d

The slice, data[j:], creates a O(n) copy every iteration of the loop.

A bit tricky. Or at least I found it tricky to notice. Maybe because
array indexing being cheap is baked into my brain...

Long story short, this repeated slicing resulted in O(n^2) behavior in
Rbyd.fetch and probably some other functions. Even though we don't care
_too_ much about performance in these scripts, having Rbyd.fetch run in
O(n^2) isn't great.

Tweaking all from* functions to take an optional index solves this, at
least on paper.

---

In practice I didn't actually find any measurable performance gain. I
guess array slicing in Python is optimized enough that the constant
factor takes over?

(Maybe it's being helped by us limiting Rbyd.fetch to block_size in most
scripts? I haven't tested NAND block sizes yet...)

Still, it's good to at least know this isn't a bottleneck.
2025-04-16 15:23:11 -05:00
b5c3b97ae1 scripts: Reworked dbgtag.py, added -i/--input, included hex in output
This just gives dbgtag.py a few more bells and whistles that may be
useful:

- Can now parse multiple tags from hex:

    $ ./scripts/dbgtag.py -x 71 01 01 01 12 02 02 02
    71 01 01 01    altrgt 0x101 w1 -1
    12 02 02 02    shrubdir w2 2

  Note this _does_ skip attached data, which risks some confusion but
  not skipping attached data will probably end up printing a bunch of
  garbage for most use cases:

    $ ./scripts/dbgtag.py -x 01 01 01 04 02 02 02 02 03 03 03 03
    01 01 01 04    gdelta 0x01 w1 4
    03 03 03 03    struct 0x03 w3 3

- Included hex in output. This is helpful for learning about the tag
  encoding and also helps identify tags when parsing multiple tags.

  I considered also included offsets, which might help with
  understanding attached data, but decided it would be too noisy. At
  some point you should probably jump to dbgrbyd.py anyways...

- Added -i/--input to read tags from a file. This is roughly the same as
  -x/--hex, but allows piping from other scripts:

    $ ./scripts/dbgcat.py disk -b4096 0 -n4,8 | ./scripts/dbgtag.py -i-
    80 03 00 08    magic 8

  Note this reads the entire file in before processing. We'd need to fit
  everything into RAM anyways to figure out padding.
2025-04-16 15:23:10 -05:00
a5747bb2b2 scripts: dbgmtree.py: Fixed minor mtree rendering/traversal issues
- Added TreeArt __bool__ and __len__.

  This was causing a crash in _treeartfrommtreertree when rtree was
  empty.

  The code was not updated in the set -> TreeArt class transition, and
  went unnoticed because it's unlikely to be hit unless the filesystem
  is corrupt.

  Fortunately(?) realtime rendering creates a bunch of transiently
  corrupt filesystem images.

- Tweaked lookupleaf to not include mroots in their own paths.

  This matches the behavior of leaf mdirs, and is intentionally
  different from btree's lookupleaf which needs to lookup the leaf rattr
  to terminate.

- Tweaked leaves to not remove the last path entry if it is an mdir.

  This hid the previous lookupleaf inconsistency. We only remove the
  last rbyd from the path because it is redundant, and for mdirs/mroots
  it should never be redundant.

  I ended up just replacing the corrupt check with an explicit check
  that the rbyd is redundant. This should be more precise and avoid
  issues like this in the future.

  Also adopted explicit redundant checks in Btree.leaves and
  Lfs.File.leaves.
2025-04-16 15:23:08 -05:00
57c77b1b72 scripts: Fixed most flickering issues in RingIO
Two new tricks:

1. Hide the cursor while redrawing the ring buffer.

2. Build up the entire redraw in RAM first, and render everything in a
   single write call.

These _mostly_ get rid of the cursor flickering issues in rapidly
updating scripts.
2025-04-16 15:23:05 -05:00
a5e59b2190 scripts: maps: Reverted all padding for status strings
After all, who doesn't love a good bit of flickering.

I think I was trying to be too clever, so reverting.

Printing these with no padding is the simplest solution, provides the
best information density, and worst case you can always add -s1 to limit
the update frequency if flickering is hurting readability.
2025-04-16 15:22:59 -05:00
27152ec597 scripts: maps: Adopted persistent padding for status strings
This automatically minimizes the status strings without flickering, all
it took was a bit of ~*global state*~.

---

If I'm remembering correctly, this was actually how tracebd.py used to
work before dbgbmap.py was added. The idea was dropped with dbgbmap.py
since dbgbmap.py relied on watch.py for real-time rendering and couldn't
persist state.

But now dbgbmap.py has its own -k/--keep-open flag, so that's not a
problem.
2025-04-16 15:22:58 -05:00
97c2287177 scripts: maps: Assume percentages never hit 100.0%
This isn't true, especially for dbgbmap.py, 100% is very possible in
filesystems with small files. But by limiting padding to 99.9%, we avoid
the annoying wasted space caused by the rare but occasional 100.0%.
2025-04-16 15:22:57 -05:00
eb4c4c612e scripts: Dropped --padding from ascii art scripts
No one is realistically ever going to use this.

Ascii art is just too low resolution, trying to pad anything just wastes
terminal space. So we might as well not support --padding and save on
the additional corner cases.

Worst case, in the future we can always find this commit and revert
things.
2025-04-16 15:22:56 -05:00
5e817be9cc scripts: maps: Cleaned up comments and junk
This took a bit of a messy route, but these scripts should be good to go
now.
2025-04-16 15:22:54 -05:00
50f652d44f scripts: maps: Cleaned up/moved header generation before rendering
Should've probably been two commits, but:

1. Cleaned up tracebd.py's header generation to be consistent with
   dbgbmap.py and other scripts.

   Percentage fields are now consistently floats in all scripts,
   allowing user-specified precision when punescaping.

2. Moved header generation up to where we still have the disk open (in
   dbgbmap[d3].py), to avoid issues with lazy Lfs attrs trying to access
   the disk after it's been closed.

   Found while testing with --title='cksum %(cksum)08x'. Lfs tries to
   validate the gcksum last minute and things break.
2025-04-16 15:22:53 -05:00
f0b8d34230 scripts: maps: Fixed divide-by-zero when packing blocks into small maps
This can be hit when dealing with very small maps, which is common since
we're rendering to the terminal. Not crashing here at least allows the
header/usage string to be shown.
2025-04-16 15:22:52 -05:00