Commit Graph

454 Commits

Author SHA1 Message Date
52dd83096b Initial implementation of forward-looking erase-state CRCs
This change is necessary to handle out-of-order writes found by pjsg's
fuzzing work.

The problem is that it is possible for (non-NOR) block devices to write
pages in any order, or to even write random data in the case of a
power-loss. This breaks littlefs's use of the first bit in a page to
indicate the erase-state.

pjsg notes this behavior is documented in the W25Q here:
https://community.cypress.com/docs/DOC-10507

---

The basic idea here is to CRC the next page, and use this "erase-state CRC" to
check if the next page is erased and ready to accept programs.

.------------------. \   commit
|     metadata     | |
|                  | +---.
|                  | |   |
|------------------| |   |
| erase-state CRC -----. |
|------------------| | | |
|   commit CRC    ---|-|-'
|------------------| / |
|     padding      |   | padding (doesn't need CRC)
|                  |   |
|------------------| \ | next prog
|     erased?      | +-'
|        |         | |
|        v         | /
|                  |
|                  |
'------------------'

This is made a bit annoying since littlefs doesn't actually store the
page (prog_size) in the superblock, since it doesn't need to know the
size for any other operation. We can work around this by storing both
the CRC and size of the next page when necessary.

Another interesting note is that we don't need to any bit tweaking
information, since we read the next page every time we would need to
know how to clobber the erase-state CRC. And since we only read
prog_size, this works really well with our caching, since the caches
must be a multiple of prog_size.

This also brings back the internal lfs_bd_crc function, in which we can
use some optimizations added to lfs_bd_cmp.

Needs some cleanup but the idea is passing most relevant tests.
2022-12-17 12:42:05 -06:00
1278ec1d08 Adopted Brent's algorithm for cycle detection
The previous cycle detection algorithm (a naive check against the largest
possible tail list) is simple and gets the job done, but has the potential to
take a very long time on disks with many blocks. Brent's algorithm, on
the other hand, takes at most 2x the number of blocks in the tail list.

Originally naive cycle detection was chosen over Floyd's algorithm to
avoid the extra complexity of managing two desynced traversals for every
traversal of the tail list, but Brent's algorithm is very well suited for our
use case, requiring only we keep track of an additional mdir pointer on the
stack as we traverse.
2022-12-17 12:41:39 -06:00
0c781dd822 Merge remote-tracking branch 'origin/master' into test-and-bench-runners 2022-12-06 23:08:53 -06:00
0b11ce03b7 Fixed incorrect calculation of extra space needed in mdir blocks
Despite the comment being correct, the calculation is somehow off by a word,
meaning something must have been missed. Maybe the space for the move-delete
was missed since that was added later to avoid losing move-deletes during
relocations.

This was found with the new exhaustive power-loss searching added to the
test framework with -P2. The exact failure was
test_dirs_many_reentrant:2gg2cb:k4o6. This must be the first test that
ends up with all possible extra state in a single mdir block.
2022-11-28 12:51:18 -06:00
eba5553314 Fixed hidden orphans by separating deorphan search into two passes
This happens in rare situations where there is a failed mdir relocation,
interrupted by a power-loss, containing the destination of a directory
rename operation, where the directory being renamed preceded the
relocating mdir in the mdir tail-list. This requires at some point for a
previous directory rename to create a cycle.

If this happens, it's possible for the half-orphan to contain the only
reference to the renamed directory. Since half-orphans contain outdated
state when viewed through the mdir tail-list, the renamed directory
appears to be a full-orphan until we fix the relocating half-orphan.
This causes littlefs to incorrectly remove the renamed directory from
the mdir tail-list, causes catastrophic problems down the line.

The source of the problem is that the two different types of orphans
really operate on two different levels of abstraction: half-orphans fix
failed mdir commits, while full-orphans fix directory removes/renames.
Conflating the two leads to situations where we attempt to fix assumed
problems about the directory tree before we have fixed problems with the
mdir state.

The fix here is to separate out the deorphan search into two passes: one
to fix half-orphans and correct any mdir-commits, restoring the mdirs
and gstate to a known good state, then two to fix failed
removes/renames.

---

This was found with the -Plinear heuristic powerloss testing, which now
runs on more geometries. The failing case was:

  test_relocations_reentrant_renames:112gg261dk1e3f3:123456789abcdefg1h1i1j1k1
  l1m1n1o1p1q1r1s1t1u1v1g2h2i2j2k2l2m2n2o2p2q2r2s2t2

Also fixed/tweaked some parts of the test framework as a part of finding
this bug:

- Fixed off-by-one in exhaustive powerloss state encoding.

- Added --gdb-powerloss-before and --gdb-powerloss-after to help debug
  state changes through a failing powerloss, maybe this should be
  expanded to any arbitrary powerloss number in the future.

- Added lfs_emubd_crc and lfs_emubd_bdcrc to get block/bd crcs for quick
  state comparisons while debugging.

- Fixed bd read/prog/erase counts not being copied during exhaustive
  powerloss testing.

- Fixed small typo in lfs_emubd trace.
2022-11-28 12:51:18 -06:00
6a53d76e90 Merge pull request #744 from littlefs-project/fix-fetchmatch-err-path
Fix lfs_dir_fetchmatch not propogating bd errors correctly in one case
2022-11-10 10:32:30 -06:00
dfa8abdd2c Merge pull request #740 from cbiffle/fix-invalid-block-size-reporting
Fix invalid block size reporting.
2022-11-10 10:31:58 -06:00
d8c96abf92 Merge pull request #724 from littlefs-project/clang-lint
Fix self-assign warnings discovered by clang, remove some warning flags
2022-11-10 10:31:42 -06:00
d08f949afd Fixed lfs_dir_fetchmatch not propogating bd errors correctly in one case
Found by cbiffle
2022-11-04 13:45:13 -05:00
eb9f4d5d7e Fix invalid block size reporting.
This boilerplate got copied from the stanza just above and incompletely
edited.
2022-10-09 17:45:14 -07:00
47914b925f Fixed self-assign warnings discovered by clang 2022-09-07 12:46:29 -05:00
6c720dc2bb Fix unused function warning with LFS_NO_MALLOC 2022-04-25 12:12:41 +02:00
abbfe8e92e Reduced lfs_dir_traverse's explicit stack to 3 frames
This is possible thanks to invoxiaamo's optimization of compacting
renames to avoid the O(n^3) nested filters. Not only does this
significantly reduce the runtime cost of that operation, but it
reduces the maximum possible depth of recursion to 3 frames.

Deepest lfs_dir_traverse before:

traverse with commit
'-> traverse with filter
    '-> traverse with move
        '-> traverse with filter

Deepest lfs_dir_traverse after:

traverse with commit
'-> traverse with move
    '-> traverse with filter
2022-04-10 23:27:49 -05:00
c60c977c25 Merge pull request #658 from littlefs-project/no-recursion
Restructure littlefs to not use recursion, measure stack usage
2022-04-10 23:23:39 -05:00
3ce64d1ac0 Merge pull request #666 from invoxiaamo/rename-opti2
Optimization of the rename case.
2022-04-10 22:02:04 -05:00
0ced3623d4 Merge pull request #657 from littlefs-project/copyright-update
Update copyright notice
2022-04-10 21:59:27 -05:00
f28ac3ea7d Merge pull request #638 from lmapii/master
Removed invalid overwrite for return value.
2022-04-10 21:52:48 -05:00
a94fbda1cd Merge pull request #632 from robekras/patch-1
Fix lfs_file_rawseek performance issue
2022-04-10 21:52:27 -05:00
cc025653ed Merge pull request #630 from Johnxjj/dev-johnxjj
add the limit, the cursor cannot be set to a negative number
2022-04-10 14:44:47 -05:00
bfb9bd2483 Merge pull request #614 from nnayo/fix_no_malloc_2
don't use lfs_file_open() when LFS_NO_MALLOC is set
2022-04-10 14:44:33 -05:00
f40b854ab5 Merge pull request #584 from colin-foster-in-advantage/block_size_mount_fail
Fail mount when the block size changes
2022-04-10 14:44:24 -05:00
c2fa1bb7df Optimization of the rename case.
Rename can be VERY time consuming. One of the reasons is the 4 recursion
level depth of lfs_dir_traverse() seen if a compaction happened during the
rename.

lfs_dir_compact()
  size computation
    [1] lfs_dir_traverse(cb=lfs_dir_commit_size)
         - do 'duplicates and tag update'
       [2] lfs_dir_traverse(cb=lfs_dir_traverse_filter, data=tag[1])
           - Reaching a LFS_FROM_MOVE tag (here)
         [3] lfs_dir_traverse(cb=lfs_dir_traverse_filter, data=tag[1]) <= on 'from' dir
             - do 'duplicates and tag update'
           [4] lfs_dir_traverse(cb=lfs_dir_traverse_filter, data=tag[3])
  followed by the compaction itself:
    [1] lfs_dir_traverse(cb=lfs_dir_commit_commit)
         - do 'duplicates and tag update'
       [2] lfs_dir_traverse(cb=lfs_dir_traverse_filter, data=tag[1])
           - Reaching a LFS_FROM_MOVE tag (here)
         [3] lfs_dir_traverse(cb=lfs_dir_traverse_filter, data=tag[1]) <= on 'from' dir
             - do 'duplicates and tag update'
           [4] lfs_dir_traverse(cb=lfs_dir_traverse_filter, data=tag[3])

Yet, analyse shows that levels [3] and [4] don't perform anything
if the callback is lfs_dir_traverse_filter...

A practical example:

- format and mount a 4KB block FS
- create 100 files of 256 Bytes named "/dummy_%d"
- create a 1024 Byte file "/test"
- rename "/test" "/test_rename"
- create a 1024 Byte file "/test"
- rename "/test" "/test_rename"
This triggers a compaction where lfs_dir_traverse was called 148393 times,
generating 25e6+ lfs_bd_read calls (~100 MB+ of data)

With the optimization, lfs_dir_traverse is now called 3248 times
(589e3 lfs_bds_calls (~2.3MB of data)

=> x 43 improvement...
2022-04-10 13:12:45 -05:00
3b62ec1c47 Updated error handling for NOSPC 2022-04-10 13:00:13 -05:00
b898977fd8 Set the limit, the cursor cannot be set to a negative number 2022-04-10 12:57:42 -05:00
cf274e6ec6 Squash of CR changes
- nit: Moving brace to end of if statement line for consistency
- mount: add more debug info per CR
- Fix compiler error from extra parentheses
- Fix superblock typo
2022-04-10 12:53:33 -05:00
425dc810a5 Modified robekras's optimization to avoid flush for all seeks in cache
The basic idea is simple, if we seek to a position in the currently
loaded cache, don't flush the cache. Notably this ensures that seek is
always as fast or faster than just reading the data.

This is a bit tricky since we need to check that our new block and
offset match the cache, fortunately we can skip the block check by
reevaluating the block index for both the current and new positions.

Note this only works whene reading, for writing we need to always flush
the cache, or else we will lose the pending write data.
2022-04-10 12:46:51 -05:00
a6f01b7d6e Update lfs.c
This should fix the performance issue if a new seek position belongs to currently cached data.
This avoids unnecessary rereads of file data.
2022-04-09 02:12:18 -05:00
9c7e232086 Fixed missing definition of lfs_cache_drop in readonly mode
Interestingly this was introduced by two different PRs which were not tested
together until pre-release testing:

- Fix lfs_file_seek doesn't update cache properties correctly
- Fix compiler warnings when LFS_READONLY defined
2022-03-21 20:29:04 -05:00
c676bcee4c Merge branch 'bf_lfs_file_seek_readonly' into HEAD 2022-03-20 23:16:15 -05:00
03f088b92c Tweaked lfs_file_flush to still flush caches when build under LFS_READONLY
A slight varation to the fix from ondrap
2022-03-20 23:14:34 -05:00
e955b9f65d Fix lfs_file_seek doesn't update cache properties correctly in readonly mode. Invalidate cache to fix it. 2022-03-20 23:10:11 -05:00
5801169348 Merge pull request #635 from mikee47/fix/spelling-errors
Fix spelling errors
2022-03-20 23:09:23 -05:00
2d6f4ead13 Merge pull request #620 from XinStellaris/master
fix bug:lfs_alloc will alloc one block repeatedly in multiple split
2022-03-20 23:09:04 -05:00
2db5dc80c2 Update copyright notice 2022-03-20 23:03:52 -05:00
1363c9f9d4 fix bug:lfs_alloc will alloc one block repeatedly in multiple split
BUG CASE:Assume there are 6 blocks in littlefs, block 0,1,2,3 already allocated. 0 has a tail pair of {2, 3}. Now we try to write more into 0.
When writing to block 0, we will split(FIRST SPLIT), thus allocate block 4 and 5. Up to now , everything is as expected.
Then we will try to commit in block 4, during which split(SECOND SPLIT) is triggered again(In our case, some files are large, some are small, one split may not be enough).  Still as expected now.
BUG happens when we try to alloc a new block pair for the second split:
As lookahead buffer reaches the end , a new lookahead buffer will be generated from flash content, and block 4, 5 are unused blocks in the new lookahead buffer because they are not programed yet. HOWEVER, block 4,5 should be occupied in the first split!!!!!  The result is block 4,5 are allocated again(This is where things are getting wrong).

commit ce2c01f results in this bug. In the commit, a lfs_alloc_ack is inserted in lfs_dir_split, which will cause split to reset lfs->free.ack to block count.
In summary, this problem exists after 2.1.3.

Solution: don't call lfs_alloc_ack in lfs_dir_split.
2022-03-20 20:53:48 -05:00
8109f28266 Removed recursion from lfs_dir_traverse
lfs_dir_traverse is a bit unpleasant in that it is inherently a
recursive function, but without a strict bound of 4 calls (commit -> filter ->
move -> filter), and efforts to unroll the recursion comes at a
signification code cost.

It turns out the best solution I've found so far is to simple create an
explicit stack with an explicit bound of 4 calls (or more accurately,
3 pushed frames).

---

This actually highlights one of the bigger flaws in littlefs right now,
which is that this function, lfs_dir_traverse, takes O(n^2) disk reads
to traverse.

Note that LFS_FROM_MOVE can only occur once per commit, which is why
this code is O(n^2) and not O(n^4).
2022-03-20 04:27:54 -05:00
fedf646c79 Removed recursion in file read/writes
This mostly just required separate functions for "lfs_file_rawwrite" and
"lfs_file_flushedwrite", since lfs_file_flush recursively invokes
lfs_file_rawread and lfs_file_rawwrite.

This comes at a code cost, but gives us bounded and measurable RAM usage
on this code path.
2022-03-20 04:25:24 -05:00
84da4c0b1a Removed recursion from commit/relocate code path
lfs_dir_commit originally relied heavily on tail-recursion, though at
least one path (through relocations) was not tail-recursive, and could
cause unbounded stack usage in extreme cases of bad blocks. (Keep in
mind even extreme cases of bad blocks should be in scope for littlefs).

In order to remove recursion from this code path, several changed were
raequired:

- The lfs_dir_compact logic had to be somewhat inverted. Instead of
  first compacting and then resolving issues such as relocations and
  orphans, the overarching lfs_dir_commit now contains a state-machine
  which after committing or compacting handles the extra changes to the
  filesystem in a single, non-recursive loop

- Instead of fixing all relocations recursively, >1 relocation requires
  defering to a full deorphan step. This step is unfortunately an
  additional n^2 process. It also required some changes to lfs_deorphan
  in order to ignore intentional orphans created as an intermediary in
  lfs_mkdir. Maybe in the future we should remove these.

- Tail recursion normally found in lfs_fs_deorphan had to be rewritten
  as a loop which restarts any time a new commit causes a relocation.
  This does show that the algorithm may not terminate, but only if every
  block is bad, which will eventually cause littlefs to run out of
  blocks to write to.
2022-03-20 04:24:44 -05:00
yog
e334983767 don't use lfs_file_open() when LFS_NO_MALLOC is set 2022-02-18 20:57:20 -06:00
4977fa0c0e Fix spelling errors 2022-01-29 09:52:00 +00:00
487df12dde Fail when block_size doesn't match config
With the previous commit, fail if the superblock block_size doesn't
match the config block_size.
2021-08-17 10:02:27 -07:00
3efb8e44f3 Fail mount when the block size changes
When the on-disk block size doesn't match the config block size, it is
possible to get file corruption. For instance, if the num blocks was
0x200 and we re-mount with 0x100 files could be corrupt.

If we re-mount with a larger number of blocks things should be safer,
but could be handled with a resize option or perhaps a mount flag to
ignore this parameter.
2021-07-21 08:56:21 -07:00
fb2c311bb4 Fix compiler warnings when LFS_READONLY defined 2021-06-14 12:12:38 +02:00
bca64d76cf Merge branch 'devel' into ci-revamp
Needed to bring in new "error-asserts" configuration
2021-01-18 12:23:25 -06:00
cab1d6cca6 Merge pull request #514 from mon/feature/assert_early_return
lfs_fs_preporphans: return int to alllow graceful LFS_ASSERT
2021-01-18 11:53:47 -06:00
e7e4b352bd lfs_fs_preporphans ret int for graceful LFS_ASSERT 2021-01-18 11:50:33 -06:00
9449ef4be4 Merge pull request #511 from embeddedt/fix_lseek
Skip flushing file if lfs_file_rawseek() doesn't change position
2021-01-18 11:47:56 -06:00
cfe779fc08 Merge pull request #508 from littlefs-project/fix-sanity-check
Moved sanity check in lfs_format after compaction
2021-01-18 11:47:23 -06:00
0db6466984 Merge pull request #502 from mon/feature/meta_limits
Add metadata_max config to help performance on devices with large blocks
2021-01-18 11:45:34 -06:00
10a08833c6 Moved lfs_mdir_isopen behind LFS_NO_ASSERT
lfs_mdir_isopen goes unused if asserts are disabled, and this caused an
"unused function" warning on Clang (curiously not on GCC since the
function was static inline, commonly used for header-only functions).

Also removed "inline" from the lfs_mdir_* functions as these involve
linked-list operations and really shouldn't be inlined. And since they
are static, inlining should occur automatically if there is a benefit.

Found by dpgeorge
2021-01-18 11:41:18 -06:00