When removing a file, we mark all open handles as "removed" (
pair={-1,-1}) to avoid trying to later read metadata that no longer
exists. Unfortunately, this also includes open dir handles that happen
to be pointing at the removed file, causing them to return
LFS_ERR_CORRUPT on the next read.
The good news is this is _not_ actual filesystem corruption, only a
logic error in lfs_dir_read.
We actually already have logic in place to nudge the dir to the next id,
but it was unreachable with the existing logic. I suspect this worked at
one point but was broken during a refactor due to lack of testing.
---
Fortunately, all we need to do is _not_ clobber the handle if the
internal type is a dir. Then the dir-nudging logic can correctly take
over.
I've also added test_dirs_remove_read to test this and prevent another
regression, adapted from tests provided by tpwrules that identified the
original bug.
Found by tpwrules
In v2.5, we introduced an optimization to avoid rereading data when
seeking inside the file cache. Unfortunately this used a slightly
wrong condition to check if the cache was "live", which meant seeks from
end-of-blocks could end up with invalid caches and wrong data. Not
great.
The problem is the nuance of when a file's cache is "live":
1. The file is marked as LFS_F_READING or LFS_F_WRITING.
But we can't reuse the cache when writing, so we only care about
LFS_F_READING.
2. file->off != lfs->cfg->block_size (end-of-block).
This is an optimization to avoid eagerly reading blocks we may not
actually care about.
We weren't checking for the end-of-block case, which meant if you seeked
_from_ the end of a block to a seemingly valid location in the file
cache, you could end up with an invalid cache.
Note that end-of-block may not be powers-of-two due to CTZ skip-list
pointers.
---
The fix is to check for the end-of-block case in lfs_file_seek. Note
this now matches the need-new-block logic in lfs_file_flushedread.
This logic change may also make lfs_file_seek call lfs_file_flush more
often, but only in cases where lfs_file_flush is a noop.
I've also extended the test_seek tests to cover a few more boundary-read
cases and prevent a regression in the future.
Found by wjl and lrodorigo
Looks like I missed a line during refactoring, resulted in only x86
sizes being reported in GitHub statuses.
If we wanted to limited these to one architecture, thumb would have
probably been a better pick.
- before: lfs_file_open("missing/") => LFS_ERR_ISDIR
- after: lfs_file_open("missing/") => LFS_ERR_NOTDIR
As noted by bmcdonnell-fb, returning LFS_ERR_ISDIR here was inconsistent
with the case where the file exists:
case before after
lfs_file_open("dir_a") => LFS_ERR_ISDIR LFS_ERR_ISDIR
lfs_file_open("dir_a/") => LFS_ERR_ISDIR LFS_ERR_ISDIR
lfs_file_open("reg_a/") => LFS_ERR_NOTDIR LFS_ERR_NOTDIR
lfs_file_open("missing_a/") => LFS_ERR_ISDIR LFS_ERR_NOTDIR
Note this is consistent with the behavior of lfs_stat:
lfs_file_open("reg_a/") => LFS_ERR_NOTDIR
lfs_stat("reg_a/") => LFS_ERR_NOTDIR
And the only other function that can "create" files, lfs_rename:
lfs_file_open("missing_a/") => LFS_ERR_NOTDIR
lfs_rename("reg_a", "missing_a/") => LFS_ERR_NOTDIR
There is some ongoing discussion about if these should return NOTDIR,
ISDIR, or INVAL, but this is at least an improvement over the
rename/open mismatch.
- test_paths_noent_trailing_slashes
- test_paths_noent_trailing_dots
- test_paths_noent_trailing_dotdots
These managed to slip through our path testing but should be tested, if
anything just to know exactly what errors these return.
Before this, the empty path ("") was treated as an alias for the root.
This was unintentional and just a side-effect of how the path parser
worked.
Now, the empty path should always result in LFS_ERR_INVAL:
- before: lfs_stat("") => 0
- after: lfs_stat("") => LFS_ERR_INVAL
Unlike normal files, dots (".") should not change the depth when
attempting to skip dotdot ("..") entries.
A weird nuance in the path parser, but at least it had a relatively easy
fix.
Added test_paths_dot_dotdots to prevent a regression.
This changes the behavior of paths that attempt to navigate above root
to now return LFS_ERR_INVAL:
- before: lfs_stat("/../a") => 0
- after: lfs_stat("/../a") => LFS_ERR_INVAL
This is a bit of an opinionated change while making other path
resolution tweaks.
In terms of POSIX-compatibility, it's a bit unclear exactly what dotdots
above the root should do.
POSIX notes:
> As a special case, in the root directory, dot-dot may refer to the
> root directory itself.
But the word choice of "may" implies it is up to the implementation.
I originally implement this as a root-loop simply because that is what
my Linux machine does, but I now think that's not the best option. Since
we're making other path-related tweaks, we might as well try to adopt
behavior that is, in my opinion, safer and less... weird...
This should also help make paths more consistent with future theoretical
openat-list APIs, where saturating at the current directory is sort of
the least expected behavior.
- lfs_mkdir now accepts trailing slashes:
- before: lfs_mkdir("a/") => LFS_ERR_NOENT
- after: lfs_mkdir("a/") => 0
- lfs_stat, lfs_getattr, etc, now reject trailing slashes if the file is
not a directory:
- before: lfs_stat("reg_a/") => 0
- after: lfs_stat("reg_a/") => LFS_ERR_NOTDIR
Note trailing slashes are accepted if the file is a directory:
- before: lfs_stat("dir_a/") => 0
- after: lfs_stat("dir_a/") => 0
- lfs_file_open now returns LFS_ERR_NOTDIR if the file exists but the
path contains trailing slashes:
- before: lfs_file_open("reg_a/") => LFS_ERR_NOENT
- after: lfs_file_open("reg_a/") => LFS_ERR_NOTDIR
To make these work, the internal lfs_dir_find API required some
interesting changes:
- lfs_dir_find no longer sets id=0x3ff on not finding a parent entry in
the path. Instead, lfs_path_islast can be used to determine if the
modified path references a parent entry or child entry based on the
remainder of the path string.
Note this is only necessary for functions that create new entries
(lfs_mkdir, lfs_rename, lfs_file_open).
- Trailing slashes mean we can no longer rely on the modified path being
NULL-terminated. lfs_path_namelen provides an alternative to strlen
that stops at slash or NULL.
- lfs_path_isdir also tells you if the modified path must reference a
dir (contains trailing slashes). I considered handling this entirely
in lfs_dir_find, but the behavior of entry-creating functions is too
nuanced.
At least lfs_dir_find returns LFS_ERR_NOTDIR if the file exists on
disk.
Like strlen, lfs_path_namelen/islast/isdir are all O(n) where n is the
name length. This isn't great, but if you're using filenames large
enough for this to actually matter... uh... open an issue on GitHub and
we might improve this in the future.
---
There are a couple POSIX incompatibilities that I think are not
worth fixing:
- Root modifications return EINVAL instead of EBUSY:
- littlefs: remove("/") => EINVAL
- POSIX: remove("/") => EBUSY
Reason: This would be the only use of EBUSY in the system.
- We accept modifications of directories with trailing dots:
- littlefs: remove("a/.") => 0
- POSIX: remove("a/.") => EBUSY
Reason: Not worth implementing.
- We do not check for existence of directories followed by dotdots:
- littlefs: stat("a/missing/..") => 0
- POSIX: stat("a/missing/..") => ENOENT
Reason: Difficult to implement non-recursively.
- We accept modifications of directories with trailing dotdots:
- littlefs: rename("a/b/..", "c") => 0
- POSIX: rename("a/b/..", "c") => EBUSY
Reason: Not worth implementing.
These are at least now documented in tests/test_paths.toml, which isn't
the greatest location, but it's at least something until a better
document is created.
Note that these don't really belong in SPEC.md because path parsing is
a function of the driver and has no impact on disk.
As expected these are failing and will need some work to pass.
The issue with lfs_file_open allowing trailing slashes was found by
rob-zeno, and the issue with lfs_mkdir disallowing trailing slashes was
found by XinStellaris, PoppaChubby, pavel-kirienko, inf265, Xywzel,
steverpalmer, and likely others.
This should be a superset of the previous test_paths test suite, while
covering a couple more things (more APIs, more path synonyms, utf8,
non-printable ascii, non-utf8, etc).
Not yet tested are some corner cases with known bugs, mainly around
trailing slashes.
These two small libraries provide examples of error-correction
compatible with littlefs (or any filesystem really).
It would be nice to eventually provide these as drop-in solutions, but
right now it's not really possible without breaking changes to
littlefs's block device API.
In the meantime, ramcrc32bd and ramrsbd at least provide example
implementations that can be adapted to users' own block devices.
The inconsistency here between the use of block_size vs metadata_max was
suspicious. Turns out there's a bug when metadata_max == prog_size.
We correctly use metadata_max for the block_size/2 check, but we weren't
using it for the block_size-40 check. The second check seems unnecessary
after the first, but it protects against running out of space in a
commit for commit-related metadata (checksums, tail pointers, etc) when
we can't program half-blocks.
Turns out this is also needed when limiting metadata_max to a single
prog, otherwise we risk erroring with LFS_ERR_NOSPC early.
Found by ajheck, dpkristensen, NLLK, and likely others.
- Added METADATA_MAX to test_runner.
- Added METADATA_MAX to bench_runner.
- Added a simple metadata_max test to test_superblocks, for lack of
better location.
There have been several issues floating around related to metadata_max
and LFS_ERR_NOSPC which makes me think there's a bug in our metadata_max
logic.
metadata_max was a quick patch and is relatively untested, so an
undetected bug isn't too surprising. This commit adds at least some
testing over metadata_max.
Sure enough, the new test_superblocks_metadata_max test reveals a
curious LFS_ERR_NAMETOOLONG error that shouldn't be there.
More investigation needed.
In the previous implementation of lfs_file_seek, we calculated the new
offset using signed arithmetic before checking for possible
overflow/underflow conditions. This results in undefined behavior in C.
Fortunately for us, littlefs is now limited to 31-bit file sizes for API
reasons, so we don't have to be too clever here. Doing the arithmetic
with unsigned integers and just checking if we're in a valid range
afterwards should work.
Found by m-kostrzewa and lucic71
With the existing method, (-DLFS_MALLOC=my_malloc)
users often had to use compiler options like -include, which
was not so portable.
This change introduces another way to provide partial overrides of
lfs_util.h using a user-provided header.
Looks like cross-workflow downloads has finally been added to the
standard download-artifact action, so we might as well switch to it to
reduce dependencies.
dawidd6's version was also missing the merge-multiple feature which is
necessary to work around breaking changes in download-artifact's v4
bump.
Weirdly it needs GITHUB_TOKEN for some reason? Not sure why this
couldn't be implicit.
Turns out major versions break things.
Old behavior: Artifacts with same name are merged
New behavior: Artifacts with same name error
Using a pattern and merging on download should fix this at least on the
job-side. Though I do wonder if we'll start running into artifact limit
issues with the new way artifacts are handled...
- block_cycles is signed and should use PRId32
- flags is signed (which is a bit weird) and should be cast for %x
Unfortunately exactly what PRI* expands to is dependant on both the
compiler and the underlying architecture, so I don't think it's possible
for us to catch these mistakes with CI...
Found by stefano-zanotti