Skip to content

[VFS] Optimize FTP directory existence to use CWD instead of parent LIST#759

Draft
ilang wants to merge 1 commit intoapache:masterfrom
new-proimage:VFS-cwd-optimization
Draft

[VFS] Optimize FTP directory existence to use CWD instead of parent LIST#759
ilang wants to merge 1 commit intoapache:masterfrom
new-proimage:VFS-cwd-optimization

Conversation

@ilang
Copy link
Copy Markdown
Contributor

@ilang ilang commented Mar 31, 2026

Summary

  • setFTPFile() now tries CWD before falling back to parent LIST for directory existence checks
  • CWD is a lightweight control-channel command (no data transfer) vs LIST which opens a data channel and transfers the entire directory listing
  • When the path is a directory, the expensive parent LIST is skipped entirely
  • When the path is a file or non-existent, falls back to the original parent LIST behavior
  • Timestamps on CWD-resolved directories are lazily fetched via parent LIST when first requested
  • Adds FtpClient.isDirectory() with PWD save/restore to avoid changing the working directory

Symlink note

CWD follows symlinks transparently, so a symlink-to-directory is classified as a plain directory via the CWD fast path. This is acceptable because FTP symlink behavior in VFS is provider-internal — doGetType() resolves symlinks to their target type, and isSymbolicLink() always returns false for FTP (the FTP provider does not override doIsSymbolicLink()). The embedded MINA FTP server also cannot report symlinks in LIST output (it uses the old java.io.File API which follows symlinks transparently).

Depends on

This PR depends on #757 (root exists on disconnect fix). Please merge #757 first.

Test plan

  • New FtpCwdOptimizationTest with 5 tests:
    • Nested directory detected via CWD optimization
    • File detected via LIST fallback
    • Non-existent path returns IMAGINARY
    • Working directory preserved after mixed CWD/LIST operations
    • Directory timestamp lazily fetched via parent LIST (no NPE)
  • All existing FTP tests pass

@ilang ilang marked this pull request as draft March 31, 2026 22:35
@ilang ilang force-pushed the VFS-cwd-optimization branch from 0ec1a6e to c0cec45 Compare April 1, 2026 08:26
@ilang ilang marked this pull request as ready for review April 1, 2026 09:54
Copy link
Copy Markdown
Member

@garydgregory garydgregory left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello @ilang

Please rebase on git master and review my comments.

Isn't it possible that LIST would work when CDW could fail due to access restrictions?

TY!

* @param relPath The pathname of the directory to change to.
* @return true if successfully completed, false if not.
* @throws IOException If an I/O error occurs.
* @since 2.10.1
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The next feature version will be 2.11.0 as you can see in the POM. How did you get to 2.10.1? Is this AI generated?

Copy link
Copy Markdown
Contributor Author

@ilang ilang Apr 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the wrong version, I do work with AI, but I closely review it, and make sure there are tests in place. I guess I missed the version here.

* @param relPath The pathname to test.
* @return true if the path is an existing directory, false otherwise.
* @throws IOException If an I/O error occurs.
* @since 2.10.1
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The next feature version will be 2.11.0 as you can see in the POM. How did you get to 2.10.1? Is this AI generated?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the wrong version, I do work with AI, but I closely review it, and make sure there are tests in place. I guess I missed the version here.

manager.init();
final FileObject dir = manager.resolveFile(
FtpProviderTest.getConnectionUri() + "/cwdOptDir/nested", createOptions());
Assertions.assertTrue(dir.exists(), "Nested directory should exist");
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use static imports but only for JUnit.

@ilang
Copy link
Copy Markdown
Contributor Author

ilang commented Apr 5, 2026

Hello @ilang

Please rebase on git master and review my comments.

Isn't it possible that LIST would work when CDW could fail due to access restrictions?

TY!

Thanks, @garydgregory

Good point, had too look it up myself. and yes, CWD could fail due to access restrictions while LIST on the parent would still work.
However, this is already handled correctly in the implementation.
If CWD fails for any reason (access denied, not a directory, or non-existent), we fall back to the original parent LIST behavior:

in FTPFileObject :

private void setFTPFile(final boolean flush) throws IOException {
...
final FTPFile cwdResult = verifyDirectory();
if (cwdResult != null) {
newFileInfo = cwdResult;
} else {
newFileInfo = parent.getChildFile(UriParser.decode(getName().getBaseName()), flush);
}
...

The only time we don't fall back is when this is a root directory, however the root case is also covered: the factory already CWDs to the root directory during connection setup.
if CWD fails, the connection itself fails with FileSystemException. So if a user is connected, CWD "." on the root is guaranteed to work since they're already in that directory.

@ilang
Copy link
Copy Markdown
Contributor Author

ilang commented Apr 5, 2026

Hi @garydgregory, I am going to rebase and commit, I also added a very small optimization, to save one cwd command when it is the "." folder.

For directory existence checks, CWD is a lightweight control-channel
command with no data transfer, while LIST opens a data channel and
transfers the entire directory contents. When CWD succeeds (path is a
directory), the expensive parent LIST is skipped entirely. When CWD
fails (path is a file, symlink, non-existent, or access-restricted),
falls back to the original parent LIST behavior.

Add FtpClient.isDirectory() with PWD save/restore in FTPClientWrapper
to avoid changing the working directory when probing non-root paths.
Skip save/restore for CWD "." (root check) since it does not change
the working directory.

Rename verifyRootDirectory() to verifyDirectory() as it now serves
both root and non-root paths.

Fix NPE in getTimestampMillis() for CWD-resolved directories: the
synthetic FTPFile has no timestamp, so lazily fetch the real FTPFile
via parent LIST when timestamp is first requested.

Review fixes:
- Fix @SInCE 2.10.1 to @SInCE 2.11.0 to match POM version.
- Use static imports for JUnit assertions.
@ilang ilang force-pushed the VFS-cwd-optimization branch from c0cec45 to 2ed521d Compare April 5, 2026 19:10
@ilang ilang marked this pull request as draft April 5, 2026 21:12
@sebbASF
Copy link
Copy Markdown
Contributor

sebbASF commented Apr 6, 2026

I may be missing something here, but why not use the FTP command MLST [1] to return just the details for a single file/directory?

[1] https://commons.apache.org/proper/commons-net/apidocs/org/apache/commons/net/ftp/FTPClient.html#mlistFile(java.lang.String)

@ilang
Copy link
Copy Markdown
Contributor Author

ilang commented Apr 6, 2026

I may be missing something here, but why not use the FTP command MLST [1] to return just the details for a single file/directory?

[1] https://commons.apache.org/proper/commons-net/apidocs/org/apache/commons/net/ftp/FTPClient.html#mlistFile(java.lang.String)

Thanks @sebbASF I will check it, I moved this pull request to be a draft, to check more stuff, and while I am dealing with my pervious PR that needs fixing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants