Add filter file support when creating Resource Groups.#18
Add filter file support when creating Resource Groups.#18CCPCookies wants to merge 6 commits intocarbonengine:mainfrom
Conversation
Changes ------- * Filter file rules loading through legacy INI file format support added. * Filter logic matched from resfileserver and eve-resparser. * Documentation of filter file format added. * Documentation covering filter logic added. * Test coverage added for all known filter include scenarios. * Added new global filter rule which is useful for excluding .red files. * Filter logic for 'resfile' field not covered as it is covered by respaths logic. * New publicly exposed library function added to create Resource Groups with filtering. * Logic added to library function to allow skipping empty search directories. * Logic added to library to allow ascertaining compression size from remote filesystem. * Test coverage for library function added. * CLI extended to add Resource Group creation with filters. * Test coverage for CLI operation added. Version ------- Rather than changing 'CreateFromDirectory' in library and 'create-group' in CLI. A new function and operation was added to create Resource Groups using filters. This was to keep the API stable as there are now other parties working with the origional commands. Version minor was bumped. Extra ----- Filter ini file parsing logic is from PR16
|
|
||
| #include "CreateResourceGroupFromFilterCliOperation.h" | ||
|
|
||
| #include <string> |
There was a problem hiding this comment.
Minor, order of includes as per coding guidelines
https://didactic-adventure-egnoryz.pages.github.io/cpp_coding_guidelines.html#order-of-includes
tools/include/FilterFileReader.h
Outdated
|
|
||
| struct FilterFile | ||
| { | ||
| std::unordered_map<std::string, std::shared_ptr<Prefix>> prefixes; |
There was a problem hiding this comment.
NOTE when you iterate over this std::unordered_map, you will not get the items returned in the insertion order.
I "think" you may want it to return items in insertion order as part of the ResourceFilter::SetFromFilterFileData() function, that is calling m_prefixPaths.push_back(), which is NOT guaranteed as currently implemented.
There was a problem hiding this comment.
Yes you are right, nice catch
There was a problem hiding this comment.
I will also add a test to enforce the order importance
include/ResourceGroup.h
Outdated
| /// @note No file filtering supported | ||
| Result CreateFromDirectory( const CreateResourceGroupFromDirectoryParams& params ); | ||
|
|
||
| /// @brief Creates a ResourceGroup from a supplied filter files. |
There was a problem hiding this comment.
Typo.
... from supplied filter files. (remove the a)
or
... from a supplied filter file. (change to singular, remove s)
tools/src/FilterFileReader.cpp
Outdated
|
|
||
| ParseIncludeExcludeRules( globalFiltersStr, fileData.includeRules, fileData.excludeRules ); | ||
|
|
||
| // Get section infomration |
There was a problem hiding this comment.
Typo:
// Get section information
| ParseIncludeExcludeRules( globalFiltersStr, fileData.includeRules, fileData.excludeRules ); | ||
|
|
||
| // Get section infomration | ||
| for( const auto& sectionName : reader.Sections() ) |
There was a problem hiding this comment.
NOTE:
The INIReader will return Sections in alphabetical order, not the order they appear in the .ini file.
If you want to keep the sections in the order as defined in the file, you will have to read the file manually and find all the sections and then iterate over that list (instead of reader.Sections())
I.e. change it to do:
std::vectorstd::string sectionsInOrder = ManuallyReadIniFileSectionsInOrderExcludingDefault();
for( const auto& sectionName : sectionsInOrder)
There was a problem hiding this comment.
Yes, I remember you saying this. I don't see a scenario where order of sections matter.
I would also not want to be doing any ini parsing manually, this needs to be supplied by a library. The lib you found so far appears to be sort of ok. Another annoying thing it does is removes all the casing from the section names, not huge but I'd prefer it didn't.
There was a problem hiding this comment.
I agree. It's annoying that it lowercases everything.
It also cuts each sectionName to the first either 48 or 50 characters (can't remember which)
But this ini reader was the best fit, because it:
- had vcpkg support
- emulated the original python ini file implementation.
There was a problem hiding this comment.
As for if order of sections matter.
What about if the same respath and prefix are present in two different [namedSections], but one of them has an [someFilter] (include) where the other has the same ![someFilter] (exclude filter)?
What happens there?
There was a problem hiding this comment.
This will match as it would with previous system
|
|
||
| 1. Globally in the ``[DEFAULT]`` section using ``filter =`` . | ||
| 2. Section locally in sections using ``filter =`` . | ||
| 3. Semi locally to respath adding include/exclude rules to each path ``respaths = prefix1:/* [ include ] ![ exclude ]`` |
There was a problem hiding this comment.
This is correct,
but it renders strange when viewed in a browser (splits the line).
It would be better if the "respaths = prefix1:/* [ include ] ![ exclude ]" would be put on the line below.
|
|
||
| 2. Section local filters are combined with any filters specified in global filters. | ||
|
|
||
| 3. respaths filters combine with both global and section filters and importantly these add for all subsequent paths. This is explained more in the following examples. |
There was a problem hiding this comment.
Isn't this supposed to be:
"...and importantly these add for all subsequent paths WITHIN THE SELECTED SECTION."
I.e. filter defined in [SectionA] will not also be applied to all entries in [SectionB] onwards.
| Two paths will be tested for inclusion: | ||
|
|
||
| ``#3`` will use ``respaths = prefix1:/*`` and combine global and section local patterns ``include1`` and ``include2``. This will match the following from the source files: | ||
| 1. ``include1.txt`` |
There was a problem hiding this comment.
Add an empty line before the "1. include1.txt", for it to render correctly in a browser.
| 2. ``include2.txt`` | ||
|
|
||
| ``#4`` will use ``respaths = prefix1:/* [ include3 ]`` which will extend the section local patterns to include ``include3``. This will match the following source files: | ||
| 1. ``include1.txt`` |
There was a problem hiding this comment.
Add empty line, so it renders correctly in a browser.
| 3. ``include3.txt`` | ||
|
|
||
| ``#5`` will use ``respaths = prefix2:/*`` and doesn't sepecify any include rules. It will apply the include rules that have been constructed for the section at this point ``include1``, ``include2`` and ``include3``. This may be suprising. So this will match the following source files: | ||
| 1. ``Path/include3.txt`` |
There was a problem hiding this comment.
Add empty line to it renders correctly in browser.
|
|
||
| [exampleSection] | ||
| filter = [ include2 ] # 2. Section local include | ||
| respaths = prefix1:/* # 3. respath1 |
There was a problem hiding this comment.
NOTE this is not how multiple (multi-line) respaths are defined in existing res.ini files.
The correct example would look like:
[resCharacterMisc]
respaths = res:/Graphics/Character/Global/...
res:/Graphics/Character/Female/Skeleton/...
res:/Graphics/Character/Female/*
res:/Graphics/Character/Male/Skeleton/...
res:/Graphics/Character/Male/*
res:/Graphics/Character/Unique/...
Where the multi-line entries are within the SAME "respaths" attribute.
There was a problem hiding this comment.
Ah yeah, I'll change the documentation. The actual tests don't do this, this is just documentation.
| #include <unordered_set> | ||
|
|
||
| CreateResourceGroupFromFilterCliOperation::CreateResourceGroupFromFilterCliOperation() : | ||
| CliOperation( "create-group-from-filter", "Create a Resource Group from a filter files." ), |
There was a problem hiding this comment.
mixing singular and plural
| #include <Md5ChecksumStream.h> | ||
| #include <GzipCompressionStream.h> | ||
| #include <cctype> | ||
| #include "ResourceInfo/PatchResourceGroupInfo.h" |
There was a problem hiding this comment.
Look at include grouping and ordering from:
https://didactic-adventure-egnoryz.pages.github.io/cpp_coding_guidelines.html#order-of-includes
src/ResourceGroupImpl.cpp
Outdated
| return Result{ ResultType::FAILED_TO_INITIALIZE_RESOURCE_FILTER, errorMsg }; | ||
| } | ||
|
|
||
| statusSettings.Update( StatusProgressType::PERCENTAGE, 0, 5, "Loading filter files" ); |
There was a problem hiding this comment.
This status line has 0, 5 like the line for "Create resource group from filters".
Should the numbers be updated (is this a copy-paste error)?
|
|
||
| if( inputDirectoryStatus.RequiresStatusUpdates() ) | ||
| { | ||
| float step = static_cast<float>( 100.0 / searchPaths.size() ); |
There was a problem hiding this comment.
The step variable is going to be the same in every iteration of the loop.
Can be calculated outside fo the for loop.
There was a problem hiding this comment.
Nah, this way that computation is skipped if not verbose, so we don't calculate something we don't need when not caring about the output.
| } | ||
| else | ||
| { | ||
| return Result{ ResultType::INPUT_DIRECTORY_DOESNT_EXIST, inputDirectory.string() }; |
There was a problem hiding this comment.
I'm confused.
Why would you ever not want to skip non existent input directories?
There was a problem hiding this comment.
Is it only for testing/debug purposes?
There was a problem hiding this comment.
No this is due to real world usecase of reduced-resources.
It only syncs files that changed, so in theory it might (and usually does) not a single file in a search directory. If it's not synced then there is no directory. But this is not a fail case.
src/ResourceGroupImpl.cpp
Outdated
|
|
||
| ss << "Processing file: " | ||
| << filePathRelativeToInputDirectory.string() | ||
| << ", Match filter: " |
There was a problem hiding this comment.
"Match filter:" vs matchSection
I guess this is supposed to be "Match Section:" or "Match Section Id:"
Unless you also reference return the "current include/exclude filter from the CheckPath() function and return that as well. Might be useful for debugging.
There was a problem hiding this comment.
I actually was thinking to return the current line number for the path rule so that you can see really well. But not required any further information so didn't bother to skip some computation time.
There was a problem hiding this comment.
But the wording is still wrong though, right?
This is supposed to be "section", not "filter", right?
There was a problem hiding this comment.
Sorry, yeah, i've changed it to matched filter section
src/ResourceGroupImpl.cpp
Outdated
| resourceParams.binaryOperation = ResourceTools::CalculateBinaryOperation( entry.path() ); | ||
| } | ||
|
|
||
| Location l; |
There was a problem hiding this comment.
Can you change the name of the variable to be more descriptive?
|
|
||
| #include <filesystem> | ||
| #include <vector> | ||
| #include <unordered_map> |
There was a problem hiding this comment.
Imports in alphabetical order
|
|
||
| struct FilterPath | ||
| { | ||
| std::string sectionId; |
There was a problem hiding this comment.
In other structs / classes you've put an empty line between items.
Missing in this struct.
tools/src/FilterFileReader.cpp
Outdated
| void FilterFileReader::ParseIncludeExcludeRules( const std::string& rulesStr, std::set<std::string>& includeRules, std::set<std::string>& excludeRules ) | ||
| { | ||
|
|
||
| std::string s = rulesStr; |
There was a problem hiding this comment.
The "ruleStr" variable could just be passed into this function by value and then you could just use it directly, instead of doing the extra:
std::string s = ruleStr;
There was a problem hiding this comment.
Honestly not really looked at this code too closely, it's pretty much just hooking up the code from your PR. I'll do a pass on it before a take this PR out of draft.
There was a problem hiding this comment.
looked at this, appears that s is never changed so I removed the copy and used the reference.
tools/src/ResourceFilter.cpp
Outdated
|
|
||
| #include "ResourceFilter.h" | ||
|
|
||
| #include <regex> |
There was a problem hiding this comment.
Alphabetical order of includes
| namespace ResourceTools | ||
| { | ||
|
|
||
| ResourceFilter::ResourceFilter() |
There was a problem hiding this comment.
The constructor and destructor could be replaced with this definition in the header file:
ResourceFilter() = default;
~ResourceFilter() = default;
| namespace ResourceTools | ||
| { | ||
|
|
||
| FilterFileReader::FilterFileReader() |
There was a problem hiding this comment.
Can be replaced with this in the header file:
FilterFileReader() = default;
~FilterFileReader() = default;
| m_paths.clear(); | ||
|
|
||
| // Populate prefix paths | ||
| for( auto& prefix : fileData.prefixes ) |
There was a problem hiding this comment.
Already mentioned:
fileData.prefixes is not ordered in the insertion order.
| std::unique_ptr<FilterPath> filterPath = std::make_unique<FilterPath>(); | ||
|
|
||
| // Normalise path and convert to pattern | ||
| std::string prefixPathStr = prefixPath.string(); |
There was a problem hiding this comment.
Have you considered using prefixPath.lexically_normal.generic_string()?
There was a problem hiding this comment.
It "should" take care of all the . \ and / checks you're manually doing in the lines below.
tools/src/ResourceFilter.cpp
Outdated
| return true; | ||
| } | ||
|
|
||
| void ResourceFilter::ConvertResPathToPattern( const std::string& resPath, std::string& pattern ) const |
There was a problem hiding this comment.
tools/src/ResourceFilter.cpp
Outdated
|
|
||
| void ResourceFilter::ConvertResPathToPattern( const std::string& resPath, std::string& pattern ) const | ||
| { | ||
| std::string resPathString = resPath; |
There was a problem hiding this comment.
If resPath is passed by value, this is not needed.
tools/src/ResourceFilter.cpp
Outdated
| return true; | ||
| } | ||
|
|
||
| void ResourceFilter::ConvertResPathToPattern( const std::string& resPath, std::string& pattern ) const |
There was a problem hiding this comment.
Why not just retrun "pattern", instead of it being a reference variable?
| return CheckPath( path, sectionId, matchPath ); | ||
| } | ||
|
|
||
| bool ResourceFilter::CheckPath( const std::filesystem::path& path, std::string& matchSectionId, std::string& matchPath ) const |
There was a problem hiding this comment.
tools/src/ResourceFilter.cpp
Outdated
| { | ||
| for( auto& filterPath : m_paths ) | ||
| { | ||
| std::string resolvedPathStr = path.string(); |
There was a problem hiding this comment.
Possible simplification.
Have you considered using prefixPath.lexically_normal.generic_string()?
It should sort out the extra checks being done below.
There was a problem hiding this comment.
Good point but don't need the lexically_normal part, thanks
tools/src/ResourceFilter.cpp
Outdated
| } | ||
| } | ||
|
|
||
| // Excludes |
There was a problem hiding this comment.
Shouldn't exclude rules be checked before include rules?
There was a problem hiding this comment.
Did this, also profiler shows that it makes sense to do this above regex check too. Also moved things around so that after the first include exclude check that fails for a section, all subsequent section path checks are skipped as they will fail too at that point. Considerable speed boost.
tests/src/ResourcesCliTest.cpp
Outdated
|
|
||
| arguments.push_back( "-1" ); | ||
|
|
||
| std::filesystem::path basePath = GetTestFileFileAbsolutePath( "CreateResourceFiles/ResourceFiles" ); |
There was a problem hiding this comment.
I just noticed this now :)
But the function name is stuttering, no need to change unless you want to.
There was a problem hiding this comment.
What! how have I never seen that?! Thanks
| std::filesystem::path invalidPath = "File.type1"; | ||
|
|
||
| ASSERT_FALSE( resourceFilter.CheckPath( invalidPath ) ); | ||
|
|
There was a problem hiding this comment.
Should you also include a check for a validPath in this test?
There was a problem hiding this comment.
OK. I'll do all of these however there are no valid paths here, but I can add another test to strengthen anyway
|
|
||
| std::filesystem::path invalidPath = "File.type1"; | ||
|
|
||
| ASSERT_FALSE( resourceFilter.CheckPath( invalidPath ) ); |
There was a problem hiding this comment.
Should you also include a check for a validPath in this test?
|
|
||
| std::filesystem::path invalidPath = "File"; | ||
|
|
||
| ASSERT_FALSE( resourceFilter.CheckPath( invalidPath ) ); |
There was a problem hiding this comment.
Should you also include a check for a validPath in this test (e.g. for "SomethingElse")?
|
|
||
| std::filesystem::path validPath = "File"; | ||
|
|
||
| ASSERT_TRUE( resourceFilter.CheckPath( validPath ) ); |
There was a problem hiding this comment.
Should you also include a check for invalidFile in this test?
|
|
||
| std::filesystem::path invalidPath = "File"; | ||
|
|
||
| ASSERT_FALSE( resourceFilter.CheckPath( invalidPath ) ); |
There was a problem hiding this comment.
Should you also include a check for a valid file "SomeThingValid" in this test?
|
|
||
| std::filesystem::path invalidPath = "File"; | ||
|
|
||
| ASSERT_FALSE( resourceFilter.CheckPath( invalidPath ) ); |
There was a problem hiding this comment.
Also check if something is valid
|
|
||
| std::filesystem::path prefix2InvalidPath = "Path2/File"; | ||
|
|
||
| ASSERT_FALSE( resourceFilter.CheckPath( prefix2InvalidPath ) ); |
There was a problem hiding this comment.
You might want to check for "SomethingValid" on both prefixes as well.
|
|
||
| std::filesystem::path prefix2ValidPath = "Path2/File"; | ||
|
|
||
| ASSERT_TRUE( resourceFilter.CheckPath( prefix2ValidPath ) ); |
There was a problem hiding this comment.
What about invalid checks in this test?
|
|
||
| std::filesystem::path prefix2InvalidPath = "Path2/File"; | ||
|
|
||
| ASSERT_FALSE( resourceFilter.CheckPath( prefix2InvalidPath ) ); |
There was a problem hiding this comment.
Add checks for "SomethingValid" on both prefixes as well.
|
|
||
| std::filesystem::path invalidPath = "File"; | ||
|
|
||
| ASSERT_FALSE( resourceFilter.CheckPath( invalidPath ) ); |
There was a problem hiding this comment.
What happens if you check for "SomethingElse", will it match or not?
This change has been tested against all real branch data with no obvious differences from the previous system it is to replace. * Get index filter mapping from yaml file passed to CLI * Reworked filter library processing to process many groups at once * Filter logic now forced to lowercase to match previous tool * Addressed previous draft PR feedback
Changes
Version
Rather than changing 'CreateFromDirectory' in library and 'create-group' in CLI. A new function and operation was added to create Resource Groups using filters. This was to keep the API stable as there are now other parties working with the origional commands. Version minor was bumped.
Extra
Filter ini file parsing logic is from PR16