Open
Conversation
Tesselator.h: Branchless min/max in Bounds via Min/Max; const ref on addBounds Tesselator.cpp: clamp for color clamping; Min/Max initializer lists in packCompactQuad min/max finding and du/dv clamping Chunk.cpp: bool[256] lookup table for occluder tile IDs in occlusion culling; [[unlikely]] hints on rebuild inner loop early-outs LevelRenderer.cpp: __restrict + local caching in frustum clip(); [[likely]]/[[unlikely]] on chunk render loop continues
Contributor
|
Amazing |
Contributor
|
Initial spawn does seem a bit quicker upon spawning in. The rendering while moving is about the same which is a shame, considering your optimizations. |
Chunk.cpp/h: - thread_local occluder table via std::array<bool,256> IIFE; queries isSolidRender() for all 255 tile types instead of only stone/dirt/bedrock - Null-check getChunkAt() early-out with correct COMPILED/NOTSKYLIT/EMPTY flags - Defer Region + TileRenderer construction past empty-check to skip 9 chunk lookups and 128KB memset for empty chunks (common post-teleport) - Replace Win32 TLS API (TlsAlloc/TlsSetValue/TlsGetValue) + new/delete with static thread_local array for per-thread tileIds storage LevelRenderer.cpp: - Rewrite updateDirtyChunks nearest-chunk search as linear ClipChunk scan; dirty-flag checked before any distance work, batch-clear empty dirty chunks upfront via isRenderChunkEmpty to shrink dirty set across frames - const on all distance/flag locals Region.h/cpp: - Stack buffer flatChunks_stack[16] for common render-chunk regions (4x4), std::unique_ptr<LevelChunk*[]> heap fallback for large pathfinding regions; eliminates per-rebuild heap alloc/free on the hot path - Deleted copy/move ops; std::fill_n for null-init; static constexpr constant TileRenderer.h/cpp: - static thread_local cache array replaces per-instance new/delete - Remove dead getLightColorCount, cacheOwned, conditional delete[] - static constexpr cache size; defaulted destructor Level.cpp: Region constructed directly on stack (no copy) stdafx.h: added <array>
|
bump, this works. |
Author
more optimisations done now with noticable difference from my testing |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
This PR introduces a multi-tiered optimization for the chunk rebuilding process. It replaces repeated branching logic with a pre-calculated lookup table and eliminates a redundant pre-calculation loop (removing a 3D boolean array). This significantly reduces memory bandwidth and CPU overhead on chunk rebuild threads, resulting in much faster terrain generation, quicker mesh building, and smoother rendering during player movement.
Changes
Previous Behavior
During chunk rebuilding, the engine utilized a two-pass system:
Root Cause
New Behavior
The engine now calculates occlusion purely inline using a single-pass system:
Fix Implementation
AI Use Disclosure
No AI was used
Related Issues
N/A