diff --git a/docs/lakehouse/data-cache.md b/docs/lakehouse/data-cache.md index 2599d01d95e7f..c7772bf552a85 100644 --- a/docs/lakehouse/data-cache.md +++ b/docs/lakehouse/data-cache.md @@ -2,7 +2,7 @@ { "title": "Data Cache", "language": "en", - "description": "Data Cache accelerates subsequent queries of the same data by caching recently accessed data files from remote storage systems (HDFS or object " + "description": "Apache Doris Data Cache accelerates Lakehouse queries by caching HDFS and object storage data locally. Supports cache warmup, quota control, and admission control for Hive, Iceberg, Hudi, and Paimon tables." } --- @@ -198,6 +198,7 @@ Usage restrictions: FROM hive_db.tpch100_parquet.lineitem WHERE dt = '2025-01-01'; ``` + 3. Warm up partial columns by filter conditions ```sql @@ -224,14 +225,147 @@ The system directly returns scan and cache write statistics for each BE (Note: S Field explanations: -* ScanRows: Number of rows scanned and read. -* ScanBytes: Amount of data scanned and read. -* ScanBytesFromLocalStorage: Amount of data scanned and read from local cache. -* ScanBytesFromRemoteStorage: Amount of data scanned and read from remote storage. -* BytesWriteIntoCache: Amount of data written to Data Cache during this warmup. +* `ScanRows`: Number of rows scanned and read. +* `ScanBytes`: Amount of data scanned and read. +* `ScanBytesFromLocalStorage`: Amount of data scanned and read from local cache. +* `ScanBytesFromRemoteStorage`: Amount of data scanned and read from remote storage. +* `BytesWriteIntoCache`: Amount of data written to Data Cache during this warmup. + +## Cache Admission Control + +> This is an experimental feature and is supported since version 4.1.0. + +The cache admission control feature provides a mechanism that allows users to control whether data read by a query is allowed to enter the File Cache (Data Cache) based on dimensions such as User, Catalog, Database, and Table. +In scenarios with massive cold data reads (e.g., large-scale ETL jobs or heavy ad-hoc queries), if all read data is allowed to enter the cache, it may cause existing hot data to be frequently evicted (i.e., "cache pollution"), leading to a drop in cache hit rates and overall query performance. When enabled, data denied admission will be pulled directly from remote underlying storage (e.g., HDFS, S3), effectively protecting core hot data from being swapped out. + +The cache admission control feature is disabled by default and needs to be enabled by configuring relevant parameters in the FE. + +### FE Configuration + +You need to enable this feature and specify the rule configuration file path in `fe.conf`, then restart the FE node for it to take effect. Modifications to the rule files themselves can be loaded dynamically. + +| Parameter | Required | Description | +| ----------------------------------------------- | -------- |-------------------------------------------| +| `enable_file_cache_admission_control` | Yes | Whether to enable cache admission control. Default is `false`. | +| `file_cache_admission_control_json_dir` | Yes | The directory path for storing admission rules JSON files. All `.json` files in this directory will be automatically loaded, and any rule additions, deletions, or modifications will **take effect dynamically**. | + +### Admission Rules Configuration Format + +Rule configurations are placed in `.json` files under the `file_cache_admission_control_json_dir` directory. The file content must be in a JSON array format. + +#### Field Description + +| Field Name | Type | Description | Example | +|--------|------|-------------------------------|-----------------------| +| `id` | Long | Rule ID. | `1` | +| `user_identity` | String | User identity (format: `user@host`, e.g., `%` matches all IPs). **Leaving it empty or omitting it matches all users globally.** | `"root@%"` | +| `catalog_name` | String | Catalog name. **Leaving it empty or omitting it matches all catalogs.** | `"hive_cat"` | +| `database_name` | String | Database name. **Leaving it empty or omitting it matches all databases.** | `"db1"` | +| `table_name` | String | Table name. **Leaving it empty or omitting it matches all tables.** | `"tbl1"` | +| `partition_pattern` | String | (Not implemented yet) Partition regular expression. Empty means matching all partitions. | `""` | +| `rule_type` | Integer | Rule type: `0` means deny cache (blacklist); `1` means allow cache (whitelist). | `0` | +| `enabled` | Integer | Whether the current rule is enabled: `0` means disabled; `1` means enabled. | `1` | +| `created_time` | Long | Creation time (UNIX timestamp, seconds). | `1766557246` | +| `updated_time` | Long | Update time (UNIX timestamp, seconds). | `1766557246` | + +#### JSON File Example + +```json +[ + { + "id": 1, + "user_identity": "root@%", + "catalog_name": "hive_cat", + "database_name": "db1", + "table_name": "table1", + "partition_pattern": "", + "rule_type": 0, + "enabled": 1, + "created_time": 1766557246, + "updated_time": 1766557246 + }, + { + "id": 2, + "user_identity": "", + "catalog_name": "hive_cat", + "database_name": "", + "table_name": "", + "partition_pattern": "", + "rule_type": 1, + "enabled": 1, + "created_time": 1766557246, + "updated_time": 1766557246 + } +] +``` + +#### Import Rules from MySQL + +For users with automated system integration needs, an auxiliary script is provided at `tools/export_mysql_rule_to_json.sh` in the Doris source code repository. This script can be used to export cache admission rules pre-stored in a MySQL database table into JSON configuration files that comply with the above format. + +### Rule Matching Principles + +#### Rule Scope Categories + +By combining different fields (`user_identity`, `catalog_name`, `database_name`, `table_name`) as either empty or specific values, the system supports 7 dimensions of valid rules. **Any rule configuration that does not comply with hierarchical dependencies (for example, skipping Database to specify Table directly) will be considered invalid.** + +| user_identity | catalog_name | database_name | table_name | Level and Scope | +|---------------|--------------|---------------|------------|------------------| +| **Specified** | **Specified** | **Specified** | **Specified** | Table-level rule for specified user | +| Empty or Omitted | **Specified** | **Specified** | **Specified** | Table-level rule for all users | +| **Specified** | **Specified** | **Specified** | Empty or Omitted | Database-level rule for specified user | +| Empty or Omitted | **Specified** | **Specified** | Empty or Omitted | Database-level rule for all users | +| **Specified** | **Specified** | Empty or Omitted | Empty or Omitted | Catalog-level rule for specified user | +| Empty or Omitted | **Specified** | Empty or Omitted | Empty or Omitted | Catalog-level rule for all users | +| **Specified** | Empty or Omitted | Empty or Omitted | Empty or Omitted | Global rule for specified user | + +#### Matching Priority and Order + +When a query accesses a table's data, the system comprehensively evaluates all rules to make an admission decision. The judgment process follows these principles: + +1. **Exact Match First**: Matching is conducted in order from specific to broad hierarchy (Table → Database → Catalog → Global). Once successfully matched at the most precise level (e.g., Table level), the judgment terminates immediately. +2. **Blacklist First (Security Principle)**: Within the same rule level, **deny cache rules always take precedence over allow cache rules**. If both a blacklist and a whitelist are matched simultaneously, the blacklist operation executes first, ensuring that access denial decisions at the same level take effect first. + +The complete decision derivation sequence is as follows: + +```text +1. Table-level rule matching + a) Hit Blacklist (rule_type=0) -> Deny + b) Hit Whitelist (rule_type=1) -> Allow +2. Database-level rule matching + ... +3. Catalog-level rule matching + ... +4. Global rule matching (only user_identity matched) + ... +5. Default fallback decision: If no rules at any level above match, caching is [Denied] by default (equivalent to a global blacklist). +``` + +> **Tip**: Because the system's fallback strategy is default-deny, best practice when deploying this feature is generally to establish a broad global allowance rule (e.g., a whitelist for all users, or for an important business Catalog), and then configure targeted Table-level blacklists for large tables known to undergo offline full-table scans. This achieves refined separation of cold and hot data. + +### Cache Decision Observability + +After successfully enabling and applying the configuration, users can view detailed cache admission decisions at the file data level for a single table via the `EXPLAIN` command (refer to the `file cache request` output below). + +```text +| 0:VHIVE_SCAN_NODE(74) | +| table: test_file_cache_features.tpch1_parquet.lineitem | +| inputSplitNum=10, totalFileSize=205792918, scanRanges=10 | +| partition=1/1 | +| cardinality=1469949, numNodes=1 | +| pushdown agg=NONE | +| file cache request ADMITTED: user_identity:root@%, reason:user table-level whitelist rule, cost:0.058 ms | +| limit: 1 | +``` + +Key fields and decision descriptions: +- **ADMITTED** / **DENIED**: Represents whether the request is allowed (ADMITTED) or rejected (DENIED) from entering the cache. +- **user_identity**: The user identity verified during the execution of this query. +- **reason**: The specific decision reason (the matched rule) that triggered the result. Common outputs include: `user table-level whitelist rule` (Current example: Table-level whitelist for a specified user); `common table-level blacklist rule` (Table-level blacklist for all users). The format is generally `[Scope] [Rule Level] [Rule Type] rule`. +- **cost**: The time cost to complete the entire admission matching calculation (in milliseconds). If the overhead is too large, it can be optimized by adjusting the number of rule hierarchies. ## Appendix ### Principle -Data caching caches accessed remote data to the local BE node. The original data file is split into Blocks based on the accessed IO size, and Blocks are stored in the local file `cache_path/hash(filepath).substr(0, 3)/hash(filepath)/offset`, with Block metadata saved in the BE node. When accessing the same remote file, doris checks whether the cache data of the file exists in the local cache and determines which data to read from the local Block and which data to pull from the remote based on the Block's offset and size, caching the newly pulled remote data. When the BE node restarts, it scans the `cache_path` directory to restore Block metadata. When the cache size reaches the upper limit, it cleans up long-unused Blocks according to the LRU principle. \ No newline at end of file +Data caching caches accessed remote data to the local BE node. The original data file is split into Blocks based on the accessed IO size, and Blocks are stored in the local file `cache_path/hash(filepath).substr(0, 3)/hash(filepath)/offset`, with Block metadata saved in the BE node. When accessing the same remote file, Doris checks whether the cache data of the file exists in the local cache and determines which data to read from the local Block and which data to pull from the remote based on the Block's offset and size, caching the newly pulled remote data. When the BE node restarts, it scans the `cache_path` directory to restore Block metadata. When the cache size reaches the upper limit, it cleans up long-unused Blocks according to the LRU principle. \ No newline at end of file diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/current/lakehouse/data-cache.md b/i18n/zh-CN/docusaurus-plugin-content-docs/current/lakehouse/data-cache.md index 9582364896a94..2f47bdacb85dd 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/current/lakehouse/data-cache.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/current/lakehouse/data-cache.md @@ -2,7 +2,7 @@ { "title": "数据缓存", "language": "zh-CN", - "description": "数据缓存(Data Cache)通过缓存最近访问的远端存储系统(HDFS 或对象存储)的数据文件到本地磁盘上,加速后续访问相同数据的查询。在频繁访问相同数据的查询场景中,Data Cache 可以避免重复的远端数据访问开销,提升热点数据的查询分析性能和稳定性。" + "description": "Apache Doris 数据缓存(Data Cache)将 HDFS 及对象存储数据缓存至本地磁盘,加速 Lakehouse 查询性能。支持缓存预热、配额限制与准入控制,适用于 Hive、Iceberg、Hudi、Paimon 表查询场景。" } --- @@ -37,7 +37,7 @@ | 参数 | 必选项 | 说明 | | ------------------- | --- | -------------------------------------- | | `enable_file_cache` | 是 | 是否启用 Data Cache,默认 false | -| `file_cache_path` | 是 | 缓存目录的相关配置,json 格式。 | +| `file_cache_path` | 是 | 缓存目录的相关配置,JSON 格式。 | | `clear_file_cache` | 否 | 默认 false。如果为 true,则当 BE 节点重启时,会清空缓存目录。 | `file_cache_path` 的配置示例: @@ -70,7 +70,7 @@ SET GLOBAL enable_file_cache = true; ### 查看缓存命中情况 -执行 `set enable_profile=true` 打开会话变量,可以在 FE 的 web 页面的 `Queris` 标签中查看到作业的 Profile。数据缓存相关的指标如下: +执行 `set enable_profile=true` 打开会话变量,可以在 FE 的 Web 页面的 `Queries` 标签中查看到作业的 Profile。数据缓存相关的指标如下: ```sql - FileCache: 0ns @@ -198,6 +198,7 @@ FROM FROM hive_db.tpch100_parquet.lineitem WHERE dt = '2025-01-01'; ``` + 3. 根据过滤条件预热部分列 ```sql @@ -224,14 +225,155 @@ FROM 字段解释 -* ScanRows:扫描读取行数。 -* ScanBytes:扫描读取数据量。 -* ScanBytesFromLocalStorage:从本地缓存扫描读取的数据量。 -* ScanBytesFromRemoteStorage:从远端存储扫描读取的数据量。 -* BytesWriteIntoCache:本次预热写入 Data Cache 的数据量。 +* `ScanRows`:扫描读取行数。 +* `ScanBytes`:扫描读取数据量。 +* `ScanBytesFromLocalStorage`:从本地缓存扫描读取的数据量。 +* `ScanBytesFromRemoteStorage`:从远端存储扫描读取的数据量。 +* `BytesWriteIntoCache`:本次预热写入 Data Cache 的数据量。 + +## 缓存准入控制 + +> 该功能为实验性功能,自 4.1.0 版本起支持。 + +通过缓存准入控制功能,用户可以基于操作用户、Catalog、Database 以及 Table 等维度配置规则,精细化管理查询产生的数据是否允许写入 Data Cache。 + +在某些业务场景(如全表扫描的大规模 ETL 作业或不可预期的 Ad-hoc 分析)中,大量的冷数据读取可能会迅速填满缓存空间,导致高频访问的“热数据”被频繁置换(即缓存污染问题)。此时系统整体的缓存命中率和查询性能将出现严重下滑。通过配置缓存准入规则拒绝此类作业的数据进入缓存,可以有效保护热数据,确保整个系统的 Data Cache 命中率维持在稳定水平。 + +缓存准入控制功能默认关闭,需要在 FE 节点上配置相关参数来开启。 + +### FE 配置 + +需要在 FE 节点的配置文件 `fe.conf` 中新增如下参数以开启缓存准入控制。开启及配置目录路径需要重启 FE 节点生效,但目录下规则文件的修改支持动态加载。 + +| 参数 | 必选项 | 说明 | +| ----------------------------------------- | ------ | ------------------------------------------------------------ | +| `enable_file_cache_admission_control` | 是 | 是否启用缓存准入控制功能。默认为 `false`(不启用)。 | +| `file_cache_admission_control_json_dir` | 是 | 存放缓存准入规则 JSON 文件的目录路径。该目录下的所有 `.json` 文件均会被自动加载并实时监听修改,规则更替**动态生效**无需重启节点。 | + +### 准入控制规则配置 + +规则文件应存放于 `file_cache_admission_control_json_dir` 所配置的目录下,且后缀需为 `.json`。 + +#### 参数说明 + +规则以 JSON 数组形式提供,每个 JSON Object 代表一条规则,各字段及说明如下: + +| 字段名 | 类型 | 说明 | 示例 | +| ------------------- | -------- | ------------------------------------------------------------ | ------------ | +| `id` | Long | 规则 ID,供用户区分不同的规则。 | `1` | +| `user_identity` | String | 用户标识(格式为 `user@host`,其中 `%` 表示匹配所有 IP)。**留空 (`""`) 表示匹配全局所有用户**。 | `"root@%"` | +| `catalog_name` | String | Catalog 名称。**留空 (`""`) 表示匹配所有 Catalog**。 | `"hive_cat"` | +| `database_name` | String | Database 名称。**留空 (`""`) 表示匹配所有 Database**。 | `"db1"` | +| `table_name` | String | Table 名称。**留空 (`""`) 表示匹配所有表**。 | `"tbl1"` | +| `partition_pattern` | String | (当前版本暂未实现)分区正则表达式。留空表示匹配所有分区。 | `""` | +| `rule_type` | Integer | 规则类型。`0` 表示禁止缓存(黑名单);`1` 表示允许缓存(白名单)。 | `0` | +| `enabled` | Integer | 该规则是否生效启用。`0` 表示停用;`1` 表示启用。 | `1` | +| `created_time` | Long | 记录的创建时间(UNIX 时间戳,秒)。 | `1766557246` | +| `updated_time` | Long | 记录的最新更新时间(UNIX 时间戳,秒)。 | `1766557246` | + +#### JSON 文件样例 + +```json +[ + { + "id": 1, + "user_identity": "root@%", + "catalog_name": "hive_cat", + "database_name": "db1", + "table_name": "table1", + "partition_pattern": "", + "rule_type": 0, + "enabled": 1, + "created_time": 1766557246, + "updated_time": 1766557246 + }, + { + "id": 2, + "user_identity": "", + "catalog_name": "hive_cat", + "database_name": "", + "table_name": "", + "partition_pattern": "", + "rule_type": 1, + "enabled": 1, + "created_time": 1766557246, + "updated_time": 1766557246 + } +] +``` + +#### 从 MySQL 导入规则 + +对于有自动化系统对接需求的用户,在 Doris 源码库 `tools/export_mysql_rule_to_json.sh` 路径下提供了辅助脚本。可以使用该脚本将预先存储在 MySQL 数据库表中的缓存准入规则导出为符合上述格式的 JSON 配置文件。 + +### 规则匹配原理 + +#### 规则作用域分类 + +根据填写的字段明确程度,规则支持以下几种层级和形式的生效范围: + +| user_identity | catalog_name | database_name | table_name | 级别与作用域 | +| ------------- | ------------ | ------------- | ---------- | ---------------------------- | +| **非空** | **非空** | **非空** | **非空** | **指定用户・Table 级规则** | +| 空缺或留空 | **非空** | **非空** | **非空** | **全体用户・Table 级规则** | +| **非空** | **非空** | **非空** | 空缺或留空 | **指定用户・Database 级规则**| +| 空缺或留空 | **非空** | **非空** | 空缺或留空 | **全体用户・Database 级规则**| +| **非空** | **非空** | 空缺或留空 | 空缺或留空 | **指定用户・Catalog 级规则** | +| 空缺或留空 | **非空** | 空缺或留空 | 空缺或留空 | **全体用户・Catalog 级规则** | +| **非空** | 空缺或留空 | 空缺或留空 | 空缺或留空 | **指定用户・全局规则** | + +> **说明:** +> - **空缺或留空**:表示该字段在 JSON 中配置为空字符串 `""` 或者是直接省略不配置(效果等同于空字符串 `""`)。 +> - **非空**:表示对应字段必须是一个明确的值(如 `"hive_cat"`)。 +> - 任何跳跃或不符合层级依赖关系的规则配置均视为**无效规则**,不会被解析生效。例如:配置了 `database_name` 但 `catalog_name` 为空。 + +#### 匹配及优先级顺序 + +判定某个查询访问目标表数据时是否写入缓存,受配置的多条规则综合影响。匹配原则如下: + +1. **精确优先原则**:按层级由明细到宽泛的顺序(Table → Database → Catalog → 全局)进行匹配,优先匹配精确度更高的规则。 +2. **拒绝优先(安全优先)**:在同一个规则内或同级别(如既存在指定用户的黑名单规则,又存在全局的白名单规则),**禁止缓存(黑名单)规则的优先级永远高于允许缓存(白名单)规则**。拒绝访问的决策最先被识别生效。 + +完整的决策推导链路如下: + +```text +1. Table 级规则匹配 + a) 命中黑名单 (rule_type=0) -> 拒绝 + b) 命中白名单 (rule_type=1) -> 允许 +2. Database 级规则匹配 + ... +3. Catalog 级规则匹配 + ... +4. 全局规则匹配 (仅匹配 user_identity) + ... +5. 兜底默认决策:如果所有的层级也没有匹配上,系统默认【拒绝】写入缓存操作(等同全局黑名单)。 +``` + +> **提示**:由于系统默认兜底为拒绝准入,因此在实际部署该功能时,最佳实践通常是建立一条广泛的全局允许规则(如全员白名单,或某个重要业务 Catalog 的白名单),然后再针对已知会进行全表扫描的大表配置针对性的 Table 级黑名单,由此实现精细化的冷热数据剥离。 + +### 缓存决策可观测性 + +当开启规则并在系统中生效起作用后,用户可以通过向目标表发送 `EXPLAIN` 命令,审查其 File Cache 准入控制决策情况(聚焦关注节点下方的 `file cache request` 决策输出信息)。 + +```text +| 0:VHIVE_SCAN_NODE(74) | +| table: test_file_cache_features.tpch1_parquet.lineitem | +| inputSplitNum=10, totalFileSize=205792918, scanRanges=10 | +| partition=1/1 | +| cardinality=1469949, numNodes=1 | +| pushdown agg=NONE | +| file cache request ADMITTED: user_identity:root@%, reason:user table-level whitelist rule, cost:0.058 ms | +| limit: 1 | +``` + +重点字段与判定说明: +- **ADMITTED** / **DENIED**:代表请求被允许写入缓存(ADMITTED)或被直接拒绝(DENIED)。如果被拒绝,数据访问直接击穿至远端底层存储。 +- **user_identity**:执行本次查询鉴别出的用户凭证身份。 +- **reason**:命中判定策略的具体原因说明。常见的输出原因如:`user table-level whitelist rule`(当前示例:指定用户 Table 级白名单规则);`common table-level blacklist rule`(全体用户 Table 级黑名单规则)。此类原因格式一般为 `[作用范围] [规则级别] [规则类型] rule`。 +- **cost**:完成整个准入匹配计算过程的耗时开销(以毫秒 ms 为单位)。如果开销过大,可通过简化规则层级数进行调整。 ## 附录 ### 原理 -数据缓存将访问的远程数据缓存到本地的 BE 节点。原始的数据文件会根据访问的 IO 大小切分为 Block,Block 被存储到本地文件 `cache_path/hash(filepath).substr(0, 3)/hash(filepath)/offset` 中,并在 BE 节点中保存 Block 的元信息。当访问相同的远程文件时,doris 会检查本地缓存中是否存在该文件的缓存数据,并根据 Block 的 offset 和 size,确认哪些数据从本地 Block 读取,哪些数据从远程拉起,并缓存远程拉取的新数据。BE 节点重启的时候,扫描 `cache_path` 目录,恢复 Block 的元信息。当缓存大小达到阈值上限的时候,按照 LRU 原则清理长久未访问的 Block。 +数据缓存将访问的远程数据缓存到本地的 BE 节点。原始的数据文件会根据访问的 IO 大小切分为 Block,Block 被存储到本地文件 `cache_path/hash(filepath).substr(0, 3)/hash(filepath)/offset` 中,并在 BE 节点中保存 Block 的元信息。当访问相同的远程文件时,Doris 会检查本地缓存中是否存在该文件的缓存数据,并根据 Block 的 offset 和 size,确认哪些数据从本地 Block 读取,哪些数据从远程拉取,并缓存远程拉取的新数据。BE 节点重启的时候,扫描 `cache_path` 目录,恢复 Block 的元信息。当缓存大小达到阈值上限的时候,按照 LRU 原则清理长久未访问的 Block。 diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.x/lakehouse/data-cache.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.x/lakehouse/data-cache.md index 3267a7eebf2b9..2f47bdacb85dd 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.x/lakehouse/data-cache.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.x/lakehouse/data-cache.md @@ -2,7 +2,7 @@ { "title": "数据缓存", "language": "zh-CN", - "description": "数据缓存(Data Cache)通过缓存最近访问的远端存储系统(HDFS 或对象存储)的数据文件到本地磁盘上,加速后续访问相同数据的查询。在频繁访问相同数据的查询场景中,Data Cache 可以避免重复的远端数据访问开销,提升热点数据的查询分析性能和稳定性。" + "description": "Apache Doris 数据缓存(Data Cache)将 HDFS 及对象存储数据缓存至本地磁盘,加速 Lakehouse 查询性能。支持缓存预热、配额限制与准入控制,适用于 Hive、Iceberg、Hudi、Paimon 表查询场景。" } --- @@ -37,7 +37,7 @@ | 参数 | 必选项 | 说明 | | ------------------- | --- | -------------------------------------- | | `enable_file_cache` | 是 | 是否启用 Data Cache,默认 false | -| `file_cache_path` | 是 | 缓存目录的相关配置,json 格式。 | +| `file_cache_path` | 是 | 缓存目录的相关配置,JSON 格式。 | | `clear_file_cache` | 否 | 默认 false。如果为 true,则当 BE 节点重启时,会清空缓存目录。 | `file_cache_path` 的配置示例: @@ -70,7 +70,7 @@ SET GLOBAL enable_file_cache = true; ### 查看缓存命中情况 -执行 `set enable_profile=true` 打开会话变量,可以在 FE 的 web 页面的 `Queris` 标签中查看到作业的 Profile。数据缓存相关的指标如下: +执行 `set enable_profile=true` 打开会话变量,可以在 FE 的 Web 页面的 `Queries` 标签中查看到作业的 Profile。数据缓存相关的指标如下: ```sql - FileCache: 0ns @@ -107,8 +107,273 @@ SET GLOBAL enable_file_cache = true; 用户可以通过系统表 [`file_cache_statistics`](../admin-manual/system-tables/information_schema/file_cache_statistics) 查看各个 Backend 节点的缓存统计指标。 +## 缓存的配额 + +> 该功能自 4.0.3 版本支持。 + +缓存配额(Cache Query Limit)功能允许用户限制单个查询可以使用的文件缓存百分比。在多用户或复杂查询共享缓存资源的场景下,单个大查询可能会占用过多的缓存空间,导致其他查询的热点数据被淘汰。通过设置查询配额,可以保证资源的公平使用,防止缓存抖动。 + +查询占用的缓存空间指的是该查询因数据未命中而填充到缓存中的数据总大小。如果该查询填充的总大小已经达到配额限制,那么查询后续填充的数据会基于 LRU 算法替换先前填充的数据。 + +### 配置说明 + +该功能涉及 BE 和 FE 两端的配置,以及会话变量(Session Variable)的设置。 + +**1. BE 配置** + +- `enable_file_cache_query_limit`: + - 类型:Boolean + - 默认值:`false` + - 说明:BE 端文件缓存查询限制功能的主开关。只有开启此开关,BE 才会处理 FE 传递的查询限制参数。 + +**2. FE 配置** + +- `file_cache_query_limit_max_percent`: + - 类型:Integer + - 默认值:`100` + - 说明:查询的最大配额约束,用于校验会话变量的上限。它确保用户设置的查询限制不会超过此值。 + +**3. 会话变量 (Session Variables)** + +- `file_cache_query_limit_percent`: + - 类型:Integer (1-100) + - 说明:文件缓存查询限制百分比。设置单个查询可使用的最大缓存比例。该值上限受 `file_cache_query_limit_max_percent` 约束。建议计算后的缓存配额不低于 256MB,如果低于该值,BE 会在日志中进行告警提示。 + +**使用示例** + +```sql +-- 设置会话变量,限制单个查询最多使用 50% 的缓存 +SET file_cache_query_limit_percent = 50; + +-- 执行查询 +SELECT * FROM large_table; +``` + +**注意:** +1. 设置的值必须在 [0, `file_cache_query_limit_max_percent`] 范围内。 + +## 缓存预热 + +Data Cache 提供缓存“预热(Warmup)”功能,允许将外部数据提前加载到 BE 节点的本地缓存中,从而提升后续首次查询的命中率和查询性能。 + +> 该功能自 4.0.2 版本支持。 + +### 语法 + +```sql +WARM UP SELECT +FROM +[WHERE ] +``` + +使用限制: + +* 支持: + + * 单表查询(仅允许一个 table_reference) + * 指定列的简单 SELECT + * WHERE 过滤(支持常规谓词) + +* 不支持: + + * JOIN、UNION、子查询、CTE + * GROUP BY、HAVING、ORDER BY + * LIMIT + * INTO OUTFILE + * 多表 / 复杂查询计划 + * 其它复杂语法 + +### 示例 + +1. 预热整张表 + + ```sql + WARM UP SELECT * FROM hive_db.tpch100_parquet.lineitem; + ``` + +2. 根据分区预热部分列 + + ```sql + WARM UP SELECT l_orderkey, l_shipmode + FROM hive_db.tpch100_parquet.lineitem + WHERE dt = '2025-01-01'; + ``` + +3. 根据过滤条件预热部分列 + + ```sql + WARM UP SELECT l_shipmode, l_linestatus + FROM hive_db.tpch100_parquet.lineitem + WHERE l_orderkey = 123456; + ``` + +### 执行返回结果 + +执行 `WARM UP SELECT` 后,FE 会下发任务至各 BE。BE 扫描远端数据并写入 Data Cache。 + +系统会直接返回各 BE 的扫描与缓存写入统计信息(注意:统计信息基本准确,但会有一定误差)。例如: + +``` ++---------------+-----------+-------------+---------------------------+----------------------------+---------------------+ +| BackendId | ScanRows | ScanBytes | ScanBytesFromLocalStorage | ScanBytesFromRemoteStorage | BytesWriteIntoCache | ++---------------+-----------+-------------+---------------------------+----------------------------+---------------------+ +| 1755134092928 | 294744184 | 11821864798 | 538154009 | 11283717130 | 11899799492 | +| 1755134092929 | 305293718 | 12244439301 | 560970435 | 11683475207 | 12332861380 | +| TOTAL | 600037902 | 24066304099 | 1099124444 | 22967192337 | 24232660872 | ++---------------+-----------+-------------+---------------------------+----------------------------+---------------------+ +``` + +字段解释 + +* `ScanRows`:扫描读取行数。 +* `ScanBytes`:扫描读取数据量。 +* `ScanBytesFromLocalStorage`:从本地缓存扫描读取的数据量。 +* `ScanBytesFromRemoteStorage`:从远端存储扫描读取的数据量。 +* `BytesWriteIntoCache`:本次预热写入 Data Cache 的数据量。 + +## 缓存准入控制 + +> 该功能为实验性功能,自 4.1.0 版本起支持。 + +通过缓存准入控制功能,用户可以基于操作用户、Catalog、Database 以及 Table 等维度配置规则,精细化管理查询产生的数据是否允许写入 Data Cache。 + +在某些业务场景(如全表扫描的大规模 ETL 作业或不可预期的 Ad-hoc 分析)中,大量的冷数据读取可能会迅速填满缓存空间,导致高频访问的“热数据”被频繁置换(即缓存污染问题)。此时系统整体的缓存命中率和查询性能将出现严重下滑。通过配置缓存准入规则拒绝此类作业的数据进入缓存,可以有效保护热数据,确保整个系统的 Data Cache 命中率维持在稳定水平。 + +缓存准入控制功能默认关闭,需要在 FE 节点上配置相关参数来开启。 + +### FE 配置 + +需要在 FE 节点的配置文件 `fe.conf` 中新增如下参数以开启缓存准入控制。开启及配置目录路径需要重启 FE 节点生效,但目录下规则文件的修改支持动态加载。 + +| 参数 | 必选项 | 说明 | +| ----------------------------------------- | ------ | ------------------------------------------------------------ | +| `enable_file_cache_admission_control` | 是 | 是否启用缓存准入控制功能。默认为 `false`(不启用)。 | +| `file_cache_admission_control_json_dir` | 是 | 存放缓存准入规则 JSON 文件的目录路径。该目录下的所有 `.json` 文件均会被自动加载并实时监听修改,规则更替**动态生效**无需重启节点。 | + +### 准入控制规则配置 + +规则文件应存放于 `file_cache_admission_control_json_dir` 所配置的目录下,且后缀需为 `.json`。 + +#### 参数说明 + +规则以 JSON 数组形式提供,每个 JSON Object 代表一条规则,各字段及说明如下: + +| 字段名 | 类型 | 说明 | 示例 | +| ------------------- | -------- | ------------------------------------------------------------ | ------------ | +| `id` | Long | 规则 ID,供用户区分不同的规则。 | `1` | +| `user_identity` | String | 用户标识(格式为 `user@host`,其中 `%` 表示匹配所有 IP)。**留空 (`""`) 表示匹配全局所有用户**。 | `"root@%"` | +| `catalog_name` | String | Catalog 名称。**留空 (`""`) 表示匹配所有 Catalog**。 | `"hive_cat"` | +| `database_name` | String | Database 名称。**留空 (`""`) 表示匹配所有 Database**。 | `"db1"` | +| `table_name` | String | Table 名称。**留空 (`""`) 表示匹配所有表**。 | `"tbl1"` | +| `partition_pattern` | String | (当前版本暂未实现)分区正则表达式。留空表示匹配所有分区。 | `""` | +| `rule_type` | Integer | 规则类型。`0` 表示禁止缓存(黑名单);`1` 表示允许缓存(白名单)。 | `0` | +| `enabled` | Integer | 该规则是否生效启用。`0` 表示停用;`1` 表示启用。 | `1` | +| `created_time` | Long | 记录的创建时间(UNIX 时间戳,秒)。 | `1766557246` | +| `updated_time` | Long | 记录的最新更新时间(UNIX 时间戳,秒)。 | `1766557246` | + +#### JSON 文件样例 + +```json +[ + { + "id": 1, + "user_identity": "root@%", + "catalog_name": "hive_cat", + "database_name": "db1", + "table_name": "table1", + "partition_pattern": "", + "rule_type": 0, + "enabled": 1, + "created_time": 1766557246, + "updated_time": 1766557246 + }, + { + "id": 2, + "user_identity": "", + "catalog_name": "hive_cat", + "database_name": "", + "table_name": "", + "partition_pattern": "", + "rule_type": 1, + "enabled": 1, + "created_time": 1766557246, + "updated_time": 1766557246 + } +] +``` + +#### 从 MySQL 导入规则 + +对于有自动化系统对接需求的用户,在 Doris 源码库 `tools/export_mysql_rule_to_json.sh` 路径下提供了辅助脚本。可以使用该脚本将预先存储在 MySQL 数据库表中的缓存准入规则导出为符合上述格式的 JSON 配置文件。 + +### 规则匹配原理 + +#### 规则作用域分类 + +根据填写的字段明确程度,规则支持以下几种层级和形式的生效范围: + +| user_identity | catalog_name | database_name | table_name | 级别与作用域 | +| ------------- | ------------ | ------------- | ---------- | ---------------------------- | +| **非空** | **非空** | **非空** | **非空** | **指定用户・Table 级规则** | +| 空缺或留空 | **非空** | **非空** | **非空** | **全体用户・Table 级规则** | +| **非空** | **非空** | **非空** | 空缺或留空 | **指定用户・Database 级规则**| +| 空缺或留空 | **非空** | **非空** | 空缺或留空 | **全体用户・Database 级规则**| +| **非空** | **非空** | 空缺或留空 | 空缺或留空 | **指定用户・Catalog 级规则** | +| 空缺或留空 | **非空** | 空缺或留空 | 空缺或留空 | **全体用户・Catalog 级规则** | +| **非空** | 空缺或留空 | 空缺或留空 | 空缺或留空 | **指定用户・全局规则** | + +> **说明:** +> - **空缺或留空**:表示该字段在 JSON 中配置为空字符串 `""` 或者是直接省略不配置(效果等同于空字符串 `""`)。 +> - **非空**:表示对应字段必须是一个明确的值(如 `"hive_cat"`)。 +> - 任何跳跃或不符合层级依赖关系的规则配置均视为**无效规则**,不会被解析生效。例如:配置了 `database_name` 但 `catalog_name` 为空。 + +#### 匹配及优先级顺序 + +判定某个查询访问目标表数据时是否写入缓存,受配置的多条规则综合影响。匹配原则如下: + +1. **精确优先原则**:按层级由明细到宽泛的顺序(Table → Database → Catalog → 全局)进行匹配,优先匹配精确度更高的规则。 +2. **拒绝优先(安全优先)**:在同一个规则内或同级别(如既存在指定用户的黑名单规则,又存在全局的白名单规则),**禁止缓存(黑名单)规则的优先级永远高于允许缓存(白名单)规则**。拒绝访问的决策最先被识别生效。 + +完整的决策推导链路如下: + +```text +1. Table 级规则匹配 + a) 命中黑名单 (rule_type=0) -> 拒绝 + b) 命中白名单 (rule_type=1) -> 允许 +2. Database 级规则匹配 + ... +3. Catalog 级规则匹配 + ... +4. 全局规则匹配 (仅匹配 user_identity) + ... +5. 兜底默认决策:如果所有的层级也没有匹配上,系统默认【拒绝】写入缓存操作(等同全局黑名单)。 +``` + +> **提示**:由于系统默认兜底为拒绝准入,因此在实际部署该功能时,最佳实践通常是建立一条广泛的全局允许规则(如全员白名单,或某个重要业务 Catalog 的白名单),然后再针对已知会进行全表扫描的大表配置针对性的 Table 级黑名单,由此实现精细化的冷热数据剥离。 + +### 缓存决策可观测性 + +当开启规则并在系统中生效起作用后,用户可以通过向目标表发送 `EXPLAIN` 命令,审查其 File Cache 准入控制决策情况(聚焦关注节点下方的 `file cache request` 决策输出信息)。 + +```text +| 0:VHIVE_SCAN_NODE(74) | +| table: test_file_cache_features.tpch1_parquet.lineitem | +| inputSplitNum=10, totalFileSize=205792918, scanRanges=10 | +| partition=1/1 | +| cardinality=1469949, numNodes=1 | +| pushdown agg=NONE | +| file cache request ADMITTED: user_identity:root@%, reason:user table-level whitelist rule, cost:0.058 ms | +| limit: 1 | +``` + +重点字段与判定说明: +- **ADMITTED** / **DENIED**:代表请求被允许写入缓存(ADMITTED)或被直接拒绝(DENIED)。如果被拒绝,数据访问直接击穿至远端底层存储。 +- **user_identity**:执行本次查询鉴别出的用户凭证身份。 +- **reason**:命中判定策略的具体原因说明。常见的输出原因如:`user table-level whitelist rule`(当前示例:指定用户 Table 级白名单规则);`common table-level blacklist rule`(全体用户 Table 级黑名单规则)。此类原因格式一般为 `[作用范围] [规则级别] [规则类型] rule`。 +- **cost**:完成整个准入匹配计算过程的耗时开销(以毫秒 ms 为单位)。如果开销过大,可通过简化规则层级数进行调整。 + ## 附录 ### 原理 -数据缓存将访问的远程数据缓存到本地的 BE 节点。原始的数据文件会根据访问的 IO 大小切分为 Block,Block 被存储到本地文件 `cache_path/hash(filepath).substr(0, 3)/hash(filepath)/offset` 中,并在 BE 节点中保存 Block 的元信息。当访问相同的远程文件时,doris 会检查本地缓存中是否存在该文件的缓存数据,并根据 Block 的 offset 和 size,确认哪些数据从本地 Block 读取,哪些数据从远程拉起,并缓存远程拉取的新数据。BE 节点重启的时候,扫描 `cache_path` 目录,恢复 Block 的元信息。当缓存大小达到阈值上限的时候,按照 LRU 原则清理长久未访问的 Block。 +数据缓存将访问的远程数据缓存到本地的 BE 节点。原始的数据文件会根据访问的 IO 大小切分为 Block,Block 被存储到本地文件 `cache_path/hash(filepath).substr(0, 3)/hash(filepath)/offset` 中,并在 BE 节点中保存 Block 的元信息。当访问相同的远程文件时,Doris 会检查本地缓存中是否存在该文件的缓存数据,并根据 Block 的 offset 和 size,确认哪些数据从本地 Block 读取,哪些数据从远程拉取,并缓存远程拉取的新数据。BE 节点重启的时候,扫描 `cache_path` 目录,恢复 Block 的元信息。当缓存大小达到阈值上限的时候,按照 LRU 原则清理长久未访问的 Block。 diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/lakehouse/data-cache.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/lakehouse/data-cache.md index 9582364896a94..2f47bdacb85dd 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/lakehouse/data-cache.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/lakehouse/data-cache.md @@ -2,7 +2,7 @@ { "title": "数据缓存", "language": "zh-CN", - "description": "数据缓存(Data Cache)通过缓存最近访问的远端存储系统(HDFS 或对象存储)的数据文件到本地磁盘上,加速后续访问相同数据的查询。在频繁访问相同数据的查询场景中,Data Cache 可以避免重复的远端数据访问开销,提升热点数据的查询分析性能和稳定性。" + "description": "Apache Doris 数据缓存(Data Cache)将 HDFS 及对象存储数据缓存至本地磁盘,加速 Lakehouse 查询性能。支持缓存预热、配额限制与准入控制,适用于 Hive、Iceberg、Hudi、Paimon 表查询场景。" } --- @@ -37,7 +37,7 @@ | 参数 | 必选项 | 说明 | | ------------------- | --- | -------------------------------------- | | `enable_file_cache` | 是 | 是否启用 Data Cache,默认 false | -| `file_cache_path` | 是 | 缓存目录的相关配置,json 格式。 | +| `file_cache_path` | 是 | 缓存目录的相关配置,JSON 格式。 | | `clear_file_cache` | 否 | 默认 false。如果为 true,则当 BE 节点重启时,会清空缓存目录。 | `file_cache_path` 的配置示例: @@ -70,7 +70,7 @@ SET GLOBAL enable_file_cache = true; ### 查看缓存命中情况 -执行 `set enable_profile=true` 打开会话变量,可以在 FE 的 web 页面的 `Queris` 标签中查看到作业的 Profile。数据缓存相关的指标如下: +执行 `set enable_profile=true` 打开会话变量,可以在 FE 的 Web 页面的 `Queries` 标签中查看到作业的 Profile。数据缓存相关的指标如下: ```sql - FileCache: 0ns @@ -198,6 +198,7 @@ FROM FROM hive_db.tpch100_parquet.lineitem WHERE dt = '2025-01-01'; ``` + 3. 根据过滤条件预热部分列 ```sql @@ -224,14 +225,155 @@ FROM 字段解释 -* ScanRows:扫描读取行数。 -* ScanBytes:扫描读取数据量。 -* ScanBytesFromLocalStorage:从本地缓存扫描读取的数据量。 -* ScanBytesFromRemoteStorage:从远端存储扫描读取的数据量。 -* BytesWriteIntoCache:本次预热写入 Data Cache 的数据量。 +* `ScanRows`:扫描读取行数。 +* `ScanBytes`:扫描读取数据量。 +* `ScanBytesFromLocalStorage`:从本地缓存扫描读取的数据量。 +* `ScanBytesFromRemoteStorage`:从远端存储扫描读取的数据量。 +* `BytesWriteIntoCache`:本次预热写入 Data Cache 的数据量。 + +## 缓存准入控制 + +> 该功能为实验性功能,自 4.1.0 版本起支持。 + +通过缓存准入控制功能,用户可以基于操作用户、Catalog、Database 以及 Table 等维度配置规则,精细化管理查询产生的数据是否允许写入 Data Cache。 + +在某些业务场景(如全表扫描的大规模 ETL 作业或不可预期的 Ad-hoc 分析)中,大量的冷数据读取可能会迅速填满缓存空间,导致高频访问的“热数据”被频繁置换(即缓存污染问题)。此时系统整体的缓存命中率和查询性能将出现严重下滑。通过配置缓存准入规则拒绝此类作业的数据进入缓存,可以有效保护热数据,确保整个系统的 Data Cache 命中率维持在稳定水平。 + +缓存准入控制功能默认关闭,需要在 FE 节点上配置相关参数来开启。 + +### FE 配置 + +需要在 FE 节点的配置文件 `fe.conf` 中新增如下参数以开启缓存准入控制。开启及配置目录路径需要重启 FE 节点生效,但目录下规则文件的修改支持动态加载。 + +| 参数 | 必选项 | 说明 | +| ----------------------------------------- | ------ | ------------------------------------------------------------ | +| `enable_file_cache_admission_control` | 是 | 是否启用缓存准入控制功能。默认为 `false`(不启用)。 | +| `file_cache_admission_control_json_dir` | 是 | 存放缓存准入规则 JSON 文件的目录路径。该目录下的所有 `.json` 文件均会被自动加载并实时监听修改,规则更替**动态生效**无需重启节点。 | + +### 准入控制规则配置 + +规则文件应存放于 `file_cache_admission_control_json_dir` 所配置的目录下,且后缀需为 `.json`。 + +#### 参数说明 + +规则以 JSON 数组形式提供,每个 JSON Object 代表一条规则,各字段及说明如下: + +| 字段名 | 类型 | 说明 | 示例 | +| ------------------- | -------- | ------------------------------------------------------------ | ------------ | +| `id` | Long | 规则 ID,供用户区分不同的规则。 | `1` | +| `user_identity` | String | 用户标识(格式为 `user@host`,其中 `%` 表示匹配所有 IP)。**留空 (`""`) 表示匹配全局所有用户**。 | `"root@%"` | +| `catalog_name` | String | Catalog 名称。**留空 (`""`) 表示匹配所有 Catalog**。 | `"hive_cat"` | +| `database_name` | String | Database 名称。**留空 (`""`) 表示匹配所有 Database**。 | `"db1"` | +| `table_name` | String | Table 名称。**留空 (`""`) 表示匹配所有表**。 | `"tbl1"` | +| `partition_pattern` | String | (当前版本暂未实现)分区正则表达式。留空表示匹配所有分区。 | `""` | +| `rule_type` | Integer | 规则类型。`0` 表示禁止缓存(黑名单);`1` 表示允许缓存(白名单)。 | `0` | +| `enabled` | Integer | 该规则是否生效启用。`0` 表示停用;`1` 表示启用。 | `1` | +| `created_time` | Long | 记录的创建时间(UNIX 时间戳,秒)。 | `1766557246` | +| `updated_time` | Long | 记录的最新更新时间(UNIX 时间戳,秒)。 | `1766557246` | + +#### JSON 文件样例 + +```json +[ + { + "id": 1, + "user_identity": "root@%", + "catalog_name": "hive_cat", + "database_name": "db1", + "table_name": "table1", + "partition_pattern": "", + "rule_type": 0, + "enabled": 1, + "created_time": 1766557246, + "updated_time": 1766557246 + }, + { + "id": 2, + "user_identity": "", + "catalog_name": "hive_cat", + "database_name": "", + "table_name": "", + "partition_pattern": "", + "rule_type": 1, + "enabled": 1, + "created_time": 1766557246, + "updated_time": 1766557246 + } +] +``` + +#### 从 MySQL 导入规则 + +对于有自动化系统对接需求的用户,在 Doris 源码库 `tools/export_mysql_rule_to_json.sh` 路径下提供了辅助脚本。可以使用该脚本将预先存储在 MySQL 数据库表中的缓存准入规则导出为符合上述格式的 JSON 配置文件。 + +### 规则匹配原理 + +#### 规则作用域分类 + +根据填写的字段明确程度,规则支持以下几种层级和形式的生效范围: + +| user_identity | catalog_name | database_name | table_name | 级别与作用域 | +| ------------- | ------------ | ------------- | ---------- | ---------------------------- | +| **非空** | **非空** | **非空** | **非空** | **指定用户・Table 级规则** | +| 空缺或留空 | **非空** | **非空** | **非空** | **全体用户・Table 级规则** | +| **非空** | **非空** | **非空** | 空缺或留空 | **指定用户・Database 级规则**| +| 空缺或留空 | **非空** | **非空** | 空缺或留空 | **全体用户・Database 级规则**| +| **非空** | **非空** | 空缺或留空 | 空缺或留空 | **指定用户・Catalog 级规则** | +| 空缺或留空 | **非空** | 空缺或留空 | 空缺或留空 | **全体用户・Catalog 级规则** | +| **非空** | 空缺或留空 | 空缺或留空 | 空缺或留空 | **指定用户・全局规则** | + +> **说明:** +> - **空缺或留空**:表示该字段在 JSON 中配置为空字符串 `""` 或者是直接省略不配置(效果等同于空字符串 `""`)。 +> - **非空**:表示对应字段必须是一个明确的值(如 `"hive_cat"`)。 +> - 任何跳跃或不符合层级依赖关系的规则配置均视为**无效规则**,不会被解析生效。例如:配置了 `database_name` 但 `catalog_name` 为空。 + +#### 匹配及优先级顺序 + +判定某个查询访问目标表数据时是否写入缓存,受配置的多条规则综合影响。匹配原则如下: + +1. **精确优先原则**:按层级由明细到宽泛的顺序(Table → Database → Catalog → 全局)进行匹配,优先匹配精确度更高的规则。 +2. **拒绝优先(安全优先)**:在同一个规则内或同级别(如既存在指定用户的黑名单规则,又存在全局的白名单规则),**禁止缓存(黑名单)规则的优先级永远高于允许缓存(白名单)规则**。拒绝访问的决策最先被识别生效。 + +完整的决策推导链路如下: + +```text +1. Table 级规则匹配 + a) 命中黑名单 (rule_type=0) -> 拒绝 + b) 命中白名单 (rule_type=1) -> 允许 +2. Database 级规则匹配 + ... +3. Catalog 级规则匹配 + ... +4. 全局规则匹配 (仅匹配 user_identity) + ... +5. 兜底默认决策:如果所有的层级也没有匹配上,系统默认【拒绝】写入缓存操作(等同全局黑名单)。 +``` + +> **提示**:由于系统默认兜底为拒绝准入,因此在实际部署该功能时,最佳实践通常是建立一条广泛的全局允许规则(如全员白名单,或某个重要业务 Catalog 的白名单),然后再针对已知会进行全表扫描的大表配置针对性的 Table 级黑名单,由此实现精细化的冷热数据剥离。 + +### 缓存决策可观测性 + +当开启规则并在系统中生效起作用后,用户可以通过向目标表发送 `EXPLAIN` 命令,审查其 File Cache 准入控制决策情况(聚焦关注节点下方的 `file cache request` 决策输出信息)。 + +```text +| 0:VHIVE_SCAN_NODE(74) | +| table: test_file_cache_features.tpch1_parquet.lineitem | +| inputSplitNum=10, totalFileSize=205792918, scanRanges=10 | +| partition=1/1 | +| cardinality=1469949, numNodes=1 | +| pushdown agg=NONE | +| file cache request ADMITTED: user_identity:root@%, reason:user table-level whitelist rule, cost:0.058 ms | +| limit: 1 | +``` + +重点字段与判定说明: +- **ADMITTED** / **DENIED**:代表请求被允许写入缓存(ADMITTED)或被直接拒绝(DENIED)。如果被拒绝,数据访问直接击穿至远端底层存储。 +- **user_identity**:执行本次查询鉴别出的用户凭证身份。 +- **reason**:命中判定策略的具体原因说明。常见的输出原因如:`user table-level whitelist rule`(当前示例:指定用户 Table 级白名单规则);`common table-level blacklist rule`(全体用户 Table 级黑名单规则)。此类原因格式一般为 `[作用范围] [规则级别] [规则类型] rule`。 +- **cost**:完成整个准入匹配计算过程的耗时开销(以毫秒 ms 为单位)。如果开销过大,可通过简化规则层级数进行调整。 ## 附录 ### 原理 -数据缓存将访问的远程数据缓存到本地的 BE 节点。原始的数据文件会根据访问的 IO 大小切分为 Block,Block 被存储到本地文件 `cache_path/hash(filepath).substr(0, 3)/hash(filepath)/offset` 中,并在 BE 节点中保存 Block 的元信息。当访问相同的远程文件时,doris 会检查本地缓存中是否存在该文件的缓存数据,并根据 Block 的 offset 和 size,确认哪些数据从本地 Block 读取,哪些数据从远程拉起,并缓存远程拉取的新数据。BE 节点重启的时候,扫描 `cache_path` 目录,恢复 Block 的元信息。当缓存大小达到阈值上限的时候,按照 LRU 原则清理长久未访问的 Block。 +数据缓存将访问的远程数据缓存到本地的 BE 节点。原始的数据文件会根据访问的 IO 大小切分为 Block,Block 被存储到本地文件 `cache_path/hash(filepath).substr(0, 3)/hash(filepath)/offset` 中,并在 BE 节点中保存 Block 的元信息。当访问相同的远程文件时,Doris 会检查本地缓存中是否存在该文件的缓存数据,并根据 Block 的 offset 和 size,确认哪些数据从本地 Block 读取,哪些数据从远程拉取,并缓存远程拉取的新数据。BE 节点重启的时候,扫描 `cache_path` 目录,恢复 Block 的元信息。当缓存大小达到阈值上限的时候,按照 LRU 原则清理长久未访问的 Block。 diff --git a/versioned_docs/version-3.x/lakehouse/data-cache.md b/versioned_docs/version-3.x/lakehouse/data-cache.md index 11cd686a9ab94..c7772bf552a85 100644 --- a/versioned_docs/version-3.x/lakehouse/data-cache.md +++ b/versioned_docs/version-3.x/lakehouse/data-cache.md @@ -2,7 +2,7 @@ { "title": "Data Cache", "language": "en", - "description": "Data Cache accelerates subsequent queries of the same data by caching recently accessed data files from remote storage systems (HDFS or object " + "description": "Apache Doris Data Cache accelerates Lakehouse queries by caching HDFS and object storage data locally. Supports cache warmup, quota control, and admission control for Hive, Iceberg, Hudi, and Paimon tables." } --- @@ -107,8 +107,265 @@ If `BytesScannedFromRemote` is 0, it means the cache is fully hit. Users can view cache statistics for each Backend node through the system table [`file_cache_statistics`](../admin-manual/system-tables/information_schema/file_cache_statistics). +## Cache Query Limit + +> This feature is supported since version 4.0.3. + +The Cache Query Limit feature allows users to limit the percentage of file cache that a single query can use. In scenarios where multiple users or complex queries share cache resources, a single large query might occupy too much cache space, causing other queries' hot data to be evicted. By setting a query limit, you can ensure fair resource usage and prevent cache thrashing. + +The cache space occupied by a query refers to the total size of data populated into the cache due to cache misses. If the total size populated by the query reaches the quota limit, subsequent data populated by the query will replace the previously populated data based on the LRU algorithm. + +### Configuration + +This feature involves configuration on BE and FE, as well as session variable settings. + +**1. BE Configuration** + +- `enable_file_cache_query_limit`: + - Type: Boolean + - Default: `false` + - Description: The master switch for the file cache query limit feature on the BE side. Only when enabled will the BE process the query limit parameters passed from the FE. + +**2. FE Configuration** + +- `file_cache_query_limit_max_percent`: + - Type: Integer + - Default: `100` + - Description: The max query limit constraint used to validate the upper limit of session variables. It ensures that the query limit set by users does not exceed this value. + +**3. Session Variables** + +- `file_cache_query_limit_percent`: + - Type: Integer (1-100) + - Description: The file cache query limit percentage. It sets the maximum percentage of cache a query can use. This value is constrained by `file_cache_query_limit_max_percent`. It is recommended that the calculated cache quota is not less than 256MB. If it is lower than this value, the BE will print a warning in the log. + +**Usage Example** + +```sql +-- Set session variable to limit a query to use at most 50% of the cache +SET file_cache_query_limit_percent = 50; + +-- Execute query +SELECT * FROM large_table; +``` + +**Note:** +1. The value must be within the range [0, `file_cache_query_limit_max_percent`]. + +## Cache Warmup + +Data Cache provides a cache "warmup" feature that allows preloading external data into the local cache of BE nodes, thereby improving cache hit rates and query performance for subsequent first-time queries. + +> This feature is supported since version 4.0.2. + +### Syntax + +```sql +WARM UP SELECT +FROM +[WHERE ] +``` + +Usage restrictions: + +* Supported: + + * Single table queries (only one table_reference allowed) + * Simple SELECT for specified columns + * WHERE filtering (supports regular predicates) + +* Not supported: + + * JOIN, UNION, subqueries, CTE + * GROUP BY, HAVING, ORDER BY + * LIMIT + * INTO OUTFILE + * Multi-table / complex query plans + * Other complex syntax + +### Examples + +1. Warm up the entire table + + ```sql + WARM UP SELECT * FROM hive_db.tpch100_parquet.lineitem; + ``` + +2. Warm up partial columns by partition + + ```sql + WARM UP SELECT l_orderkey, l_shipmode + FROM hive_db.tpch100_parquet.lineitem + WHERE dt = '2025-01-01'; + ``` + +3. Warm up partial columns by filter conditions + + ```sql + WARM UP SELECT l_shipmode, l_linestatus + FROM hive_db.tpch100_parquet.lineitem + WHERE l_orderkey = 123456; + ``` + +### Execution Results + +After executing `WARM UP SELECT`, the FE dispatches tasks to each BE. The BE scans remote data and writes it to Data Cache. + +The system directly returns scan and cache write statistics for each BE (Note: Statistics are generally accurate but may have some margin of error). For example: + +``` ++---------------+-----------+-------------+---------------------------+----------------------------+---------------------+ +| BackendId | ScanRows | ScanBytes | ScanBytesFromLocalStorage | ScanBytesFromRemoteStorage | BytesWriteIntoCache | ++---------------+-----------+-------------+---------------------------+----------------------------+---------------------+ +| 1755134092928 | 294744184 | 11821864798 | 538154009 | 11283717130 | 11899799492 | +| 1755134092929 | 305293718 | 12244439301 | 560970435 | 11683475207 | 12332861380 | +| TOTAL | 600037902 | 24066304099 | 1099124444 | 22967192337 | 24232660872 | ++---------------+-----------+-------------+---------------------------+----------------------------+---------------------+ +``` + +Field explanations: + +* `ScanRows`: Number of rows scanned and read. +* `ScanBytes`: Amount of data scanned and read. +* `ScanBytesFromLocalStorage`: Amount of data scanned and read from local cache. +* `ScanBytesFromRemoteStorage`: Amount of data scanned and read from remote storage. +* `BytesWriteIntoCache`: Amount of data written to Data Cache during this warmup. + +## Cache Admission Control + +> This is an experimental feature and is supported since version 4.1.0. + +The cache admission control feature provides a mechanism that allows users to control whether data read by a query is allowed to enter the File Cache (Data Cache) based on dimensions such as User, Catalog, Database, and Table. +In scenarios with massive cold data reads (e.g., large-scale ETL jobs or heavy ad-hoc queries), if all read data is allowed to enter the cache, it may cause existing hot data to be frequently evicted (i.e., "cache pollution"), leading to a drop in cache hit rates and overall query performance. When enabled, data denied admission will be pulled directly from remote underlying storage (e.g., HDFS, S3), effectively protecting core hot data from being swapped out. + +The cache admission control feature is disabled by default and needs to be enabled by configuring relevant parameters in the FE. + +### FE Configuration + +You need to enable this feature and specify the rule configuration file path in `fe.conf`, then restart the FE node for it to take effect. Modifications to the rule files themselves can be loaded dynamically. + +| Parameter | Required | Description | +| ----------------------------------------------- | -------- |-------------------------------------------| +| `enable_file_cache_admission_control` | Yes | Whether to enable cache admission control. Default is `false`. | +| `file_cache_admission_control_json_dir` | Yes | The directory path for storing admission rules JSON files. All `.json` files in this directory will be automatically loaded, and any rule additions, deletions, or modifications will **take effect dynamically**. | + +### Admission Rules Configuration Format + +Rule configurations are placed in `.json` files under the `file_cache_admission_control_json_dir` directory. The file content must be in a JSON array format. + +#### Field Description + +| Field Name | Type | Description | Example | +|--------|------|-------------------------------|-----------------------| +| `id` | Long | Rule ID. | `1` | +| `user_identity` | String | User identity (format: `user@host`, e.g., `%` matches all IPs). **Leaving it empty or omitting it matches all users globally.** | `"root@%"` | +| `catalog_name` | String | Catalog name. **Leaving it empty or omitting it matches all catalogs.** | `"hive_cat"` | +| `database_name` | String | Database name. **Leaving it empty or omitting it matches all databases.** | `"db1"` | +| `table_name` | String | Table name. **Leaving it empty or omitting it matches all tables.** | `"tbl1"` | +| `partition_pattern` | String | (Not implemented yet) Partition regular expression. Empty means matching all partitions. | `""` | +| `rule_type` | Integer | Rule type: `0` means deny cache (blacklist); `1` means allow cache (whitelist). | `0` | +| `enabled` | Integer | Whether the current rule is enabled: `0` means disabled; `1` means enabled. | `1` | +| `created_time` | Long | Creation time (UNIX timestamp, seconds). | `1766557246` | +| `updated_time` | Long | Update time (UNIX timestamp, seconds). | `1766557246` | + +#### JSON File Example + +```json +[ + { + "id": 1, + "user_identity": "root@%", + "catalog_name": "hive_cat", + "database_name": "db1", + "table_name": "table1", + "partition_pattern": "", + "rule_type": 0, + "enabled": 1, + "created_time": 1766557246, + "updated_time": 1766557246 + }, + { + "id": 2, + "user_identity": "", + "catalog_name": "hive_cat", + "database_name": "", + "table_name": "", + "partition_pattern": "", + "rule_type": 1, + "enabled": 1, + "created_time": 1766557246, + "updated_time": 1766557246 + } +] +``` + +#### Import Rules from MySQL + +For users with automated system integration needs, an auxiliary script is provided at `tools/export_mysql_rule_to_json.sh` in the Doris source code repository. This script can be used to export cache admission rules pre-stored in a MySQL database table into JSON configuration files that comply with the above format. + +### Rule Matching Principles + +#### Rule Scope Categories + +By combining different fields (`user_identity`, `catalog_name`, `database_name`, `table_name`) as either empty or specific values, the system supports 7 dimensions of valid rules. **Any rule configuration that does not comply with hierarchical dependencies (for example, skipping Database to specify Table directly) will be considered invalid.** + +| user_identity | catalog_name | database_name | table_name | Level and Scope | +|---------------|--------------|---------------|------------|------------------| +| **Specified** | **Specified** | **Specified** | **Specified** | Table-level rule for specified user | +| Empty or Omitted | **Specified** | **Specified** | **Specified** | Table-level rule for all users | +| **Specified** | **Specified** | **Specified** | Empty or Omitted | Database-level rule for specified user | +| Empty or Omitted | **Specified** | **Specified** | Empty or Omitted | Database-level rule for all users | +| **Specified** | **Specified** | Empty or Omitted | Empty or Omitted | Catalog-level rule for specified user | +| Empty or Omitted | **Specified** | Empty or Omitted | Empty or Omitted | Catalog-level rule for all users | +| **Specified** | Empty or Omitted | Empty or Omitted | Empty or Omitted | Global rule for specified user | + +#### Matching Priority and Order + +When a query accesses a table's data, the system comprehensively evaluates all rules to make an admission decision. The judgment process follows these principles: + +1. **Exact Match First**: Matching is conducted in order from specific to broad hierarchy (Table → Database → Catalog → Global). Once successfully matched at the most precise level (e.g., Table level), the judgment terminates immediately. +2. **Blacklist First (Security Principle)**: Within the same rule level, **deny cache rules always take precedence over allow cache rules**. If both a blacklist and a whitelist are matched simultaneously, the blacklist operation executes first, ensuring that access denial decisions at the same level take effect first. + +The complete decision derivation sequence is as follows: + +```text +1. Table-level rule matching + a) Hit Blacklist (rule_type=0) -> Deny + b) Hit Whitelist (rule_type=1) -> Allow +2. Database-level rule matching + ... +3. Catalog-level rule matching + ... +4. Global rule matching (only user_identity matched) + ... +5. Default fallback decision: If no rules at any level above match, caching is [Denied] by default (equivalent to a global blacklist). +``` + +> **Tip**: Because the system's fallback strategy is default-deny, best practice when deploying this feature is generally to establish a broad global allowance rule (e.g., a whitelist for all users, or for an important business Catalog), and then configure targeted Table-level blacklists for large tables known to undergo offline full-table scans. This achieves refined separation of cold and hot data. + +### Cache Decision Observability + +After successfully enabling and applying the configuration, users can view detailed cache admission decisions at the file data level for a single table via the `EXPLAIN` command (refer to the `file cache request` output below). + +```text +| 0:VHIVE_SCAN_NODE(74) | +| table: test_file_cache_features.tpch1_parquet.lineitem | +| inputSplitNum=10, totalFileSize=205792918, scanRanges=10 | +| partition=1/1 | +| cardinality=1469949, numNodes=1 | +| pushdown agg=NONE | +| file cache request ADMITTED: user_identity:root@%, reason:user table-level whitelist rule, cost:0.058 ms | +| limit: 1 | +``` + +Key fields and decision descriptions: +- **ADMITTED** / **DENIED**: Represents whether the request is allowed (ADMITTED) or rejected (DENIED) from entering the cache. +- **user_identity**: The user identity verified during the execution of this query. +- **reason**: The specific decision reason (the matched rule) that triggered the result. Common outputs include: `user table-level whitelist rule` (Current example: Table-level whitelist for a specified user); `common table-level blacklist rule` (Table-level blacklist for all users). The format is generally `[Scope] [Rule Level] [Rule Type] rule`. +- **cost**: The time cost to complete the entire admission matching calculation (in milliseconds). If the overhead is too large, it can be optimized by adjusting the number of rule hierarchies. + ## Appendix ### Principle -Data caching caches accessed remote data to the local BE node. The original data file is split into Blocks based on the accessed IO size, and Blocks are stored in the local file `cache_path/hash(filepath).substr(0, 3)/hash(filepath)/offset`, with Block metadata saved in the BE node. When accessing the same remote file, doris checks whether the cache data of the file exists in the local cache and determines which data to read from the local Block and which data to pull from the remote based on the Block's offset and size, caching the newly pulled remote data. When the BE node restarts, it scans the `cache_path` directory to restore Block metadata. When the cache size reaches the upper limit, it cleans up long-unused Blocks according to the LRU principle. \ No newline at end of file +Data caching caches accessed remote data to the local BE node. The original data file is split into Blocks based on the accessed IO size, and Blocks are stored in the local file `cache_path/hash(filepath).substr(0, 3)/hash(filepath)/offset`, with Block metadata saved in the BE node. When accessing the same remote file, Doris checks whether the cache data of the file exists in the local cache and determines which data to read from the local Block and which data to pull from the remote based on the Block's offset and size, caching the newly pulled remote data. When the BE node restarts, it scans the `cache_path` directory to restore Block metadata. When the cache size reaches the upper limit, it cleans up long-unused Blocks according to the LRU principle. \ No newline at end of file diff --git a/versioned_docs/version-4.x/lakehouse/data-cache.md b/versioned_docs/version-4.x/lakehouse/data-cache.md index 2599d01d95e7f..c7772bf552a85 100644 --- a/versioned_docs/version-4.x/lakehouse/data-cache.md +++ b/versioned_docs/version-4.x/lakehouse/data-cache.md @@ -2,7 +2,7 @@ { "title": "Data Cache", "language": "en", - "description": "Data Cache accelerates subsequent queries of the same data by caching recently accessed data files from remote storage systems (HDFS or object " + "description": "Apache Doris Data Cache accelerates Lakehouse queries by caching HDFS and object storage data locally. Supports cache warmup, quota control, and admission control for Hive, Iceberg, Hudi, and Paimon tables." } --- @@ -198,6 +198,7 @@ Usage restrictions: FROM hive_db.tpch100_parquet.lineitem WHERE dt = '2025-01-01'; ``` + 3. Warm up partial columns by filter conditions ```sql @@ -224,14 +225,147 @@ The system directly returns scan and cache write statistics for each BE (Note: S Field explanations: -* ScanRows: Number of rows scanned and read. -* ScanBytes: Amount of data scanned and read. -* ScanBytesFromLocalStorage: Amount of data scanned and read from local cache. -* ScanBytesFromRemoteStorage: Amount of data scanned and read from remote storage. -* BytesWriteIntoCache: Amount of data written to Data Cache during this warmup. +* `ScanRows`: Number of rows scanned and read. +* `ScanBytes`: Amount of data scanned and read. +* `ScanBytesFromLocalStorage`: Amount of data scanned and read from local cache. +* `ScanBytesFromRemoteStorage`: Amount of data scanned and read from remote storage. +* `BytesWriteIntoCache`: Amount of data written to Data Cache during this warmup. + +## Cache Admission Control + +> This is an experimental feature and is supported since version 4.1.0. + +The cache admission control feature provides a mechanism that allows users to control whether data read by a query is allowed to enter the File Cache (Data Cache) based on dimensions such as User, Catalog, Database, and Table. +In scenarios with massive cold data reads (e.g., large-scale ETL jobs or heavy ad-hoc queries), if all read data is allowed to enter the cache, it may cause existing hot data to be frequently evicted (i.e., "cache pollution"), leading to a drop in cache hit rates and overall query performance. When enabled, data denied admission will be pulled directly from remote underlying storage (e.g., HDFS, S3), effectively protecting core hot data from being swapped out. + +The cache admission control feature is disabled by default and needs to be enabled by configuring relevant parameters in the FE. + +### FE Configuration + +You need to enable this feature and specify the rule configuration file path in `fe.conf`, then restart the FE node for it to take effect. Modifications to the rule files themselves can be loaded dynamically. + +| Parameter | Required | Description | +| ----------------------------------------------- | -------- |-------------------------------------------| +| `enable_file_cache_admission_control` | Yes | Whether to enable cache admission control. Default is `false`. | +| `file_cache_admission_control_json_dir` | Yes | The directory path for storing admission rules JSON files. All `.json` files in this directory will be automatically loaded, and any rule additions, deletions, or modifications will **take effect dynamically**. | + +### Admission Rules Configuration Format + +Rule configurations are placed in `.json` files under the `file_cache_admission_control_json_dir` directory. The file content must be in a JSON array format. + +#### Field Description + +| Field Name | Type | Description | Example | +|--------|------|-------------------------------|-----------------------| +| `id` | Long | Rule ID. | `1` | +| `user_identity` | String | User identity (format: `user@host`, e.g., `%` matches all IPs). **Leaving it empty or omitting it matches all users globally.** | `"root@%"` | +| `catalog_name` | String | Catalog name. **Leaving it empty or omitting it matches all catalogs.** | `"hive_cat"` | +| `database_name` | String | Database name. **Leaving it empty or omitting it matches all databases.** | `"db1"` | +| `table_name` | String | Table name. **Leaving it empty or omitting it matches all tables.** | `"tbl1"` | +| `partition_pattern` | String | (Not implemented yet) Partition regular expression. Empty means matching all partitions. | `""` | +| `rule_type` | Integer | Rule type: `0` means deny cache (blacklist); `1` means allow cache (whitelist). | `0` | +| `enabled` | Integer | Whether the current rule is enabled: `0` means disabled; `1` means enabled. | `1` | +| `created_time` | Long | Creation time (UNIX timestamp, seconds). | `1766557246` | +| `updated_time` | Long | Update time (UNIX timestamp, seconds). | `1766557246` | + +#### JSON File Example + +```json +[ + { + "id": 1, + "user_identity": "root@%", + "catalog_name": "hive_cat", + "database_name": "db1", + "table_name": "table1", + "partition_pattern": "", + "rule_type": 0, + "enabled": 1, + "created_time": 1766557246, + "updated_time": 1766557246 + }, + { + "id": 2, + "user_identity": "", + "catalog_name": "hive_cat", + "database_name": "", + "table_name": "", + "partition_pattern": "", + "rule_type": 1, + "enabled": 1, + "created_time": 1766557246, + "updated_time": 1766557246 + } +] +``` + +#### Import Rules from MySQL + +For users with automated system integration needs, an auxiliary script is provided at `tools/export_mysql_rule_to_json.sh` in the Doris source code repository. This script can be used to export cache admission rules pre-stored in a MySQL database table into JSON configuration files that comply with the above format. + +### Rule Matching Principles + +#### Rule Scope Categories + +By combining different fields (`user_identity`, `catalog_name`, `database_name`, `table_name`) as either empty or specific values, the system supports 7 dimensions of valid rules. **Any rule configuration that does not comply with hierarchical dependencies (for example, skipping Database to specify Table directly) will be considered invalid.** + +| user_identity | catalog_name | database_name | table_name | Level and Scope | +|---------------|--------------|---------------|------------|------------------| +| **Specified** | **Specified** | **Specified** | **Specified** | Table-level rule for specified user | +| Empty or Omitted | **Specified** | **Specified** | **Specified** | Table-level rule for all users | +| **Specified** | **Specified** | **Specified** | Empty or Omitted | Database-level rule for specified user | +| Empty or Omitted | **Specified** | **Specified** | Empty or Omitted | Database-level rule for all users | +| **Specified** | **Specified** | Empty or Omitted | Empty or Omitted | Catalog-level rule for specified user | +| Empty or Omitted | **Specified** | Empty or Omitted | Empty or Omitted | Catalog-level rule for all users | +| **Specified** | Empty or Omitted | Empty or Omitted | Empty or Omitted | Global rule for specified user | + +#### Matching Priority and Order + +When a query accesses a table's data, the system comprehensively evaluates all rules to make an admission decision. The judgment process follows these principles: + +1. **Exact Match First**: Matching is conducted in order from specific to broad hierarchy (Table → Database → Catalog → Global). Once successfully matched at the most precise level (e.g., Table level), the judgment terminates immediately. +2. **Blacklist First (Security Principle)**: Within the same rule level, **deny cache rules always take precedence over allow cache rules**. If both a blacklist and a whitelist are matched simultaneously, the blacklist operation executes first, ensuring that access denial decisions at the same level take effect first. + +The complete decision derivation sequence is as follows: + +```text +1. Table-level rule matching + a) Hit Blacklist (rule_type=0) -> Deny + b) Hit Whitelist (rule_type=1) -> Allow +2. Database-level rule matching + ... +3. Catalog-level rule matching + ... +4. Global rule matching (only user_identity matched) + ... +5. Default fallback decision: If no rules at any level above match, caching is [Denied] by default (equivalent to a global blacklist). +``` + +> **Tip**: Because the system's fallback strategy is default-deny, best practice when deploying this feature is generally to establish a broad global allowance rule (e.g., a whitelist for all users, or for an important business Catalog), and then configure targeted Table-level blacklists for large tables known to undergo offline full-table scans. This achieves refined separation of cold and hot data. + +### Cache Decision Observability + +After successfully enabling and applying the configuration, users can view detailed cache admission decisions at the file data level for a single table via the `EXPLAIN` command (refer to the `file cache request` output below). + +```text +| 0:VHIVE_SCAN_NODE(74) | +| table: test_file_cache_features.tpch1_parquet.lineitem | +| inputSplitNum=10, totalFileSize=205792918, scanRanges=10 | +| partition=1/1 | +| cardinality=1469949, numNodes=1 | +| pushdown agg=NONE | +| file cache request ADMITTED: user_identity:root@%, reason:user table-level whitelist rule, cost:0.058 ms | +| limit: 1 | +``` + +Key fields and decision descriptions: +- **ADMITTED** / **DENIED**: Represents whether the request is allowed (ADMITTED) or rejected (DENIED) from entering the cache. +- **user_identity**: The user identity verified during the execution of this query. +- **reason**: The specific decision reason (the matched rule) that triggered the result. Common outputs include: `user table-level whitelist rule` (Current example: Table-level whitelist for a specified user); `common table-level blacklist rule` (Table-level blacklist for all users). The format is generally `[Scope] [Rule Level] [Rule Type] rule`. +- **cost**: The time cost to complete the entire admission matching calculation (in milliseconds). If the overhead is too large, it can be optimized by adjusting the number of rule hierarchies. ## Appendix ### Principle -Data caching caches accessed remote data to the local BE node. The original data file is split into Blocks based on the accessed IO size, and Blocks are stored in the local file `cache_path/hash(filepath).substr(0, 3)/hash(filepath)/offset`, with Block metadata saved in the BE node. When accessing the same remote file, doris checks whether the cache data of the file exists in the local cache and determines which data to read from the local Block and which data to pull from the remote based on the Block's offset and size, caching the newly pulled remote data. When the BE node restarts, it scans the `cache_path` directory to restore Block metadata. When the cache size reaches the upper limit, it cleans up long-unused Blocks according to the LRU principle. \ No newline at end of file +Data caching caches accessed remote data to the local BE node. The original data file is split into Blocks based on the accessed IO size, and Blocks are stored in the local file `cache_path/hash(filepath).substr(0, 3)/hash(filepath)/offset`, with Block metadata saved in the BE node. When accessing the same remote file, Doris checks whether the cache data of the file exists in the local cache and determines which data to read from the local Block and which data to pull from the remote based on the Block's offset and size, caching the newly pulled remote data. When the BE node restarts, it scans the `cache_path` directory to restore Block metadata. When the cache size reaches the upper limit, it cleans up long-unused Blocks according to the LRU principle. \ No newline at end of file