Skip to content

update apache doris result based on 4.1 RC01 and c7a.48xl, modify benchmark script#820

Open
HappenLee wants to merge 4 commits intoClickHouse:mainfrom
HappenLee:doris
Open

update apache doris result based on 4.1 RC01 and c7a.48xl, modify benchmark script#820
HappenLee wants to merge 4 commits intoClickHouse:mainfrom
HappenLee:doris

Conversation

@HappenLee
Copy link

Thank You for Your Contribution!

We appreciate your effort and contribution to the project. To ensure that your Pull Request (PR) adheres to our guidelines, please ensure to review the rules mentioned in our contribution guidelines:

ClickHouse/ClickBench Contribution Rules

Thank you for your attention to these details and for helping us maintain the quality and integrity of the project.

Copilot AI review requested due to automatic review settings March 18, 2026 14:09
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Updates the Apache Doris ClickBench entry to reflect benchmarking on Doris 4.1.0 RC01 (c7a.metal-48xl), including updating the benchmark automation and publishing new measured results.

Changes:

  • Update doris/benchmark.sh to download Doris 4.1.0 RC01 and load the dataset from partitioned Parquet via local(...) TVF, then run queries inline (instead of run.sh).
  • Remove doris/run.sh (query execution loop moved into benchmark.sh).
  • Update doris/results/c7a.metal-48xl.json with new date/load time/data size and query timings.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.

File Description
doris/benchmark.sh Updates install URL/version and switches dataset load + query execution flow.
doris/run.sh Deleted; prior query runner logic is now embedded in benchmark.sh.
doris/results/c7a.metal-48xl.json Refreshes published benchmark results for c7a.metal-48xl.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

Comment on lines +99 to +105
PARALLEL_NUM=$(($(nproc) / 4))
echo "Setting parallel_pipeline_task_num to $PARALLEL_NUM (cpu cores: $(nproc) / 4)"

echo "start loading hits.parquet using TVF, estimated to take about 3 minutes ..."
START=$(date +%s)
curl --location-trusted \
-u root: \
-T "hits.tsv" \
-H "label:hits" \
-H "columns: WatchID,JavaEnable,Title,GoodEvent,EventTime,EventDate,CounterID,ClientIP,RegionID,UserID,CounterClass,OS,UserAgent,URL,Referer,IsRefresh,RefererCategoryID,RefererRegionID,URLCategoryID,URLRegionID,ResolutionWidth,ResolutionHeight,ResolutionDepth,FlashMajor,FlashMinor,FlashMinor2,NetMajor,NetMinor,UserAgentMajor,UserAgentMinor,CookieEnable,JavascriptEnable,IsMobile,MobilePhone,MobilePhoneModel,Params,IPNetworkID,TraficSourceID,SearchEngineID,SearchPhrase,AdvEngineID,IsArtifical,WindowClientWidth,WindowClientHeight,ClientTimeZone,ClientEventTime,SilverlightVersion1,SilverlightVersion2,SilverlightVersion3,SilverlightVersion4,PageCharset,CodeVersion,IsLink,IsDownload,IsNotBounce,FUniqID,OriginalURL,HID,IsOldCounter,IsEvent,IsParameter,DontCountHits,WithHash,HitColor,LocalEventTime,Age,Sex,Income,Interests,Robotness,RemoteIP,WindowName,OpenerName,HistoryLength,BrowserLanguage,BrowserCountry,SocialNetwork,SocialAction,HTTPError,SendTiming,DNSTiming,ConnectTiming,ResponseStartTiming,ResponseEndTiming,FetchTiming,SocialSourceNetworkID,SocialSourcePage,ParamPrice,ParamOrderID,ParamCurrency,ParamCurrencyID,OpenstatServiceName,OpenstatCampaignID,OpenstatAdID,OpenstatSourceID,UTMSource,UTMMedium,UTMCampaign,UTMContent,UTMTerm,FromTag,HasGCLID,RefererHash,URLHash,CLID" \
http://localhost:8030/api/hits/hits/_stream_load
mysql -h 127.0.0.1 -P9030 -uroot hits -e "SET parallel_pipeline_task_num = $PARALLEL_NUM;\
INSERT INTO hits SELECT
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants