Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[feat](nereids) count prefix index in CBO #34970

Merged
merged 6 commits into from
May 24, 2024
Merged

Conversation

englefly
Copy link
Contributor

@englefly englefly commented May 16, 2024

Proposed changes

  1. if prefix index is choosen, lower the filter cost in CBO
  2. for join cost: if left.rowcount==right.rowcount, we will choose the wider (stats.getWidthInJoinCluster()) child as left child, since we prefer left deep tree.

Issue Number: close #xxx

Further comments

If this is a relatively large or complex change, kick off the discussion at dev@doris.apache.org by explaining why you chose the solution you did and what alternatives you considered, etc...

@doris-robot
Copy link

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR

Since 2024-03-18, the Document has been moved to doris-website.
See Doris Document.

@englefly
Copy link
Contributor Author

run buildall

@englefly
Copy link
Contributor Author

run p0

@englefly
Copy link
Contributor Author

run buildall

@englefly
Copy link
Contributor Author

run buildall

1 similar comment
@englefly
Copy link
Contributor Author

run buildall

@englefly
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 41151 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 48ecadd5e2ef35bc1b81aa56123dadaf8637848c, data reload: false

------ Round 1 ----------------------------------
q1	17604	4339	4239	4239
q2	2021	196	190	190
q3	10480	1252	1204	1204
q4	10203	886	819	819
q5	7454	2686	2737	2686
q6	225	134	136	134
q7	955	604	612	604
q8	9234	2095	2100	2095
q9	9230	6629	6687	6629
q10	9185	3903	3904	3903
q11	453	239	250	239
q12	431	237	235	235
q13	17259	3263	3253	3253
q14	275	219	224	219
q15	521	473	474	473
q16	512	397	396	396
q17	987	692	599	599
q18	8380	7856	7880	7856
q19	4848	1564	1540	1540
q20	648	316	325	316
q21	5076	3245	4210	3245
q22	363	277	286	277
Total cold run time: 116344 ms
Total hot run time: 41151 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4569	4405	4459	4405
q2	370	275	297	275
q3	3127	2960	2884	2884
q4	1974	1686	1706	1686
q5	5340	5371	5478	5371
q6	217	123	126	123
q7	2223	1809	1854	1809
q8	3232	3350	3412	3350
q9	8611	8590	8671	8590
q10	4093	3894	3797	3797
q11	582	482	493	482
q12	776	586	582	582
q13	16362	3175	3151	3151
q14	296	277	278	277
q15	527	479	481	479
q16	493	437	460	437
q17	1826	1528	1535	1528
q18	8128	7493	7366	7366
q19	1659	1533	1533	1533
q20	1993	1802	1748	1748
q21	9976	4769	4744	4744
q22	559	494	501	494
Total cold run time: 76933 ms
Total hot run time: 55111 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 171722 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 48ecadd5e2ef35bc1b81aa56123dadaf8637848c, data reload: false

query1	912	387	366	366
query2	6454	2429	2524	2429
query3	6637	204	201	201
query4	18860	17303	17354	17303
query5	4094	425	429	425
query6	251	154	154	154
query7	4576	298	282	282
query8	237	198	182	182
query9	8597	2395	2357	2357
query10	476	280	293	280
query11	10444	10124	10040	10040
query12	133	90	89	89
query13	1640	368	357	357
query14	9933	7509	6940	6940
query15	271	169	172	169
query16	7992	263	252	252
query17	1739	539	506	506
query18	2093	265	268	265
query19	203	153	149	149
query20	93	84	83	83
query21	198	138	131	131
query22	4149	3846	3923	3846
query23	33692	33160	32924	32924
query24	8354	2754	2859	2754
query25	561	357	380	357
query26	690	152	153	152
query27	2170	319	316	316
query28	5593	2042	2063	2042
query29	851	601	592	592
query30	223	178	178	178
query31	986	765	732	732
query32	93	51	53	51
query33	670	263	258	258
query34	863	473	467	467
query35	702	629	591	591
query36	1038	880	934	880
query37	107	68	70	68
query38	2897	2820	2776	2776
query39	828	776	794	776
query40	199	121	120	120
query41	49	43	43	43
query42	104	97	98	97
query43	573	551	542	542
query44	1049	725	741	725
query45	183	162	164	162
query46	1072	700	732	700
query47	1857	1765	1766	1765
query48	374	291	289	289
query49	866	380	392	380
query50	763	386	389	386
query51	6924	6706	6797	6706
query52	103	90	90	90
query53	349	281	298	281
query54	837	422	414	414
query55	72	71	71	71
query56	263	238	250	238
query57	1095	1052	1029	1029
query58	229	204	198	198
query59	3476	3217	3110	3110
query60	280	252	249	249
query61	111	89	87	87
query62	604	441	452	441
query63	303	274	282	274
query64	8480	2222	1730	1730
query65	3238	3105	3110	3105
query66	786	325	321	321
query67	15299	14856	15194	14856
query68	4498	529	537	529
query69	442	260	312	260
query70	1196	1151	1098	1098
query71	379	264	266	264
query72	7251	5773	5223	5223
query73	742	329	311	311
query74	6013	5694	5701	5694
query75	3274	2663	2603	2603
query76	2317	1000	1018	1000
query77	385	264	261	261
query78	10219	9893	9862	9862
query79	2344	509	506	506
query80	986	437	431	431
query81	539	250	240	240
query82	1293	96	92	92
query83	233	173	174	173
query84	242	89	87	87
query85	1439	352	254	254
query86	457	300	303	300
query87	3286	3178	3132	3132
query88	4169	2320	2325	2320
query89	496	383	371	371
query90	2059	186	183	183
query91	132	94	98	94
query92	59	48	47	47
query93	2332	506	491	491
query94	1281	186	183	183
query95	395	307	300	300
query96	597	269	259	259
query97	3177	2956	2986	2956
query98	248	226	219	219
query99	1108	859	850	850
Total cold run time: 263426 ms
Total hot run time: 171722 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 30.92 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 48ecadd5e2ef35bc1b81aa56123dadaf8637848c, data reload: false

query1	0.04	0.04	0.03
query2	0.09	0.04	0.04
query3	0.23	0.05	0.05
query4	1.66	0.07	0.07
query5	0.49	0.48	0.51
query6	1.12	0.72	0.73
query7	0.02	0.01	0.01
query8	0.05	0.04	0.04
query9	0.54	0.49	0.48
query10	0.55	0.56	0.54
query11	0.15	0.11	0.12
query12	0.14	0.12	0.12
query13	0.59	0.59	0.59
query14	0.78	0.78	0.78
query15	0.82	0.81	0.82
query16	0.38	0.36	0.38
query17	1.03	1.01	1.01
query18	0.23	0.24	0.23
query19	1.77	1.72	1.68
query20	0.01	0.01	0.01
query21	15.73	0.66	0.64
query22	4.01	6.53	2.26
query23	18.28	1.44	1.35
query24	1.94	0.25	0.20
query25	0.15	0.09	0.09
query26	0.27	0.16	0.17
query27	0.08	0.08	0.08
query28	13.32	1.02	0.99
query29	13.24	3.37	3.27
query30	0.24	0.05	0.05
query31	2.85	0.41	0.39
query32	3.25	0.46	0.46
query33	2.87	2.89	2.93
query34	17.20	4.39	4.43
query35	4.51	4.49	4.51
query36	0.70	0.47	0.46
query37	0.19	0.15	0.14
query38	0.16	0.14	0.14
query39	0.05	0.04	0.04
query40	0.16	0.14	0.14
query41	0.09	0.05	0.05
query42	0.06	0.04	0.05
query43	0.04	0.04	0.04
Total cold run time: 110.08 s
Total hot run time: 30.92 s

@englefly
Copy link
Contributor Author

run buildall

@englefly
Copy link
Contributor Author

run performance

@doris-robot
Copy link

TPC-H: Total hot run time: 41398 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 70233d22a874fdff9c14621cb0ca6d7a99a9bcf0, data reload: false

------ Round 1 ----------------------------------
q1	17977	6861	4250	4250
q2	2033	186	190	186
q3	10607	1136	1219	1136
q4	10390	842	794	794
q5	7470	2730	2716	2716
q6	220	132	134	132
q7	965	611	605	605
q8	9212	2154	2094	2094
q9	9496	6722	6704	6704
q10	9610	3887	3906	3887
q11	480	237	240	237
q12	462	226	220	220
q13	17352	3183	3276	3183
q14	258	220	220	220
q15	509	475	475	475
q16	501	385	393	385
q17	986	653	661	653
q18	8432	7913	7917	7913
q19	4691	1568	1580	1568
q20	662	326	326	326
q21	5191	3435	3882	3435
q22	349	285	279	279
Total cold run time: 117853 ms
Total hot run time: 41398 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4582	4381	4409	4381
q2	369	270	274	270
q3	3193	2909	3002	2909
q4	1887	1589	1583	1583
q5	5438	5495	5484	5484
q6	220	122	122	122
q7	2189	1846	1828	1828
q8	3300	3420	3355	3355
q9	8717	8719	8687	8687
q10	3895	3731	3826	3731
q11	591	504	492	492
q12	805	619	617	617
q13	16292	3149	3227	3149
q14	294	272	266	266
q15	526	484	475	475
q16	494	428	420	420
q17	1758	1508	1492	1492
q18	7686	7606	7511	7511
q19	1658	1513	1585	1513
q20	2029	1805	1802	1802
q21	4883	4781	4790	4781
q22	579	485	487	485
Total cold run time: 71385 ms
Total hot run time: 55353 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 172873 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 70233d22a874fdff9c14621cb0ca6d7a99a9bcf0, data reload: false

query1	914	386	363	363
query2	6451	2428	2414	2414
query3	6651	208	210	208
query4	19410	17370	17247	17247
query5	4188	421	421	421
query6	240	156	147	147
query7	4591	298	289	289
query8	235	192	183	183
query9	8466	2371	2357	2357
query10	456	278	286	278
query11	10660	10109	10084	10084
query12	143	97	85	85
query13	1638	357	354	354
query14	10088	7556	7705	7556
query15	218	171	175	171
query16	7701	260	258	258
query17	1710	539	507	507
query18	1937	271	272	271
query19	196	151	155	151
query20	92	84	85	84
query21	194	126	126	126
query22	4261	3921	3982	3921
query23	33680	32999	33340	32999
query24	6725	2893	2797	2797
query25	459	354	382	354
query26	689	159	159	159
query27	1908	323	318	318
query28	3866	2052	2027	2027
query29	834	623	602	602
query30	241	180	173	173
query31	936	761	762	761
query32	60	51	56	51
query33	503	272	264	264
query34	843	489	489	489
query35	699	625	618	618
query36	1012	930	947	930
query37	107	73	71	71
query38	2893	2783	2790	2783
query39	837	811	796	796
query40	201	133	140	133
query41	47	48	53	48
query42	108	96	97	96
query43	603	556	550	550
query44	1067	728	745	728
query45	184	169	162	162
query46	1060	729	719	719
query47	1847	1762	1781	1762
query48	365	296	284	284
query49	761	372	387	372
query50	781	387	383	383
query51	6909	6700	6799	6700
query52	106	96	93	93
query53	350	290	292	290
query54	526	435	419	419
query55	77	74	75	74
query56	283	238	245	238
query57	1139	1042	1048	1042
query58	242	214	215	214
query59	3420	3428	3219	3219
query60	263	251	249	249
query61	90	89	88	88
query62	558	455	446	446
query63	311	289	293	289
query64	2521	1788	1722	1722
query65	3195	3106	3134	3106
query66	799	332	326	326
query67	15244	14754	14779	14754
query68	4588	533	539	533
query69	439	271	274	271
query70	1142	1116	1150	1116
query71	350	263	268	263
query72	7348	5428	5372	5372
query73	719	318	320	318
query74	6046	5653	5643	5643
query75	3249	2658	2614	2614
query76	2259	930	1064	930
query77	424	266	268	266
query78	10229	9790	9717	9717
query79	2479	514	514	514
query80	1114	441	431	431
query81	520	248	239	239
query82	792	96	93	93
query83	246	167	190	167
query84	247	85	89	85
query85	1060	282	276	276
query86	450	330	309	309
query87	3295	3108	3096	3096
query88	4134	2332	2372	2332
query89	487	397	384	384
query90	2086	189	185	185
query91	140	99	110	99
query92	58	48	59	48
query93	2035	518	495	495
query94	1187	197	195	195
query95	415	326	321	321
query96	597	280	268	268
query97	3185	3063	3063	3063
query98	248	220	222	220
query99	1178	837	850	837
Total cold run time: 252535 ms
Total hot run time: 172873 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 30.03 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 70233d22a874fdff9c14621cb0ca6d7a99a9bcf0, data reload: false

query1	0.04	0.04	0.03
query2	0.08	0.04	0.04
query3	0.23	0.05	0.06
query4	1.67	0.08	0.09
query5	0.52	0.49	0.53
query6	1.12	0.73	0.72
query7	0.02	0.01	0.02
query8	0.05	0.05	0.05
query9	0.53	0.48	0.50
query10	0.54	0.55	0.54
query11	0.16	0.11	0.12
query12	0.15	0.12	0.12
query13	0.61	0.60	0.61
query14	0.77	0.77	0.76
query15	0.83	0.80	0.81
query16	0.35	0.37	0.36
query17	1.02	1.01	1.01
query18	0.22	0.24	0.24
query19	1.87	1.74	1.73
query20	0.02	0.01	0.00
query21	15.43	0.72	0.69
query22	4.56	7.84	1.26
query23	18.31	1.43	1.36
query24	1.70	0.21	0.26
query25	0.15	0.08	0.08
query26	0.26	0.16	0.17
query27	0.08	0.07	0.07
query28	13.30	1.02	0.99
query29	13.30	3.27	3.27
query30	0.27	0.08	0.08
query31	2.84	0.39	0.37
query32	3.31	0.47	0.47
query33	2.87	2.86	2.90
query34	17.10	4.43	4.42
query35	4.50	4.47	4.55
query36	0.66	0.46	0.46
query37	0.17	0.16	0.15
query38	0.15	0.14	0.14
query39	0.05	0.04	0.04
query40	0.15	0.14	0.14
query41	0.09	0.04	0.04
query42	0.05	0.05	0.04
query43	0.04	0.04	0.04
Total cold run time: 110.14 s
Total hot run time: 30.03 s

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label May 24, 2024
Copy link
Contributor

PR approved by at least one committer and no changes requested.

Copy link
Contributor

PR approved by anyone and no changes requested.

@englefly englefly merged commit c53d3ed into apache:master May 24, 2024
27 of 29 checks passed
@englefly englefly deleted the mv-stats branch May 24, 2024 06:45
M1saka2003 pushed a commit to M1saka2003/doris that referenced this pull request May 24, 2024
1. if prefix index is choosen, lower the filter cost in CBO. MV and OlapTable could have different prefix index. it help us to choose a plan which leaverages prefix index.

2. for join cost: if left.rowcount==right.rowcount, we will choose the wider (stats.getWidthInJoinCluster()) child as left child, since we prefer left deep tree.
dataroaring pushed a commit that referenced this pull request May 26, 2024
1. if prefix index is choosen, lower the filter cost in CBO. MV and OlapTable could have different prefix index. it help us to choose a plan which leaverages prefix index.

2. for join cost: if left.rowcount==right.rowcount, we will choose the wider (stats.getWidthInJoinCluster()) child as left child, since we prefer left deep tree.
seawinde pushed a commit to seawinde/doris that referenced this pull request May 27, 2024
1. if prefix index is choosen, lower the filter cost in CBO. MV and OlapTable could have different prefix index. it help us to choose a plan which leaverages prefix index.

2. for join cost: if left.rowcount==right.rowcount, we will choose the wider (stats.getWidthInJoinCluster()) child as left child, since we prefer left deep tree.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by one committer. reviewed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants