Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[optimize](desc) display the correct data type of aggStateType #34968

Merged
merged 3 commits into from
May 21, 2024

Conversation

DarvenDuan
Copy link
Contributor

@DarvenDuan DarvenDuan commented May 16, 2024

If a table column is AGG_STATE type, we can't get the clear defined data type if we use desc tbl statement.

create table a_table(
    k1 int null,
    k2 agg_state<max_by(int not null,int)> generic,
    k3 agg_state<group_concat(string)> generic
)
aggregate key (k1)
distributed BY hash(k1) buckets 3
properties("replication_num" = "1");

before optimize:

mysql> desc a_table;
+-------+------------------------------------------------+------+-------+---------+---------+
| Field | Type                                           | Null | Key   | Default | Extra   |
+-------+------------------------------------------------+------+-------+---------+---------+
| k1    | INT                                            | Yes  | true  | NULL    |         |
| k2    | org.apache.doris.catalog.AggStateType@239f771c | No   | false | NULL    | GENERIC |
| k3    | org.apache.doris.catalog.AggStateType@2e535f50 | No   | false | NULL    | GENERIC |
+-------+------------------------------------------------+------+-------+---------+---------+
3 rows in set (0.00 sec)

after optimize:

mysql> desc a_table;
+-------+------------------------------------+------+-------+---------+---------+
| Field | Type                               | Null | Key   | Default | Extra   |
+-------+------------------------------------+------+-------+---------+---------+
| k1    | INT                                | Yes  | true  | NULL    |         |
| k2    | AGG_STATE<max_by(INT, INT NULL)>   | No   | false | NULL    | GENERIC |
| k3    | AGG_STATE<group_concat(TEXT NULL)> | No   | false | NULL    | GENERIC |
+-------+------------------------------------+------+-------+---------+---------+

@doris-robot
Copy link

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR

Since 2024-03-18, the Document has been moved to doris-website.
See Doris Document.

@DarvenDuan
Copy link
Contributor Author

run buildall

Copy link
Contributor

@caiconghui caiconghui left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

PR approved by at least one committer and no changes requested.

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label May 21, 2024
Copy link
Contributor

PR approved by anyone and no changes requested.

Copy link
Contributor

@GoGoWen GoGoWen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@doris-robot
Copy link

TPC-H: Total hot run time: 41955 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit cc79ae2db721643c4ac0c283cb3f54e54d09e368, data reload: false

------ Round 1 ----------------------------------
q1	17599	4487	4209	4209
q2	2023	195	202	195
q3	10431	1284	1174	1174
q4	10211	894	861	861
q5	7495	2752	2738	2738
q6	219	137	133	133
q7	1021	585	588	585
q8	9415	2174	2127	2127
q9	9236	6737	6821	6737
q10	9218	3900	3864	3864
q11	448	264	238	238
q12	404	227	242	227
q13	17477	3154	3261	3154
q14	259	216	210	210
q15	504	469	477	469
q16	529	408	404	404
q17	983	728	734	728
q18	8498	8024	7716	7716
q19	3137	1585	1499	1499
q20	669	316	318	316
q21	5233	4091	4110	4091
q22	351	298	280	280
Total cold run time: 115360 ms
Total hot run time: 41955 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4546	4387	4407	4387
q2	398	274	279	274
q3	3317	2906	2754	2754
q4	1863	1611	1593	1593
q5	5519	5479	5527	5479
q6	212	121	123	121
q7	2345	1994	1990	1990
q8	3249	3414	3380	3380
q9	8678	8737	8616	8616
q10	3899	3887	3840	3840
q11	599	511	502	502
q12	805	637	629	629
q13	17453	3155	3090	3090
q14	284	260	269	260
q15	508	466	469	466
q16	484	405	417	405
q17	1757	1486	1471	1471
q18	7716	7569	7439	7439
q19	2531	1534	1525	1525
q20	1999	1779	1788	1779
q21	5465	5033	4885	4885
q22	571	490	479	479
Total cold run time: 74198 ms
Total hot run time: 55364 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 180846 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit cc79ae2db721643c4ac0c283cb3f54e54d09e368, data reload: false

query1	906	381	366	366
query2	7121	2539	2296	2296
query3	6652	216	222	216
query4	23923	21247	21174	21174
query5	4157	416	419	416
query6	271	183	186	183
query7	4577	298	280	280
query8	250	194	182	182
query9	8758	2413	2380	2380
query10	443	264	249	249
query11	14990	14174	14074	14074
query12	133	97	88	88
query13	1650	383	380	380
query14	10181	8117	7661	7661
query15	229	166	172	166
query16	7910	250	248	248
query17	1756	586	547	547
query18	2043	283	264	264
query19	198	145	148	145
query20	91	85	84	84
query21	190	125	124	124
query22	5030	4835	4798	4798
query23	34008	33666	33437	33437
query24	6760	2846	2934	2846
query25	535	371	358	358
query26	690	158	152	152
query27	1994	317	327	317
query28	3741	2028	2027	2027
query29	848	615	596	596
query30	225	179	175	175
query31	956	740	742	740
query32	92	57	53	53
query33	479	246	242	242
query34	880	479	481	479
query35	758	677	668	668
query36	1068	913	874	874
query37	104	69	72	69
query38	2873	2741	2702	2702
query39	1647	1544	1564	1544
query40	193	130	119	119
query41	53	44	43	43
query42	106	96	100	96
query43	601	558	566	558
query44	1088	715	735	715
query45	277	254	256	254
query46	1062	754	728	728
query47	1950	1872	1892	1872
query48	362	290	287	287
query49	762	387	393	387
query50	779	376	384	376
query51	6885	6691	6661	6661
query52	107	88	90	88
query53	345	290	285	285
query54	518	419	424	419
query55	73	71	74	71
query56	237	216	219	216
query57	1245	1132	1157	1132
query58	226	204	196	196
query59	3402	3072	3244	3072
query60	293	241	260	241
query61	88	91	86	86
query62	590	495	479	479
query63	315	282	283	282
query64	8417	2204	1776	1776
query65	3189	3095	3109	3095
query66	791	342	339	339
query67	15576	15127	14924	14924
query68	4490	521	529	521
query69	469	297	296	296
query70	1175	1150	1100	1100
query71	383	263	264	263
query72	7085	2632	2344	2344
query73	717	310	314	310
query74	6654	6140	6266	6140
query75	3288	2647	2580	2580
query76	2126	993	957	957
query77	406	266	261	261
query78	10527	10188	10037	10037
query79	2238	517	514	514
query80	1026	451	452	451
query81	521	245	243	243
query82	960	105	97	97
query83	256	165	171	165
query84	253	90	91	90
query85	1073	321	375	321
query86	455	316	280	280
query87	3295	3139	3130	3130
query88	3351	2312	2316	2312
query89	475	372	399	372
query90	2031	184	186	184
query91	120	97	98	97
query92	61	50	48	48
query93	2021	497	481	481
query94	1158	182	182	182
query95	395	289	306	289
query96	581	269	259	259
query97	3166	2994	3011	2994
query98	241	227	230	227
query99	1208	904	910	904
Total cold run time: 270852 ms
Total hot run time: 180846 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 30.26 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit cc79ae2db721643c4ac0c283cb3f54e54d09e368, data reload: false

query1	0.04	0.04	0.03
query2	0.08	0.04	0.04
query3	0.24	0.05	0.04
query4	1.68	0.07	0.08
query5	0.51	0.49	0.49
query6	1.12	0.73	0.73
query7	0.02	0.02	0.01
query8	0.04	0.04	0.04
query9	0.55	0.49	0.49
query10	0.55	0.56	0.54
query11	0.16	0.11	0.12
query12	0.15	0.12	0.12
query13	0.59	0.58	0.60
query14	0.76	0.76	0.77
query15	0.83	0.81	0.81
query16	0.37	0.36	0.35
query17	0.97	0.95	0.93
query18	0.22	0.23	0.24
query19	1.76	1.68	1.76
query20	0.01	0.01	0.01
query21	15.74	0.66	0.66
query22	4.21	8.00	1.70
query23	18.32	1.40	1.33
query24	1.64	0.29	0.21
query25	0.16	0.08	0.08
query26	0.27	0.18	0.17
query27	0.08	0.07	0.07
query28	13.45	1.02	1.04
query29	13.55	3.27	3.24
query30	0.24	0.06	0.06
query31	2.87	0.41	0.39
query32	3.26	0.48	0.48
query33	2.94	2.89	2.90
query34	16.94	4.39	4.42
query35	4.49	4.65	4.47
query36	0.65	0.46	0.45
query37	0.17	0.16	0.16
query38	0.15	0.15	0.15
query39	0.05	0.03	0.04
query40	0.16	0.13	0.15
query41	0.10	0.05	0.04
query42	0.05	0.05	0.05
query43	0.04	0.03	0.03
Total cold run time: 110.18 s
Total hot run time: 30.26 s

@DarvenDuan
Copy link
Contributor Author

run external

@morrySnow
Copy link
Contributor

could u replace screenshot by code block? screeshot is hard to put into git message and not firendly for search engine

@DarvenDuan
Copy link
Contributor Author

could u replace screenshot by code block? screeshot is hard to put into git message and not firendly for search engine

OK, I modified my comment.

@morrySnow morrySnow merged commit 2ec122c into apache:master May 21, 2024
28 of 31 checks passed
yiguolei pushed a commit that referenced this pull request May 22, 2024
If a table column is AGG_STATE type, we can't get the clear defined data type if we use `desc tbl` statement.

create table a_table(
    k1 int null,
    k2 agg_state<max_by(int not null,int)> generic,
    k3 agg_state<group_concat(string)> generic
)
aggregate key (k1)
distributed BY hash(k1) buckets 3
properties("replication_num" = "1");

before optimize:

mysql> desc a_table;
+-------+------------------------------------------------+------+-------+---------+---------+
| Field | Type                                           | Null | Key   | Default | Extra   |
+-------+------------------------------------------------+------+-------+---------+---------+
| k1    | INT                                            | Yes  | true  | NULL    |         |
| k2    | org.apache.doris.catalog.AggStateType@239f771c | No   | false | NULL    | GENERIC |
| k3    | org.apache.doris.catalog.AggStateType@2e535f50 | No   | false | NULL    | GENERIC |
+-------+------------------------------------------------+------+-------+---------+---------+
3 rows in set (0.00 sec)


after optimize:

mysql> desc a_table;
+-------+------------------------------------+------+-------+---------+---------+
| Field | Type                               | Null | Key   | Default | Extra   |
+-------+------------------------------------+------+-------+---------+---------+
| k1    | INT                                | Yes  | true  | NULL    |         |
| k2    | AGG_STATE<max_by(INT, INT NULL)>   | No   | false | NULL    | GENERIC |
| k3    | AGG_STATE<group_concat(TEXT NULL)> | No   | false | NULL    | GENERIC |
+-------+------------------------------------+------+-------+---------+---------+


Co-authored-by: duanxujian <duanxujian@jd.com>
M1saka2003 pushed a commit to M1saka2003/doris that referenced this pull request May 24, 2024
…e#34968)

If a table column is AGG_STATE type, we can't get the clear defined data type if we use `desc tbl` statement.

create table a_table(
    k1 int null,
    k2 agg_state<max_by(int not null,int)> generic,
    k3 agg_state<group_concat(string)> generic
)
aggregate key (k1)
distributed BY hash(k1) buckets 3
properties("replication_num" = "1");

before optimize:

mysql> desc a_table;
+-------+------------------------------------------------+------+-------+---------+---------+
| Field | Type                                           | Null | Key   | Default | Extra   |
+-------+------------------------------------------------+------+-------+---------+---------+
| k1    | INT                                            | Yes  | true  | NULL    |         |
| k2    | org.apache.doris.catalog.AggStateType@239f771c | No   | false | NULL    | GENERIC |
| k3    | org.apache.doris.catalog.AggStateType@2e535f50 | No   | false | NULL    | GENERIC |
+-------+------------------------------------------------+------+-------+---------+---------+
3 rows in set (0.00 sec)


after optimize:

mysql> desc a_table;
+-------+------------------------------------+------+-------+---------+---------+
| Field | Type                               | Null | Key   | Default | Extra   |
+-------+------------------------------------+------+-------+---------+---------+
| k1    | INT                                | Yes  | true  | NULL    |         |
| k2    | AGG_STATE<max_by(INT, INT NULL)>   | No   | false | NULL    | GENERIC |
| k3    | AGG_STATE<group_concat(TEXT NULL)> | No   | false | NULL    | GENERIC |
+-------+------------------------------------+------+-------+---------+---------+


Co-authored-by: duanxujian <duanxujian@jd.com>
dataroaring pushed a commit that referenced this pull request May 26, 2024
If a table column is AGG_STATE type, we can't get the clear defined data type if we use `desc tbl` statement.

create table a_table(
    k1 int null,
    k2 agg_state<max_by(int not null,int)> generic,
    k3 agg_state<group_concat(string)> generic
)
aggregate key (k1)
distributed BY hash(k1) buckets 3
properties("replication_num" = "1");

before optimize:

mysql> desc a_table;
+-------+------------------------------------------------+------+-------+---------+---------+
| Field | Type                                           | Null | Key   | Default | Extra   |
+-------+------------------------------------------------+------+-------+---------+---------+
| k1    | INT                                            | Yes  | true  | NULL    |         |
| k2    | org.apache.doris.catalog.AggStateType@239f771c | No   | false | NULL    | GENERIC |
| k3    | org.apache.doris.catalog.AggStateType@2e535f50 | No   | false | NULL    | GENERIC |
+-------+------------------------------------------------+------+-------+---------+---------+
3 rows in set (0.00 sec)


after optimize:

mysql> desc a_table;
+-------+------------------------------------+------+-------+---------+---------+
| Field | Type                               | Null | Key   | Default | Extra   |
+-------+------------------------------------+------+-------+---------+---------+
| k1    | INT                                | Yes  | true  | NULL    |         |
| k2    | AGG_STATE<max_by(INT, INT NULL)>   | No   | false | NULL    | GENERIC |
| k3    | AGG_STATE<group_concat(TEXT NULL)> | No   | false | NULL    | GENERIC |
+-------+------------------------------------+------+-------+---------+---------+


Co-authored-by: duanxujian <duanxujian@jd.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by one committer. dev/2.1.x reviewed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants