Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Enhancement] improve cloud native pk table memory cost when handle large ingestion #45685

Merged
merged 1 commit into from
May 24, 2024

Conversation

luohaha
Copy link
Contributor

@luohaha luohaha commented May 15, 2024

Why I'm doing:

In current implementation, if one ingest in tablet has multi segment files, and SR will load all segment file state at once, which will cost lots of memory if this ingestion is large, and it will lead to OOM. We need to avoid OOM when handle large ingestion.

What I'm doing:

  1. Load segment files one by one, so max memory use by one tablet when publish version will be less than 100MB, because max memtable when loading is 100MB (be.conf write_buffer_size).
  2. Also allow SR to preload more segment when update memory is not limited.
  3. Refactor the code.

What type of PR is this:

  • BugFix
  • Feature
  • Enhancement
  • Refactor
  • UT
  • Doc
  • Tool

Does this PR entail a change in behavior?

  • Yes, this PR will result in a change in behavior.
  • No, this PR will not result in a change in behavior.

If yes, please specify the type of change:

  • Interface/UI changes: syntax, type conversion, expression evaluation, display information
  • Parameter changes: default values, similar parameters but with different default values
  • Policy changes: use new policy to replace old one, functionality automatically enabled
  • Feature removed
  • Miscellaneous: upgrade & downgrade compatibility, etc.

Checklist:

  • I have added test cases for my bug fix or my new feature
  • This pr needs user documentation (for new or modified features or behaviors)
    • I have added documentation for my new feature or new function
  • This is a backport pr

Bugfix cherry-pick branch check:

  • I have checked the version labels which the pr will be auto-backported to the target branch
    • 3.3
    • 3.2
    • 3.1
    • 3.0
    • 2.5

@luohaha luohaha requested review from a team as code owners May 15, 2024 12:27
@luohaha luohaha force-pushed the lake-pk-large-ingest branch 4 times, most recently from 0ee75d1 to 5128ed3 Compare May 16, 2024 13:06
@wyb wyb requested review from decster, TszKitLo40 and sevev May 17, 2024 03:20
@luohaha luohaha requested review from srlch and wyb May 21, 2024 02:17
…arge ingestion

Signed-off-by: luohaha <18810541851@163.com>
Copy link

[FE Incremental Coverage Report]

pass : 0 / 0 (0%)

Copy link

[BE Incremental Coverage Report]

pass : 315 / 337 (93.47%)

file detail

path covered_line new_line coverage not_covered_line_detail
🔵 be/src/storage/lake/rowset_update_state.cpp 209 227 92.07% [84, 212, 269, 271, 292, 304, 308, 309, 310, 311, 313, 314, 347, 348, 353, 354, 701, 720]
🔵 be/src/storage/lake/update_manager.cpp 91 95 95.79% [414, 458, 525, 532]
🔵 be/src/storage/lake/rowset_update_state.h 12 12 100.00% []
🔵 be/src/storage/lake/update_manager.h 2 2 100.00% []
🔵 be/src/storage/lake/delta_writer.cpp 1 1 100.00% []

@wyb wyb merged commit a092277 into StarRocks:main May 24, 2024
50 checks passed
Copy link

@Mergifyio backport branch-3.3

@github-actions github-actions bot removed the 3.3 label May 24, 2024
Copy link

@Mergifyio backport branch-3.2

@github-actions github-actions bot removed the 3.2 label May 24, 2024
Copy link
Contributor

mergify bot commented May 24, 2024

backport branch-3.3

✅ Backports have been created

Copy link
Contributor

mergify bot commented May 24, 2024

backport branch-3.2

✅ Backports have been created

mergify bot pushed a commit that referenced this pull request May 24, 2024
…arge ingestion (#45685)

Signed-off-by: luohaha <18810541851@163.com>
(cherry picked from commit a092277)
mergify bot pushed a commit that referenced this pull request May 24, 2024
…arge ingestion (#45685)

Signed-off-by: luohaha <18810541851@163.com>
(cherry picked from commit a092277)

# Conflicts:
#	be/src/storage/lake/rowset_update_state.cpp
#	be/src/storage/lake/rowset_update_state.h
#	be/src/storage/lake/update_manager.cpp
#	be/src/storage/rowset/segment_rewriter.cpp
#	be/src/storage/rowset/segment_rewriter.h
#	be/test/storage/lake/partial_update_test.cpp
wanpengfei-git pushed a commit that referenced this pull request May 24, 2024
…arge ingestion (backport #45685) (#46216)

Co-authored-by: Yixin Luo <18810541851@163.com>
@luohaha
Copy link
Contributor Author

luohaha commented May 28, 2024

ignore backport check: 3.2.8

@github-actions github-actions bot added the 3.3 label May 28, 2024
luohaha added a commit to luohaha/starrocks that referenced this pull request Jun 3, 2024
…arge ingestion (StarRocks#45685)

Signed-off-by: luohaha <18810541851@163.com>
wyb pushed a commit that referenced this pull request Jun 3, 2024
…arge ingestion (backport #45685) (#46548)

Signed-off-by: luohaha <18810541851@163.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants