Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Slack ingest does not support pagination (thus limited to 100 threads) and breaks date handling. #374

Open
gururise opened this issue Feb 22, 2024 · 1 comment

Comments

@gururise
Copy link

Describe the bug
Slack ingestion is limited to 100 threads due to not supporting pagination. Additionaly, start/end dates are ignored due to non-support of pagination.

To Reproduce
Try to ingest a slack channel(s) with more than 100 threads as follows:
unstructured-ingest slack --channels C040Z2873DH,C05JBERNW0Y --token <api_token> --download-dir slack-ingest-download --output-dir slack-ingest-output --start-date 2022-01-01 --end-date 2024-02-28

or try to ingest less than 100 threads, but using a time period before the last 100 threads:
unstructured-ingest slack --channels C040Z2873DH,C05JBERNW0Y --token <api_token> --download-dir slack-ingest-download --output-dir slack-ingest-output --start-date 2022-01-01 --end-date 2022-02-01

Environment:
Local

Additional context
There are two problems associated with this one bug:

  1. cannot ingest more than 100 threads (due to non support of pagination)
  2. cannot ingest any threads that are older than the last 100 threads (due to non-support of pagination)

Ideally, you could also support a pause or delay function that would insert a small delay after "n" api calls to prevent hitting slack's api limits.

@gururise gururise changed the title Slack ingest does not support pagination (thus limited to 100 threads) Slack ingest does not support pagination (thus limited to 100 threads) and breaks date handling. Feb 22, 2024
@awalker4
Copy link
Collaborator

Thanks for flagging this! For awareness, issues with the ingest connectors can be raised against the core library repo, rather than the api here. No worries, though, I'll pass on to the team and make sure we're tracking this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants