Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refine handling of underperforming_group issue type #1099

Merged
merged 5 commits into from
May 24, 2024

Conversation

gogetron
Copy link
Contributor

@gogetron gogetron commented Apr 14, 2024

Summary

Closes #1065

🎯 Purpose: Refine handling of underperforming_group issue type for classification tasks to ensure it is correctly included only when appropriate inputs are provided.

UnderperformingGroupIssueManager always requires pred_probs. When features are available but pred_probs is not, the _CLASSIFICATION_ARGS_DICT includes the underperforming_group as an available issue type. An additional check was added to remove this issue from the list like we do for label, class_imbalance and outlier. A new assertion was added to the existing test suite. An error is correctly raised in master and this PR fixes the issue.

The underperforming_group issue type should only be available when pred_probs and another required input (features, knn_graph, or cluster_ids) are provided. This enhancement ensures that the _CLASSIFICATION_ARGS_DICT accurately reflects this requirement. An additional check was added to exclude underperforming_group from the list of available issue types if these conditions are not met. This refinement aligns with how other issue types such as label, class_imbalance, and outlier are handled. A new assertion was added to the existing test suite to validate this behavior.

Testing

🔍 Testing Done: Included a new assertion to verify the correct behavior of underperforming_group issue type availability based on input conditions.

Links to Relevant Issues or Conversations

🔗 What Git or Slack items (Issues, threads, etc) that are specifically related to
this work? Please link them here.

Reviewer Notes

  • The new check for underperforming_group issue type and its integration with the _CLASSIFICATION_ARGS_DICT.
  • The comprehensiveness and coverage of the updated test cases.

Copy link

codecov bot commented Apr 14, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 96.13%. Comparing base (0221f8b) to head (790e0c7).
Report is 21 commits behind head on master.

Current head 790e0c7 differs from pull request most recent head 0509924

Please upload reports for the commit 0509924 to get more accurate results.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #1099      +/-   ##
==========================================
- Coverage   96.18%   96.13%   -0.05%     
==========================================
  Files          76       80       +4     
  Lines        5996     6113     +117     
  Branches      992     1075      +83     
==========================================
+ Hits         5767     5877     +110     
- Misses        135      140       +5     
- Partials       94       96       +2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@jwmueller jwmueller requested a review from sanjanag May 13, 2024 06:32
@sanjanag sanjanag requested review from elisno May 24, 2024 10:07
Split and parametrize test functions for easier maintenance.

- Expand set of scenarios for addressing issue 1065. It's not sufficient to just provide pred_probs, but also one of the following: (features, knn_graph, cluster_ids).
@elisno elisno changed the title Fix: Error in underperforming_group Refine handling of underperforming_group issue type May 24, 2024
Copy link
Member

@elisno elisno left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Made slight improvements to handle the omission of underperforming_group more accurately. The additional checks and updated test cases ensure that the issue type is only included when the required inputs are provided.

Thanks @gogetron!

@elisno elisno merged commit a043470 into cleanlab:master May 24, 2024
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

lab.find_issues(features=features) outputs error for underperforming issue
3 participants