You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, is it possible to print out which feature directly/more likely contributes to determining a data point as an outlier? So that we can check for each data point that which features may deviate the data point from the distribution the most.
The text was updated successfully, but these errors were encountered:
Hi @lujiazho, you can actually use the shap library for this:
# Fit PyOD modelclf=KNN()
clf.fit(data)
# Shap is slow so perhaps only explain highest likelihood values, e.g. 100scores=clf.decision_scores_idx=np.argsort(scores)
# Fit shap explainer and get values for top 100explainer=shap.Explainer(clf.decision_function, data)
shap_values=explainer(data[idx][-100:])
# Or for just one point of interestidx=55shap_values=explainer(data[55].reshape(1,-1))
# Example of some plotsshap.plots.waterfall(shap_values)
shap.plots.beeswarm(shap_values)
shap.plots.bar(shap_values)
Hope this is what you were looking for and all the best!
Hi, is it possible to print out which feature directly/more likely contributes to determining a data point as an outlier? So that we can check for each data point that which features may deviate the data point from the distribution the most.
The text was updated successfully, but these errors were encountered: