Threshold Algorithms #5

Closed
opened 2026-03-10 14:16:28 +00:00 by leon-adsk · 2 comments
Owner

Implement a way to choose the detection threshold dynamically. Ideally this should be modular, requiring only the dataset and model.

Implement a way to choose the detection threshold dynamically. Ideally this should be modular, requiring only the dataset and model.
leon-adsk added this to the Release v1.0 milestone 2026-03-10 14:16:28 +00:00
Author
Owner

Added simple naive brute-force threshold selection during model creation in issue #6. Need to decide whether we either:

  • Select threshold during model creation: find a way to add to ONNX metadata.
  • Select threshold during model loading: give proxy application access to dataset.
Added simple naive brute-force threshold selection during model creation in issue #6. Need to decide whether we either: - Select threshold during model creation: find a way to [add to ONNX metadata](https://github.com/pytorch/pytorch/issues/42808). - Select threshold during model loading: give proxy application access to dataset.
leon-adsk modified the milestone from Release v1.0 to MVP 2026-03-13 09:06:11 +00:00
Author
Owner

We will allow explicitly setting a threshold via flag, see issue #9. We will also implement the following methods to be selected via CLI args; these require a dataset:

  • Evaluate ROC (Youden's J): Find the threshold that maximizes (True Positive Rate - False Positive Rate).
  • Evaluate ROC (Geometric Distance): Find the threshold that minimizes the geometric distance to the ideal (0, 1) coordinate.
  • Evaluate Confusion Matrix Metrics: F1, Recall, Precision, Accuracy, etc.
  • FBeta Score with Beta = 2
  • Evaluate Z-Score: Calculate μ+kσ on the reconstruction loss of strictly benign traffic to set the boundary.
  • Evaluate Median Absolute Deviation (MAD): Use MAD on benign traffic to set a threshold that is robust against occasional normal outliers.
  • Evaluate Percentile Cutoff: Pin the threshold to a fixed percentile (e.g., 99th or 99.9th) of benign validation losses based on acceptable false positive rates.
  • Evaluate Alert Budgeting: Set the threshold dynamically to yield a maximum number of daily alerts matching SOC review capacity.
  • Load cached threshold (invalidate with different model)
We will allow explicitly setting a threshold via flag, see issue #9. We will also implement the following methods to be selected via CLI args; these require a dataset: - [ ] Evaluate ROC (Youden's J): Find the threshold that maximizes (True Positive Rate - False Positive Rate). - [ ] Evaluate ROC (Geometric Distance): Find the threshold that minimizes the geometric distance to the ideal (0, 1) coordinate. - [x] Evaluate Confusion Matrix Metrics: F1, Recall, Precision, Accuracy, etc. - [x] FBeta Score with Beta = 2 - [ ] Evaluate Z-Score: Calculate μ+kσ on the reconstruction loss of strictly benign traffic to set the boundary. - [ ] Evaluate Median Absolute Deviation (MAD): Use MAD on benign traffic to set a threshold that is robust against occasional normal outliers. - [ ] Evaluate Percentile Cutoff: Pin the threshold to a fixed percentile (e.g., 99th or 99.9th) of benign validation losses based on acceptable false positive rates. - [ ] Evaluate Alert Budgeting: Set the threshold dynamically to yield a maximum number of daily alerts matching SOC review capacity. - [ ] Load cached threshold (invalidate with different model)
leon-adsk added spent time 2026-03-14 17:09:28 +00:00
10 minutes
leon-adsk stopped working 2026-03-14 17:26:34 +00:00
17 minutes 4 seconds
leon-adsk stopped working 2026-03-15 15:33:26 +00:00
50 minutes 59 seconds
leon-adsk referenced this issue from a commit 2026-03-16 12:35:32 +00:00
leon-adsk stopped working 2026-03-16 12:35:32 +00:00
27 minutes 16 seconds
leon-adsk stopped working 2026-03-16 13:05:01 +00:00
29 minutes 18 seconds
leon-adsk added spent time 2026-03-16 13:09:01 +00:00
3 minutes
leon-adsk referenced this issue from a commit 2026-03-16 13:09:13 +00:00
leon-adsk stopped working 2026-03-17 15:52:11 +00:00
12 minutes 39 seconds
Sign in to join this conversation.
No milestone
No assignees
1 participant
Notifications
Total time spent: 2 hours 30 minutes
leon-adsk
2 hours 30 minutes
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Blocks
#9 Flags
leon-adsk/http-anomaly-detection
Depends on
#8 De-Singleton Model Loading
leon-adsk/http-anomaly-detection
Reference
leon-adsk/http-anomaly-detection#5
No description provided.