- HTML 67%
- Python 33%
| data/results | ||
| images | ||
| notebooks | ||
| references | ||
| reports | ||
| src | ||
| .gitignore | ||
| .python-version | ||
| main.py | ||
| pyproject.toml | ||
| README.md | ||
| uv.lock | ||
Utility vs. Fidelity
This repository contains some experiments surrounding the relationship between utility and fidelity in differential privacy.
Reproduction
To run this project you must have uv installed on your system, download it via your systems package manager or through this link.
To reproduce the results simply clone the project and run the following commands in the project directory:
uv sync
uv run marimo edit
Your default browser should now open a notebook, here you can run the experiments as you please. If you prefer to run the dp_ml experiments heedlessly, you can do so via:
uv run notebooks/run_experiment.py
Once the experiments finish and a CSV file is generated, simply press the toggle in the notebook to skip running the experiments and go straight to plotting.
Keep in mind that to run the experiments you will need the corresponding datasets. There is a dropdown at the top of each notebook, the acs_income will donwload automatically if chosen, the rest can be found here and placed into the data/datasets directory:
- Adult Income Dataset: Predict whether an individual makes over $50,000 a year.
- UCI Heart Disease Dataset: This is a public healthcare classification dataset and could serve as a reproducible alternative to the closed COVID dataset.
- Dataset: UCI Heart Disease Dataset https://archive.ics.uci.edu/dataset/45/heart%2Bdisease
- German Credit / South German Credit Dataset: This is a public credit-risk classification dataset with mixed categorical and numerical attributes.
- Dataset: UCI Statlog German Credit Data https://archive.ics.uci.edu/dataset/522/south%2Bgerman%2Bcredit