Comprehensive benchmarking of privacy preservation and data fidelity across leading synthetic data generators
We define a novel real-world privacy metric that captures harms people actually care about. Rather than asking if a flow or packet was included in training, our formulation asks whether an individual's traffic contributed to training a model, whether that inclusion can reveal sensitive information such as their location, behavior, or organizational affiliation etc.
TraceBleed is not a naïve membership inference attack (MIA) — it's a novel attack-grounded metric tailored to network traces that exploits behavioral fingerprints across flows, rather than statistical artifacts.
We do not simply train attacker and generator on the same dataset. Our leaderboard contains evaluations across datasets collected from diverse vantage points to ensure that the vulnerability is true privacy leakage not distribution-specific anomalies.
We close the loop with a principled, generator-agnostic mitigation, showing the leakage we measure is both real and fixable.