Joint Design of Membership-Inference Resistance Metrics and Verifiable Watermarks for Synthetic Data

Qiying Wu; Fei Ge

doi:10.54254/2755-2721/2025.28248

Applied and Computational EngineeringOpen access

Joint Design of Membership-Inference Resistance Metrics and Verifiable Watermarks for Synthetic Data

Research Article

Open Access

Joint Design of Membership-Inference Resistance Metrics and Verifiable Watermarks for Synthetic Data

Qiying Wu ^1* Fei Ge ²

¹ Syracuse University, New York, United States

² Swansea University, Swansea, UK

^*Corresponding author: rara481846778@gmail.com

Published on 22 October 2025

ACE Vol.197

ISSN (Print): 2755-273X

ISSN (Online): 2755-2721

ISBN (Print): 978-1-80590-465-6

ISBN (Online): 978-1-80590-466-3

Download Cover

Abstract

Synthetic data has become a vital component in AI training and data governance, addressing challenges of data scarcity and regulatory compliance while introducing new concerns regarding security and controllability. A central tension arises between privacy protection and traceability, membership inference attacks can exploit model output differences to infer the presence of individual records in training data, leading to privacy leakage, while watermarking mechanisms provide traceability but often compromise task utility through embedding strength and robustness. To resolve this conflict, this study proposes a joint design framework that integrates adversary-advantage–based resistance metrics with verifiable watermarking, optimized through a multi-objective paradigm to enable coordinated training of generators and watermark embedders. Experiments conducted on CIFAR-10, CelebA, IMDB, and UCI Adult datasets demonstrate that the framework significantly reduces membership inference risks under black-box, white-box, and post-editing attacks, with an average reduction of approximately 30% in adversary advantage, while maintaining over 95% watermark detection accuracy and less than 2% utility loss. These findings validate the feasibility of achieving a dynamic balance among privacy resistance, traceability, and data quality through joint optimization, establishing a unified evaluation protocol and practical governance pathway, with significant implications for the trustworthy deployment of synthetic data in research and industry.

Keywords:

Synthetic data, Data governance, Membership inference, Verifiable watermark, Data quality

View PDF

References

[1]. Zhang, Ziqi, Chao Yan, and Bradley A. Malin. "Membership inference attacks against synthetic health data." Journal of biomedical informatics 125 (2022): 103977.

[2]. Houssiau, Florimond, et al. "TAPAS: a toolbox for adversarial privacy auditing of synthetic data." arXiv preprint arXiv: 2211.06550 (2022).

[3]. Van Breugel, Boris, et al. "Membership inference attacks against synthetic data through overfitting detection." arXiv preprint arXiv: 2302.12580 (2023).

[4]. Laszkiewicz, Mike, et al. "Set-membership inference attacks using data watermarking." arXiv preprint arXiv: 2307.15067 (2023).

[5]. Guépin, Florent, et al. "Synthetic is all you need: removing the auxiliary data assumption for membership inference attacks against synthetic data." European Symposium on Research in Computer Security. Cham: Springer Nature Switzerland, 2023.

[6]. Naseh, Ali, and Niloofar Mireshghallah. "Synthetic data can mislead evaluations: Membership inference as machine text detection." arXiv preprint arXiv: 2501.11786 (2025).

[7]. Zhai, Shengfang, et al. "Membership inference on text-to-image diffusion models via conditional likelihood discrepancy." Advances in Neural Information Processing Systems 37 (2024): 74122-74146.

[8]. Sander, Tom, et al. "Watermarking makes language models radioactive." Advances in Neural Information Processing Systems 37 (2024): 21079-21113.

[9]. Zhu, Zhihao, Jiale Han, and Yi Yang. "HoneyImage: Verifiable, Harmless, and Stealthy Dataset Ownership Verification for Image Models." arXiv preprint arXiv: 2508.00892 (2025).

[10]. Annamalai, Meenatchi Sundaram Muthu Selva, Andrea Gadotti, and Luc Rocher. "A linear reconstruction approach for attribute inference attacks against synthetic data." 33rd USENIX Security Symposium (USENIX Security 24). 2024.

[11]. Chen, Zitao, and Karthik Pattabiraman. "A method to facilitate membership inference attacks in deep learning models." arXiv preprint arXiv: 2407.01919 (2024).