Matryoshka : Exploiting the Over-Parametrization of Deep Learning Models for Covert Data Transmission

High-quality private machine learning (ML) data stored in local data centers becomes a key competitive factor for AI corporations. In this paper, we present a novel insider attack called Matryoshka to reveal the possibility of breaking the privacy of ML data even with no exposed interface. Our attac...

Ausführliche Beschreibung

Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on pattern analysis and machine intelligence. - 1979. - PP(2024) vom: 26. Juli
1. Verfasser: Pan, Xudong (VerfasserIn)
Weitere Verfasser: Zhang, Mi, Yan, Yifan, Zhang, Shengyao, Yang, Min
Format: Online-Aufsatz
Sprache:English
Veröffentlicht: 2024
Zugriff auf das übergeordnete Werk:IEEE transactions on pattern analysis and machine intelligence
Schlagworte:Journal Article
LEADER 01000naa a22002652 4500
001 NLM37546123X
003 DE-627
005 20240727233038.0
007 cr uuu---uuuuu
008 240727s2024 xx |||||o 00| ||eng c
024 7 |a 10.1109/TPAMI.2024.3434417  |2 doi 
028 5 2 |a pubmed24n1483.xml 
035 |a (DE-627)NLM37546123X 
035 |a (NLM)39058616 
040 |a DE-627  |b ger  |c DE-627  |e rakwb 
041 |a eng 
100 1 |a Pan, Xudong  |e verfasserin  |4 aut 
245 1 0 |a Matryoshka  |b Exploiting the Over-Parametrization of Deep Learning Models for Covert Data Transmission 
264 1 |c 2024 
336 |a Text  |b txt  |2 rdacontent 
337 |a ƒaComputermedien  |b c  |2 rdamedia 
338 |a ƒa Online-Ressource  |b cr  |2 rdacarrier 
500 |a Date Revised 26.07.2024 
500 |a published: Print-Electronic 
500 |a Citation Status Publisher 
520 |a High-quality private machine learning (ML) data stored in local data centers becomes a key competitive factor for AI corporations. In this paper, we present a novel insider attack called Matryoshka to reveal the possibility of breaking the privacy of ML data even with no exposed interface. Our attack employs a scheduled-to-publish DNN model as a carrier model for covert transmission of secret models which memorize the information of private ML data that otherwise has no interface to the outsider. At the core of our attack, we present a novel parameter sharing approach which exploits the learning capacity of the carrier model for information hiding. Our approach simultaneously achieves: (i) High Capacity - With almost no utility loss of the carrier model, Matryoshka can transmit over 10,000 real-world data samples within a carrier model which has 220× less parameters than the total size of the stolen data, and simultaneously transmit multiple heterogeneous datasets or models within a single carrier model under a trivial distortion rate, neither of which can be done with existing steganography techniques; (ii) Decoding Efficiency - once downloading the published carrier model, an outside colluder can exclusively decode the hidden models from the carrier model with only several integer secrets and the knowledge of the hidden model architecture; (iii) Effectiveness - Moreover, almost all the recovered models either have similar performance as if it is trained independently on the private data, or can be further used to extract memorized raw training data with low error; (iv) Robustness - Information redundancy is naturally implemented to achieve resilience against common post-processing techniques on the carrier before its publishing; (v) Covertness - A model inspector with different levels of prior knowledge could hardly differentiate a carrier model from a normal model 
650 4 |a Journal Article 
700 1 |a Zhang, Mi  |e verfasserin  |4 aut 
700 1 |a Yan, Yifan  |e verfasserin  |4 aut 
700 1 |a Zhang, Shengyao  |e verfasserin  |4 aut 
700 1 |a Yang, Min  |e verfasserin  |4 aut 
773 0 8 |i Enthalten in  |t IEEE transactions on pattern analysis and machine intelligence  |d 1979  |g PP(2024) vom: 26. Juli  |w (DE-627)NLM098212257  |x 1939-3539  |7 nnns 
773 1 8 |g volume:PP  |g year:2024  |g day:26  |g month:07 
856 4 0 |u http://dx.doi.org/10.1109/TPAMI.2024.3434417  |3 Volltext 
912 |a GBV_USEFLAG_A 
912 |a SYSFLAG_A 
912 |a GBV_NLM 
912 |a GBV_ILN_350 
951 |a AR 
952 |d PP  |j 2024  |b 26  |c 07