A distributed multiple sample testing for massive data

© 2021 Informa UK Limited, trading as Taylor & Francis Group.

Bibliographische Detailangaben
Veröffentlicht in:Journal of applied statistics. - 1991. - 50(2023), 3 vom: 24., Seite 555-573
1. Verfasser: Xiaoyue, Xie (VerfasserIn)
Weitere Verfasser: Shi, Jian, Song, Kai
Format: Online-Aufsatz
Sprache:English
Veröffentlicht: 2023
Zugriff auf das übergeordnete Werk:Journal of applied statistics
Schlagworte:Journal Article Distributed scheme classification fraud detection hypothesis testing
LEADER 01000caa a22002652 4500
001 NLM356176681
003 DE-627
005 20240917232447.0
007 cr uuu---uuuuu
008 231226s2023 xx |||||o 00| ||eng c
024 7 |a 10.1080/02664763.2021.1911967  |2 doi 
028 5 2 |a pubmed24n1536.xml 
035 |a (DE-627)NLM356176681 
035 |a (NLM)37114090 
040 |a DE-627  |b ger  |c DE-627  |e rakwb 
041 |a eng 
100 1 |a Xiaoyue, Xie  |e verfasserin  |4 aut 
245 1 2 |a A distributed multiple sample testing for massive data 
264 1 |c 2023 
336 |a Text  |b txt  |2 rdacontent 
337 |a ƒaComputermedien  |b c  |2 rdamedia 
338 |a ƒa Online-Ressource  |b cr  |2 rdacarrier 
500 |a Date Revised 17.09.2024 
500 |a published: Electronic-eCollection 
500 |a Citation Status PubMed-not-MEDLINE 
520 |a © 2021 Informa UK Limited, trading as Taylor & Francis Group. 
520 |a When the data are stored in a distributed manner, direct application of traditional hypothesis testing procedures is often prohibitive due to communication costs and privacy concerns. This paper mainly develops and investigates a distributed two-node Kolmogorov-Smirnov hypothesis testing scheme, implemented by the divide-and-conquer strategy. In addition, this paper also provides a distributed fraud detection and a distribution-based classification for multi-node machines based on the proposed hypothesis testing scheme. The distributed fraud detection is to detect which node stores fraud data in multi-node machines and the distribution-based classification is to determine whether the multi-node distributions differ and classify different distributions. These methods can improve the accuracy of statistical inference in a distributed storage architecture. Furthermore, this paper verifies the feasibility of the proposed methods by simulation and real example studies 
650 4 |a Journal Article 
650 4 |a Distributed scheme 
650 4 |a classification 
650 4 |a fraud detection 
650 4 |a hypothesis testing 
700 1 |a Shi, Jian  |e verfasserin  |4 aut 
700 1 |a Song, Kai  |e verfasserin  |4 aut 
773 0 8 |i Enthalten in  |t Journal of applied statistics  |d 1991  |g 50(2023), 3 vom: 24., Seite 555-573  |w (DE-627)NLM098188178  |x 0266-4763  |7 nnns 
773 1 8 |g volume:50  |g year:2023  |g number:3  |g day:24  |g pages:555-573 
856 4 0 |u http://dx.doi.org/10.1080/02664763.2021.1911967  |3 Volltext 
912 |a GBV_USEFLAG_A 
912 |a SYSFLAG_A 
912 |a GBV_NLM 
912 |a GBV_ILN_350 
951 |a AR 
952 |d 50  |j 2023  |e 3  |b 24  |h 555-573