グループテストとは - わかりやすく解説 Weblio辞書

ウィキペディア

索引トップランキングカテゴリー

グループテスト

出典: フリー百科事典『ウィキペディア（Wikipedia）』 (2020/07/12 14:40 UTC 版)

ナビゲーションに移動検索に移動

この項目「グループテスト」は途中まで翻訳されたものです。（原文：en:Group testing 10:02, 4 June 2020 時点の版）
翻訳作業に協力して下さる方を求めています。ノートページや履歴、翻訳のガイドラインも参照してください。要約欄への翻訳情報の記入をお忘れなく。（2020年7月）

この記事は別の言語からおおざっぱに翻訳されたものであり、場合によっては不慣れな翻訳者や機械翻訳によって翻訳されたものかもしれません。翻訳を改善してくださる方を募集しています。

Template:Db-a10

An illustration of the lightbulb problem, where one is searching for a broken bulb among six lightbulbs. Here, the first three are connected to a power supply, and they light up (A). This indicates that the broken bulb must be one of the last three (B). If instead the bulbs did not light up, one could be sure that the broken bulb was among the first three. Continuing this procedure can locate the broken bulb in no more than three tests, compared to a maximum of six tests if the bulbs are checked individually.

統計学や組み合わせ数学において, 何らかの検体を特定する作業を, 検体を個別に調べるのではなく, グループにまとめて調べるという作業に分割するようなやりかたをグループテストとよぶ。1943年にRobert Dorfmanによって最初に研究されて以来, グループテストは比較的新しい応用数学の分野であり、幅広い分野で実用化されており, 現在も活発に研究されている。

グループテストのよく知られている例は、直列に接続されたいくつかの電球の中で, いずれか1つだけが壊れている場合である。目的は、できる限り少ない検査（いくつかの電球を電源に接続すること）で、壊れた電球を見つけることである。単純な方法は、各電球を個別に検査することである。しかし、電球の数が多い場合は、電球をグループにまとめる方がはるかに効率的である。たとえば、電球を2つのグループに分けて, 片方のグループの電球を一度にまとめて接続することにより、壊れた電球がどちらのグループに入っているかを判断でき、たった1回の検査で半数の電球を除外できる。

グループテストを実行するためのスキームには単純なものも複雑なものもあり、ひとつのスキームの中の各段階の検査も異なる場合がある。次の段階の検査法が前の段階の検査結果によって変わるようなスキームは適応的手順(adaptive procedure)と呼ばれる一方, すべての検査法が事前に決まるように設計されたスキームは「非適応的手順」(non-adaptive procedure)と呼ばれる。非適応的手順に含まれる検査のスキームの構造は「プーリング設計」として知られている（訳注: この部分の意味が原文ではわからない）。

グループテストには、統計学、生物学、コンピュータサイエンス、医学、エンジニアリング、サイバーセキュリティなど、多くの応用がある。これらの検査スキームに対する関心は、近年もHuman Genome Projectによって再燃している ^[1]。

基本的な説明と用語

数学の多くの分野とは異なり、グループテストの起源は、^[2]Robert Dorfmanという1人の人物が書いた1つの報告書に遡ることができる。^[3]第二次世界大戦中に、アメリカ公衆衛生局とセレクティブ・サービス・システムが, 入隊者の中から梅毒の男性全員を排除する大規模プロジェクトに着手したときにその動機が生じた。梅毒について個人を検査するには、血液サンプルを採取して, それを分析する必要がある。当時、この検査には費用がかかり、すべての兵士を個別に検査することは非常に費用がかかり、非効率的だった。^[3]

Supposing there are $n$

A typical group testing setup. A non-adaptive algorithm first chooses the matrix

M

Combinatorial Orthogonal Matching Pursuit, or COMP, is a simple non-adaptive group-testing algorithm that forms the basis for the more complicated algorithms that follow in this section.

First, each entry of the testing matrix is chosen i.i.d. to be $1$

An illustration of a multiaccess channel showing a successful message and a message collision.

A multiaccess channel is a communication channel that connects many users at once. Every user can listen and transmit on the channel, but if more than one user transmits at the same time, the signals collide, and are reduced to unintelligible noise. Multiaccess channels are important for various real-world applications, notably wireless computer networks and phone networks.^[27]

A prominent problem with multiaccess channels is how to assign transmission times to the users so that their messages do not collide. A simple method is to give each user their own time slot in which to transmit, requiring $n$

Origami Assay paper template for group testing design

In a laboratory setting, one challenge of group testing is the construction of the mixtures can be time-consuming and difficult to do accurately by hand. Origami assays provided a workaround for this construction problem by providing paper templates to guide the technician on how to allocate patient samples across the test wells.^[35]

Using the largest group testing designs (XL3) it was possible to test 1120 patient samples in 94 assay wells. If the true positive rate was low enough, then no additional testing was required.

Data forensics

Data forensics is a field dedicated to finding methods for compiling digital evidence of a crime. Such crimes typically involve an adversary modifying the data, documents or databases of a victim, with examples including the altering of tax records, a virus hiding its presence, or an identity thief modifying personal data.^[26]

A common tool in data forensics is the one-way cryptographic hash. This is a function that takes the data, and through a difficult-to-reverse procedure, produces a unique number called a hash.^{[注釈 9]} Hashes, which are often much shorter than the data, allow us to check if the data has been changed without having to wastefully store complete copies of the information: the hash for the current data can be compared with a past hash to determine if any changes have occurred. An unfortunate property of this method is that, although it is easy to tell if the data has been modified, there is no way of determining how: that is, it is impossible to recover which part of the data has changed.^[26]

One way to get around this limitation is to store more hashes – now of subsets of the data structure – to narrow down where the attack has occurred. However, to find the exact location of the attack with a naive approach, a hash would need to be stored for every datum in the structure, which would defeat the point of the hashes in the first place. (One may as well store a regular copy of the data.) Group testing can be used to dramatically reduce the number of hashes that need to be stored. A test becomes a comparison between the stored and current hashes, which is positive when there is a mismatch. This indicates that at least one edited datum (which is taken as defectiveness in this model) is contained in the group that generated the current hash.^[26]

In fact, the amount of hashes needed is so low that they, along with the testing matrix they refer to, can even be stored within the organisational structure of the data itself. This means that as far as memory is concerned the test can be performed 'for free'. (This is true with the exception of a master-key/password that is used to secretly determine the hashing function.)^[26]

Notes

^ The original problem that Dorfman studied was of this nature (although he did not account for this), since in practice, only a certain number of blood sera could be pooled before the testing procedure became unreliable. This was the main reason that Dorfman’s procedure was not applied at the time.^[3]
^ However, as is often the case in mathematics, group testing has been subsequently re-invented multiple times since then, often in the context of applications. For example, Hayes independently came up with the idea to query groups of users in the context of multiaccess communication protocols in 1978.^[9]
^ This is sometimes referred to as the Hu-Hwang-Wang conjecture.
^ The number of tests, $t$ , must scale as $t=O\left(d^{2}\log _{d}n\right)$ for deterministic designs, compared to $t=O(d\log _{2}n)$ for designs that allow arbitrarily small probabilities of error (as $d\to \infty$ and $n\to \infty$ ).^[4]
^ One must be careful to distinguish between when a test reports a false result and when the group-testing procedure fails as a whole. It is both possible to make an error with no incorrect tests and to not make an error with some incorrect tests. Most modern combinatorial algorithms have some non-zero probability of error (even with no erroneous tests), since this significantly decreases the number of tests needed.
^ In fact it is possible to do much better. For example, Li's $s$ -stage algorithm gives an explicit construction were $t\leq {\frac {e}{\log _{2}e}}d\log _{2}{(n/d)}$ .
^ Alternatively $\mathbf {y}$ can be defined by the equation $\mathbf {y} :=M\mathbf {x}$ , where multiplication is logical AND ( $\wedge$ ) and addition is logical OR ( $\vee$ ). Here, $\mathbf {y}$ will have a $1$ in position $i$ if and only if $(M)_{i,j}$ and $\mathbf {x} _{j}$ are both $1$ for any $j$ . That is, if and only if at least one defective item was included in the $i{\textrm {-th}}$ test.
^ This kind of measurement comes up in many applications. For example, certain kinds of digital camera^[28] or MRI machines,^[29] where time constraints require that only a small number of measurements are taken.
^ More formally hashes have a property called collision resistance, which is that the likelihood of the same hash resulting from different inputs is very low for data of an appropriate size. In practice, the chance that two different inputs might produce the same hash is often ignored.

References

Citations

^ Colbourn, Charles J.; Dinitz, Jeffrey H. (2007), Handbook of Combinatorial Designs (2nd ed.), Boca Raton: Chapman & Hall/ CRC, p. 574, Section 46: Pooling Designs, ISBN 978-1-58488-506-1
^ ^a ^b ^c Dorfman, Robert (December 1943), “The Detection of Defective Members of Large Populations”, The Annals of Mathematical Statistics 14 (4): 436–440, doi:10.1214/aoms/1177731363, JSTOR 2235930, https://jstor.org/stable/2235930
^ ^a ^b ^c ^d ^e ^f ^g ^h ⁱ ^j ^k ^l ^m ⁿ ^o ^p ^q ^r ^s ^t ^u ^v ^w Ding-Zhu, Du; Hwang, Frank K. (1993). Combinatorial group testing and its applications. Singapore: World Scientific. ISBN 978-9810212933
^ ^a ^b ^c ^d ^e ^f ^g ^h ⁱ Atia, George Kamal; Saligrama, Venkatesh (March 2012). “Boolean compressed sensing and noisy group testing”. IEEE Transactions on Information Theory 58 (3): 1880–1901. arXiv:0907.1061. doi:10.1109/TIT.2011.2178156.
^ ^a ^b ^c ^d ^e ^f ^g ^h ⁱ ^j Chun Lam Chan; Pak Hou Che; Jaggi, Sidharth; Saligrama, Venkatesh (1 September 2011), “2011 49th Annual Allerton Conference on Communication, Control, and Computing (Allerton)”, 49th Annual Allerton Conference on Communication, Control, and Computing: pp. 1832–1839, arXiv:1107.4540, doi:10.1109/Allerton.2011.6120391, ISBN 978-1-4577-1817-5
^ Hung, M.; Swallow, William H. (March 1999). “Robustness of Group Testing in the Estimation of Proportions”. Biometrics 55 (1): 231–237. doi:10.1111/j.0006-341X.1999.00231.x. PMID 11318160.
^ Chen, Hong-Bin; Fu, Hung-Lin (April 2009). “Nonadaptive algorithms for threshold group testing”. Discrete Applied Mathematics 157 (7): 1581–1585. doi:10.1016/j.dam.2008.06.003.
^ De Bonis, Annalisa (20 July 2007). “New combinatorial structures with applications to efficient group testing with inhibitors”. Journal of Combinatorial Optimization 15 (1): 77–94. doi:10.1007/s10878-007-9085-1.
^ Hayes, J. (August 1978). “An adaptive technique for local distribution”. IEEE Transactions on Communications 26 (8): 1178–1186. doi:10.1109/TCOM.1978.1094204.
^ Sterrett, Andrew (December 1957). “On the detection of defective members of large populations”. The Annals of Mathematical Statistics 28 (4): 1033–1036. doi:10.1214/aoms/1177706807.
^ Sobel, Milton; Groll, Phyllis A. (September 1959). “Group testing to eliminate efficiently all defectives in a binomial sample”. Bell System Technical Journal 38 (5): 1179–1252. doi:10.1002/j.1538-7305.1959.tb03914.x.
^ Li, Chou Hsiung (June 1962). “A sequential method for screening experimental variables”. Journal of the American Statistical Association 57 (298): 455–477. doi:10.1080/01621459.1962.10480672.
^ Katona, Gyula O.H. (1973). “A survey of combinatorial theory”. Combinatorial Search Problems (North-Holland, Amsterdam): 285–308.
^ ^a ^b ^c ^d Hwang, Frank K. (September 1972). “A method for detecting all defective members in a population by group testing”. Journal of the American Statistical Association 67 (339): 605–608. doi:10.2307/2284447. JSTOR 2284447.
^ Allemann, Andreas (2013). “An efficient algorithm for combinatorial group testing”. Information Theory, Combinatorics, and Search Theory: 569–596.
^ ^a ^b Hu, M. C.; Hwang, F. K.; Wang, Ju Kwei (June 1981). “A Boundary Problem for Group Testing”. SIAM Journal on Algebraic and Discrete Methods 2 (2): 81–87. doi:10.1137/0602011.
^ Leu, Ming-Guang (28 October 2008). “A note on the Hu–Hwang–Wang conjecture for group testing.”. The ANZIAM Journal 49 (4): 561. doi:10.1017/S1446181108000175.
^ Riccio, Laura; Colbourn, Charles J. (1 January 2000). “Sharper bounds in adaptive group testing”. Taiwanese Journal of Mathematics 4 (4): 669–673. doi:10.11650/twjm/1500407300.
^ ^a ^b ^c ^d Aldridge, Matthew; Baldassini, Leonardo; Johnson, Oliver (June 2014). “Group Testing Algorithms: Bounds and Simulations”. IEEE Transactions on Information Theory 60 (6): 3671–3687. arXiv:1306.6438. doi:10.1109/TIT.2014.2314472.
^ Baldassini, L.; Johnson, O.; Aldridge, M. (1 July 2013), “2013 IEEE International Symposium on Information Theory”, IEEE International Symposium on Information Theory: pp. 2676–2680, arXiv:1301.7023, doi:10.1109/ISIT.2013.6620712, ISBN 978-1-4799-0446-4
^ Sobel, Milton; Elashoff, R. M. (1975). “Group testing with a new goal, estimation”. Biometrika 62 (1): 181–193. doi:10.1093/biomet/62.1.181.
^ Bar-Noy, A.; Hwang, F. K.; Kessler, I.; Kutten, S. (1 May 1992). A new competitive algorithm for group testing. 2. 786–793. doi:10.1109/INFCOM.1992.263516. ISBN 978-0-7803-0602-8
^ Damaschke, Peter (2000). “Adaptive versus nonadaptive attribute-efficient learning”. Machine Learning 41 (2): 197–215. doi:10.1023/A:1007616604496.
^ Stinson, D. R.; van Trung, Tran; Wei, R (May 2000). “Secure frameproof codes, key distribution patterns, group testing algorithms and related structures”. Journal of Statistical Planning and Inference 86 (2): 595–617. doi:10.1016/S0378-3758(99)00131-7.
^ Colbourn, C. J.; Dinitz, J. H.; Stinson, D. R. (1999). “Communications, Cryptography, and Networking”. Surveys in Combinatorics 3 (267): 37–41. doi:10.1007/BF01609873.
^ ^a ^b ^c ^d ^e Goodrich, Michael T.; Atallah, Mikhail J.; Tamassia, Roberto (7 June 2005). Indexing information for data forensics. Lecture Notes in Computer Science. 3531. 206–221. doi:10.1007/11496137_15. ISBN 978-3-540-26223-7
^ Chlebus, B. S. (2001). "Randomized communication in radio networks". In: Pardalos, P. M.; Rajasekaran, S.; Reif, J.; Rolim, J. D. P. (Eds.), Handbook of Randomized Computing, Vol. I, p.401–456. Kluwer Academic Publishers, Dordrecht.
^ Takhar, D.; Laska, J. N.; Wakin, M. B.; Duarte, M. F.; Baron, D.; Sarvotham, S.; Kelly, K. F.; Baraniuk, R. G. (February 2006). “A new compressive imaging camera architecture using optical-domain compression”. Electronic Imaging. Computational Imaging IV 6065: 606509–606509–10. Bibcode: 2006SPIE.6065...43T. doi:10.1117/12.659602.
^ Candès, E. J. (2014). "Mathematics of sparsity (and a few other things)". Proceedings of the International Congress of Mathematicians. Seoul, South Korea.
^ ^a ^b Gilbert, A. C.; Iwen, M. A.; Strauss, M. J. (October 2008). "Group testing and sparse signal recovery". 42nd Asilomar Conference on Signals, Systems and Computers: 1059–1063. Institute of Electrical and Electronics Engineers.
^ Wright, S. J.; Nowak, R. D.; Figueiredo, M. A. T. (July 2009). “Sparse Reconstruction by Separable Approximation”. IEEE Transactions on Signal Processing 57 (7): 2479–2493. Bibcode: 2009ITSP...57.2479W. doi:10.1109/TSP.2009.2016892.
^ ^a ^b Berinde, R.; Gilbert, A. C.; Indyk, P.; Karloff, H.; Strauss, M. J. (September 2008). Combining geometry and combinatorics: A unified approach to sparse signal recovery. 798–805. arXiv:0804.4666. doi:10.1109/ALLERTON.2008.4797639. ISBN 978-1-4244-2925-7
^ Indyk, Piotr (1 January 2008). “Explicit Constructions for Compressed Sensing of Sparse Signals”. Proceedings of the Nineteenth Annual ACM-SIAM Symposium on Discrete Algorithms: 30–33.
^ “Origami Assays”. Origami Assays (2020年4月2日). 2020年4月7日閲覧。
^ “Origami Assays”. Origami Assays (2020年4月2日). 2020年4月7日閲覧。

General references

Ding-Zhu, Du; Hwang, Frank K. (1993). Combinatorial group testing and its applications. Singapore: World Scientific. ISBN 978-9810212933
Atri Rudra's course on Error Correcting Codes: Combinatorics, Algorithms, and Applications (Spring 2007), Lectures 7.
Atri Rudra's course on Error Correcting Codes: Combinatorics, Algorithms, and Applications (Spring 2010), Lectures 10, 11, 28, 29
Du, D., & Hwang, F. (2006). Pooling Designs and Nonadaptive Group Testing. Boston: Twayne Publishers.
Aldridge, M., Johnson, O. and Scarlett, J. (2019), Group Testing: An Information Theory Perspective, Foundations and Trends® in Communications and Information Theory: Vol. 15: No. 3-4, pp 196-392. doi:10.1561/0100000099
Ely Porat, Amir Rothschild: Explicit Non-adaptive Combinatorial Group Testing Schemes. ICALP (1) 2008: 748–759
Kagan, Eugene; Ben-gal, Irad (2014), “A group testing algorithm with online informational learning”, IIE Transactions 46 (2): 164–184, doi:10.1080/0740817X.2013.803639, ISSN 0740-817X

グループテストとは？わかりやすく解説

グループテスト

目次