Mathematics Enables Effective Screening of Recessive Genetic Disorders

Posted August 2013

Diagram showing how recessive genetic disorders emerge

Yaniv Erlich directs a human genetics lab at the Whitehead Institute for Biomedical Research at the Massachusetts Institute of Technology (MIT). One of his research projects is to identify carriers of recessive genetic disorders that affect a large proportion of the Ashkenazi Jewish population. These genetic disorders are known to cause devastating diseases, such as Tay-Sachs, Canavan disease, familial dysautonomia, and Cystic Fibrosis.

While it is possible to screen every individual, there is a huge cost and effort involved. For this reason, Erlich developed a pooling strategy (Erlich et al., Genome Research, 2009; Erlich et al., IEEE Trans Info Theory, 2010). Pooling goes back to the work of Dorfman in the 1940’s. He developed a method that later became known as “group testing.” The method, which was used in World War II to identify syphilitic men called for service, pooled the recruits' blood samples into groups, and the mixed blood samples were then tested. Because the infection rate was so small, an efficient algorithm could be devised to identify infected recruits. The number of tests needed to be performed is just a little more than the number of groups.

Erlich attended the workshop “Group Testing Designs, Algorithms, and Applications to Biology” held at the Institute for Mathematics and its Applications (IMA) in February 2012. The workshop brought together researchers in bioinformatics and computer science. Atri Rudra, a computer scientist from the University of Buffalo, gave a talk on code concatenation developed by Kautz and Singleton in their 1964 work for designing good group testing schemes. During this talk, Erlich and other biologists became aware of the strong connection between coding theory and group testing, and began to work out new testing designs based on the Reed-Solomon code.

Today, the seeds planted at that IMA workshop have started to bear fruit. The Erlich lab currently uses a testing design based on the R-S code. Recently, they screened 480 samples using this technique that reduced the costs of the project by 75%. The testing method is efficient and has desirable properties that are mathematically provable. This breakthrough has made conducting large-scale screening for carriers of genetic disorders a practical reality.