Research Brief

Embryo Selection, Polygenic Scoring and Unrealistic Expectations

False hope for instilling disease resistance and desirable traits?

Since the 1990s, in vitro fertilization paired with genetic testing has helped biological parents identify embryos that are immune to certain inherited conditions. Until recently, these screenings were limited to a handful of diseases — Tay-Sachs, sickle cell anemia and cystic fibrosis, for example — in which the preimplantation diagnoses are all but certain. These conditions derive from single gene anomalies, and lack of the anomaly means the embryo cannot grow into someone who will have the disease.

Within the past year or so, some IVF clinics have started offering a much wider variety of preimplantation screenings, with much less definitive results. Recently developed polygenic scores can assess each embryo’s statistical risk of developing, even decades later, diseases and traits that result from the combined effects of hundreds or thousands of DNA variants, as well as environmental factors.

The scores are used in screening individuals for common conditions like cardiovascular disease, diabetes, cancer, schizophrenia and Alzheimer’s. Well-established studies also find that polygenic scores can be significant predictors of a wide variety of other outcomes, from attention deficit issues and cognitive abilities to creativity and life satisfaction generally. They can also provide strong clues about the propensity to graduate college, according to a seminal 2018 study based on data from over a million individuals published in Nature Genetics.

Yet these respectable research findings may be setting up IVF clinic patients for disappointment, according to a study in the New England Journal of Medicine. For a variety of reasons, the authors write, polygenic scores used in embryo selection are not nearly as useful for achieving certain traits or disease resistance in one couple’s offspring as the genetic studies imply.

The authors suspect that the way some clinics present research statistics to IVF customers, even accurately, encourages unrealistic expectations, explains UCLA Anderson’s Daniel Benjamin, a supervisor on the study along with Steven Hyman of Harvard and Massachusetts Institute of Technology; David Laibson of Harvard; and Peter M. Visscher of University of Queensland. The scores, they say, don’t necessarily make selection among healthy embryos obvious and may even present the biological parents with hard choices they didn’t expect.

The study’s key example focuses on the limited value, and potential downsides, for a biological couple using polygenic scores to select embryos for maximum educational potential.

But the issues the team outlines extend to polygenic scoring for any of the complex conditions that embryos can be screened for in IVF clinics. These include a long list of diseases, such as diabetes, malignant melanoma and schizophrenia, as well as characteristics such as height and cognitive abilities. And they include more controversial potential uses of the scores (not currently offered) to predict offspring outcomes such as household income, happiness and skin color.

The Illusion of Easy Choices

A half-century of genetic research has identified common combinations of gene variants in people affected by many of the same complex diseases and traits. While a single variant related to, say, heart disease doesn’t statistically raise the risk of developing it, having dozens of these variants does. Genome-wide studies in the field compare DNA from thousands or millions of people who have and don’t have a particular disease or trait.

The polygenic score aggregates gene variants related to a particular disease or trait to assign a cumulative genetic propensity either to an embryo or fully developed human.

Recently, U.S.-based genotyping companies such as Orchid, MyOme and Genomic Prediction began marketing polygenic scores as tools for IVF clinics to expand the number of hereditary diseases they can identify in embryos. Global demand for preimplantation genetic testing is expected to reach about $1.15 billion by 2025, up from $531.7 million in 2018, according to a report by Adroit Market Research. Growth, the reports say, will be driven by screenings for women waiting later in life to give birth (which increases risk of chromosomal disorders), as well as technological advances that allow testing for more and more diseases.

Even a fair and accurate description of the research behind polygenic scores can mislead IVF customers, the NEJM team finds. To illustrate, they point to findings in the Nature Genetics study on educational attainment. Benjamin was one of the senior authors in the team of some 70 co-authors on the paper, which has been cited widely as a model for applying polygenic scoring to behavior or social traits. (Benjamin also is co-founder of the Social Science Genetic Association Consortium, which conducts many large-scale genetic studies on these traits.)

The study found that the group of subjects scoring in the highest quintile of polygenic scores for educational attainment were about five times more likely to graduate than the group in the lowest quintile.

The findings seem to imply that a couple applying polygenic scores for educational attainment to their own healthy embryos will find a clear-cut favorite. But the NEJM study finds multiple reasons that the data from genome-wide studies may not be of practical value to any one IVF couple selecting among their own embryos for the brightest scholar. Among the biggest reasons the study finds:

  • The couple have only about a 3% chance of producing one viable embryo with a high end score for education attainment and another with a low end score, according to the study. All of their healthy embryos are more likely to have scores that are much more similar.
  • The genetic variants that increase propensity for college completion on average also increase propensity for bipolar disorder, which complicates selection. Such side effects are common with any gene variant, and science has nowhere near identified all of the undesirable conditions that might come with desirable ones.
  • Whether a genetic host actually completes college, dies from heart disease or manifests any complex condition is determined by an interaction of gene variants and environmental factors. Because today’s children cannot possibly be raised in the exact same conditions as the study subjects, their polygenic scores don’t hold the same predictive value as those in the research.
  • The research that developed polygenic scoring was conducted almost exclusively on people whose ancestors in recent generations come from Europe. When these scores are applied to parents with other ancestral backgrounds, their predictive accuracy can be much, much lower.

Is It Worth an Extra Inch of Height?

The research team wants IVF customers to understand how much (or little) the likelihood of the disease or trait changes if they use polygenic scoring to select an embryo, rather than randomly selecting one among their healthy candidates. This size of this difference is called absolute risk reduction, or expected gain.

For several reasons, absolute risk reduction achieved by applying scores can be far less impressive than the relative risk reductions that are emphasized in many studies about polygenic scores. For example, the authors of the NEJM study note that across different diseases, relative risk fell 15% to 80% for subjects of European ancestry, while the absolute risk reduction was only 0.12% to 8.5%.

This difference is particularly stark for less common diseases. For example, Benjamin and his colleagues calculate that for biological parents of European ancestry, scoring for Type 1 diabetes can provide a 35% reduction in risk relative to the U.S. population. But the average risk for Type 1 diabetes generally is only 0.34%, they note. By picking embryos with the best scores, the researchers calculate couples on average would lower the likelihood of producing offspring that develop the disease by about 0.12 percentage points.

In some cases, the study illustrates, the expected gain for any one family is so low that there is little or no practical advantage to scoring embryos before selection.

For example, based on the authors’ calculations, individuals with the highest polygenic scores for educational achievement have on average about 1.55 years more schooling than low scoring subjects from other families. When turning the calculation to a batch of embryos produced by one couple of European ancestry, however, the expected gain from scoring drops to barely half a year of additional education. If the couple are of West African ancestry, the expected gain falls to less than three months.

Alternatively, consider a couple that hope their son will grow taller than his unusually short parents. The study finds that a couple choosing an embryo with a low polygenic score for idiopathic short stature can reduce that risk by 1.8%, which is considered significant in scientific terms. But the selection isn’t likely to provoke a noticeable difference in the human being that develops, the study finds. At best, selection increases expected height by about 2.5 centimeters.

Translating the Research into Useful Information

The issues surrounding efficacy and ethics of polygenic scoring are difficult to communicate well, the researchers note, even to scientists and clinicians. They propose formal guidelines that would strictly regulate the way IVF clinics present the potential benefits of applying the scores in embryo selection.

Chief among their recommendations is a major change in presentation of research data that’s commonly used in marketing materials. Charts, tables and literature describing the predictive powers of polygenic scores are typically displayed in terms of relative risk reduction, as they are in many peer reviewed studies, instead of the rather less impressive advantages patients are likely to achieve by applying the scores to their own offspring. The authors contend that the relative risk reduction statistics should never be presented in isolation.

Instead, they continue, the materials should focus on expected gains of using the scores versus random embryo selections, with predictions detailed by particular disease or trait and specific ancestries. They encourage caveats noting that even expected gains might not be meaningful in certain situations.

The researchers call for a nationwide discussion to clarify the ethics and misunderstandings surrounding polygenic scoring, especially in IVF settings, and to consider appropriate limitations and regulations. They find that merely promoting the research, even accurately, is not nearly enough disclosure.

Featured Faculty

About the Research

Turley, P., Meyer, M.N., Wang, N., Cesarini, D., Hammonds, E., Martin, A.R., Neale, B.M., Rehm, H.L., Wilkins-Haug, L., Benjamin, D.J., Hyman, S., Laibson, D., Visscher, P.M.. Challenges with embryo selection using polygenic scores. New England Journal of Medicine 2021;  385:78-86. DOI: 10.1056/NEJMsr2105065

de Zeeuw, E.L., van Beijsterveldt, C.E., Glasner, et al.  Polygenic scores associated with educational attainment in adults predict educational achievement and ADHD symptoms in children. American Journal of Medical Genetics Neuropsychiatric Genetics. 2014 Sep;165B(6):510-20.

Power, R., Steinberg, S., Bjornsdottir, G., et al. Polygenic risk scores for schizophrenia and bipolar disorder predict creativity. Nature Neuroscience 18, 953–955 (2015).

Rietveld, C.A., Cesarini, D., Benjamin, D.J., et al. (2013). Molecular genetics and subjective well-being. Proceedings of the National Academy of Sciences.

Lee, J.J., Wedow, R., Okbay, A., et al. (2018). Gene discovery and polygenic prediction from a 1.1-million-person GWAS of educational attainment. Nature Genetics.

Related Articles

Two hands push medication out of a green plastic container Research Brief / Health

Free Birth Control, Fewer Births

Unintended pregnancies decline when copays and patient fees are eliminated

A flow chart overlaid on programming language. Research Brief / Technology

The Trade-Off Between Fairness and Accuracy in Algorithm Design

What happens when data is excluded?

A side view of feet on a scale Research Brief / Health

Modest Financial Incentives Help with Weight Loss

Tying payments to weight, rather than behaviors, marginally more effective

Illustration of one person pointing, another sitting, and wearing a tie with a briefcase Research Brief / Workplace

Bystanders Are Tougher than Victims in Punishing Office Misbehavior

Research looks beyond management to measure how co-workers police each other