- Split View
-
Views
-
Cite
Cite
Stamatina Iliodromiti, Thomas W. Kelsey, Olivia Wu, Richard A. Anderson, Scott M. Nelson, The predictive accuracy of anti-Müllerian hormone for live birth after assisted conception: a systematic review and meta-analysis of the literature, Human Reproduction Update, Volume 20, Issue 4, July/August 2014, Pages 560–570, https://doi.org/10.1093/humupd/dmu003
- Share Icon Share
Abstract
Anti-Müllerian hormone (AMH) is an established marker of ovarian reserve and a good predictor of poor or excessive ovarian response after controlled hyperstimulation. However, it is unclear whether it can predict the ultimate outcome of assisted conception, live birth. We undertook a systematic review and meta-analysis to examine whether AMH is a predictor of live birth in women undergoing assisted conception.
The study was conducted according to the PRISMA guidelines. PubMed, Embase, Medline, Web of Knowledge and the Cochrane trial register and unpublished literature were searched. Studies fulfilling the eligibility criteria were included in the systematic review and those with extractable data were included in the meta-analysis. Quality assessment was performed with the QUADAS 2 checklist. A summary estimate of diagnostic odds ratio (DOR) was derived using the random effects model for binary data. A hierarchical summary receiver operating characteristic model provided pooled estimates before and after adjusting for age and AMH assay as covariates.
Out of 361 non-duplicate studies, 47 were selected; 17 met the eligibility criteria and 13 had extractable data and thus were included in the meta-analysis. Three out of the 13 studies included only women with expected low ovarian reserve and were analysed individually from the remaining 10 to minimize heterogeneity. The DOR for women with unknown ovarian reserve (n = 5764 women) was 2.39 (95% confidence interval (CI): 1.85–3.08). After adjustment for age the DOR was little changed at 2.48 (95% CI: 1.81–3.22) and the DOR adjusted for AMH assay was almost identical at 2.42 (95% CI: 1.86–3.14). For women with expected low ovarian reserve (n = 542 women) the DOR was 4.63 (95% CI: 2.75–7.81).
AMH, independently of age, has some association with predicting live birth after assisted conception and may be helpful when counselling couples before undergoing fertility treatment. However, its predictive accuracy is poor.
Introduction
The age-related decline in oocyte quantity and quality (Nelson et al., 2013b) underpins the decline in success rates and prospect of live birth after assisted conception with advancing maternal age (Oudendijk et al., 2012). Age alone however is of limited accuracy in predicting live birth; thus there is a need for improved prediction for individualization of counselling. The substantial heterogeneity in the size of the ovarian reserve at any given age (Wallace and Kelsey, 2010) results in marked inter-individual variation in ovarian response despite optimal ovarian stimulation. Analysis of this heterogeneity may provide insights into understanding individual fertility and how it changes with age, and it is also a likely source of clinically useful biomarkers. A variety of ovarian reserve tests have been developed and their predictive capacity for ovarian response examined. In recent systematic reviews, individual patient data meta-analysis and international multicentre trials, anti-Müllerian hormone (AMH) has been confirmed as the current best biomarker for prediction of poor and excessive ovarian response (Broekmans et al., 2006; Broer et al., 2009, 2011, 2013; Nelson et al., 2009; Anckaert et al., 2012).
Given the strength of the relationships with oocyte yield, the association of AMH with pregnancy after assisted conception has been examined, but results were inconclusive. Some studies have concluded that AMH is not associated with pregnancy (Broekmans et al., 2006; van Rooij et al., 2006) while others have found a positive association (Nelson et al., 2007; Honnma et al., 2013). A recent individual patient data meta-analysis in 1008 patients undergoing fertility treatment demonstrated a weak association of AMH with ongoing pregnancy (Broer et al., 2013). As live birth is the ultimate outcome of assisted conception, clarification of whether AMH is predictive of this outcome is warranted.
The observational studies that have examined the association of AMH and live birth have either been small (Nelson et al., 2007; Lee et al., 2009; Majumder et al., 2010) or restricted to specific subpopulations of infertility patients (Gleicher et al., 2010; Arce et al., 2013). To provide an accurate estimate of the effect size we undertook a systematic review and meta-analysis of all eligible studies to examine whether AMH can predict live birth in women undergoing assisted conception.
Methods
This study was conducted according to the PRISMA guidelines (Liberati et al., 2009) and followed a structured protocol established among the authors prior to the start of the literature search.
Eligibility criteria
Studies were included if they met the following criteria: (i) the study population included women of reproductive age undergoing IVF or ICSI, with any stimulation protocol; (ii) serum AMH was measured in all study participants before ovarian stimulation; (iii) the clinical outcome of live birth was recorded for all participants; and (iv) any study design except case reports. Thus, studies referring to intrauterine insemination were excluded as were studies referring to on-going pregnancy rate but not live birth rate. Studies referring to follicular fluid AMH or oocyte donation programmes were excluded. Our search included studies published up to August 2013 and there was no language restriction.
Search
The following electronic databases were searched: PubMed, Embase, Medline, Web of Knowledge and the Cochrane trial register. Search terms for live birth (MeSH; live birth, ongoing pregnancy, pregnancy) and key words ‘anti-müllerian hormone’, ‘müllerian-inhibiting substance’, or ‘müllerian-inhibiting factor’ were combined with a search filter for studies related to humans. The abstracts of all studies identified were screened by two researchers (S.I. and S.M.N.). Any studies including data on AMH and live birth or other assisted conception outcomes were read in full. The reference lists of the selected papers were hand-searched in order to identify potentially relevant papers. Grey literature was also searched via the opengrey website.
Study selection and data collection
Subsequently, two researchers (S.I. and S.M.N.) carefully read and independently judged all selected articles. If a study fulfilled the eligibility criteria, it was included in the systematic review. If the study provided extractable information about a cut-off of AMH which was associated with a higher or lower chance of live birth or the sensitivity, specificity or area under the curve (AUC) of the receiver operating characteristic curve of AMH in predicting the chance of live birth, it was selected for the meta-analysis. Any disagreement between the two researchers was resolved with discussion. If a study was selected for the systematic review but did not provide data that could be included in the meta-analysis, the authors were contacted via email. If the authors did not reply or the relevant information was not available, the studies were not included in the meta-analysis. If the authors did not provide the information asked, but the relevant information was extractable using a reverse engineering technique through Plot digitizer, a computer software programme which can extract data from published plots, the articles and data therein were used in the meta-analysis.
For each study the first author, year of publication, number of cycles, number of patients, stimulation protocol, mean/median age of the patients, suggested cut-off point of AMH (converted to ng/ml using the conversion formula ng/ml = 7.14 pmol/l), AMH assay, number of live births among the patients with low or high serum AMH (below or above the cut-off point), study design and patient selection were extracted.
Quality appraisal of the selected studies
Each selected study was further assessed according to the QUADAS 2 checklist to assess the risk of bias and the applicability of primary diagnostic accuracy studies (Whiting et al., 2011). It consists of four main domains testing the risk of bias (low, high or unclear) in patient selection, index test, reference test and flow of each study.
We tried to minimize the risk of publication bias by a comprehensive search strategy which included unpublished (grey) literature. The risk of publication bias and potential small study effect was visually assessed by constructing a funnel plot which plots estimates of diagnostic accuracy against statistical precision (Sterne et al., 2011). In addition, we performed a linear regression of log diagnostic ratios on the inverse root of effective sample sizes as a test for funnel plot asymmetry, where a non-zero slope coefficient (P < 0.10) is suggestive of significant asymmetry and small study bias (Deeks et al., 2005).
Data synthesis and analysis
The statistical analysis was undertaken using the Stata/SE (version 12.1, Stata Corp, USA) and SAS/STAT software. Pooling of the data related to the number of live births among the study participants with AMH below and above a cut-off point used the random effects model for binary data and provided a summary estimate of diagnostic odds ratio (DOR) and 95% confidence intervals (CI). The DOR summarizes the diagnostic accuracy of the AMH tests and can take values from 0 to infinity. It expresses the odds of detecting AMH above a cut-off point (positive test result) among women with live births relative to the odds of detecting AMH above the cut-off among women without live births (Glas et al., 2003).
Heterogeneity resulting from true diagnostic accuracy not being identical in each study was quantified by the I-squared measure (Higgins et al., 2003). Sensitivity analysis was performed for studies with similar study populations (women with unknown ovarian reserve versus women with reduced ovarian reserve) to assess whether the DOR varies according to patients' characteristics. In addition, the summary ROC curve, sensitivity, specificity, positive and negative likelihood ratio of AMH for predicting live birth was generated. This was conducted by fitting a two-level mixed logistic regression model, with independent binomial distributions for the true positives and true negatives restricted to the sensitivity and specificity in each study, and a bivariate normal model for the logit transforms of sensitivity and specificity between studies (Rutter and Gatsonis 2001; Reitsma et al., 2005; Harbord et al., 2007). In addition, the hierarchical model estimated the characteristics of the ROC curve and the adjusted DOR after including age and AMH assay as covariates (Takwoingi and Deeks, 2010).
Ethics
Formal ethics approval was not required because this analysis consists of pooling of published studies.
Results
Search results
The systematic search of the biomedical databases produced 595 hits; after excluding duplicates 361 citations were identified (Fig. 1). Unpublished literature (open grey website or hand searching of references) meeting the search indices was not identified. After excluding articles based on the title or abstract, 47 articles were assessed fully for eligibility. Thirty studies were excluded with a reason recorded (Supplementary data, Table SI); thus 17 studies were selected for the systematic review (Nelson et al., 2007; Lee et al., 2009; Gleicher et al., 2010; Majumder et al., 2010; Wang et al., 2010; Friden et al., 2011; La Marca et al., 2011; Weghofer et al., 2011; Grzegorczyk-Martin et al., 2012; Arce et al., 2013; Brodin et al., 2013; Khader et al., 2013; Li et al., 2013; Lin et al., 2013; Lukaszuk et al., 2013; Merhi et al., 2013; Mutlu et al., 2013). Four of these were excluded from the meta-analysis as extraction of relevant data was not possible even after contacting the authors (Wang et al., 2010; Weghofer et al., 2011; Lin et al., 2013; Mutlu et al., 2013) and a subgroup of another study for the same reason (Lee et al., 2009). Most of the data from one of the excluded studies (Weghofer et al., 2011) were included in a previous study from the same research group (Gleicher et al., 2010) which contributed to the meta-analysis. One of the studies was in French (Grzegorczyk-Martin et al., 2012). The characteristics of the studies included in the meta-analysis are listed in Table I.
Study . | Population . | AMH test . | Outcome . |
---|---|---|---|
a. Characteristics of studies of women undergoing IVF/ICSI | |||
Li et al. (2013) | n = 1026 consecutive women undergoing their first cycle of IVF/ICSI from 2007 to 2009 and were stimulated with either the long GnRH agonist protocol or the GnRH antagonist protocol. Egg donation and preimplantation cycles were excluded. Study subjects had a median age of 35 years (IQ, 33–38) and had various aetiologies of infertility (male infertility was the majority, followed by tubal, endometriosis, anovulation and unexplained). Retrospective cohort study design. | Gen II Assay was used No threshold was used and sensitivity and specificity were extracted (Plot digitizer software) from graph | Live birth after the first fresh cycle, no frozen cycles were included in the analysis. n = 383 achieved a live birth. n = 643 did not have a live birth after a fresh IVF cycle |
Arce et al. (2013) | n = 749 women aged 21–34 years with either male or unexplained infertility stimulated with the GnRH antagonist protocol. Excluded women with polycystic ovary syndrome (PCOS), endometriosis and previous poor response. Women had FSH 1–12 IU/l and antral follicle count ≥10. It was a secondary analysis of data prospectively collected in a randomized assessor blinded trial. | Gen II Assay was used. Threshold ≥13 pmol/l (1.82 ng/ml) was used | n = 291 live birth after the first fresh cycle |
Lukaszuk et al. (2013) | n = 619 women (of 2495) had anti-Müllerian hormone (AMH) measured and underwent their first ICSI from 2005 to 2009. Median age was 32 (IQ, 29–34) with male infertility, anovulation, tubal factor, endometriosis and unexplained infertility. They underwent pituitary suppression with GnRH agonist protocol. Retrospective observational study. | DSL ELISA kit Threshold ≥1.9 ng/ml (13.6 pmol/l) was used | n = 39 women with AMH below the threshold had a live birth. n = 252 women with AMH above the threshold had a live birth |
Brodin et al. (2013) | n = 892 consecutive women with a maximum age of 42 years (median 36 years) underwent 1230 cycles of IVF/ICSI stimulated by the long GnRH agonist protocol. Aetiology of infertility included male factor, unexplained, tubal factor and endometriosis. Prospective data collection was conducted. | DSL ELISA kit Threshold ≥0.84 ng/ml (6 pmol/l) was used | n = 33 women with AMH below the threshold had a live birth (per stimulated cycle). n = 222 women with AMH above the threshold had a live birth (per stimulated cycle) |
Khader et al. (2013) | n = 822 women aged 25–42 years and had had their first IVF/ICSI cycle from 2006 to 2010. They underwent a GnRH agonist or antagonist protocol and the causes of infertility included male factor, anovulation, tubal disease, endometriosis and unexplained infertility. Data were collected prospectively in a registry and it was a retrospective cohort study. | DSL ELISA kit Threshold ≥0.73 ng/ml (5.2 pmol/l) was used | n = 5 women with AMH below the threshold had a live birth. n = 237 women with AMH above the threshold had a live birth |
Grzegorczyk-Martin et al. (2012) | n = 704 women attending for their first IVF/ICSI study from 2006 to 2009 and underwent a long GnRH agonist protocol. Women with PCOS, one ovary, older than 43 years of age were excluded. Also, women with FSH ≥10 IU/l and AMH >2 ng/ml were excluded from the authors. We further excluded from the analysis women with FSH ≥10 IU/l and AMH ≤2 ng/ml (n = 54, so n = 650 included in the meta-analysis) because they were more likely to be poor responders. Retrospective cohort study. | IBC ELISA kit Threshold ≥2 ng/ml was used | n = 19 women with AMH below the threshold had a live birth. n = 100 women with AMH above the threshold had a live birth |
La Marca et al. (2011) | n = 381 women attended for their first IVF/ICSI cycle from 2005 to 2008, aged up to 42 years (mean ± SD, 34.8 ± 4.8 years) and were stimulated by a long GnRH protocol. Couples with severe male infertility (sperm count <1 × 106/ml or normal forms <5%) or women with systematic diseases were excluded from the study. Data were collected prospectively in a registry and it was a retrospective observational study. | IBC ELISA kit Threshold ≥0.4 ng/ml was used | n = 3 women with AMH below the threshold had a live birth. n = 98 women with AMH above the threshold had a live birth |
Majumder et al. (2010) | n = 162 women undergoing their first IVF/ICSI cycle from 2005 to 2006 and aged 23 to 39 years (Mean ± SD, 31.8 ± 3.79). They underwent a long GnRH agonist protocol for the common causes of infertility. It was a prospective observational study. | DSL ELISA kit Threshold ≥19.3 pmol/l (2.7 ng/ml) had 65.8% sensitivity and 54.8% specificity of predicting live birth (per embryo transfer) | n = 38 had a live birth (out of 137 women having had an embryo transfer) |
Lee et al. (2009) | n = 336 underwent their first IVF/ICSI cycle from March 2007 to December 2007. They underwent a long GnRH agonist protocol for male factor, tubal factor, other female factor or unexplained infertility. Women with PCOS were excluded. n = 213 were under 35 years of age (mean ± SD, 30.8 ± 0.2) and n = 123 were ≥35 years of age (mean ± SD, 38.6 ± 0.2). It was a prospective cohort study. | DSL ELISA kit Threshold ≥1.68 ng/ml (12 pmol/l) had 60% sensitivity and 66.2 % specificity of predicting live birth (per embryo transfer) in women ≥35 years | n = 40 women ≥35 years had a live birth (out of 114 women having had an embryo transfer). There were no extractable data for women <35 years |
Nelson et al. (2007) | n = 340 consecutive patients undergoing their first stimulated cycle with a GnRH agonist protocol. The patients had a median age of 34 years (IQ: 31–37 year). Prospective cohort study. | DSL ELISA kit Threshold ≥7.8 pmol/l (1.1 ng/ml) | n = 93 women achieved a live birth |
b. Characteristics of studies of women with expected low ovarian reserve (LOR) undergoing IVF/ICSI | |||
Merhi et al. (2013) | n = 120 historic cohort with LOR underwent IVF/ICSI with a GnRH agonist protocol from January 2008 to June 2013. The participants were over 35 years of age and LOR defined as FSH ≥ 10 IU/l. Participants were subdivided in three groups based on AMH levels with mean age ± SD group of 41.2 ± 3.3, 39.3 ± 3 and 40.2 ± 2.8 years. | DSL ELISA kit Threshold ≥0.8 ng/ml (5.7 pmol/l) | n = 9 women achieved a live birth |
Friden et al. (2011) | n = 127 women aged 39–46 years undergoing their first stimulated cycle from November 2006 to December 2008. They underwent standard antagonist or agonist protocols. There is no information regarding the study design. | DSL ELISA kit Threshold ≥8.6 pmol/l (1.2 ng/ml) | n = 6 women with AMH below the threshold had a live birth. n = 8 women with AMH above the threshold had a live birth |
Gleicher et al. (2010) | n = 295 with LOR underwent 507 cycles and were stimulated with dehydroepiandrosterone (DHEA) and microdose agonist. LOR was defined arbitrarily as FSH ≥10 IU/l or/and abnormally low age-specific AMH. n = 174 women had AMH ≤1.05 ng/ml and had a mean age 39.6 years (SD: 4.6). n = 121 women had AMH >1.05 ng/ml and mean age 35.2 years (SD: 5.4). | DSL ELISA kit Threshold >1.05 ng/ml (7.5 pmol/l) had a sensitivity of 73.6 % and specificity of 67.4 % in predicting live birth per stimulating cycle. | n = 43 women had a live birth (507 cycles) (data extracted with the Plot digitizer software) |
Study . | Population . | AMH test . | Outcome . |
---|---|---|---|
a. Characteristics of studies of women undergoing IVF/ICSI | |||
Li et al. (2013) | n = 1026 consecutive women undergoing their first cycle of IVF/ICSI from 2007 to 2009 and were stimulated with either the long GnRH agonist protocol or the GnRH antagonist protocol. Egg donation and preimplantation cycles were excluded. Study subjects had a median age of 35 years (IQ, 33–38) and had various aetiologies of infertility (male infertility was the majority, followed by tubal, endometriosis, anovulation and unexplained). Retrospective cohort study design. | Gen II Assay was used No threshold was used and sensitivity and specificity were extracted (Plot digitizer software) from graph | Live birth after the first fresh cycle, no frozen cycles were included in the analysis. n = 383 achieved a live birth. n = 643 did not have a live birth after a fresh IVF cycle |
Arce et al. (2013) | n = 749 women aged 21–34 years with either male or unexplained infertility stimulated with the GnRH antagonist protocol. Excluded women with polycystic ovary syndrome (PCOS), endometriosis and previous poor response. Women had FSH 1–12 IU/l and antral follicle count ≥10. It was a secondary analysis of data prospectively collected in a randomized assessor blinded trial. | Gen II Assay was used. Threshold ≥13 pmol/l (1.82 ng/ml) was used | n = 291 live birth after the first fresh cycle |
Lukaszuk et al. (2013) | n = 619 women (of 2495) had anti-Müllerian hormone (AMH) measured and underwent their first ICSI from 2005 to 2009. Median age was 32 (IQ, 29–34) with male infertility, anovulation, tubal factor, endometriosis and unexplained infertility. They underwent pituitary suppression with GnRH agonist protocol. Retrospective observational study. | DSL ELISA kit Threshold ≥1.9 ng/ml (13.6 pmol/l) was used | n = 39 women with AMH below the threshold had a live birth. n = 252 women with AMH above the threshold had a live birth |
Brodin et al. (2013) | n = 892 consecutive women with a maximum age of 42 years (median 36 years) underwent 1230 cycles of IVF/ICSI stimulated by the long GnRH agonist protocol. Aetiology of infertility included male factor, unexplained, tubal factor and endometriosis. Prospective data collection was conducted. | DSL ELISA kit Threshold ≥0.84 ng/ml (6 pmol/l) was used | n = 33 women with AMH below the threshold had a live birth (per stimulated cycle). n = 222 women with AMH above the threshold had a live birth (per stimulated cycle) |
Khader et al. (2013) | n = 822 women aged 25–42 years and had had their first IVF/ICSI cycle from 2006 to 2010. They underwent a GnRH agonist or antagonist protocol and the causes of infertility included male factor, anovulation, tubal disease, endometriosis and unexplained infertility. Data were collected prospectively in a registry and it was a retrospective cohort study. | DSL ELISA kit Threshold ≥0.73 ng/ml (5.2 pmol/l) was used | n = 5 women with AMH below the threshold had a live birth. n = 237 women with AMH above the threshold had a live birth |
Grzegorczyk-Martin et al. (2012) | n = 704 women attending for their first IVF/ICSI study from 2006 to 2009 and underwent a long GnRH agonist protocol. Women with PCOS, one ovary, older than 43 years of age were excluded. Also, women with FSH ≥10 IU/l and AMH >2 ng/ml were excluded from the authors. We further excluded from the analysis women with FSH ≥10 IU/l and AMH ≤2 ng/ml (n = 54, so n = 650 included in the meta-analysis) because they were more likely to be poor responders. Retrospective cohort study. | IBC ELISA kit Threshold ≥2 ng/ml was used | n = 19 women with AMH below the threshold had a live birth. n = 100 women with AMH above the threshold had a live birth |
La Marca et al. (2011) | n = 381 women attended for their first IVF/ICSI cycle from 2005 to 2008, aged up to 42 years (mean ± SD, 34.8 ± 4.8 years) and were stimulated by a long GnRH protocol. Couples with severe male infertility (sperm count <1 × 106/ml or normal forms <5%) or women with systematic diseases were excluded from the study. Data were collected prospectively in a registry and it was a retrospective observational study. | IBC ELISA kit Threshold ≥0.4 ng/ml was used | n = 3 women with AMH below the threshold had a live birth. n = 98 women with AMH above the threshold had a live birth |
Majumder et al. (2010) | n = 162 women undergoing their first IVF/ICSI cycle from 2005 to 2006 and aged 23 to 39 years (Mean ± SD, 31.8 ± 3.79). They underwent a long GnRH agonist protocol for the common causes of infertility. It was a prospective observational study. | DSL ELISA kit Threshold ≥19.3 pmol/l (2.7 ng/ml) had 65.8% sensitivity and 54.8% specificity of predicting live birth (per embryo transfer) | n = 38 had a live birth (out of 137 women having had an embryo transfer) |
Lee et al. (2009) | n = 336 underwent their first IVF/ICSI cycle from March 2007 to December 2007. They underwent a long GnRH agonist protocol for male factor, tubal factor, other female factor or unexplained infertility. Women with PCOS were excluded. n = 213 were under 35 years of age (mean ± SD, 30.8 ± 0.2) and n = 123 were ≥35 years of age (mean ± SD, 38.6 ± 0.2). It was a prospective cohort study. | DSL ELISA kit Threshold ≥1.68 ng/ml (12 pmol/l) had 60% sensitivity and 66.2 % specificity of predicting live birth (per embryo transfer) in women ≥35 years | n = 40 women ≥35 years had a live birth (out of 114 women having had an embryo transfer). There were no extractable data for women <35 years |
Nelson et al. (2007) | n = 340 consecutive patients undergoing their first stimulated cycle with a GnRH agonist protocol. The patients had a median age of 34 years (IQ: 31–37 year). Prospective cohort study. | DSL ELISA kit Threshold ≥7.8 pmol/l (1.1 ng/ml) | n = 93 women achieved a live birth |
b. Characteristics of studies of women with expected low ovarian reserve (LOR) undergoing IVF/ICSI | |||
Merhi et al. (2013) | n = 120 historic cohort with LOR underwent IVF/ICSI with a GnRH agonist protocol from January 2008 to June 2013. The participants were over 35 years of age and LOR defined as FSH ≥ 10 IU/l. Participants were subdivided in three groups based on AMH levels with mean age ± SD group of 41.2 ± 3.3, 39.3 ± 3 and 40.2 ± 2.8 years. | DSL ELISA kit Threshold ≥0.8 ng/ml (5.7 pmol/l) | n = 9 women achieved a live birth |
Friden et al. (2011) | n = 127 women aged 39–46 years undergoing their first stimulated cycle from November 2006 to December 2008. They underwent standard antagonist or agonist protocols. There is no information regarding the study design. | DSL ELISA kit Threshold ≥8.6 pmol/l (1.2 ng/ml) | n = 6 women with AMH below the threshold had a live birth. n = 8 women with AMH above the threshold had a live birth |
Gleicher et al. (2010) | n = 295 with LOR underwent 507 cycles and were stimulated with dehydroepiandrosterone (DHEA) and microdose agonist. LOR was defined arbitrarily as FSH ≥10 IU/l or/and abnormally low age-specific AMH. n = 174 women had AMH ≤1.05 ng/ml and had a mean age 39.6 years (SD: 4.6). n = 121 women had AMH >1.05 ng/ml and mean age 35.2 years (SD: 5.4). | DSL ELISA kit Threshold >1.05 ng/ml (7.5 pmol/l) had a sensitivity of 73.6 % and specificity of 67.4 % in predicting live birth per stimulating cycle. | n = 43 women had a live birth (507 cycles) (data extracted with the Plot digitizer software) |
PCOS, polycystic ovary syndrome; DSL, Diagnostic Systems Laboratories; IBC, Immunotech-Beckman Coulter.
Study . | Population . | AMH test . | Outcome . |
---|---|---|---|
a. Characteristics of studies of women undergoing IVF/ICSI | |||
Li et al. (2013) | n = 1026 consecutive women undergoing their first cycle of IVF/ICSI from 2007 to 2009 and were stimulated with either the long GnRH agonist protocol or the GnRH antagonist protocol. Egg donation and preimplantation cycles were excluded. Study subjects had a median age of 35 years (IQ, 33–38) and had various aetiologies of infertility (male infertility was the majority, followed by tubal, endometriosis, anovulation and unexplained). Retrospective cohort study design. | Gen II Assay was used No threshold was used and sensitivity and specificity were extracted (Plot digitizer software) from graph | Live birth after the first fresh cycle, no frozen cycles were included in the analysis. n = 383 achieved a live birth. n = 643 did not have a live birth after a fresh IVF cycle |
Arce et al. (2013) | n = 749 women aged 21–34 years with either male or unexplained infertility stimulated with the GnRH antagonist protocol. Excluded women with polycystic ovary syndrome (PCOS), endometriosis and previous poor response. Women had FSH 1–12 IU/l and antral follicle count ≥10. It was a secondary analysis of data prospectively collected in a randomized assessor blinded trial. | Gen II Assay was used. Threshold ≥13 pmol/l (1.82 ng/ml) was used | n = 291 live birth after the first fresh cycle |
Lukaszuk et al. (2013) | n = 619 women (of 2495) had anti-Müllerian hormone (AMH) measured and underwent their first ICSI from 2005 to 2009. Median age was 32 (IQ, 29–34) with male infertility, anovulation, tubal factor, endometriosis and unexplained infertility. They underwent pituitary suppression with GnRH agonist protocol. Retrospective observational study. | DSL ELISA kit Threshold ≥1.9 ng/ml (13.6 pmol/l) was used | n = 39 women with AMH below the threshold had a live birth. n = 252 women with AMH above the threshold had a live birth |
Brodin et al. (2013) | n = 892 consecutive women with a maximum age of 42 years (median 36 years) underwent 1230 cycles of IVF/ICSI stimulated by the long GnRH agonist protocol. Aetiology of infertility included male factor, unexplained, tubal factor and endometriosis. Prospective data collection was conducted. | DSL ELISA kit Threshold ≥0.84 ng/ml (6 pmol/l) was used | n = 33 women with AMH below the threshold had a live birth (per stimulated cycle). n = 222 women with AMH above the threshold had a live birth (per stimulated cycle) |
Khader et al. (2013) | n = 822 women aged 25–42 years and had had their first IVF/ICSI cycle from 2006 to 2010. They underwent a GnRH agonist or antagonist protocol and the causes of infertility included male factor, anovulation, tubal disease, endometriosis and unexplained infertility. Data were collected prospectively in a registry and it was a retrospective cohort study. | DSL ELISA kit Threshold ≥0.73 ng/ml (5.2 pmol/l) was used | n = 5 women with AMH below the threshold had a live birth. n = 237 women with AMH above the threshold had a live birth |
Grzegorczyk-Martin et al. (2012) | n = 704 women attending for their first IVF/ICSI study from 2006 to 2009 and underwent a long GnRH agonist protocol. Women with PCOS, one ovary, older than 43 years of age were excluded. Also, women with FSH ≥10 IU/l and AMH >2 ng/ml were excluded from the authors. We further excluded from the analysis women with FSH ≥10 IU/l and AMH ≤2 ng/ml (n = 54, so n = 650 included in the meta-analysis) because they were more likely to be poor responders. Retrospective cohort study. | IBC ELISA kit Threshold ≥2 ng/ml was used | n = 19 women with AMH below the threshold had a live birth. n = 100 women with AMH above the threshold had a live birth |
La Marca et al. (2011) | n = 381 women attended for their first IVF/ICSI cycle from 2005 to 2008, aged up to 42 years (mean ± SD, 34.8 ± 4.8 years) and were stimulated by a long GnRH protocol. Couples with severe male infertility (sperm count <1 × 106/ml or normal forms <5%) or women with systematic diseases were excluded from the study. Data were collected prospectively in a registry and it was a retrospective observational study. | IBC ELISA kit Threshold ≥0.4 ng/ml was used | n = 3 women with AMH below the threshold had a live birth. n = 98 women with AMH above the threshold had a live birth |
Majumder et al. (2010) | n = 162 women undergoing their first IVF/ICSI cycle from 2005 to 2006 and aged 23 to 39 years (Mean ± SD, 31.8 ± 3.79). They underwent a long GnRH agonist protocol for the common causes of infertility. It was a prospective observational study. | DSL ELISA kit Threshold ≥19.3 pmol/l (2.7 ng/ml) had 65.8% sensitivity and 54.8% specificity of predicting live birth (per embryo transfer) | n = 38 had a live birth (out of 137 women having had an embryo transfer) |
Lee et al. (2009) | n = 336 underwent their first IVF/ICSI cycle from March 2007 to December 2007. They underwent a long GnRH agonist protocol for male factor, tubal factor, other female factor or unexplained infertility. Women with PCOS were excluded. n = 213 were under 35 years of age (mean ± SD, 30.8 ± 0.2) and n = 123 were ≥35 years of age (mean ± SD, 38.6 ± 0.2). It was a prospective cohort study. | DSL ELISA kit Threshold ≥1.68 ng/ml (12 pmol/l) had 60% sensitivity and 66.2 % specificity of predicting live birth (per embryo transfer) in women ≥35 years | n = 40 women ≥35 years had a live birth (out of 114 women having had an embryo transfer). There were no extractable data for women <35 years |
Nelson et al. (2007) | n = 340 consecutive patients undergoing their first stimulated cycle with a GnRH agonist protocol. The patients had a median age of 34 years (IQ: 31–37 year). Prospective cohort study. | DSL ELISA kit Threshold ≥7.8 pmol/l (1.1 ng/ml) | n = 93 women achieved a live birth |
b. Characteristics of studies of women with expected low ovarian reserve (LOR) undergoing IVF/ICSI | |||
Merhi et al. (2013) | n = 120 historic cohort with LOR underwent IVF/ICSI with a GnRH agonist protocol from January 2008 to June 2013. The participants were over 35 years of age and LOR defined as FSH ≥ 10 IU/l. Participants were subdivided in three groups based on AMH levels with mean age ± SD group of 41.2 ± 3.3, 39.3 ± 3 and 40.2 ± 2.8 years. | DSL ELISA kit Threshold ≥0.8 ng/ml (5.7 pmol/l) | n = 9 women achieved a live birth |
Friden et al. (2011) | n = 127 women aged 39–46 years undergoing their first stimulated cycle from November 2006 to December 2008. They underwent standard antagonist or agonist protocols. There is no information regarding the study design. | DSL ELISA kit Threshold ≥8.6 pmol/l (1.2 ng/ml) | n = 6 women with AMH below the threshold had a live birth. n = 8 women with AMH above the threshold had a live birth |
Gleicher et al. (2010) | n = 295 with LOR underwent 507 cycles and were stimulated with dehydroepiandrosterone (DHEA) and microdose agonist. LOR was defined arbitrarily as FSH ≥10 IU/l or/and abnormally low age-specific AMH. n = 174 women had AMH ≤1.05 ng/ml and had a mean age 39.6 years (SD: 4.6). n = 121 women had AMH >1.05 ng/ml and mean age 35.2 years (SD: 5.4). | DSL ELISA kit Threshold >1.05 ng/ml (7.5 pmol/l) had a sensitivity of 73.6 % and specificity of 67.4 % in predicting live birth per stimulating cycle. | n = 43 women had a live birth (507 cycles) (data extracted with the Plot digitizer software) |
Study . | Population . | AMH test . | Outcome . |
---|---|---|---|
a. Characteristics of studies of women undergoing IVF/ICSI | |||
Li et al. (2013) | n = 1026 consecutive women undergoing their first cycle of IVF/ICSI from 2007 to 2009 and were stimulated with either the long GnRH agonist protocol or the GnRH antagonist protocol. Egg donation and preimplantation cycles were excluded. Study subjects had a median age of 35 years (IQ, 33–38) and had various aetiologies of infertility (male infertility was the majority, followed by tubal, endometriosis, anovulation and unexplained). Retrospective cohort study design. | Gen II Assay was used No threshold was used and sensitivity and specificity were extracted (Plot digitizer software) from graph | Live birth after the first fresh cycle, no frozen cycles were included in the analysis. n = 383 achieved a live birth. n = 643 did not have a live birth after a fresh IVF cycle |
Arce et al. (2013) | n = 749 women aged 21–34 years with either male or unexplained infertility stimulated with the GnRH antagonist protocol. Excluded women with polycystic ovary syndrome (PCOS), endometriosis and previous poor response. Women had FSH 1–12 IU/l and antral follicle count ≥10. It was a secondary analysis of data prospectively collected in a randomized assessor blinded trial. | Gen II Assay was used. Threshold ≥13 pmol/l (1.82 ng/ml) was used | n = 291 live birth after the first fresh cycle |
Lukaszuk et al. (2013) | n = 619 women (of 2495) had anti-Müllerian hormone (AMH) measured and underwent their first ICSI from 2005 to 2009. Median age was 32 (IQ, 29–34) with male infertility, anovulation, tubal factor, endometriosis and unexplained infertility. They underwent pituitary suppression with GnRH agonist protocol. Retrospective observational study. | DSL ELISA kit Threshold ≥1.9 ng/ml (13.6 pmol/l) was used | n = 39 women with AMH below the threshold had a live birth. n = 252 women with AMH above the threshold had a live birth |
Brodin et al. (2013) | n = 892 consecutive women with a maximum age of 42 years (median 36 years) underwent 1230 cycles of IVF/ICSI stimulated by the long GnRH agonist protocol. Aetiology of infertility included male factor, unexplained, tubal factor and endometriosis. Prospective data collection was conducted. | DSL ELISA kit Threshold ≥0.84 ng/ml (6 pmol/l) was used | n = 33 women with AMH below the threshold had a live birth (per stimulated cycle). n = 222 women with AMH above the threshold had a live birth (per stimulated cycle) |
Khader et al. (2013) | n = 822 women aged 25–42 years and had had their first IVF/ICSI cycle from 2006 to 2010. They underwent a GnRH agonist or antagonist protocol and the causes of infertility included male factor, anovulation, tubal disease, endometriosis and unexplained infertility. Data were collected prospectively in a registry and it was a retrospective cohort study. | DSL ELISA kit Threshold ≥0.73 ng/ml (5.2 pmol/l) was used | n = 5 women with AMH below the threshold had a live birth. n = 237 women with AMH above the threshold had a live birth |
Grzegorczyk-Martin et al. (2012) | n = 704 women attending for their first IVF/ICSI study from 2006 to 2009 and underwent a long GnRH agonist protocol. Women with PCOS, one ovary, older than 43 years of age were excluded. Also, women with FSH ≥10 IU/l and AMH >2 ng/ml were excluded from the authors. We further excluded from the analysis women with FSH ≥10 IU/l and AMH ≤2 ng/ml (n = 54, so n = 650 included in the meta-analysis) because they were more likely to be poor responders. Retrospective cohort study. | IBC ELISA kit Threshold ≥2 ng/ml was used | n = 19 women with AMH below the threshold had a live birth. n = 100 women with AMH above the threshold had a live birth |
La Marca et al. (2011) | n = 381 women attended for their first IVF/ICSI cycle from 2005 to 2008, aged up to 42 years (mean ± SD, 34.8 ± 4.8 years) and were stimulated by a long GnRH protocol. Couples with severe male infertility (sperm count <1 × 106/ml or normal forms <5%) or women with systematic diseases were excluded from the study. Data were collected prospectively in a registry and it was a retrospective observational study. | IBC ELISA kit Threshold ≥0.4 ng/ml was used | n = 3 women with AMH below the threshold had a live birth. n = 98 women with AMH above the threshold had a live birth |
Majumder et al. (2010) | n = 162 women undergoing their first IVF/ICSI cycle from 2005 to 2006 and aged 23 to 39 years (Mean ± SD, 31.8 ± 3.79). They underwent a long GnRH agonist protocol for the common causes of infertility. It was a prospective observational study. | DSL ELISA kit Threshold ≥19.3 pmol/l (2.7 ng/ml) had 65.8% sensitivity and 54.8% specificity of predicting live birth (per embryo transfer) | n = 38 had a live birth (out of 137 women having had an embryo transfer) |
Lee et al. (2009) | n = 336 underwent their first IVF/ICSI cycle from March 2007 to December 2007. They underwent a long GnRH agonist protocol for male factor, tubal factor, other female factor or unexplained infertility. Women with PCOS were excluded. n = 213 were under 35 years of age (mean ± SD, 30.8 ± 0.2) and n = 123 were ≥35 years of age (mean ± SD, 38.6 ± 0.2). It was a prospective cohort study. | DSL ELISA kit Threshold ≥1.68 ng/ml (12 pmol/l) had 60% sensitivity and 66.2 % specificity of predicting live birth (per embryo transfer) in women ≥35 years | n = 40 women ≥35 years had a live birth (out of 114 women having had an embryo transfer). There were no extractable data for women <35 years |
Nelson et al. (2007) | n = 340 consecutive patients undergoing their first stimulated cycle with a GnRH agonist protocol. The patients had a median age of 34 years (IQ: 31–37 year). Prospective cohort study. | DSL ELISA kit Threshold ≥7.8 pmol/l (1.1 ng/ml) | n = 93 women achieved a live birth |
b. Characteristics of studies of women with expected low ovarian reserve (LOR) undergoing IVF/ICSI | |||
Merhi et al. (2013) | n = 120 historic cohort with LOR underwent IVF/ICSI with a GnRH agonist protocol from January 2008 to June 2013. The participants were over 35 years of age and LOR defined as FSH ≥ 10 IU/l. Participants were subdivided in three groups based on AMH levels with mean age ± SD group of 41.2 ± 3.3, 39.3 ± 3 and 40.2 ± 2.8 years. | DSL ELISA kit Threshold ≥0.8 ng/ml (5.7 pmol/l) | n = 9 women achieved a live birth |
Friden et al. (2011) | n = 127 women aged 39–46 years undergoing their first stimulated cycle from November 2006 to December 2008. They underwent standard antagonist or agonist protocols. There is no information regarding the study design. | DSL ELISA kit Threshold ≥8.6 pmol/l (1.2 ng/ml) | n = 6 women with AMH below the threshold had a live birth. n = 8 women with AMH above the threshold had a live birth |
Gleicher et al. (2010) | n = 295 with LOR underwent 507 cycles and were stimulated with dehydroepiandrosterone (DHEA) and microdose agonist. LOR was defined arbitrarily as FSH ≥10 IU/l or/and abnormally low age-specific AMH. n = 174 women had AMH ≤1.05 ng/ml and had a mean age 39.6 years (SD: 4.6). n = 121 women had AMH >1.05 ng/ml and mean age 35.2 years (SD: 5.4). | DSL ELISA kit Threshold >1.05 ng/ml (7.5 pmol/l) had a sensitivity of 73.6 % and specificity of 67.4 % in predicting live birth per stimulating cycle. | n = 43 women had a live birth (507 cycles) (data extracted with the Plot digitizer software) |
PCOS, polycystic ovary syndrome; DSL, Diagnostic Systems Laboratories; IBC, Immunotech-Beckman Coulter.
Pooled estimates
We present data of 6856 cycles (6306 women) undergoing IVF or ICSI. The studies were categorized into those including only women with expected low ovarian reserve (n = 542 women) and those with women with unknown ovarian reserve (n = 5764) to minimize the between study heterogeneity. The pooled DOR for AMH predicting a live birth among women with unknown ovarian reserve who present to a fertility clinic was 2.39 (95% CI: 1.85–3.08) (Fig. 2a). The estimated I-squared was 58.9%, suggesting moderate heterogeneity between the studies. After adjustment for age, the DOR obtained by a hierarchical logistic regression model analysis was very similar at 2.48 (95% CI: 1.81–3.22). After adjusting for the assay used to measure AMH, the DOR was almost identical at 2.42 (95% CI: 1.86–3.14). The DOR for women with expected low ovarian reserve was 4.63 (95% CI: 2.75–7.81) (I-squared 0%), with wider CIs due to the small number of pooled studies (Fig. 2b). Hierarchical logistic regression analysis was not conducted for this subgroup of studies because of the small sample size (n = 3 studies). Since the 95% CIs of the DOR of the sensitivity analysis overlapped, the pooled DOR for all studies was estimated at 2.63 (95% CI: 2.02–3.40) with estimated I-squared of 62.1%. After adjustment for age the DOR for all studies was 2.67 (95% CI: 2.06–3.48). After including AMH assay as a covariate, the adjusted DOR obtained by the hierarchical model was 2.66 (95% CI: 2.06–3.43).
For the prediction of live birth in women with unknown ovarian reserve (n = 10 studies), the hierarchical summary receiver operating characteristics (HSROCs) along with study-specific estimates are plotted in Fig. 3. The parameters of the plot did not change substantially after including age or AMH assay as covariates (data not shown). The summary ROC and 95% CIs do not cross the line of no-discrimination. The AUC was 0.61 (CI 0.56–0.65). The overall summary estimates of the above 10 studies for serum AMH and live birth were sensitivity of 83.7% (95% CI: 72.5–90.9%) and specificity of 32.0% (95% CI: 21.6–44.6%) with adjusted for age sensitivity of 83.7% (95% CI: 72.5–90.1%) and specificity of 32.6% (95% CI: 21.8–45.5%). However, caution is required interpreting the summary sensitivity and specificity as the pooled studies have similar but not identical AMH thresholds (Table I), and summary sensitivity and specificity can vary according to the threshold used. The adjusted for age positive likelihood ratio was 1.24 (95% CI: 1.14–1.36) and negative likelihood ratio 0.5 (95% CI: 0.39–0.65).
Study quality assessment and publication bias
The quality assessment of the selected studies is represented as percentage of high, low or unclear bias in each domain assessed by the QUADAS 2 tool (Supplementary data, Fig. S1). Most studies reported live birth per cycle (or per patient if they included solely the first stimulated cycle (Nelson et al., 2007; Lee et al., 2009; Majumder et al., 2010; La Marca et al., 2011; Grzegorczyk-Martin et al., 2012; Khader et al., 2013), one study reported live birth per ovum retrieval (Friden et al. 2011) and two reported the cumulative live birth rate (Arce et al., 2013; Li et al., 2013). The majority of the studies measured AMH using the Diagnostic Systems Laboratories Inc. (DSL, Webster, TX, USA) assay (Nelson et al., 2007; Lee et al., 2009; Gleicher et al., 2010; Majumder et al., 2010; Friden et al., 2011; Brodin et al., 2013; Khader et al., 2013; Lukaszuk et al., 2013; Merhi et al., 2013) with the remainder using the Immunotech-Beckman Coulter (IBC, Marseille, France) assay (La Marca et al., 2011; Grzegorczyk-Martin et al., 2012) or the Beckman Coulter Generation II assay (Arce et al., 2013; Li et al., 2013). With respect to potential selection bias, two studies excluded women with polycystic ovary syndrome (La Marca et al., 2011; Grzegorczyk-Martin et al., 2012), one excluded couples with severe male factor infertility (La Marca et al., 2011), some studies included only women with low expected ovarian reserve defined either by advanced age and/or high FSH and/or low AMH (Gleicher et al., 2010; Friden et al., 2011; Weghofer et al., 2011; Merhi et al., 2013).
The funnel plot (Supplementary data, Fig. S2) visually suggests asymmetry raising the possibility that studies with small sample size and results lacking statistical significance may be missing; however, the statistical test for funnel plot asymmetry did not reach statistical significance (P = 0.25) .
Discussion
This meta-analysis of 6306 women suggests that AMH has some association with predicting live birth in women undergoing IVF; however, its predictive accuracy is poor. The pooled DOR among 5764 women with unknown ovarian reserve was 2.48 after adjusting for age. In the remaining women with expected low ovarian reserve (n = 542), AMH had a better, albeit still small, predictive effect (DOR = 4.63); however, this needs to be substantiated in larger studies. Although the DOR was greater than unity in all the pooled studies, it was consistently low; it is known that useful tests with good predictive accuracy tend to have DOR above 20 (Fischer et al., 2003). The HSROC model and 95% CIs of the pooled data did not cross the no-discrimination line, indicating that AMH has some value in predicting live birth. In addition, the prediction 95% confidence region, which suggests the confidence region for a forecast of the true specificity and sensitivity in a future study, includes the line of no-discrimination, if only marginally; this raises the possibility of a positive predictive value of AMH being found in future studies. However, it is established that tests with likelihood ratios ranging from 0.33 to 3 rarely change clinical decisions (Jaeschke et al., 1994); therefore, the small positive (1.24) and the large negative likelihood ratio (0.5) found in our study indicate that AMH alone is unlikely to alter a clinical decision based on the chance of live birth after IVF/ICSI.
The potential value of ovarian reserve tests, including AMH, in predicting the likelihood of pregnancy after assisted conception has been contentious. Initial meta-analysis and more recent individual patient data meta-analysis did not demonstrate an association (Broekmans et al., 2006; Broer et al., 2013). This was despite ovarian reserve tests, and in particular AMH, being strongly associated with ovarian response and oocyte yield (Broer et al., 2009, 2011, 2013), which is a known major determinant of live birth (Sunkara et al., 2011). The current meta-analysis has used a solid methodology by pooling a large number of studies with the outcome of live birth rather than pregnancy, so the conclusions should not be viewed as contradictory but supplementary.
AMH may not be associated with a positive pregnancy test or ongoing pregnancy, with the small association with live birth only becoming apparent after non-continuing pregnancies are lost. Although we did not test the value of AMH in predicting non-continuing pregnancies or ongoing pregnancies in the current review, we could consider this as a possible biological explanation. This would suggest that AMH is not only a marker of ovarian response and oocyte quantity (La Marca et al., 2010; Broer et al., 2011, 2013; Arce et al., 2013), but also that it has a (limited) association with oocyte quality. The decline in oocyte quality with increasing age is well established (Nybo Andersen et al., 2000), but the literature on ovarian reserve tests and oocyte quality is inconsistent. The positive association of AMH with increasing cumulative live birth rate is attributed to the greater availability of oocytes and embryos and not better oocyte quality by some (Arce et al., 2013) but was found to be independent of age and oocyte yield by others (Brodin et al., 2013). Several studies have not observed an association between AMH and oocyte or embryo quality (Smeenk et al., 2007; Lie Fong et al., 2008; Guerif et al., 2009; Mashiach et al., 2010; Riggs et al., 2011; Anckaert et al., 2012; Kedem-Dickman et al., 2012; Arce et al., 2013), whereas others report a positive association (Ebner et al., 2006; Majumder et al., 2010; Irez et al., 2011; Brodin et al., 2013; Lin et al., 2013). As assessment of oocyte and embryo quality has largely focused on morphology rather than objective measures of euploid status and developmental potential (Plante et al., 2010; Kline et al., 2011), the alternative outcome of a live birth has been used by some researchers to assess oocyte quality. Many of these studies were small (Ebner et al., 2006; Smeenk et al., 2007; Lie Fong et al., 2008; Majumder et al., 2010; Mashiach et al., 2010; Irez et al., 2011; Riggs et al., 2011; Kedem-Dickman et al., 2012; Lin et al. 2013), raising the possibility of a beta (false negative) error (Riggs et al., 2011). The present analysis indicates that AMH has some value, albeit with poor accuracy, in predicting live birth, and that this relationship is independent of age or AMH assay. While most circulating AMH derives from small antral follicles (Weenen et al., 2004; Jeppesen et al., 2013), AMH expression persists in the cumulus cells surrounding the oocyte at the time of ovulation (Salmon et al., 2004) providing a potential basis for a relationship with oocyte quality distinct from its relationship with follicle number.
The included studies in our analysis reported AMH according to the IBC, the DSL or the Beckman Coulter Generation II assay. Although the IBC and DSL assays differ both in their pairs of monoclonal antibodies and standardization so do not give comparable values, the conversion formula of the DSL assay data into IBC values of 2.02 * DSL = IBC has been used consistently for data aggregation studies (Hehenkamp et al., 2006). In contrast, the Generation II assay was recently released after harmonization of the other two assays by incorporating the antibodies used in the DSL assay but being calibrated to the Immunotech assay and thereby anticipated to give equivalent values (Nelson and La Marca 2011). However, since the commercial release of the AMH Generation II assay it has been demonstrated that there is a systematic shift in assay calibration and AMH values generated with the AMH Generation II assay were significantly lower compared with the DSL assay for women of similar age (Nelson et al., 2013a). Therefore, the different AMH assays with the current calibration concerns could be a source of significant heterogeneity in our pooled analysis; however, after adjustment for AMH assay the DOR for both women of unknown ovarian reserve and all women was only changed modestly confirming further that, irrespective of the assay, measured AMH has some value in the prediction of live birth.
Clinical application
The immediate clinical implication of the present finding is that AMH independently of age provides additional information for couples considering assisted reproduction. However, its diagnostic accuracy in live birth is poor and should not be used to alter clinical decisions and exclude couples from IVF/ICSI based on a low AMH. In addition, these data do not justify adoption of an AMH threshold for access to such treatments and further studies are needed to investigate whether a universal AMH threshold is possible, or appropriate.
To date a wide range of prediction models has been developed to facilitate prognostication of the likelihood of success after assisted conception, with none of these going through the three classical phases of model development (Nelson and Lawlor 2011; van Loendersloot et al., 2013). A recent systematic review and meta-analysis which analysed nine predictive factors in IVF identified that female age and baseline FSH were inversely associated with the likelihood of success after IVF (van Loendersloot et al., 2010). As AMH is a stronger associate of ovarian reserve than FSH (Broekmans et al., 2006; Hansen et al., 2011), this suggests that future prediction models should consider AMH as an alternative covariate. Critical assessment of whether inclusion of AMH improves the prediction characteristics of existing models, when compared with updating existing prediction models with adjustment or recalibration to account for local circumstances, will be a critical step in confirming its clinical utility in prediction of live birth.
Strengths and limitations
This is the first study presenting pooled data of a large number of cycles to assess the predictive value of serum AMH in live birth after IVF/ICSI. The strengths of the review lie in the extensive search strategy, adherence to recent guidelines (Cochrane 2011), inclusion of non-English studies and robust statistical analysis in accordance with established guidance for diagnostic tests (Irwig et al., 1994; Khan et al., 2001). Although the process of systematic literature review and meta-analysis is a robust way of generating a more powerful estimate of true-effect size with less random error than individual studies, it does have limitations and the inferences assumed by the data are subject to the limitations and bias of the primary studies. Heterogeneity of the studies must be addressed as it may affect the justification for pooling the data into one analysis. In the case of the present meta-analysis, heterogeneity may have been caused by different baseline characteristics in study participants, different stimulating protocols, variation in the AMH threshold and assay across different studies and study quality characteristics. However, the statistical estimation of heterogeneity was within acceptable levels for pooling studies. In addition, one of the advantages of the HSROC analysis is that it takes into account the full range of variation in the data, differentiating within study from between study variability and systematic from random variability (Gatsonis and Paliwal 2006). Secondly, our pooled analysis did not include data from some studies (Wang et al., 2010; Lin et al., 2013; Mutlu et al., 2013) and from a subgroup from another study (Lee et al., 2009) (overall n = 1956 women) as the data to conduct a 2 × 2 table were not extractable or available even after contacting the authors, which we acknowledge may have introduced bias. The largest of these studies showed a positive relationship between AMH and live birth although it attenuated after age stratification (Wang et al., 2010), while the others did not show an association possibly due to small sample number (n = 83 (Lin et al., 2013); n = 213 (Lee et al., 2009); n = 192 (Mutlu et al., 2013)). Also, while the funnel plot analysis raises the possibility that small studies showing non-significant or negative association between AMH and live birth may be missing, an asymmetrical funnel plot does not prove a specific type of bias and may indicate small study effect, i.e. smaller studies have the tendency to inflate the summary effect (Sterne et al., 2011) and moreover, the apparent funnel asymmetry was not statistically significant. Our methodology tried to minimize the possibility of publication bias by implementing a robust search technique without language limitations and by contacting the authors when the relevant information was not extractable. In addition, the qualitative assessment of the included studies indicates that the majority of the studies had low risk of methodological bias (Supplementary data, Fig. S1). Our pooled analysis does not derive a summary diagnostic threshold for AMH, as this would be inappropriate due to the individual studies utilizing diverse populations with different fertility potentials and ethnic backgrounds, different AMH cut-off points and different AMH assays.
Conclusion
Based on the current evidence, we found that AMH adds some value in predicting live birth, and this is independent of age or AMH assay. Although the CIs of the DOR for AMH do not cross unity, its predictive accuracy is poor and should not be over-interpreted. These findings were consistent across all of the studies examined, but the existing evidence would be inappropriate to determine a widely applicable threshold value due to the heterogeneity between studies, and likewise to derive the relative risks of live birth across the clinically relevant range of AMH values or exclude women from IVF/ICSI. This evidence can only be obtained by prospectively designed studies of test accuracy with adequate clinical size and attention to limiting bias and appropriate outcome measures (possibly including cost effectiveness analysis) using a decision tree model. This will allow the trade-off between positive and negative benefits to be truly evaluated.
Authors' roles
S.M.N. and R.A.A. conceived the idea of the study. S.I. and S.M.N. did the systematic search, assessed the eligible studies and extracted the data. S.I. pooled the data and conducted the statistical analysis. T.W.K. contributed to the statistical analysis. O.W. reviewed the methodology. S.I., S.M.N. and R.A.A. drafted the paper. All the authors contributed to the final version of the paper.
Funding
No funding was received for this study.
Conflict of interest
S.M.N. and R.A.A. have undertaken consultancy work for Beckman Coulter. The other authors have no conflicts to declare.
References
Author notes
Joint senior authors.