Introduction

Esophageal carcinomas can be divided into two distinct histological subtypes; squamous cell carcinoma (ESC) and adenocarcinoma (EAC). In Northwestern European countries and North America a rapid rise in the incidence of EAC is seen1,2. Mainly due to late symptoms, only half of the patients present with curable disease and despite multimodality treatment, median overall survival remains merely 48.6 months in patients with operable disease3.

To increase survival, biomarkers could harbor great potential by (i) better stratification of patients according to their tumor biology and (ii) to direct the development of new targeted anti-cancer therapies. Prognostic biomarkers provide information on clinical cancer outcomes, such as overall survival (OS), independent of received treatment4. The Erb-b2 receptor tyrosine kinase 2 (Neu or HER2), a member of the epithelial growth factor receptor family, has previously been identified as such a prognostic biomarker in EAC, which can be targeted by trastuzumab, a humanized anti-HER2 monoclonal antibody5. Since a significant survival benefit was shown in the phase III ToGA trial, trastuzumab in addition to standard chemotherapy, has become standard of care for HER2 positive advanced-stage gastro-esophageal cancers5,6. Currently, the value of HER2 directed therapies in patients with curative EAC is investigated (NCT02120911), however, compared to other tumor types, targeted therapy development is lagging behind in EAC. Thus far, trastuzumab is the only available targeted treatment option in EAC, while survival in this disease remains dismal, underscoring the urgent need to improve therapeutic options7. Further identification of prognostic biomarkers may lead to the development of new targeted therapies, thereby improving survival.

Unfortunately, previous reviews investigating prognostic biomarkers in esophageal cancer did not distinguish EAC from ESC or solely focused on immunohistochemistry (IHC) as the method of biomarker detection8,9. However, great differences in tumor biology between EAC and ESC have been demonstrated, necessitating separate analysis2. Furthermore, since their publication there has been an enormous development of detection techniques, enhancing the opportunity to identify clinically applicable prognostic biomarkers10. And lastly, the REporting recommendations for tumor MARKer prognostic studies (REMARK criteria) have become consensus guidelines for prognostic biomarker studies, to increase quality of the published work and improve extrapolation of the study outcomes11. Hence, when appraising new prognostic biomarkers, these REMARK criteria should be taken into account.

This systematic review with meta-analyses provides an overview of the prognostic biomarkers in resectable EAC treated with curative intent, focusing on overall survival, to guide the development of new targeted therapies.

Results

Study characteristics

All 3,298 identified articles were screened on title and abstract (Fig. 1). After assessing 466 articles on full text, 84 articles were included12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95. Six articles were grouped in the adapted hallmark of cancer ‘multiple’, resulting in 78 articles that could be included in the meta-analysis, investigating a total population of 12,876 EAC patients. The main characteristics of the studies are shown in supplementary Table S1. A total of 82 unique biomarkers were identified. The majority of the biomarkers were detected by immunohistochemistry (IHC) or a combination of IHC and an in situ hybridization method (ISH). Less frequently applied detection methods were PCR, RNA sequencing, DNA sequencing and one article used a combination of reverse phase protein array (RPPA) analysis, reverse transcriptase-PCR and IHC95. Most (N = 61) articles included a study population consisting of EAC only, 12 articles included an EAC population that consisted of ≥70% adenocarcinomas, 11 articles performed separate OS analyses on EAC and other histological subtypes. Of the assessed patients, 1822 (14.2%) received prior chemo(radiation)therapy. The mean study sample size and IF of the articles was 152 patients (standard deviation = 112.16) and 4.54, respectively.

Figure 1
figure 1

Flow-chart of included articles.

Quality assessment

Assessment of the study quality using the adapted REMARK criteria, resulted in a mean quality of 5.9 points (range 3.5–7) (Supplementary Table S2). Three studies had a low quality score, and were included in the sensitivity analyses31. In general, points were lacking in quality criteria C5; reporting if patients received therapy and if so, specifying the chemo(radio)therapy regimen. In addition, C1; a representative cohort with clear baseline characteristic and C2; reasons of patient drop-out, were often absent. A positive correlation (R = 0.480) was observed comparing study size and the impact factor of the journal in which the study was published (p = 0.0005) (Supplementary Fig. S3). There was no correlation (R = 0.058) between the study quality assessed by the adapted REMARK criteria and impact factor (p = 0.601).

Proliferation

The majority of the biomarkers studied are involved in tumor cell proliferation, of which HER2, EGFR, cyclin D, KI67 and MTOR were the most frequently reported (Fig. 2). Subgroup analysis on EGFR demonstrated an association with worse OS, HR 1.43 (95% CI 1.04–1.95). Analyses of the HER2 subgroup, however, showed no significant association with OS, HR 1.28 (95% CI 0.96–1.70). HER2 remained not significantly associated with worse OS when evaluating the HER2 subgroup by including only data on HER2 expression assessed by means of the gold standard (IHC and in case of equivocal HER2 expression (Hoffman scoring system 2+) an additional in situ hybridization method96), or if data on EAC with Barrett’s esophagus (BE) segment was replaced by data on EAC without BE ((HR 1.09 (95%CI 0.46–2.60)) and (HR 1.33 (95%CI 0.78–2.28)), respectively) (Table 1). The overall pooled effect of the proliferation feature was significantly associated with worse OS (HR 1.41 (95%CI 1.22–1.63)), however, significant test heterogeneity was found. IGFBP7, a member of the insulin like growth factor receptor family, was identified as most promising prognostic biomarker in this hallmarks of cancer feature. Funnel plot analyses showed no indication for publication bias (Supplementary Material S4).

Figure 2
figure 2

Random-effect Forest plot of prognostic biomarkers included in the adapted hallmark of cancer ‘proliferation’. EGFR, Cyclin D1, mTOR and HER2 were pooled as subgroup.

Table 1 Sensitivity analyses on the HER2 subgroup.

Hallmark specific markers

All identified biomarkers and hallmarks of cancer features are summarized in Fig. 3. The potential of all identified prognostic biomarkers was evaluated by assembling the biomarkers according to their main function in tumor biology in their corresponding hallmarks of cancer feature. Performing meta-analysis on all features, most were significantly associated with worse OS, except metabolism (HR 1.56 (95%CI 0.98–2.47)), and self-renewal (HR 1.08, (95%CI 0.81–1.43)). The hallmark of cancer feature ‘immune’ was most significantly associated with worse OS (HR 1.88, (95%CI 1.20–2.93)). Of the 82 unique prognostic biomarkers identified, meta-analyses showed several promising biomarkers, including COX-2, PAK-1, p14ARF, PD-L1, MET, LC3B and LGR5, associated to each hallmark of cancer feature. After excluding low study quality articles, there was no significant association with OS in the group cell adhesion (N = 1, n = 52, SPARC and SPP1; HR 1.49 (95% CI 1.07–2.07) to HR 1.24 (95% CI 0.83–1.86), respectively) (Table 2)31,45,58. Additional sensitivity analyses on EAC treated with surgery as single treatment modality vs. EAC treated with neoadjuvant treatment and surgery, the hallmarks of cancers feature ‘cell cycle’ was not significantly associated with OS (HR 1.43 (95%CI 1.08–1.89) to HR 1.09 (95%CI 0.75–1.57), respectively) although the same biomarkers were tested. The feature ‘metabolism’ remained not significantly associated with OS. After sensitivity analyses, the prognostic biomarkers identified as most promising remained unchanged for each hallmark of cancer feature. Funnel plot analyses showed no indication for publication bias.

Figure 3
figure 3

All identified biomarkers and adapted hallmarks of cancer are summarized in the Ferris Wheel Plot. The area of each adapted hallmark of cancer represents the amount of articles with data on the corresponding hallmark of cancer. The most promising prognostic biomarkers according to our meta-analysis are highlighted. In the inner circle the hazard ratios (HR) and 95% Confidence Intervals (95%CI) are reported for each adapted hallmark of cancer.

Table 2 Sensitivity analysis on articles with a low quality score on the adapted REMARK criteria and those patients receiving (neo)adjuvant chemotherapy.

Discussion

This review summarizes the great diversity of prognostic biomarkers studied in EAC thus far. Evaluating the biomarkers by grouping them based on their role in tumor biology to the most fitted hallmark of cancer feature, 82 unique biomarkers could be identified.

Interestingly, the hallmark of cancer feature ‘immune’ presented itself as most significant associated with worse OS, and therefore may harbor potential to apply targeted therapies. Due to increased understanding of the tumor immunomicro-environment, and promising trial results, new immune based therapies are recently emerging, such as the PD-L1/PD1 targeting agents nivolumab and pembrolizumab97. Targeting PD-L1/PD-1, a critical immune checkpoint, releases the inhibitory effect on both the humoral and cellular immune response, activating T-cells to enhance the antitumor response. These PD-1 pathway inhibitors have previously been FDA approved in several solid tumors, including melanoma and non-small lung cancer. Indeed here we identify PD-L1, a ligand of the co-inhibitory receptor PD-1, as the most promising prognostic biomarker included in this hallmarks of cancer feature. However, the clinical applicability of these drugs has not been proven in resectable EAC yet and whether PD-1 is a predictive biomarker, reflective of response to treatment, remains to be elucidated4,97.

For all other hallmarks of cancer features promising prognostic biomarkers were identified as well, including COX-2, PAK-1, p14ARF, MET, LC3B, IGFBP7 and LGR5. For the MET-, IGFBP7, and LGR5 pathways targeted therapies have already been studied in other cancer types with varying results, however, the potential to target these biomarkers in EAC is yet to be investigated98,99. Likewise, the inhibition of CDK4/6 in p14ARF mutant patients by small molecules or pan-CDK inhibitors is being invested as add-on to standard chemotherapy backbones, potentially enabling blockage of unrestricted cell division caused by p14ARF mutations100. Non-steroidal anti-inflammatory drugs (NSAID’s), inhibiting COX-2, are commonly used and safe. Hence, inhibition of COX-2, an important regulator of cell growth, differentiation and apoptosis, may be a valuable contribution in the treatment of EAC. Thus far, COX-2 has been demonstrated to be involved in the neoplastic formation of esophageal cancer101. Moreover, the use of NSAID’s, is associated with a reduced risk of EAC development and is proven to reduce cell growth in 8 esophageal cell lines. Contrary, yet little is known about the potential drugability of PAK-1 in cancer, even though the recently elucidated central role in oncogenic signaling has enhanced interest in small-molecule based PAK-1 targeting102. Similarly, merely in vitro the inhibition of autophagy by blocking LC3B has been explored in oncological diseases. Therefore, the therapeutic potential remains to be clarified.

Even though promising prognostic biomarkers were identified, limitations should be recognized. Firstly, after performing sensitivity analysis on the study quality, the feature cell adhesion was no longer significantly associated with OS when excluding articles scoring low on the adapted REMARK criteria31,45,58. In addition, as it is known that studies with low quality hamper extrapolation of the data to clinical practice, it is surprising to notice that study size and impact factor were correlated, while no correlation between the study quality and the impact factor was found. Although after sensitivity analyses on articles scoring low on the adapted REMARK criteria the same promising biomarkers were still identified, the varying study quality is worrying. Frequently, articles failed to report the received therapy, and if this information was supplied, often did not specify the treatment regimen. As nowadays neoadjuvant treatment has become standard of care for operable EAC, reporting these baseline characteristics has become increasingly important.

In this meta-analyses 1822 (14.2%) resection specimens were evaluated on prognostic biomarker status after patients received neoadjuvant chemo(radiation)therapy. It should be noted that in specimens of good-responders no, or a few, remaining tumor cells may be found, biasing the prognostic potential of the assessed biomarker. Moreover, if post-neoadjuvant therapy samples are included in biomarker analyses, treatment regimens should be clearly described. It is known that a better response to therapy is attained with neoadjuvant chemoradiation therapy than if patients receive radiation therapy as single treatment modality. This could further bias the results found. In addition, when extrapolating these results to a predictive setting for the identification of new therapy options, these biomarkers might not have predictive potential in the neoadjuvant setting. Indeed, sensitivity analyses on articles reporting on patients who received neoadjuvant therapy demonstrated the influence of these treatment regimens on the association between biomarker status and survival. The feature ‘cell cycle’ was significantly associated with worse OS in all patients, and, when testing the same biomarkers, no longer harbored this association with survival if solely neoadjuvant treated EAC was included in the analysis. Since commonly used DNA-damaging chemotherapeutics as carboplatin and paclitaxel have influence on the cell cycle, this effect was expected, highlighting the importance of reporting the received treatment regimen.

The importance of clear reporting standards for biomarker research and standardization of the detection method used is also demonstrated by subgroup analyses on HER2. In contrast to the current notion, no association with decreased survival was found when plotting the data of all articles reporting on the prognostic potential of HER2. When exclusively including data on HER2 positivity assessed by means of the gold standard, IHC and in case of equivocal HER2 expression (Hoffman scoring system 2+) an additional in situ hybridization method, the association with worse OS remained not significant5,103. The significant test heterogeneity found in the corresponding hallmark of cancer feature ‘proliferation’ could at least partly be attributed to the varying detection methods applied. As all used tests have a unique sensitivity and specificity, outcomes can be greatly influenced by the method of biomarker assessment. The applied detection method will not only reflect underlying tumor biology, but also affect the relation of the biomarker with prognostic outcomes and targetability. For example, it has been demonstrated that solely assessing HER2 positivity by amplification of the HER2-gene with an in situ hybridization method does not correlate to efficacy of HER2-targeted therapy103. Likewise, different IHC cutoff-points of biomarker positivity influence both prognostic and predictive outcomes. As has been demonstrated in this meta-analysis, even for well-known biomarkers such as HER2, used in clinical practice, articles use varying definitions of biomarker positivity, thereby limiting comparison of data. Several promising biomarkers in resectable EAC have been identified, however, in order to stratify patients in accordance to their tumor biology, and to develop new targeted anti-cancer treatments, future research is needed. First, standardization of reporting on biomarker research is needed to further identify prognostic biomarkers. Subsequently, large-scale multicenter randomized-controlled trials should be conducted to validate the clinical applicability of these biomarkers and to evaluate their potential targetability.

To conclude, a wide variety of prognostic proteins and their expression have been studied in EAC treated with curative intent. Despite varying study quality of the published data, promising biomarkers could be identified, including COX-2, PAK-1, p14ARF, PD-L1, MET, LC3B, IGFBP7 and LGR5. The clinical application and targetability of these biomarkers as anti-cancer therapy in operable EAC should be addressed in future research.

Methods

Search strategy

Literature was retrieved using the Medline, Cochrane and Embase databases on the 19th of January 2017 to identify articles published in the last 10 years, with the publication date restricted to the first of January 2007 until the first of January 2017. In addition to MESH terms, free text words were added to the search, to include all relevant articles that might not have assigned MESH terms yet. The full search is available in the supplementary information S5.

Screening and selection of studies

All titles, abstracts and full text articles were screened independently by two researchers (AC and EAE), discrepancies were resolved by discussion. Articles were selected based on the following criteria; (i) the research population included adenocarcinomas of the esophagus or the gastro-esophageal junction, defined as Siewert class I and II, that could be treated with curative intent (ii) should report biomarker related overall survival (OS) data, described with hazard ratios (HR), 95% confidence intervals (CI), and p-value. If both EAC and ESC were studied, the research population should include at least 70% EAC or display separate survival analysis. Reviews, case reports, (meeting) abstracts, phase I studies and articles without full-text in English were excluded. When articles reported on the same biomarker(s) investigating the identical patient population, the publication examining the most biomarkers was included. Endnote X7 (Clarivate Analytics, Boston, USA) was used to select and screen the literature.

Data extraction and outcomes

Data extraction was done by AC and EAE following a predefined protocol and double checked until consensus was reached. The following data was extracted: first author, publication year, journal, patient population (EAC only, >70% EAC or EAC and ESC with separate survival analysis), tumor material studied (blood, biopsy, resection specimen or a combination), reported tissue handling, method of biomarker detection, used scoring methods and cut-off values for biomarker positivity, received therapy (yes (including a clear description of the treatment regimen), no, or not reported (NR)), the duration of follow-up, and reported confounders in multivariate analyses. Lastly, the primary outcome of this review, overall survival data of univariate and/or multivariate analyses presented as HR, 95% CI, and p-value. The impact factor (IF) of journals at the time of publication of the studies were extracted from bioxbio.com/if/.

Study quality assessment

To assess the quality of the included studies the REporting recommendations for tumor MARKer prognostic studies (REMARK) criteria for biomarker studies were adapted into a scoring system (Table 3)11. The adapted scoring criteria were chosen by discussion between AC, EAE, MvO and HvL. The articles could be scored 1 point per item, with a maximum of 7 points. In case of ambiguity or incompleteness, half a point was allocated. A study was defined of low quality when ≤3.5 points were assigned. The study quality was assessed by AC and EAE, in case of disagreement consensus was reached by discussion.

Table 3 The adapted version of the REporting recommendations for tumor MARKer prognostic studies (REMARK) criteria for biomarker studies11.

Statistics

The potential of all identified prognostic biomarkers was evaluated by grouping the biomarkers according to their main function in tumor biology in the corresponding hallmark of cancer104. To fit all identified biomarkers, the hallmarks of cancer were adapted, resulting in the following features: angiogenesis, cell adhesion and extra-cellular matrix remodeling, cell cycle, immune, invasion and metastasis, metabolism, proliferation, and self-renewal. Some articles showed data on a cluster of genes, these were assembled in the hallmarks of cancer feature ‘multiple’. Due to the heterogeneous scope of action of the biomarkers, we did not perform meta-analysis on papers included in the ‘multiple’ group. Pooled hazard ratios (HR) and 95% confidence intervals (CI) were derived by random effects meta-analyses performed on each hallmark of cancer feature. HR and 95%CI data of univariate and multivariate analysis were combined in the meta-analysis; data derived from multivariate analysis was used as default, but when absent, univariate values were used. If the data was related to absence rather than presence of the biomarker, the HR data were inversed. When identical biomarkers were reported in more than two studies, these duplicate biomarkers were included in subgroup analysis. In order to determine the influence of a low quality score, sensitivity analyses were performed on studies with a low study quality on the adapted REMARK criteria scale. Additional sensitivity analyses were conducted on studies showing data on both EAC treated with surgery as single treatment modality and neoadjuvant treated EAC. Finally, the most promising biomarker for each hallmark of cancer feature was selected based on the most optimal combination of a high HR and small 95% CI. Consensus was reached between AC, EAE, MvO and HvL on the selected biomarkers. Publication bias was evaluated by means of a Funnel plot on all hallmarks of cancer features. Random effects meta-analyses were performed in Review Manager V5 (The Cochrane Collaboration, Copenhagen, Denmark). Pearson’s correlations with linear regression analysis between IF, adapted REMARK quality score, and patient cohort size were performed using GraphPad Prism 6 (GraphPad Software, La Jolla, CA, USA).

Ethics statement

This article does not contain any studies with human or animal subjects performed by any of the authors.