Somatic Mutations in Cancer-Free Individuals: A Liquid Biopsy Connection

Somatic mutations have been perceived as the causal event in the origin of the vast majority of cancers. Advanced massively parallel, high-throughput DNA sequencing have enabled the comprehensive characterization of somatic mutations in a large number of tumor samples for precision and personalized therapy. Understanding how these observed genetic alterations give rise to specific cancer phenotypes represents an ultimate goal of cancer genomics. However, somatic mutations are also commonly found in healthy individuals, which interfere with the effectiveness for cancer diagnostics.

Somatic mutations in healthy individuals are very prevalent, with an average mutation number of around 2-6 mutations/1M bases [9,10]. The baseline somatic mutation spectrum in healthy population not only can help fill the gaps for the establishing early cancer diagnosis strategies, but also argues against the idea of using normal cells as germline control to make somatic mutation calls in sequencing tests. Moreover, the same driver mutation could exist in both tumor and normal cells yet with distinct biological effects, we should not simply define the threshold of mutation detection by removing the background mutations found in a healthy population. Taken together, we need to incorporate and carefully calibrate the background somatic mutations in healthy individuals; the fact is they are not all germline mutations.

Somatic Driver Mutations Found in Healthy Population by Liquid Biopsy
With the dramatically decreased cost of next-generation sequencing (NGS) in recent years, it is now practical to screen a large number of individuals at ultra-deep sequencing depths to identify cancer-related mutations. Cell-free DNA (cf DNA) in the blood circulation of cancer patients (as liquid biopsy) has emerged as key biomarkers for cancer monitoring and treatment decision-making [11]. Both academic research groups and industry players are chasing the pan-cancer screening by a simple blood draw. However, the reliable and accurate application of cfDNA detection requires better understanding of background somatic information in healthy individuals.
We performed ultra-deep target sequencing on 50 cancerassociated genes for plasma cfDNA from a cohort of 129 apparently healthy cancer-free subjects. To increase the confidence of the called mutations, we here defined the mutation

Journal of Tumor Medicine & Prevention
as the variant allele frequency greater than 1% and the average depth more than 5,000 xs for demonstration. Our data revealed an age-independent mutation spectrum with average 3.12 somatic mutations per subject ( Figure 1). The most frequently mutated genes are TP53 (42%), KIT (6%), KDR (5.5%), PIK3CA (5.5%), EGFR (5%) and PTEN (3.7%). These results highlighted the prevalence of some cancer-associated driver mutations in healthy individuals as background mutations. We also demonstrated the concordance between our results and a recent study for revealing the real somatic mutation in healthy population. The study by Xia et al. [12] examined the background somatic mutations in white blood cells and cfDNA in healthy controls based on sequencing data from 821 non-cancer individuals with the aim of understanding the baseline profile of somatic mutations detected in cfDNA. The data comparison was summarized in Table 1. Although there are differences in study cohort composition, sample volume, extraction methodology and analytical platform, the end results are remarkably similar, i.e., average 3 mutations per subject with an almost identical list of frequently mutated genes. Although varying mutation spectra in cancers have often been attributed to cancer-specific processes, our data suggest that at least a subset of these mutations actually reflect normal tissue-specific processes. This concept is consistent with the idea that a substantial fraction of the mutations found in cancers occur in normal stem cells [13,14].

Normal Tissue as a Germline Control not Justified
There is evidence for the presence of tumor-derived cfDNA in early cancers [15]. However, the real fraction of cfDNA that shed by tumor rather than the background somatic mutations is not well illustrated. For clinical application, the low level of tumor mutation as well as the heterogeneity of background mutation present in the circulation needs to be clearly addressed and differentiated to achieve accuracy. Unfortunately, this goal can't be achieved by pushing detection limit of current advanced technology to below 0.01% mutant allele frequency (MAF). Contrarily, the higher sensitivity will guarantee higher chance to pick up background somatic mutations. Also, the clinical relevance of those low-percentage tumor mutations is still debatable in terms of treatment decision or regimen change.
Each human individual is unique. Every cancer patient is different. No two tumors are the same even resides within the same patient; to distinguish the definitive cancer-specific mutations from background signals observable in plasma is extremely daunting. Evaluation of specificity in plasma cfDNA profiles from large numbers of healthy individuals as representative controls for the cancer population seems farfetched with uncertainty, especially when standardized protocol and optimized technology are still lacking. Unlike tissue genomic DNA, circulating cfDNA is so diluted and dynamic with a relatively

Journal of Tumor Medicine & Prevention
short half-life, making single-point measurement not suitable for clinical application. We reason that cfDNA in circulation is truly under a continuous selection pressure to select for highly aggressive/proliferative clones, as disease progressing the low-abundant tumor clones will either evolve and dominate or vanish by the immune clean-up processes, therefore longitudinal clinical follow-ups should be performed to identify the best time and target for precision therapy, meanwhile to filter out contaminating background mutations. To achieve high clinical specificity, a cfDNA-based test must be capable of distinguishing between the background signals originating from non-cancer or pre-cancerous processes and the invasive malignancy of clinical interest. It is still possible that mutational signatures in cfDNA could distinguish basic biological processes from malignant and pathological processes. Here we propose a combined approach based on the tumor evolutional principle of "survival and domination of the fittest" in circulation that is to perform multiple time-point monitoring, filter out potential background mutations (e.g., <1% MAF), reduce sample input volume and interrogate multiple databases. A representative mutational trending curve following our approaches was shown (Figure 2). Our findings underscore the importance of an assessment of the landscape of somatic mutations in cancer-free population, and associated mutation signatures. Somatic mutations and mosaicism in healthy individuals have implications not only for early detection, diagnosis and treatment of cancer using liquid biopsy but also emerging technologies in healthcare. We recommend caution while extending the mutation conclusions to cancer patients by employing matched normal tissue as germline control. To increase sample input and push liquid biopsy sensitivity toward <1% may not serve the interest of detecting low-frequency mutant allele, but only to increase the chance of background mutation contamination.