Fitness-for-Purpose of cfDNA for Genetic Analyses

Described for the first time in 1948 [1], cfDNA can be used as a substrate for genetic analyses by next-generation sequencing (NGS), digital PCR or BEAMing digital PCR, and methylation specific PCR. Diagnostic non-invasive prenatal testing (NIPT) routinely uses maternal and fetal cfDNA and several FDA-approved and CE-marked kits [2, 3]. Recently, sequencing of cfDNA in maternal plasma has even been suggested as a tool for paternity testing [4]. Analysis of cfDNA for prognosis of cancer recurrence or metastasis, based on monitoring and increase of cfDNA concentrations, as well as for prediction of success of cancer therapies, and adaptation of targeted therapies, based on appearance of new variants, is likely to be implemented in clinical diagnostics [5]. An FDA-approved cfDNA-based diagnostic test is the Roche Cobas plasma EGFR mutation test V2 to guide treatment decisions in lung cancer patients. Emerging applications in oncology have been reviewed by Komatsubara and Sacher [6] and cfDNA is already being used in clinical trials and healthcare, e.g., by Guardant Health in the USA, based on enrichment by hybrid capture [7] or with Biocartis real-time PCR assays (https://www.genomeweb.com/pcr/biocartis-gets-ce-marking-two-colorectal-cancer-liquid-biopsy-tests#.WvgVaGcR1vA). The cfDNA methylation is also being explored as a biomarker in colorectal cancer [8] and NIPT [9].

For NGS-targeted oncogenic mutation analyses on a panel of around 50 genes, Illumina technologies require a minimum DNA input amount of around 150 ng, which is too high for cfDNA. However, enrichment by hybrid capture (Agilent Technologies) allows library preparation and Illumina sequencing from 5 to 30 ng of extracted cfDNA [10], while alternative library preparation technologies have been developed by Life Technologies, with minimum DNA input amount of less than 10 ng [11]. The efficiency of ddPCR in terms of minimum DNA input amount/sensitivity is very high, exhibiting detection of mutation prevalence between 0.005 and 0.01%, with a sensitivity of 5 to 50 mutant copies in a background of 10,000 wild-type copies, and with an input of 10–30 ng DNA per reaction [12].

In disease areas other than cancer, such as NAFL NASH, the use of cfDNA as a diagnostic or prognostic biomarker is at the exploratory research stage, for example in the clinical evolution of NAFL disease [13]. The measurement of cfDNA has also been proposed as a diagnostic biomarker in myocardial infarction [14], a prognostic biomarker in monitoring of transplant patients [15], and as a prognostic biomarker in trauma and intensive care unit patients [16].

The mechanisms of release of cfDNA in the circulation are not completely elucidated. While cell necrosis and apoptosis have long been considered as the main mechanisms, active release of DNA molecules from living cells is also possible [17, 18]. Active and rapid release of DNA in the circulation may take place through mechanisms, such as the release of exosomes [19]. Therefore, although this is the general consensus, it is not absolutely certain that only mononucleosomal size DNA fragments of around 180 bp are of clinical interest. Higher MW fragments may as well contain clinically important genetic information, if they originate from the tissue of interest.

As for any other biomarkers, fitness-for-purpose of the specimens being used for cfDNA genetic analysis is necessary for the accuracy of the results [20•, 21••]. Uncontrolled and undocumented pre-analytics may introduce catastrophic bias, invalidate clinical analytical results, or lead to irreproducible research publications.

Critical In Vivo Pre-analytical Factors

Many types of stress induce an increase in the levels of cfDNA. Chronic stress and inflammation are linked to apoptosis- and necrosis-mediated long-term release of DNA, while acute stress, such as anaerobic exercise, is linked to rapid DNA release, through mechanisms that have not yet been completely elucidated. Chronic endurance training leads to constant release of DNA and persistent increase of cfDNA levels, due to both acute oxidative stress and inflammatory processes taking place during damage and repair of muscle cells [22]. Not only physical stress, but also psychosocial stress has been shown to induce an increase in cfDNA levels and an alteration of cfDNA methylation profile [23]. Acute viral infections, such as HIV, hepatitis B, or Epstein Barr Virus infection, may induce an increase in the cfDNA levels due to the presence of viral DNA [24].

In an oncological context, tumor variables, such as disease stage, tumor volume, tumor grade, type of tumor (primary or metastatic), but also treatment status, all have an influence on the amounts of cfDNA. This association is at the basis of the use of cfDNA in clinical diagnostics [5,6,7].

Circadian rhythmicity might also affect the levels of cfDNA as has been suggested in a study where most of the subjects presented higher cfDNA levels at mid-day [25]. The same circadian rhythmicity, with higher cfDNA levels at mid-day, was observed in colorectal cancer patients of stages I–III, but not stage IV [26]. Environmental exposures, such as exposure to pesticides, may influence the levels of cfDNA; these levels were found to be higher in males than in females [27]. Finally, age is a significant covariable when one studies relative cfDNA amounts at specific genomic locations, such as transcription start and termination sites, or cfDNA-associated methylation signals, corresponding to different tissue types [28] (Fig. 1).

Fig. 1
figure 1

Summary of in vivo and ex vivo critical pre-analytical factors for cfDNA analyses

Fig. 2
figure 2

Representative examples of cfDNA yield (ng/ml plasma), extracted by QIAamp cfDNA extraction method, from different healthy donors

Critical Ex Vivo Pre-analytical Factors

There are two main challenges when processing serum or plasma to obtain cfDNA for downstream analysis. The first is the low concentration, of around 1000 genome equivalents per milliliter of blood [29], (few ng per ml of plasma in healthy individuals), introducing the need for highest yield to ensure best sensitivity of the downstream analysis. The second is the possible “contamination” of cfDNA by bigger size genomic DNA fragments, originating from white blood cells (WBC) and introducing the need for WBC stabilization to ensure best specificity of the downstream analysis.

The blood collection tube type is the first critical pre-analytical factor.

The question of serum versus plasma has been debated. Extraction of circulating DNA consistently gives five–eightfold higher yields in serum than in plasma [30, 31•] Fig. 2. Moreover, Warton et al. have shown [31•] that when spiking blood, collected in serum blood collection tube (BCT), with high MW DNA, this spiked DNA is not recovered or detected electrophoretically in the extracted serum cfDNA. On the contrary, it is effectively detected in the plasma cfDNA extract. It appears that high MW DNA is trapped in the clot during the coagulation process.

Another study having evaluated serum versus plasma is the one by Parpart-Li et al. who found significantly more total genome equivalents (GEs), but significantly lower mutant allele fractions in serum than in EDTA plasma from cancer patients. The fragment sizes in the cfDNA fraction extracted from serum ranged from 150 to 2000 bp while the samples from plasma showed the single typical peak at around 150 [32•]. Release of cellular DNA therefore seems to be taking place during the blood clotting process from lysing cells, and this explains the higher yield of circulating DNA obtained from serum. As a conclusion, serum is a more unstable and difficult-to-control fluid, due to the coagulation process and is not recommended as starting material for commercial cfDNA analysis kits.

Therefore, anticoagulated blood is the preferred option for any cfDNA-based genetic analysis assay. EDTA has been suggested as a better anticoagulant than citrate or heparin [33], and currently available diagnostic kits require EDTA plasma as primary material.

The next question is whether the anticoagulated blood should be stabilized or not and with what type of stabilizer. Non-stabilized blood is blood collected in an anticoagulant, such as EDTA, citrate, or heparin, while stabilized blood is collected in tubes with a chemical stabilizer, such as PAXgene, Streck, and PAXgene ccfDNA tubes. The standard PAXgene tubes that have been used for years for blood cell nucleic acid–based analyses are inadequate for cfDNA isolation because of extensive cell lysis [34]. Cell lysis also occurs in EDTA tubes, upon prolonged incubation of the blood, especially at room temperature (RT). Following such WBC lysis, cellular genomic DNA fragments as well as DNases are released. These DNases may degrade the cfDNA, even though it has been shown that EDTA inhibits to a certain degree endogenous DNases [35].

Blood cell lysis is avoided by stabilization solutions, used in Streck, CellSave, Roche, Norgen Biotek, or PAXgene ccfDNA tubes [36,37,38]. Yields and performance of cfDNA are comparable from Streck BCT, Roche, and PAXgene ccfDNA tubes, with reliable qPCR detection of target mutations, when the spiked DNA quantity was as low as 0.5 ng, and even after 7-day RT tube storage [39•](also Ammerlaan, IBBL, unpublished author experience). Recently, an abstract from Qiagen indicated that Streck tubes may contain formaldehyde which can induce DNA deaminations and introduce bias in cfDNA methylation analyses [40]. Furthermore, PAXgene ccfDNA tubes allow unbiased quantification of methylated sequences to be performed and are therefore fit-for-purpose for downstream cfDNA methylation analyses [41]. The concern that the Streck tube preservative might introduce DNA sequence modifications was recently dismissed by Risberg et al. who showed that there is no difference in background error rate between cfDNA extracted from Streck BCT and from paired EDTA samples and analyzed by Tagged Amplicon deep sequencing (Tam-Seq) [42••].

The CellSave tubes are tubes containing a preservative, specific for the stabilization of circulating tumor cells (CTCs). These tubes were shown to be fit-for-purpose for the isolation of cfDNA in the scope of copy number variation analysis by NGS [43].

The time and temperature between blood collection and plasma isolation is the second most important pre-analytical factor. Concerning the maximum allowable time between blood collection and processing by centrifugation for plasma separation, all biospecimen research studies indicate that EDTA blood should be processed in the 3–6 h following collection, if stored at room temperature. If EDTA blood is stored at 4 °C, the delay can safely be extended to 8 h when using the QIAamp MinElute Virus Spin Kit [31•] or to 24 h [32•, 44,45,46,47,48,49]. For blood collection tubes with stabilizers, stability claims of the manufacturers have generally been confirmed (see above references): 7 days at RT for Streck BCT and PAXgene ccfDNA tubes. The stability of the cfDNA collection tube from Roche Diagnostics GmBH seems to be lower [50•].

Concerning the centrifugation conditions, the consensus is for double spun plasma with a second high-speed centrifugation at 16000g before cfDNA extraction.

The cfDNA extraction kit is the third most important pre-analytical factor. The most commonly used extraction kits or methods are listed in Table 1 [49, 51].

Table 1 List of most commonly used cfDNA extraction kits, the companies that produce them, and the principle of the methods used

Different kits may provide different yields of cfDNA. A comparative study between 11 different kits and methods showed highest cfDNA yields with the Norgen Kits [52].

Different kits may produce cfDNA of different size ranges, depending on the binding capacity of the beads or membranes used during the extraction process. The processing laboratory should validate the MW range of DNA molecules that are effectively extracted. As of today, it is not clear if DNA molecules of higher MW are of clinical interest or not. It cannot be excluded that tumor cells dying of necrosis release high MW DNA that may be recovered in the cfDNA eluate with some, but not with other extraction kits. A recent comparison between six different column-based or magnetic bead–based kits used DNA spikes of different sizes (50–808 bp). This study showed that some kits, such as the QIAamp Circulating Nucleic Acid Kit and the Norgen Plasma/Serum Cell-Free Circulating DNA Purification Kit, provide very good recovery of DNA fragments of all sizes, while the MagMAX Cell-Free DNA Isolation Kit does not allow recovery of the lower size (50 bp) DNA fragments, and the Bioo Next-Prep-Mag cfDNA Isolation Kit allows good recovery of DNA fragments, only in the range of 75–300 bp [53••].

For downstream cfDNA methylation analyses, fitness-for-purpose of the cfDNA has been shown and a modified MethylMiner (Invitrogen) protocol for isolation of the methylated cfDNA sequences has been described [31•]. If bisulfite conversion is used, the Bisulfite Conversion Kit is another critical pre-analytical factor. Although there are many Bisulfite Conversion Kits, the one that is the most fit-for-purpose for bisulfite conversion of cfDNA in plasma is the InnuCONVERT Bisulfite Body Fluids Kit (Analytik Jena AG), which works with up to 3-ml volumes of input samples [54]. If bisulfite conversion is performed on isolated cfDNA, the Epitect (Qiagen) Kit can be used with the “Small Amounts of Fragmented DNA” protocol [54]. Finally, a single tube extraction and processing method, named “methylation on beads” has been described, which allows for cfDNA extraction and bisulfite conversion from up to 2 ml of plasma and ensures high recovery and analytical sensitivity of methylation analysis [55].

Once cfDNA has been extracted, its long-term stability at − 80 °C may not be guaranteed, and therefore, long-term storage in liquid nitrogen (LN) would be a safer decision [56]. Protocols for whole genome amplification (WGA) from plasma are commercialized (Sigma-modified WGA2 or Sigma-modified WGA4 protocols, https://www.sigmaaldrich.com/technical-documents/protocols/biology/whole-genome-amplification-serum-or-plasma.html); however, no independent validation has been published. cfDNA fragments of around 150 bp are too short for WGA (Trouet, IBBL, unpublished author experience).

A first set of evidence-based guidelines for processing specimens for cfDNA analyses was published by El Messaoudi et al. [57]. More evidence-based procedures and recommendations are being prepared by the European consortium CANCER-ID (https://www.cancer-id.eu/), in the context of the Innovative Medicines Initiative (IMI). The European program SPIDIA (http://www.spidia.eu/) has also developed a CEN Technical Specification standard for the pre-analytical phase of ccfDNA [58].

Quality Control Materials

Few in-process quality control materials exist that can be used for the validation of the cfDNA extraction methods. Horizon commercializes cfDNA Reference Standards Reference Standards for method validation purposes (research use only). These materials are provided as mechanically sheared, fragmented DNA (average size 160 bp) from engineered human cell lines and contain variants at allelic frequencies down to 0.1% [59], https://www.horizondiscovery.com/media/resources/Application%20Notes/reference-standards/independent-dPCR-study-of-horizon-cfdna-reference-standards.pdf.

Devonshire et al. have developed and validated spike materials that allow users to measure cfDNA extraction efficiency, fragment size bias, and yield [60••]. These materials are based on a linearized and digested plasmid, containing the Arabidopsis thaliana alcohol dehydrogenase gene (ADH), and fragments have sizes of 189 bp and roughly 3 kb.

A homemade reference material that has been described, for the analytical validation of a cfDNA NGS assay, is based on overlapping extension PCR for site-directed mutagenesis to obtain fragments of 537 to 2030 bp, containing specific mutation sequences [4].

The validation of the cfDNA extraction method is based on measurement of (i) the concentration and yield of the cfDNA extracted from the in-process QC material, either by high sensitivity spectrofluorometry or by qPCR or dPCR and (ii) the integrity of the cfDNA extracted from the in-process QC material, either by microfluidic electrophoresis or by qPCR on amplicons of different lengths [61].

Conclusion

The context of use of cfDNA has extended from the NIPT to oncological indications, while research and development on cfDNA is ongoing in other disease areas. Standardization of the cfDNA pre-analytical phase is key for the accuracy of genetic analyses and the reproducibility of research results. The commercialization of blood collection tubes with stabilizers precluding blood cell lysis has increased the robustness and specificity of cfDNA testing. Different in vivo and ex vivo critical pre-analytical factors may affect the yield, the size range, and the integrity of the recovered cfDNA, leading to a decrease in sensitivity and/or specificity of cfDNA-based genetic analyses. These factors should be acknowledged, documented, and taken into account in cfDNA-based research data analyses.