At the end of October we received the transcriptome sequencing data from the patients participating in the study. From here we wanted to follow different data analysis approaches to be as unbiased as possible. One would be, to look at the patients' RNA sequencing data and put emphasis on those genes that are differentially expressed (up or downregulated). Once identified, we would look at the type of genetic variants present that could explain said deregulation. Aside from the directed hypothesis mentioned in the previous report, the sequencing data of the entire germline genome will also be explored in an unbiased manner without selecting a list of genes, emphasizing those rare variants that cause a clear protein alteration (strategy referred as low-hanging fruits). This strategy is centered on the “easiest” candidate genes, which are the ones clearly altered in patients.
On top of the sequencing data of single nucleotide variants and the patients' transcriptome, we will also have copy number and aberrant splicing information available. It will help to get a deeper understanding of which genetic variants at the germline level might be involved in hereditary predisposition to colorectal cancer.
The whole-genome sequencing data of the tumor DNA, and in some patients also from precursor lesions, will enable the characterization of the genetic events in these tissues. On one hand, we will detect which genes are mutated, the mutational load and the mutational signatures present. These datasets will help to establish links between candidate hereditary predispositions and environmental events. In addition, the analysis of the data to detect microbial DNA has begun with the idea of linking, in some cases, its presence to an increased risk of colorectal cancer. Finally, metabolomics from patients' serum is scheduled to begin at the end of this year on the IARC platform (https://www.iarc.who.int/).