eRapport

Building next generation predictive models of gene regulation by incorporating diverse information

Prosjekt
Prosjektnummer
2017061
Ansvarlig person
Junbai Wang
Institusjon
Oslo universitetssykehus HF
Prosjektkategori
Forskerstipend
Helsekategori
Blood, Cancer
Forskningsaktivitet
4. Detection and Diagnosis
Rapporter
2021 - sluttrapport
The main goal of the project is building novel prediction models of gene regulation by incorporating diverse information, and the second aim is to apply the novel models on cancer research such as lymphoma and breast cancer. In the past five years, we have followed this plan achieved numerous results: - In 2017, we designed a novel computational model to investigate gene regulation by considering DNA shape and biophysical theory of protein-DNA interaction. The new method was applied on genomic data from follicular lymphoma cancer patients, and discovered few DNA sequence variations that potentially affect protein-DNA binding and dysregulate nearby genes in follicular lymphoma. These predictions were validated in lymphoma cell lines and published. - In 2018, based on our published work of integrative whole-genome sequence analysis in follicular lymphoma, we developed a new user-friendly Python package (BayesPI-BAR2) for predicting functional non-coding mutations in cancer patient cohorts. This package is evaluated successfully in both follicular lymphoma and skin cancer patients, by focusing on functional DNA sequence variants in gene promoter regions. This work was published later and the software package is publicly available in github. - In 2019, a collaborative research project with Prof Jin from the US was published in Nat Commun, where we investigated temporal dynamic reorganization of 3D chromatin architecture in hormone-induced breast cancer. This work helps us understand high-order temporal chromatin reorganization underlying hormone-dependent breast cancer. - In 2020, we developed a new Python tool for differential methylation analysis in DNA sequencing data (HMST-Seq-Analyzer). The tool was published and publicly available in github, which is suited for multiple sequencing platforms such as WGBS, RRBS and HMSR-Seq et al. At the same time, a new integrative genome analysis pipeline (IGAP) was developed and published, from which we demonstrated that biophysical theory of protein-DNA interaction can be incorporated with epigenetic information for explaining long-distance gene regulation in 3-D chromosomal interactions at both normal and cancer cell lines. Additionally, we also contributed to the development of a new method for modeling and analysis of Hi-C data with Prof Jin’s group in the US. For example, a new method (HiSIF) can be used to predict promoter-distal loops was published in Genome Med 2020, and a study for dynamic changes of 3D genomic landscape in breast cancer was published in BBA Gene Regul Mech. 2020. - Finally, in 2021, we contributed in computational Hi-C analyses in three-dimensional (3D) spheroids of breast cancer cells and compared chromatin domain structure and looping genes, which helps understanding of 3D-growth-specific chromatin architecture in tamoxifen-resistant breast cancer. The work was published in Clin Epigenetics 2021. Overall, in the four years funding period, we have achieved original goal in the project proposal: not only published four publicly available bioinformatics tools (e.g., BayesPI-BAR2, HMST-Seq-Analyzer, IGAP, and HiSIF for studying non-coding DNA sequence variation, differential methylation analysis, long-distance gene regulation, and 3D dynamical chromatin interaction in cancer, respectively), but also applied these computational methods on sequencing data generated from lymphoma and breast cancer related patient data or cell lines for understanding the complex gene regulation in human (e.g., publications in Sci Rep 2017, Nat commun 2019, Genome med 2020, Clin Epigenetics 2021) The knowledge and tools generated from this project will benefit the future understanding of genetic and epigenetic information in disease and cancer. For example, the identification of functional driver mutation or DNA sequence variants in disease or cancer will provide a possibility of early diagnosis for patient that prevents disease as early as possible, the prediction of differential methylations between normal and tumor samples may help develop drug target or novel therapy for treating the disease because epigenetic changes are reversible, and the investigation of 3D chromatin structure in cancer will help us better understanding the complex gene regulation in human disease. Experimental data generated from modern day’s molecular biology research or medical diagnosis labs are far more than the tools/methods that are available to explore and interpret them. Therefore, the development of novel computational methods and tools for biomedical research and diagnosis is essential for future medical treatment and patient care. It not only reduces a significant amount of manpower to handle big data analysis but also provides more precise and less error in interpretation of multi-omics data collected from patients or health systems.

NO

2020
IGAP: integrative genome analysis pipeline and HMST-Seq-Analyzer: a new Python tool for differential methylation analysis in DNA sequencing data, were developed and published in 2020. Additionally, there are two publications in modeling and analysis of breast cancer Hi-C data, through a research collaboration with Prof Jin from the US.Major aim of the project is to design new computational predictive models through considering diverse information in the genome. The new methods or programs can be applied on predicting dysregulation of genes in human cancer. In 2020, Junbai Wang (funded by this project) supervised Phd students Amna Farooq (funded by another project) and Alireza Sahaf (supported by another source) to develop an integrative genome analysis pipeline – IGAP, for studying long distance gene regulation in a 3D genomic space.  The work combines both biophysical information of DNA sequence (e.g., non-specific transcription factor binding affinity - nTBA) and epigenomics information (e.g., histone modification and 3D chromosomal interactions) in the new predictive model. This pipeline and the new findings based on our study (e.g., additive effects of nTBA to regulatory DNA sequences may contribute to long distance gene regulation) were published in Comput Struct Biotechnol Journal. In the same time, Junbai Wang also worked closely with Amna, a third PhD student Omer Ali (funded by another project), a master student from Informatics Department at University of Oslo, and Prof. Magnar Bjørås, to develop HMST-Seq-Analyzer: a new Python package for detecting Differential Methylated Regions (DMRs) in various DNA methylation sequencing data. The new package can annotate predicted DMRs, give a visual overview of DNA methylation status and also perform preliminary quality checks on the data. It has successfully applied on several published DNA methylation data in cancer (e.g., human hepatocellular carcinoma ) that recovers published results. The quality of prediction by using HMST-Seq-Analyzer is similar to other published tools (e.g., BSmooth, MethylSig, and MethylKit). However, our program has more features in genome annotation, result interpretation, and user friendly. The work was published in Comput Struct Biotechnol Journal. Through a collaboration with Prof. Jin from the US, Junbai Wang (funded by this project) participated in developing a new method for modeling and analysis of Hi-C data. The new method can be used to predict promoter-distal loops, which was published in Genome Med 2020. In the second project with Prof Jin, we published results of a new study for dynamic changes of 3D genomic landscape in breast cancer, at Biochim Biophys Acta Gene Regul Mech. 2020 where Junbai Wang is a co-corresponding author. From year 2020, Junbai Wang (funded by this project) has been supervising PhD student Omer Ali (funded by another project) to design a new machine learning approach in clustering of transcription factor binding sites. We already have some primary results of the work, which is expected to be published in 2021. Overall, we have completed almost all project aims in 2020. After publishing a couple of remaining papers in 2021, the project will be fully completed.

NO

2019
Both a new Python package (BayesPI-BAR2 - functional mutation prediction by using genome sequence) from our group and a collaborated project with Prof. Jin (temporal dynamic reorganization of 3D chromatin architecture in breast cancer) were published in 2019. Our ongoing works are focus on the computational study of 3D chromatin structure.In 2019, a new functional non-coding mutation prediction tool – BayesPI-BAR2 was published in Front Genet. (PMID: 31001324), where integrated genome analysis approaches are used to detect putative functional mutations from the whole-genome sequencing data. This tool is not only suited for analyzing genome-wide sequencing data in cancer patients, but also can be applied on sequencing data from other disease. BayesPI-BAR2 is free and public available (http://folk.uio.no/junbaiw/BayesPI-BAR2/), which has attracted ~160 downloads worldwide since 2019 (the top five countries are US, Norway, India, Italy, and UK) based on the statistics in Google Analytics. Last year, research works of our on-going collaborated projects with Prof Jin's lab were published in Nat Commun. (PMID: 30944316), where the two co-first authors generated massive time-series data of chromosome conformation capture experiments in breast cancer cell lines by using both Tethered Chromatin Capture (TCC) and High-throughput Detection of Chromosomal Interactions (Hi-C). As a second author, I am mainly responsible for the bioinformatics data analysis and the computational interpretation of results. Some new works related to this exciting project are still on-going, which may have another paper during year 2020. In addition, we are actively working on DNA methylation data for neuroepigentics project with Prof. Bjørås. Hope a draft of paper will be ready during this year. Through a collaboration with Prof. Bjørås and people from Westlake University (Zhejiang, China), we submitted a grant application to Chinese Scholarship Council (CSC) last year, for a grant support to research in Tumor Evolution and Genome Dynamics.

NO

2018
We develop a new user-friendly Python package (BayesPI-BAR2) for predicting functional non-coding mutations in cancer patient cohorts, which is based on our proposed pipeline for integrative whole-genome sequence analysis. It is evaluated in follicular lymphoma and skin cancer patients, by focusing on sequence variants in gene promoter regions.There are three major results of our research in 2018: 1) Works of our on-going research collaboration with Prof Jin's Lab (Three-dimensional analysis reveals altered chromatin interaction by enhancer inhibitors harbors TCF7L2-regulated cancer gene signature) was published in 2018 (PMID: 30548288), where we contributed in data analysis of two types chromosome conformation capture experiments (Tethered Chromatin Capture - TCC and High-throughput Detection of Chromosomal Interactions Hi-C data). 2) In 2018, we developed a new user-friendly Python package BayesPI-BAR2 based on the proposed pipeline for integrative whole-genome sequence analysis. This may be the first prediction package that considers information from both multiple mutations and multiple patients. It is evaluated in follicular lymphoma and skin cancer patients, by focusing on sequence variants in gene promoter regions. BayesPI-BAR2 not only can automatically recover known gene regulatory disturbance, but also can discover the novel ones which can be tested in wet-lab. It is a useful tool for predicting functional non-coding mutations in whole genome sequencing data: it allows identification of novel TFs whose binding is altered by non-coding mutations in cancer. BayesPI-BAR2 program can analyze multiple datasets of genome-wide mutations at once and generate concise, easily interpretable reports for potentially affected gene regulatory sites. The package will be freely available to academic users once it is published. The manuscript of BayesPI-BAR2 is currently under revision now. 3) Additionally, we participated in research collaborations by using our bioinformatics skills to analyze high-throughput experimental data sets such as microarray, and exome sequencing et al. These works resulted in two co-authored publications (PMID: 29449332 and 29265349). I am also co-editing a special issue "Integration of Multisource Heterogenous Omics Information in Cancer" for Frontiers in Genetics now.

NO

2017
We develop a better predictive model on gene regulation by incorporating diverse information (e.g., DNA shape, and biophysical theory of protein-DNA interaction). The new model is applied on cancer genomic data for studying the effects of non-coding DNA sequence variation. The predictions are validated in follicular lymphoma by wet-lab experiment.Following our original plan of project proposal, there are three major achievements in year 2017: 1) we develop a new biophysical model of protein-DNA interactions by considering DNA shape features. The model is based on the dinucleotide dependency model BayesPI2+. Using the new model, we explore the variation of DNA shape preferences in several transcription factors across various cancer cell lines and cellular conditions. The results reveal that there are DNA shape variations at FOXA1 (Forkhead Box Protein A1) binding sites in steroid-treated MCF7 cells. The new biophysical model is useful for elucidating the finer details of transcription factor–DNA interaction, as well as for predicting cancer mutation effects in the future. (PMID: 28927002). 2) we extended our BayesPI-BAR method to analyze non-coding mutations in multiple cancer patients by considering diverse information such as spatial distribution of DNA sequence variants, gene expression, biophysical properties of protein-DNA interactions etc,. The new BayesPI-BAR2 method is tested in whole genome-wide sequencing data of 14 tumor-normal paired follicular lymphoma (FL) patients from international cancer genome consortium (ICGC). We discovered three novel highly recurrent regulatory mutation blocks near important genes implicated in FL, BCL6 and BCL2. We also found transcription factors (TF) whose binding may be disturbed by these mutations in FL: disruption of FOX TF family near the BCL6 promoter may result in reduced BCL6 expression, which then increases BCL2 expression over that caused by BCL2 gene translocation. Knockdown experiments of two TF hits (FOXD2 or FOXD3) were performed in human B lymphocytes verifying that they modulate BCL6/BCL2 according to the computationally predicted effects of the SNVs on TF binding. Our proposed new integrative analysis facilitates non-coding driver identification and the new findings may enhance the understanding of FL (PMID: 28765546). 3) Additionally, we participated in several research collaborations by using our computational biology/bioinformatics skills to analyze various high-throughput experimental data sets such as microarray, RNA-seq, and exome sequencing et al. These works also resulted in four co-authored publications. In 2017, we also presented our research works at several international/national conferences: 1) Both a poster and a flash talk of our BayesPI-BAR method were presented at the RegGen COSI track of ISMB/ECCB 2017 - International Society for Computational Biology. July, Prague, Czech. 2) Our ongoing work (Temporal dynamic reorganization of 3D chromatin architecture in hormone-induced breast cancer and endocrine resistance) with Prof Jin's lab was presented as a poster in RECOMB/ISCB 2017 - Conference on Regulatory & System Genomics, with DREAM challenges. November, New York, USA. 3) Our recent work, Integrative whole-genome sequence analysis reveals roles of regulatory mutations in BCL6 and BCL2 in follicular lymphoma, was selected as an oral presentation in the 6th Norwegian Cancer Symposium. December, Oslo, Norway.
Vitenskapelige artikler
Gatti F, Mia S, Hammarström C, Frerker N, Fosby B, Wang J, Pietka W, Sundnes O, Hol J, Kasprzycka M, Haraldsen G

Nuclear IL-33 restrains the early conversion of fibroblasts to an extracellular matrix-secreting phenotype.

Sci Rep 2021 01 08;11(1):108. Epub 2021 jan 8

PMID: 33420328

Li J, Fang K, Choppavarapu L, Yang K, Yang Y, Wang J, Cao R, Jatoi I, Jin VX

Hi-C profiling of cancer spheroids identifies 3D-growth-specific chromatin interactions in breast cancer endocrine resistance.

Clin Epigenetics 2021 09 17;13(1):175. Epub 2021 sep 17

PMID: 34535185

Zhou Y, Cheng X, Yang Y, Li T, Li J, Huang TH, Wang J, Lin S, Jin VX

Modeling and analysis of Hi-C data by HiSIF identifies characteristic promoter-distal loops.

Genome Med 2020 08 12;12(1):69. Epub 2020 aug 12

PMID: 32787954

Yang Y, Choppavarapu L, Fang K, Naeini AS, Nosirov B, Li J, Yang K, He Z, Zhou Y, Schiff R, Li R, Hu Y, Wang J, Jin VX

The 3D genomic landscape of differential response to EGFR/HER2 inhibition in endocrine-resistant breast cancer cells.

Biochim Biophys Acta Gene Regul Mech 2020 11;1863(11):194631. Epub 2020 sep 19

PMID: 32956836

Sahaf Naeini A, Farooq A, Bjørås M, Wang J

IGAP-integrative genome analysis pipeline reveals new gene regulatory model associated with nonspecific TF-DNA binding affinity.

Comput Struct Biotechnol J 2020;18():1270-1286. Epub 2020 jun 2

PMID: 32612751

Pihlstrøm HK, Ueland T, Michelsen AE, Aukrust P, Gatti F, Hammarström C, Kasprzycka M, Wang J, Haraldsen G, Mjøen G, Dahle DO, Midtvedt K, Eide IA, Hartmann A, Holdaas H

Exploring the potential effect of paricalcitol on markers of inflammation in de novo renal transplant recipients.

PLoS One 2020;15(12):e0243759. Epub 2020 des 16

PMID: 33326471

Farooq A, Grønmyr S, Ali O, Rognes T, Scheffler K, Bjørås M, Wang J

HMST-Seq-Analyzer: A new python tool for differential methylation and hydroxymethylation analysis in various DNA methylation sequencing data.

Comput Struct Biotechnol J 2020;18():2877-2889. Epub 2020 okt 10

PMID: 33163148

Batmanov K, Delabie J, Wang J

BayesPI-BAR2: A New Python Package for Predicting Functional Non-coding Mutations in Cancer Patient Cohorts.

Front Genet 2019;10():282. Epub 2019 apr 2

PMID: 31001324

Zhou Y, Gerrard DL, Wang J, Li T, Yang Y, Fritz AJ, Rajendran M, Fu X, Schiff R, Lin S, Frietze S, Jin VX

Temporal dynamic reorganization of 3D chromatin architecture in hormone-induced breast cancer and endocrine resistance.

Nat Commun 2019 04 03;10(1):1522. Epub 2019 apr 3

PMID: 30944316

Gerrard DL, Wang Y, Gaddis M, Zhou Y, Wang J, Witt H, Lin S, Farnham PJ, Jin VX, Frietze SE

Three-dimensional analysis reveals altered chromatin interaction by enhancer inhibitors harbors TCF7L2-regulated cancer gene signature.

J Cell Biochem 2019 Mar;120(3):3056-3070. Epub 2018 des 11

PMID: 30548288

Poulsen LC, Edelmann RJ, Krüger S, Diéguez-Hurtado R, Shah A, Stav-Noraas TE, Renzi A, Szymanska M, Wang J, Ehling M, Benedito R, Kasprzycka M, Bækkevold E, Sundnes O, Midwood KS, Scott H, Collas P, Siebel CW, Adams RH, Haraldsen G, Sundlisæter E, Hol J

Inhibition of Endothelial NOTCH1 Signaling Attenuates Inflammation by Reducing Cytokine-Mediated Histone Acetylation at Inflammatory Enhancers.

Arterioscler Thromb Vasc Biol 2018 04;38(4):854-869. Epub 2018 feb 15

PMID: 29449332

Malecka A, Trøen G, Tierens A, Østlie I, Malecki J, Randen U, Wang J, Berentsen S, Tjønnfjord GE, Delabie JMA

Frequent somatic mutations of KMT2D (MLL2) and CARD11 genes in primary cold agglutinin disease.

Br J Haematol 2018 Dec;183(5):838-842. Epub 2017 des 19

PMID: 29265349

Batmanov K, Wang J

Predicting Variation of DNA Shape Preferences in Protein-DNA Interaction in Cancer Cells with a New Biophysical Model.

Genes (Basel) 2017 Sep 18;8(9). Epub 2017 sep 18

PMID: 28927002

Batmanov K, Wang W, Bjørås M, Delabie J, Wang J

Integrative whole-genome sequence analysis reveals roles of regulatory mutations in BCL6 and BCL2 in follicular lymphoma.

Sci Rep 2017 Aug 01;7(1):7040. Epub 2017 aug 1

PMID: 28765546

Li X, Lu J, Kan Q, Li X, Fan Q, Li Y, Huang R, Slipicevic A, Dong HP, Eide L, Wang J, Zhang H, Berge V, Goscinski MA, Kvalheim G, Nesland JM, Suo Z

Metabolic reprogramming is associated with flavopiridol resistance in prostate cancer DU145 cells.

Sci Rep 2017 Jul 11;7(1):5081. Epub 2017 jul 11

PMID: 28698547

Pihlstrøm HK, Gatti F, Hammarström C, Eide IA, Kasprzycka M, Wang J, Haraldsen G, Svensson MHS, Midtvedt K, Mjøen G, Dahle DO, Hartmann A, Holdaas H

Early introduction of oral paricalcitol in renal transplant recipients. An open-label randomized study.

Transpl Int 2017 Aug;30(8):827-840. Epub 2017 jun 2

PMID: 28436117

Olsen MB, Hildrestrand GA, Scheffler K, Vinge LE, Alfsnes K, Palibrk V, Wang J, Neurauter CG, Luna L, Johansen J, Øgaard JDS, Ohm IK, Slupphaug G, Kusnierczyk A, Fiane AE, Brorson SH, Zhang L, Gullestad L, Louch WE, Iversen PO, Østlie I, Klungland A, Christensen G, Sjaastad I, Sætrom P, Yndestad A, Aukrust P, Bjørås M, Finsen AV

NEIL3-Dependent Regulation of Cardiac Fibroblast Proliferation Prevents Myocardial Rupture.

Cell Rep 2017 01 03;18(1):82-92.

PMID: 28052262

Deltagere
  • Junbai Wang Forsker (finansiert av denne bevilgning)
  • Tianhai Tian Prosjektdeltaker
  • Kirill Batmanov Postdoktorstipendiat (annen finansiering)
  • Victor Jin Prosjektdeltaker
  • Magnar Bjørås Prosjektdeltaker
  • Jahn M Nesland Prosjektdeltaker
  • Jan Delabie Prosjektdeltaker
  • Klaus Beiske Prosjektdeltaker

eRapport er utarbeidet av Sølvi Lerfald og Reidar Thorstensen, Regionalt kompetansesenter for klinisk forskning, Helse Vest RHF, og videreutvikles av de fire RHF-ene i fellesskap, med støtte fra Helse Vest IKT

Alle henvendelser rettes til eRapport

Personvern  -  Informasjonskapsler