Integrated property-based analysis of human transcription factors
Prosjekt
- Prosjektnummer
- 46056641
- Ansvarlig person
- Sharahm Bahrami
- Institusjon
- NTNU, IKM
- Prosjektkategori
- phd-stipend 2012
- Helsekategori
- Other (ukjent, se veiledning)
- Forskningsaktivitet
- 1. Underpinning
Rapporter
The project is to do a functional classification of human TFs by investigating relationships between TF properties and how the TFs are used for gene regulation. This will be done by using integrated resources on TF properties, gene regulation and TF-TF interactions, and use this to find significant correlations and overrepresented features.Gene expression is regulated by transcription factors; proteins that bind to specific binding sites (motifs) in the genome, and thereby regulate the expression level of specific genes. Transcription factors are essential proteins for regulating gene expression. This regulation depends upon specific features of the transcription factors, including how they interact with DNA, how they interact with each other, and how they are post-translationally modified. Reliable information about key properties associated with transcription factors will therefore be useful for data analysis, in particular of data from high-throughput experiments.
We have used an existing list of 1978 human proteins described as transcription factors to make a well-annotated data set, which includes information on Pfam domains, DNA-binding domains, post-translational modifications and protein-protein interactions. We have then used this data set for enrichment analysis. We have investigated correlations within this set of features, and between the features and more general protein properties. We have also used the data set to analyze previously published gene lists associated with cell differentiation, cancer, and tissue distribution.
The study shows that well-annotated feature list for transcription factors is a useful resource for extensive data analysis; both of transcription factor properties in general and of properties associated with specific processes. However, the study also shows that such analyses are easily biased by incomplete coverage in experimental data, and by how gene sets are defined.
The goal of the project is to do a functional classification of human transcription factors (TFs) by investigating relationships between TF properties and how the TFs are used for gene regulation. This will be done by using integrated resources on transcription factor properties , gene regulation and TF-TF interactions.Integrated property-based analysis of human transcription factors: Regulation of gene expression by transcription factors is an essential part of the general regulatory system of any cell. More precise knowledge about this process will facilitate our understanding of normal cell regulation and how it may be affected by internal or external perturbations. This project will contribute to more precise knowledge by using bioinformatic and statistical methods to study correlations between structural features of transcription factors and their functional roles in gene regulation. This will improve our understanding of complementary regulatory roles exercised by different classes of transcription factors, and how this may be used to analyze e.g. expression data related to normal and disease-associated regulatory networks. The project will make an integrated resource from heterogeneous experimental data on transcription factor properties from different sources, complemented with predicted data when necessary. This resource will be used for statistical analysis towards key research questions on transcription factor activity, in particular with respect to interactions in regulatory modules. This will show how the interplay between transcription factor properties facilitates gene regulation, which may be used to improve regulatory network models. This is important basic research for future biomedical projects where bioinformatic methods are used to analyze high throughput data generated from e.g. biopsies or biobank sample.
making a comprehensive overview of Pfam domains, DBD domains, and PTMs of transcription factors, and relate this to PPI data and functional differences for human TFs, using statistical overrepresentation analysis.Regulation of gene expression by transcription factors is an essential part of the general regulatory system of any cell. In fact, Gene expression and its regulation involve the binding of many regulatory transcription factors (TFs) to specific DNA elements called Transcription Factor Binding Sites (TFBSs) and transcription factors are proteins containing a sequence specific DNA binding domain (DBD). The interaction between TF and TFBD defines the specificity of TF and it is mediated between structural motifs TF and surfaces of the DNA bases and backbone by non-covalent interactions. This project will contribute to more precise knowledge by using bioinformatic and statistical methods to study correlations between structural features of transcription factors and their functional roles in gene regulation. This will improve our understanding of complementary regulatory roles exercised by different classes of transcription factors, and how this may be used to analyze e.g. expression data related to normal and disease-associated regulatory networks. The project will make an integrated resource from heterogeneous experimental data on transcription factor properties from different sources, complemented with predicted data when necessary. This resource will be used for statistical analysis towards key research questions on transcription factor activity, in particular with respect to interactions in regulatory modules. This will show how the interplay between transcription factor properties facilitates gene regulation, which may be used to improve regulatory network models. This is important basic research for future biomedical projects where bioinformatic methods are used to analyze high throughput data generated from e.g. biopsies or biobank samples.
Transkripsjonsfaktorer er proteiner som regulerer gener i genomet. Dette prosjektet klassifiserer alle kjente humane transkripsjonsfaktorer basert på viktige egenskaper. Dette blir så brukt i bioinformatisk analyse av generegulering, for å identifisere de egenskapene som er viktige for regulatorisk funksjon.Uttrykket av gener styres av transkripsjonsfaktorer som binder til regulatoriske regioner i genomet. Disse faktorene er proteiner som gjenkjenner spesifikke motiver i genomet, og binder til disse motivene. Dette bidrar til å øke eller redusere uttrykket av det aktuelle genet. Det er mange ulike transkripsjonsfaktorer, for eksempel med hensyn til hvordan de binder til DNA, hvilke andre domener de har i tillegg til DNA-binding, hvordan de modifiseres, og hvilke andre proteiner de er i fysisk kontakt med. Gjennom en bedre forståelse av disse egenskapene kan vi også få en bedre forståelse av hvordan samspillet mellom ulike transkripsjonsfaktorer leder til en spesifikk regulatorisk prosess.
Prosjektet har i oppstartfasen fokusert på å bygge en god og omfattende database for transkripsjonsfaktorer og deres egenskaper. Dette inkluderer identifikasjon og annotering av relevante domener i de enkelte faktorene. Dette gjøres med utgangspunkt i databasen Pfam, som beskriver et stort antall kjente domener i ulike proteiner. Det har vært lagt særlig vekt på identifikasjon av DNA-bindende domener. Dette er bare delvis beskrevet i Pfam, og det er derfor i tillegg brukt prediksjonsmetoder for DNA-binding. I tillegg legges det nå til informasjon om protein-protein interaksjoner og ulike modifikasjoner som gjøres på proteinet for å aktivere eller deaktivere det i cellen.
Neste trinn i prosjektet blir så å bruke denne databasen som grunnlag for en bioinformatisk analyse av transkripsjonsfaktorer i ulike regulatoriske områder, med tanke på å klassifisere faktorene i funksjonelle klasser, basert på samspillet mellom egenskaper (fra databasen) og funksjon (observert fra genomskala data fra for eksempel det internasjonale ENCODE-prosjektet). Dette kan bidra til en bedre forståelse av sykdommer relatert til endringer i genregulering.
Vitenskapelige artikler
Bahrami S, Ehsani R, Drabløs F
A property-based analysis of human transcription factors.
BMC Res Notes 2015;8():82. Epub 2015 mar 14
PMID: 25890365 - Inngår i doktorgradsavhandlingen
Deltagere
- Rezvan Ehsani Doktorgradsstipendiat
- Pål Sætrom Medveileder, biveileder
- Finn Drabløs Hovedveileder
eRapport er utarbeidet av Sølvi Lerfald og Reidar Thorstensen, Regionalt kompetansesenter for klinisk forskning, Helse Vest RHF, og videreutvikles av de fire RHF-ene i fellesskap, med støtte fra Helse Vest IKT
Alle henvendelser rettes til Helse Midt-Norge RHF - Samarbeidsorganet og FFU