Disentangling transcription factor binding site complexity

Show full item record



Permalink

http://hdl.handle.net/10138/298979

Citation

Eggeling , R 2018 , ' Disentangling transcription factor binding site complexity ' , Nucleic Acids Research , vol. 46 , no. 20 , e121 . https://doi.org/10.1093/nar/gky683

Title: Disentangling transcription factor binding site complexity
Author: Eggeling, Ralf
Contributor: University of Helsinki, Department of Computer Science
Date: 2018-11-16
Language: eng
Number of pages: 12
Belongs to series: Nucleic Acids Research
ISSN: 0305-1048
URI: http://hdl.handle.net/10138/298979
Abstract: The binding motifs of many transcription factors (TFs) comprise a higher degree of complexity than a single position weight matrix model permits. Additional complexity is typically taken into account either as intra-motif dependencies via more sophisticated probabilistic models or as heterogeneities via multiple weight matrices. However, both orthogonal approaches have limitations when learning from in vivo data where binding sites of other factors in close proximity can interfere with motif discovery for the protein of interest. In this work, we demonstrate how intra-motif complexity can, purely by analyzing the statistical properties of a given set of TF-binding sites, be distinguished from complexity arising from an intermix with motifs of co-binding TFs or other artifacts. In addition, we study the related question whether intra-motif complexity is represented more effectively by dependencies, heterogeneities or variants in between. Benchmarks demonstrate the effectiveness of both methods for their respective tasks and applications on motif discovery output from recent tools detect and correct many undesirable artifacts. These results further suggest that the prevalence of intra-motif dependencies may have been overestimated in previous studies on in vivo data and should thus be reassessed.
Subject: 113 Computer and information sciences
112 Statistics and probability
1182 Biochemistry, cell and molecular biology
Transcription factors
Sequence motifs
PROTEIN-DNA INTERACTIONS
ChIP-Seq
Rights:


Files in this item

Total number of downloads: Loading...

Files Size Format View
gky683.pdf 2.373Mb PDF View/Open

This item appears in the following Collection(s)

Show full item record