Browsing by Subject "ACCURATE"

Sort by: Order: Results:

Now showing items 1-11 of 11
  • Lehtola, Susi; Blockhuys, Frank; Van Alsenoy, Christian (2020)
    A uniform derivation of the self-consistent field equations in a finite basis set is presented. Both restricted and unrestricted Hartree-Fock (HF) theory as well as various density functional approximations are considered. The unitary invariance of the HF and density functional models is discussed, paving the way for the use of localized molecular orbitals. The self-consistent field equations are derived in a non-orthogonal basis set, and their solution is discussed also in the presence of linear dependencies in the basis. It is argued why iterative diagonalization of the Kohn-Sham-Fock matrix leads to the minimization of the total energy. Alternative methods for the solution of the self-consistent field equations via direct minimization as well as stability analysis are briefly discussed. Explicit expressions are given for the contributions to the Kohn-Sham-Fock matrix up to meta-GGA functionals. Range-separated hybrids and non-local correlation functionals are summarily reviewed.
  • Hanif, Tanzeela; Dhaygude, Kishor; Kankainen, Matti; Renkonen, Jutta; Mattila, Pirkko; Ojala, Teija; Joenvaara, Sakari; Mäkelä, Mika; Pelkonen, Anna; Kauppi, Paula; Haahtela, Tari; Renkonen, Risto; Toppila-Salmi, Sanna (2019)
  • Rautiainen, Mikko; Mäkinen, Veli; Marschall, Tobias (2019)
    Motivation: Graphs are commonly used to represent sets of sequences. Either edges or nodes can be labeled by sequences, so that each path in the graph spells a concatenated sequence. Examples include graphs to represent genome assemblies, such as string graphs and de Bruijn graphs, and graphs to represent a pan-genome and hence the genetic variation present in a population. Being able to align sequencing reads to such graphs is a key step for many analyses and its applications include genome assembly, read error correction and variant calling with respect to a variation graph. Results: We generalize two linear sequence-to-sequence algorithms to graphs: the Shift-And algorithm for exact matching and Myers' bitvector algorithm for semi-global alignment. These linear algorithms are both based on processing w sequence characters with a constant number of operations, where w is the word size of the machine (commonly 64), and achieve a speedup of up to w over naive algorithms. For a graph with vertical bar V vertical bar nodes and vertical bar E vertical bar edges and a sequence of length m, our bitvector-based graph alignment algorithm reaches a worst case runtime of O(vertical bar V vertical bar+(sic)m/w(sic)vertical bar E vertical bar logw) for acyclic graphs and O(vertical bar V vertical bar+m vertical bar E vertical bar logw) for arbitrary cyclic graphs. We apply it to five different types of graphs and observe a speedup between 3-fold and 20-fold compared with a previous (asymptotically optimal) alignment algorithm.
  • Jurkute, Neringa; Majander, Anna; Bowman, Richard; Votruba, Marcela; Abbs, Stephen; Acheson, James; Lenaers, Guy; Amati-Bonneau, Patrizia; Moosajee, Mariya; Arno, Gavin; Yu-Wai-Man, Patrick (2019)
  • Mukherjee, Kingshuk; Rossi, Massimiliano; Salmela, Leena; Boucher, Christina (2021)
    Genome wide optical maps are high resolution restriction maps that give a unique numeric representation to a genome. They are produced by assembling hundreds of thousands of single molecule optical maps, which are called Rmaps. Unfortunately, there are very few choices for assembling Rmap data. There exists only one publicly-available non-proprietary method for assembly and one proprietary software that is available via an executable. Furthermore, the publicly-available method, by Valouev et al. (Proc Natl Acad Sci USA 103(43):15770-15775, 2006), follows the overlap-layout-consensus (OLC) paradigm, and therefore, is unable to scale for relatively large genomes. The algorithm behind the proprietary method, Bionano Genomics' Solve, is largely unknown. In this paper, we extend the definition of bi-labels in the paired de Bruijn graph to the context of optical mapping data, and present the first de Bruijn graph based method for Rmap assembly. We implement our approach, which we refer to as rmapper, and compare its performance against the assembler of Valouev et al. (Proc Natl Acad Sci USA 103(43):15770-15775, 2006) and Solve by Bionano Genomics on data from three genomes: E. coli, human, and climbing perch fish (Anabas Testudineus). Our method was able to successfully run on all three genomes. The method of Valouev et al. (Proc Natl Acad Sci USA 103(43):15770-15775, 2006) only successfully ran on E. coli. Moreover, on the human genome rmapper was at least 130 times faster than Bionano Solve, used five times less memory and produced the highest genome fraction with zero mis-assemblies. Our software, rmapper is written in C++ and is publicly available under GNU General Public License at .
  • Lehtola, Susi (2019)
    We present the implementation of a variational finite element solver in the HelFEM program for benchmark calculations on diatomic systems. A basis set of the form chi nlm mu nu phi=Bn mu Ylm nu phi is used, where (mu, nu, phi) are transformed prolate spheroidal coordinates, B-n(mu) are finite element shape functions, and Ylm are spherical harmonics. The basis set allows for an arbitrary level of accuracy in calculations on diatomic molecules, which can be performed at present with either nonrelativistic Hartree-Fock (HF) or density functional (DF) theory. Hundreds of DFs at the local spin density approximation (LDA), generalized gradient approximation (GGA), and the meta-GGA level can be used through an interface with the Libxc library; meta-GGA and hybrid DFs are not available in other fully numerical diatomic program packages. Finite electric fields are also supported in HelFEM, enabling access to electric properties. We introduce a powerful tool for adaptively choosing the basis set by using the core Hamiltonian as a proxy for its completeness. HelFEM and the novel basis set procedure are demonstrated by reproducing the restricted open-shell HF limit energies of 68 diatomic molecules from the first to the fourth period with excellent agreement with literature values, despite requiring orders of magnitude fewer parameters for the wave function. Then, the electric properties of the BH and N-2 molecules under finite field are studied, again yielding excellent agreement with previous HF limit values for energies, dipole moments, and dipole polarizabilities, again with much more compact wave functions than what were needed for the literature references. Finally, HF, LDA, GGA, and meta-GGA calculations of the atomization energy of N-2 are performed, demonstrating the superb accuracy of the present approach.
  • Vatanen, Tommi; Plichta, Damian R.; Somani, Juhi; Muench, Philipp C.; Arthur, Timothy D.; Hall, Andrew Brantley; Rudolf, Sabine; Oakeley, Edward J.; Ke, Xiaobo; Young, Rachel A.; Haiser, Henry J.; Kolde, Raivo; Yassour, Moran; Luopajärvi, Kristiina; Siljander, Heli; Virtanen, Suvi M.; Ilonen, Jorma; Uibo, Raivo; Tillmann, Vallo; Mokurov, Sergei; Dorshakova, Natalya; Porter, Jeffrey A.; McHardy, Alice C.; Lahdesmaki, Harri; Vlamakis, Hera; Huttenhower, Curtis; Knip, Mikael; Xavier, Ramnik J. (2019)
    The human gut microbiome matures towards the adult composition during the first years of life and is implicated in early immune development. Here, we investigate the effects of microbial genomic diversity on gut microbiome development using integrated early childhood data sets collected in the DIABIMMUNE study in Finland, Estonia and Russian Karelia. We show that gut microbial diversity is associated with household location and linear growth of children. Single nucleotide polymorphism- and metagenomic assembly-based strain tracking revealed large and highly dynamic microbial pangenomes, especially in the genus Bacteroides, in which we identified evidence of variability deriving from Bacteroides-targeting bacteriophages. Our analyses revealed functional consequences of strain diversity; only 10% of Finnish infants harboured Bifidobacterium longum subsp. infantis, a subspecies specialized in human milk metabolism, whereas Russian infants commonly maintained a probiotic Bifidobacterium bifidum strain in infancy. Groups of bacteria contributing to diverse, characterized metabolic pathways converged to highly subject-specific configurations over the first two years of life. This longitudinal study extends the current view of early gut microbial community assembly based on strain-level genomic variation.
  • Ritari, Jarmo; Salojärvi, Jarkko; Lahti, Leo; de Vos, Willem M. (2015)
    Background: Current sequencing technology enables taxonomic profiling of microbial ecosystems at high resolution and depth by using the 16S rRNA gene as a phylogenetic marker. Taxonomic assignation of newly acquired data is based on sequence comparisons with comprehensive reference databases to find consensus taxonomy for representative sequences. Nevertheless, even with well-characterised ecosystems like the human intestinal microbiota it is challenging to assign genus and species level taxonomy to 16S rRNA amplicon reads. A part of the explanation may lie in the sheer size of the search space where competition from a multitude of highly similar sequences may not allow reliable assignation at low taxonomic levels. However, when studying a particular environment such as the human intestine, it can be argued that a reference database comprising only sequences that are native to the environment would be sufficient, effectively reducing the search space. Results: We constructed a 16S rRNA gene database based on high-quality sequences specific for human intestinal microbiota, resulting in curated data set consisting of 2473 unique prokaryotic species-like groups and their taxonomic lineages, and compared its performance against the Greengenes and Silva databases. The results showed that regardless of used assignment algorithm, our database improved taxonomic assignation of 16S rRNA sequencing data by enabling significantly higher species and genus level assignation rate while preserving taxonomic diversity and demanding less computational resources. Conclusion: The curated human intestinal 16S rRNA gene taxonomic database of about 2500 species-like groups described here provides a practical solution for significantly improved taxonomic assignment for phylogenetic studies of the human intestinal microbiota.
  • van der Kolk, Birgitta W.; Saari, Sina; Lovric, Alen; Arif, Muhammad; Alvarez, Marcus; Ko, Arthur; Miao, Zong; Sahebekhtiari, Navid; Muniandy, Maheswary; Heinonen, Sini; Oghabian, Ali; Jokinen, Riikka; Jukarainen, Sakari; Hakkarainen, Antti; Lundbom, Jesper; Kuula, Juho; Groop, Per-Henrik; Tukiainen, Taru; Lundbom, Nina; Rissanen, Aila; Kaprio, Jaakko; Williams, Evan G.; Zamboni, Nicola; Mardinoglu, Adil; Pajukanta, Paivi; Pietiläinen, Kirsi H. (2021)
    Tissue-specific mechanisms prompting obesity-related development complications in humans remain unclear. We apply multiomics analyses of subcutaneous adipose tissue and skeletal muscle to examine the effects of acquired obesity among 49 BMI-discordant monozygotic twin pairs. Overall, adipose tissue appears to be more affected by excess body weight than skeletal muscle. In heavier co-twins, we observe a transcriptional pattern of downregulated mitochondrial pathways in both tissues and upregulated inflammatory pathways in adipose tissue. In adipose tissue, heavier co-twins exhibit lower creatine levels; in skeletal muscle, glycolysis- and redox stress-related protein and metabolite levels remain higher. Furthermore, metabolomics analyses in both tissues reveal that several proinflammatory lipids are higher and six of the same lipid derivatives are lower in acquired obesity. Finally, in adipose tissue, but not in skeletal muscle, mitochondrial downregulation and upregulated inflammation are associated with a fatty liver, insulin resistance, and dyslipidemia, suggesting that adipose tissue dominates in acquired obesity.
  • Shahi, Chandra; Bhattarai, Puskar; Wagle, Kamal; Santra, Biswajit; Schwalbe, Sebastian; Hahn, Torsten; Kortus, Jens; Jackson, Koblar A.; Peralta, Juan E.; Trepte, Kai; Lehtola, Susi; Nepal, Niraj K.; Myneni, Hemanadhan; Neupane, Bimal; Adhikari, Santosh; Ruzsinszky, Adrienn; Yamamoto, Yoh; Baruah, Tunna; Zope, Rajendra R.; Perdew, John P. (2019)
    Semilocal approximations to the density functional for the exchange-correlation energy of a many-electron system necessarily fail for lobed one-electron densities, including not only the familiar stretched densities but also the less familiar but closely related noded ones. The Perdew-Zunger (PZ) self-interaction correction (SIC) to a semilocal approximation makes that approximation exact for all one-electron ground- or excited-state densities and accurate for stretched bonds. When the minimization of the PZ total energy is made over real localized orbitals, the orbital densities can be noded, leading to energy errors in many-electron systems. Minimization over complex localized orbitals yields nodeless orbital densities, which reduce but typically do not eliminate the SIC errors of atomization energies. Other errors of PZ SIC remain, attributable to the loss of the exact constraints and appropriate norms that the semilocal approximations satisfy, suggesting the need for a generalized SIC. These conclusions are supported by calculations for one-electron densities and for many-electron molecules. While PZ SIC raises and improves the energy barriers of standard generalized gradient approximations (GGAs) and meta-GGAs, it reduces and often worsens the atomization energies of molecules. Thus, PZ SIC raises the energy more as the nodality of the valence localized orbitals increases from atoms to molecules to transition states. PZ SIC is applied here, in particular, to the strongly constrained and appropriately normed (SCAN) meta-GGA, for which the correlation part is already self-interaction-free. This property makes SCAN a natural first candidate for a generalized SIC. Published under license by AIP Publishing.
  • Sundholm, Dage; Pyykkö, Pekka (2018)
    New standard values -116(2) mb and 76(3) mb are suggested for the nuclear quadrupole moments (Q) of the Ar-39 and Ar-37 nuclei, respectively. The Q values were obtained by combining optical measurements of the quadrupole coupling constant (B or eqQ/h) of the 3s(2)3p(5)4s[3/2](2) (P-3(o)) and 3s(2)3p(5)4p[5/2](3) (D-3(e)) states of argon with large scale numerical complete active space self-consistent field and restricted active space self-consistent field calculations of the electric field gradient at the nucleus (q) using the LUCAS code, which is a finite-element based multiconfiguration Hartree-Fock program for atomic structure calculations. Q(Ar-39(18)) = -116(2) mb