Faculty of Science


Recent Submissions

  • Maarala, Ilari (Helsingin yliopisto, 2021)
    High-throughput sequencing (HTS) technologies have enabled rapid DNA sequencing of whole-genomes collected from various organisms and environments, including human tissues, plants, soil, water, and air. As a result, sequencing data volumes have grown by several orders of magnitude, and the number of assembled whole-genomes is increasing rapidly as well. This whole-genome sequencing (WGS) data has revealed the genetic variation in humans and other species, and advanced various fields from human and microbial genomics to drug design and personalized medicine. The amount of sequencing data has almost doubled every six months, creating new possibilities but also big data challenges in genomics. Diverse methods used in modern computational biology require a vast amount of computational power, and advances in HTS technology are even widening the gap between the analysis input data and the analysis outcome. Currently, many of the existing genomic analysis tools, algorithms, and pipelines are not fully exploiting the power of distributed and high-performance computing, which in turn limits the analysis throughput and restrains the deployment of the applications to clinical practice in the long run. Thus, the relevance of harnessing distributed and cloud computing in bioinformatics is more significant than ever before. Besides, efficient data compression and storage methods for genomic data processing and retrieval integrated with conventional bioinformatics tools are essential. These vast datasets have to be stored and structured in formats that can be managed, processed, searched, and analyzed efficiently in distributed systems. Genomic data contain repetitive sequences, which is one key property in developing efficient compression algorithms to alleviate the data storage burden. Moreover, indexing compressed sequences appropriately for bioinformatics tools, such as read aligners, offers direct sequence search and alignment capabilities with compressed indexes. Relative Lempel-Ziv (RLZ) has been found to be an efficient compression method for repetitive genomes that complies with the data-parallel computing approach. RLZ has recently been used to build hybrid-indexes compatible with read aligners, and we focus on extending it with distributed computing. Data structures found in genomic data formats have properties suitable for parallelizing routine bioinformatics methods, e.g., sequence matching, read alignment, genome assembly, genotype imputation, and variant calling. Compressed indexing fused with the routine bioinformatics methods and data-parallel computing seems a promising approach to building population-scale genome analysis pipelines. Various data decomposition and transformation strategies are studied for optimizing data-parallel computing performance when such routine bioinformatics methods are executed in a complex pipeline. These novel distributed methods are studied in this dissertation and demonstrated in a generalized scalable bioinformatics analysis pipeline design. The dissertation starts from the main concepts of genomics and DNA sequencing technologies and builds routine bioinformatics methods on the principles of distributed and parallel computing. This dissertation advances towards designing fully distributed and scalable bioinformatics pipelines focusing on population genomic problems where the input data sets are vast and the analysis results are hard to achieve with conventional computing. Finally, the methods studied are applied in scalable population genomics applications using real WGS data and experimented with in a high performance computing cluster. The experiments include mining virus sequences from human metagenomes, imputing genotypes from large-scale human populations, sequence alignment with compressed pan-genomic indexes, and assembling reference genomes for pan-genomic variant calling.
  • Rantanen, Kari (Helsingin yliopisto, 2021)
    Graphical models are a versatile machine learning framework enabling efficient and intuitive representations of probability distributions. They can be used for performing various data analysis tasks that would not be feasible otherwise. This is made possible by constructing a graph structure which encodes the underlying dependence structure of the probability distribution. To that end, the field of structure learning develops specialized algorithms which can learn a structure that describes given data well. This thesis presents advances in structure learning for three different graphical model classes: chordal Markov networks, maximal ancestral graphs, and causal graphs with cycles and latent confounders. Learning structures for these model classes has turned out to be a very challenging task, with the few existing exact methods scaling to much fewer number of random variables compared to more extensively developed methods for Bayesian networks. Chordal Markov networks are a central class of undirected graphical models. Being equivalent to so-called decomposable models, they are essentially a special case of Bayesian networks. This thesis presents an exact branch-and-bound algorithm and an in-exact stochastic local search for learning chordal Markov network structures. Empirically we show that the branch and bound is at times able to learn provably optimal structures with higher number of variables than competing methods, whereas the local search scales considerably further. Maximal ancestral graphs are a generalization of Bayesian networks which allow for representing the influence of unobserved variables. This thesis introduces the first exact search algorithm for learning maximal ancestral graphs and a framework for generating and pruning local scores for the search. Empirically we show that the proposed exact search is able to learn higher-quality structures than existing in-exact methods. Finally, we introduce an exact branch-and-bound algorithm for learning causal graphs in the presence of cycles and latent confounders. Our empirical results show that the presented algorithm is able to learn optimal structures considerably faster than a recent exact method for the problem. We also extend our branch and bound to support interventional data and σ-separation, and show empirically that the algorithm can handle higher number of experimental datasets than the only competing method supporting σ-separation.
  • Saressalo, Anton (Helsingin yliopisto, 2021)
    Electric discharge is present in various aspects of our everyday lives. Internal combustion engines rely on spark plugs for the running of the motor, fluorescent lighting functions by gas discharge and a lightning bolt strikes somewhere on earth every second. An electrical breakdown is an event where a voltage across two conductive electrodes, separated by an electrically insulating medium, becomes high enough for the insulating properties of the medium to be weakened, subsequently allowing an electric current to pass through the medium. A special type of such an event is a vacuum arc breakdown, where the electrodes are separated by a gap of void, which acts as a good insulator, but will still be breached under sufficiently high voltage. When controlled, the electric arcing can be used as a powerful tool to focus energy to a specific location. However, several applications are also hindered by the occurrence of breakdowns, including particle accelerators, vacuum interrupters and solar panels. A common factor in these applications is the aim to maximize the electric field strength to optimize the operational efficiency and ecological footprint of such a device. The breakdown phenomenon is at the crossroads of many fields of science, including plasma, materials and surface physics. Effort to explain the breakdown origin has been ongoing for more than a hundred years, and, despite of the constant progress, there are only hypotheses on the exact nature of the process. This work presents an experimental approach for studying the breakdown phenomenon between Cu electrodes, separated by a vacuum gap. The breakdowns are generated as a consequence of repeatedly applying high-voltage pulses across the gap. As a result, statistics, such as breakdown frequency, of the events are investigated and any effects on the surface analyzed. It was shown that cleaning the electrode surface, either by the electric pulsing or plasma treatment, improves the breakdown resistance of the system, whereas any idle time between the high-voltage pulses increases the breakdown probability. Furthermore, it was found that the breakdown events can be attributed to distinct classes, suggesting separate processes responsible for the breakdown generation. One set of processes were labeled extrinsic, as they are driven by the external factors responsible of the surface contamination of the electrode surface. The other processes were characterized as intrinsic, as they were defined by inherent material properties and continued affecting the breakdown frequency even when the effect of extrinsic processes was minimized by plasma cleaning of the surface. Understanding the formation mechanisms of a vacuum arc breakdown allows designing applications that can sustain higher electric fields without breakdown events. The results of this work provide insight on how improving the surface state of an electrode can increase its breakdown resistance. Additionally, an algorithm is presented for recovering the pulsing voltage after a previous breakdown to a high level in an optimal way, with a minimal probability of follow-up breakdowns.
  • Joshi, Satya Prakash (Helsingin yliopisto, 2021)
    Combustion of practical fuels proceeds via an extremely large number of elementary reactions, which makes it difficult to model their combustion chemistry. To resolve this problem, combustion chemistry of a practical fuel can be emulated with a small set of surrogate fuels that involves a much more limited number of elementary reactions. In this work, the kinetics of various unsaturated radical reactions have been studied, which are central to the combustion of important surrogate fuels such as propene, 2-methyl-2-butene (2M2B) and methyl-crotonate (MC). Apart from investigating the reaction kinetics on a fundamental level, the practical application of this work is to provide rate coefficients data over a wide range of conditions, which is expected to significantly improve the accuracy of the current combustion models. As the reactions studied in this work are an integral part of the detailed reaction schemes utilized for modeling the combustion of practical fuels. The kinetic experiments presented in this work were conducted using laser photolysis–photoionization mass spectrometer (LP–PIMS) apparatus. Whenever required, the results of LP-PIMS experiments were further supported and extrapolated to the combustion conditions using quantum-chemistry calculations and master equation (ME) modeling. Kinetics of the CH3CCH2 + O2, cis/trans-CH3CHCH + O2, (CH3)2CCH + O2 and (CH3)2CCCH3 + O2 reactions were measured over a wide temperature range (220 – 660 K) and at low pressures (0.3 – 2 Torr). These vinyl-type radicals are derivatives of methyl-group substitution to the α- and/or β-hydrogens of the vinyl radical (C2H3). The main goal of this study was to quantify the effects of the CH3-group substitutions on the kinetics and reactivity of vinyl-type radicals towards O2. Comparing the measured bimolecular rate-coefficients for the aforementioned vinyl-type radical + O2 reactions reveals that the CH3-group substitution to the α- and β-positions of the C2H3 radical has an increasing (~50%) and decreasing (~30%) effect on its reactivity towards O2, respectively. The CH2CHCHC(O)OCH3 radical having a highly resonance-stabilized structure, the reactivity of the CH2CHCHC(O)OCH3 + O2 reaction is expected to be slow. A very low upper-limit (k ≤ 7.5 × 10-17 cm3 molecule-1 s-1) for the bimolecular rate-coefficient of the CH2CHCHC(O)OCH3 + O2 reaction was measured at 600 K. Following this, thermal unimolecular decomposition kinetics of the CH2CHCHC(O)OCH3 radical was studied over the temperature range of 750 – 869 K and at low pressures (< 9 Torr). Subsequently, the measured thermal unimolecular decomposition rate-coefficients were modeled using ME. The kinetics of the reaction between the resonance-stabilized (CH3)2CCHCH2 radical and O2 was studied over the temperature range of 238 – 660 K and at low pressures (0.2 – 5.7 Torr). The most important observation of this study was the opening of high temperature reaction channels at temperatures above 500 K. A thorough single- and multi-reference quantum-chemistry calculations and ME modeling study were performed to corroborate the experimental findings. Importantly, the observed high temperature (T > 500 K) kinetics of the (CH3)2CCHCH2 + O2 reaction is significantly faster than is currently incorporated in the combustion models.
  • Liangsupree, Thanaporn (Helsingin yliopisto, 2021)
    This doctoral dissertation focuses on the elucidation of biochemical and chemical compositions of clinically relevant human plasma-derived nanosized particles, namely lipoproteins and extracellular vesicle subpopulations (EVs) isolated by on-line coupled immunoaffinity – asymmetric flow field-flow fractionation. Raman spectroscopy and chromatographic techniques along with statistical and computational models for complex data analysis were employed for compositional studies. Continuous flow quartz crystal microbalance (QCM) combined with an advanced numerical tool namely Adaptive Interaction Distribution Algorithm or AIDA gave valuable information on their binding kinetics and interactions, also helpful for the development of the isolation methods. The first step was to develop a fast and reliable platform for the isolation and fractionation of low-density lipoprotein (LDL) particles and EV subpopulations from human plasma. The isolation was based on the highly specific and selective affinity chromatography with monolithic disk columns, enabling convective mass transport and high permeability. The LDL isolation system utilized two monolithic disk columns, one immobilized with chondroitin-6-sulfate (C6S) and another with monoclonal anti-apolipoprotein B-100 (anti-apoB-100) antibody. The first disk removed very-low-density and intermediate-density lipoproteins from human plasma, while the second isolated LDLs from the flow-through plasma. EV isolation methods included four immunoaffinity ligands, monoclonal anti-CD9, anti-CD63, anti-CD81, and anti-CD61 antibodies. The isolates were further on-line fractionated by asymmetric flow field-flow fractionation, resulting in EV subpopulations with size ranges of < 50 nm exomeres and 50-120 nm exosomes. The developed systems allowed automated, quick, highly reliable, and successful isolation and fractionation of both lipoproteins and EV subpopulations with minimal losses and contamination. Raman spectroscopy combined with statistical models was successfully used to prove the hypothesis that plasma-derived EVs of different sizes and origins have different biochemical compositions. In addition, EVs were clearly distinguished from non-EV components, such as apoB-100-containing lipoproteins and human plasma. Plasma-derived EV subpopulations, including CD9+, CD63+, and CD81+ EVs, gave distinct spectral compositions compared to platelet-derived CD61+ EV subpopulations, and the diversity was even found within exomere and exosome size ranges. In parallel, the fatty acid composition of lipoproteins and EV subpopulations was analyzed by comprehensive two-dimensional gas chromatography – time-of-flight mass spectrometry and the amino acid and glucose content by hydrophilic interaction liquid chromatography – tandem mass spectrometry. EV subpopulations were free from detectable apoB-100-containing lipoproteins and differed in amino acid and fatty acid compositions. Detailed binding kinetics and interactions carried out by continuous flow QCM and data analysis tool AIDA gave knowledge of biological system heterogeneity and binding kinetics parameters, useful for the development of affinity chromatographic methods and for the determination of molecular properties of both lipoproteins and EV subpopulations.
  • Chen, Yuxing (Helsingin yliopisto, 2021)
    The substantial increase of real-life applications creates a large scale of ever-increasing raw heterogeneous data nowadays, correlating to the four Vs characteristics of big data: volume, variety, velocity and veracity. We discuss volume and variety challenges in this thesis. For volume, efficiently extracting valuable information and making predictions from these large-scale data are interesting to various quarters from academic researchers and industrial data scientists to customers and shareholders. For variety, much research addresses the challenges of effectively storing, collecting, processing, querying, and analyzing heterogeneous data. This thesis pushes approaches to optimize the performance with volume and variety challenges. For volume challenges, we aim at performance tuning for big data systems. In this part, to tackle cold-start situations with no statistics for models, we leverage cost-model and triangulation to model the performance, thus leading to cost-effective prediction. For variety challenges, we aim at optimizing join queries. In this part, to fill the gap of little research on join queries with heterogeneous data (i.e., relational and tree data), we research the size bound and the worst-case optimal join algorithm with relational and tree data in contrast with only relations. For parameter tuning, this thesis first contributes to propose a cost model for Spark workloads, which leverages Monte Carlo simulation to achieve cost-effective training. Specifically, we utilize a little part of resources and data to predict dependable performance for larger clusters and datasets even with data skewness and runtime deviations. Particularly, this work considers network and disk bounds so that it performs better with I/O-bounded workloads. Next, the thesis proposes $d$-simplexed, which models the Spark workloads by leveraging Delaunay Triangulation. Unlike other black-box ML methods, $d$-simplexed utilizes piece-wise linear regression models, which can be built faster and yield better prediction. Also, $d$-simplexed is built with an adaptive sampling technique which collects few training points but achieves accurate prediction. For join queries, this thesis studies the worst-case optimal join with relational and tree data. To this end, we first embark the study on the cross-model conjunctive query (CMCQ) with relational and tree data, and formally define the problem of CMCQ processing. We reveal that the computation of the worst-case size bound of a CMCQ is NP-hard w.r.t query expression complexity. We then develop a worst-case optimal join algorithm called CMJoin to match the size bound of a CMCQ under some circumstances.
  • Oikarinen, Joona (Helsingin yliopisto, 2021)
    This thesis concerns constructive Liouville Conformal Field Theory (LCFT), which is a certain two-dimensional quantum field theory with conformal symmetry. The focus is on the properties of the correlation functions of the primary fields. The correlation functions are defined by an explicit path integral construction, discovered by David, Kupiainen, Rhodes and Vargas. We consider regularity of the correlation functions and their dependency on the background metric of the theory. In physics literature, LCFT originated in the 1980's in Polyakov's study of path integral quantization of non-critical bosonic string theory in the conformal gauge. Shortly after this, Liouville theory appeared in two-dimensional quantum gravity, and later Liouville theory was found to have a deep connection to four-dimensional Yang--Mills theories via the AGT-conjecture. Liouville theory was also one of the motivations for the development of the conformal bootstrap approach to two-dimensional Conformal Field Theory in the 1980's by Belavin, Polyakov and Zamolodchikov. The mathematically rigorous study of Liouville theory started in 2010's. The path integral of the theory was successfully constructed using the theory of Gaussian Multiplicative Chaos of Kahane. Quickly after the construction of the path integral, many of the predictions of the physicists were proven rigorously. The first article of the dissertation considers the regularity of the LCFT correlation functions. The correlation functions are shown to be smooth when the insertion points are distinct. The method is based on identities obtained from Gaussian integration by parts, and a fusion estimate for the correlation functions. The fusion estimate controls the singularities of the correlation functions when insertions get closer to each other pairwise. The second article concerns the stress-energy tensor of LCFT, which describes the response of the theory to variations of the background metric. Conformal Ward Identities are derived for the correlation functions of the stress-energy tensor on LCFT on the Riemann sphere. The identities are consequences of conformal symmetry, and on the sphere this is especially strong constraint because the sphere admits only one conformal structure. The correlation functions are shown to be smooth with respect to the background metric, and then their functional derivatives with respect to the background metric are computed. In the computation the Beltrami equation plays a crucial role. The third article combines the methods used in both of the previous articles. The Conformal Ward identities are derived on hyperbolic compact Riemann surfaces. Now the space of conformal structures is non-trivial. This means that the conformal symmetry does not fully fix the dependency of the correlation functions on the metric, and thus a separate study is required for this new degree of freedom. It turns out that the method developed in the first article suffices to also show that the variation of the correlation functions with respect to the conformal structure defines a smooth function. After this, the derivation of the Conformal Ward identities is similar to the computation done in the second article.
  • Virman, Meri (Helsingin yliopisto, 2021)
    Almost all of the precipitation in the Tropics is caused by or linked with atmospheric deep convection, and it also contributes significantly to the summer-time precipitation in the midlatitudes. Deep convection is often associated with hazardous weather, including thunderstorms, tornadoes and hurricanes. On a global scale, deep convection produces most of the precipitation on Earth, affects large-scale weather and climate patterns and transfers water vapor and heat to the upper troposphere. The physical mechanisms in deep convection, especially those responsible for its sensitivity to low-to-midtropospheric moisture, need to be understood in order to produce realistic weather and climate forecasts. In this thesis, a novel mechanism potentially controlling deep convection and contributing to its moisture sensitivity has been studied. Vertical temperature structures associated with 1) precipitation over tropical oceans and 2) evaporation of stratiform precipitation were investigated using radiosonde and satellite observations at tropical sounding stations and idealized model simulations, respectively. After precipitation over tropical oceans, warm layers, that were below cold layers, were observed in the lower troposphere but only over dry regions. The idealized simulations showed that evaporation of stratiform precipitation results in qualitatively similar temperature structures. It was concluded that, depending on the amount of low-to-midtropospheric moisture, evaporation of stratiform precipitation and resulting adiabatic subsidence warming could cause lower tropospheric warm anomalies that may control the formation of subsequent deep convection and thereby explain why deep convection depends on the amount of low-to-midtropospheric moisture. ERA5 and ERA-Interim reanalyses were also compared with observations at the tropical sounding stations. The comparison showed that the newer ERA5 differed more than the older ERA-Interim from observations in the low-to-midtroposphere. Based on comparisons to previous studies, it was concluded that the underlying model in ERA5 may not represent the moisture sensitivity of convection entirely correctly. This thesis suggests that tropical stratiform precipitation contributes to the moisture sensitivity of deep convection. Accurate representations of tropical stratiform precipitation and its evaporation may be key to more realistic representation and understanding of convective phenomena in numerical weather prediction and climate models.
  • Ryöti, Miliza (Helsingin yliopisto, 2021)
    This study explores the inherently political nature of visualizing spatial information. Theoretically I draw from scholars of geography’s visual culture, including a body of literature on critical cartography and performativity of cartographic practices. Combining the concept of performativity with notions of bounded rationality and critical approaches to evidence-based planning and policy-making I propose a framework of performativity defined by aspects of potential, process and outcome. Using this framework I argue that the politics of visual practices in planning take both explicit and implicit forms. Methodologically this is an ethnographic study of strategic spatial planning practices aimed to prevent residential segregation, on both city-regional and municipal levels. My ethnographic fieldwork consisted of over a decade of professional involvement, which made me an insider researcher and my work as much autoethnography as institutional ethnography. In addition to my extensive fieldwork I generated research material in visual elicitation interviews with planning professionals and locally elected decision-makers taking part in the planning processes. The visualizations’ roles varied from technical tools of exploring the statistical data and assessing its quality to serving as boundary objects in framing the problem and negotiating the policy measures. The communicative role of the visualizations was purposefully ‘muted,’ thus limiting their performative potential to professional use only. This decision was rationalized as a socially and ethically responsible way of avoiding stigmatization of residential areas, thus serving the public interest and promoting future development. In my analysis, however, I identified additional motivations behind these depoliticized visual practices. Although in evidence-based planning expert knowledge is largely considered objective and neutral, it is also the product of a planning rationality bounded by institutional conventions, organizational structures, limited resources and managerialist tendencies. The evidence thus becomes inherently political in the sense that it reflects the technocratic approach of getting things done, leaving unexplored anything not directly concerning the task at hand. In my discussion and conclusions I advocate for a more pluralistic approach of visualizations that draws together different aspects of physical, functional and social space and opens them to interpretation and co-authorship also beyond the realm of planners. Challenging naturalized spatial imaginaries of segregation and stigma may result in new performative outcomes such as cross-disciplinary policy measures.
  • Nordling, Kalle (Helsingin yliopisto, 2021)
    Anthropogenic aerosols alter the climate by scattering and absorbing the incoming solar radiation and by modifying clouds’ optical properties, causing a global cooling or warming effect. Anthropogenic aerosols are partly co-emitted with anthropogenic greenhouse gases, and future climate mitigation actions lead to the decline of anthropogenic aerosols’ cooling effect. However, the exact cooling effect is still uncertain. Part of this uncertainty is related to the structural differences of current climate models. This work evaluates the present-day anthropogenic aerosol temperature and precipitation effect and factors affecting the model difference. The key objectives of this thesis were: 1) What are the climate effects of present-day anthropogenic aerosols?, 2) What mechanisms drive the model-to-model differences?, and 3) How do future reductions affect local and global climates? The global models ECHAM6 and NorESM1 were used to evaluate the present-day climate effects with the identical anthropogenic aerosol scheme MACv2-SP. Results reveal that an identical anthropogenic aerosol description does not reduce the uncertainty related to anthropogenic aerosol climate effects, and the difference in the estimated difference is due to model dynamics and oceans. The key mechanism driving the difference in the models was evaluated using data from the Precipitation Driven Model Intercomparison Project (PRMIP). Similar mechanisms drive the model-to-model difference for greenhouse gases and aerosols, where the key drivers are the differences in water vapor, the vertical temperature structure of the atmosphere, and sea ice and snow cover changes. However, on a regional scale, the key drivers differ. Future anthropogenic aerosol effects were evaluated using new CMIP6 data. This work shows the importance of anthropogenic aerosols for current and future climate change. For a more accurate assessment of climate impacts of anthropogenic aerosols, one needs to also consider remote effects of the local aerosols. The Arctic regions are particularly sensitive to midlatitude aerosols, such as Asian aerosols, which are expected to decline in the next decades. To gain a more accurate estimation of anthropogenic aerosols, it is not sufficient to only focus on composition and geographical distribution of aerosols, as the dynamic response of climate is also important. On global temperature results did not indicate clear aerosols signal, however future temperature development over the Asian regions is modulated by the future Asian aerosol emissions.
  • Mika, Vestenius (Helsingin yliopisto, 2021)
    Air pollution is an important environmental risk to human health and ecosystems around the world. Particulate matter (PM), especially fine particulate matter, is an important part of this air pollution problem. Particle composition varies greatly and depends on the emission source. In addition to inorganic components, organic particulate fraction can contain several hundred organic compounds from anthropogenic and natural sources. The health risk of particulate is related to the particle size and the compounds inside or on the surface of the aerosol particles. The overall aim of this thesis was to study the selected chemical substances of atmospheric aerosol from both anthropogenic and natural sources. Concentrations of polycyclic aromatic hydrocarbons (PAH) and biogenic organic acids in aerosol were measured, and their effect on the local air quality was estimated. The sources of PAHs, trace elements, biogenic volatile organic compounds (BVOCs), and persistent organic compounds (POPs) in air were studied using positive matrix factorization (PMF), which was used as the main source apportionment tool in three of five papers and for the unpublished data in this thesis. Particles from burning emissions, e.g., diesel particles and particles from biomass burning, are the most toxic in our daily environment. Because of intensive wood use for heating and in sauna stoves, residential biomass burning is the major PAH air pollution source in Finland. The main source of PAHs at Virolahti were found to be combustion- and traffic-related source from the direction of St. Petersburg. Instead, local traffic appeared to have a very small influence on PAH levels in HMA, as local residential wood burning was found to be the main b(a)p source in Helsinki Metropolitan Area. Biogenic VOCs like monoterpenes and sesquiterpenes are highly reactive and oxidize rapidly in the atmosphere, producing secondary organic aerosol (SOA). We showed that positive matrix factorization (PMF) is a useful tool in estimating separate sources in a quasistationary dynamic system like ambient VOC concentrations in the boreal forest. Selected biogenic organic acids were measured from fine particles in the boreal forest in order to estimate their influence on aerosol production. Results indicated that sesquiterpene emissions from boreal forest are probably underestimated and their oxidation products probably have more important role in the SOA production that previously estimated. The Kola Peninsula area was found to be the major source of heavy metal pollution at Pallas. However, as Norilsk Nickel has now partly shut down its metallurgical operations, the trace element and SO2 emissions from the Kola Peninsula should be declining in the future. The ambient concentrations of POP compounds are globally declining but, in the Arctic, for some compounds this is not the case. In the source apportionment study for Pallas 1996–2018 POPs data, relatively big portion of measured POPs at Pallas came within the marine source from clean areas from the north. These long-lived compounds, which have migrated into the Arctic from the southern areas along the air and sea currents for many decades, are now released back into the atmosphere from the melting Arctic ice cover due to global warming. For these compounds, the Arctic has turned from the sink to the source.
  • Korpela, Jussi (Helsingin yliopisto, 2021)
    his thesis is about inverse problem. Inverse problems have rich mathematical theory that employs modern methods in partial differential equations, numerical analysis, probability theory, harmonic analysis, and differential geometry. Inverse problems research lies at the intersection of pure and applied mathematics. Traditionally, inverse problems are application oriented, although there are also pure mathematical problems that are considered to be inverse problems. In this study, the wave equation is the physical model for analysis. The wave equation basically tells us how disturbances travel through a medium, transporting energy from one location to another location without transporting matter. It is a mathematical model that describes many physical phenomena in a reasonably manner. In many situations, the initial and boundary value problem for the wave equation is a quite convenient structure when solving inverse problems. Thus it is the central framework for analysis here, added to with the concept of measurements on the boundary, a so-called Neumann-to-Dirichlet map. This dissertation consists of three publications. In Publication I, an inverse boundary value problem for a 1+1-dimensional wave equation with wave speed $c(x)$ is considered. We give a regularization strategy for inverting the map $\mathcal A:c\mapsto \Lambda,$ where $\Lambda$ is the hyperbolic Neumann-to-Dirichlet map corresponding to the wave speed $c$. In Publication II, an inverse boundary value problem for the 1+1-dimensional wave equation $(\p_t^2 - c(x)^2 \p_x^2)u(x,t)=0,\quad x\in\R_+$ is considered. We give a discrete regularization strategy recovering the wave speed $c(x)$ when we are given the boundary value of the wave, $u(0,t)$, that is produced by a single pulse-like source. The regularization strategy gives an approximative wave speed $\wtilde c$, satisfying a H\"older type estimate $\| \wtilde c-c\|\leq C \epsilon^{\gamma}$, where $\epsilon$ is the noise level. In Publication III, We studied the wave equation on a bounded domain of $\R^m$ and on a compact Riemannian manifold $M$ with a boundary. We assumed, that the coefficients of the wave equation are unknown but that we are given the hyperbolic Neumann-to-Dirichlet map $\Lambda$ that corresponds to the physical measurements on the boundary. With the knowledge of $\Lambda$ we construct a sequence of Neumann boundary values so that, at a time $T$, the corresponding waves converge to zero while the time derivative of the waves converges to a delta distribution. Such waves are called \textit{artificial point sources}. The convergence of a wave takes place in the function spaces naturally related to the energy of the wave. We apply the results for inverse problems and demonstrate the focusing of the waves numerically in the one-dimensional case.
  • Hirviniemi, Olli (Helsingin yliopisto, 2021)
    This dissertation contains three articles on regularity properties of quasiconformal mappings and mappings of finite distortion. Quasiconformal mappings have bounded distortion, and we examine several ways to generalize the known regularity results for them to the case where the distortion is no longer uniformly bounded. In the first article, we study the analytic regularity of mappings with exponentially integrable distortion. In particular, our main theorem is that in the borderline case where the derivative is not square-integrable, we can find a weight function that is logarithmic to the distortion function such that the derivative is square-integrable in the weighted space. We also provide a better weight for radial functions and find regularity results for the case with the distortion function having better than exponential integrability. In the second article, we consider the geometric regularity when a line is mapped quasiconformally. Local stretching is understood by investigating the modulus of continuity for both the mapping and its inverse, while the local rotation is measured by comparing the image of a segment to logarithmic spirals. These two properties can be studied jointly as complex powers. Almost everywhere on the line, we prove a bound for the possible complex exponents that improves the earlier, more general bound for 1-dimensional subsets of the plane. In the third article, our focus is on mappings of finite distortion that are quasiconformal inside a disk. We give an example of a planar domain that cannot be the image of the disk under such a mapping, with boundary satisfying three point condition with a tighter control function than earlier examples. We also prove a modulus of continuity estimate for the inverse mapping, and establish optimal integrability for the derivative under additional assumption on the singularities.
  • Åberg, Susanne (Helsingin yliopisto, 2021)
    Lähetetään erikseen
  • Hämäläinen, Karoliina (Helsingin yliopisto, 2021)
    The renewable energy sources play a big role in mitigating the effects of power production on climate change. However, many renewable energy sources are weather dependent, and accurate weather forecasts are needed to support energy production estimates. This dissertation work aims to develop meteorological solutions to support wind energy production, and to answer the following questions: How accurate are the wind forecasts at the wind turbine hub height? What is the annual distribution of the wind speed? How much energy can be harvested from the wind? How does the atmospheric icing affect wind energy production and how do we forecast these events? The first part of this dissertation work concentrates on resource mapping. Wind and Icing Atlases bring valuable information when planning wind parks and where to locate new ones. The Atlases provide climatological information on mean wind speed, potential to generate wind power and atmospheric icing conditions in Finland. Based on mean wind speed and direction, altogether 72 representative months were simulated to represent the wind climatology of the past 30 years. A similar detailed selection could not be made with respect to icing process due to lack of icing observations. However, sensitivity tests were performed with respect to temperature and relative humidity, which have an influence on icing formation. According to these sensitivity tests the selected period was found to represent the icing climatology as well. The results are presented in gridded form with 2.5 km horizontal resolution and for 50 m, 100 m and 200 m heights above the ground, representing typical hub heights of a wind turbine. Daily probabilistic wind forecasts can bring additional value to decision making to support wind energy production. Probabilistic weather forecasts not only provide wind forecasts but also give estimations related to forecast uncertainty. However, probabilistic wind forecasts are often underdispersive. In this thesis the statistical calibration methods combined with a new type of wind observations were utilized. The aim was to study if Lidar and Radar wind observations at 100 m’s height can be used for ensemble calibration. The results strongly indicate that the calibration enhances the forecast skill by enlarging the ensemble spread and by decreasing RMSE. The most significant improvements are identified with shorter lead times and with weak or moderate wind speeds. For the strongest winds no improvements are seen, as a result of small amount of strong wind speed cases during the calibration training period. In addition to wind speed, wind power generation is mostly affected by atmospheric icing at Northern latitudes. However, measuring of icing is difficult due to many reasons and, furthermore, not many observations are available. Therefore, in this thesis the suitability of a new type of ceilometer-based icing profiles for atmospheric icing model validation have been tested. The results support the usage of this new type of ceilometer icing profiles for model verification. Furthermore, this new extensive observation network provides opportunities for deeper investigation of icing cloud properties and structure.
  • Yang, Dong (Helsingin yliopisto, 2021)
    Poly(N-acryloyl glycinamide) (PNAGA) is a non-ionic polymer possessing an upper critical solution temperature (UCST) in water and saline solutions. This thesis explores the synthesis of stimuli-responsive PNAGA microgels and immobilization of catalytically active species inside these. The synthesis of poly(N-acryloylglycinamide) (PNAGA) microgels was conducted in water by free radical precipitation polymerization below the phase transition temperature of PNAGA in the presence of N,N’-methylenebisacrylamide crosslinker. These water dispersed PNAGA microgels show reversible size changes, swelling upon heating and shrinking upon cooling. Using these PNAGA microgels as host for nanocatalysts was carried out by loading silver nanoparticles (AgNPs) via reduction of AgNO3.The thermosensitive behavior of the PNAGA microgels was retained after loading AgNPs, and the catalytic activity of the metal particles in 4-nitrophenol reduction was tested under different conditions. Furthermore, it was shown that the catalytic activity of the AgNP–PNAGA microgels could be switched on and off by changing the temperature and utilizing the thermosensitivity. To realize biocatalytic microgels, immobilization of an enzyme, β-D-glucosidase, was done by encapsulation of the enzyme during the NAGA precipitation polymerization. Properties of these hybrid microgels were studied varying the enzyme-monomer ratio and the degree of crosslinking. The microgel encapsulated enzymes showed enhanced activity at high pH compared to the native enzymes. Tandem catalysts were then produced by further encapsulation of AgNPs. These were used in cascade reactions involving first enzymatic catalysis followed by AgNP induced reduction. The catalyst loading efficiency as well as the manipulation of the thermoresponsive properties was performed by copolymerizing methacrylic acid (MAA) with NAGA. The volume phase transition behavior and interactions between NAGA and MAA in the poly(N-acryloyl glycinamide-co-methacrylic acid) [P(NAGA−MAA)] copolymer microgels were studied. AgNPs were immobilized inside the P(NAGA−MAA) microgels using both UV light and chemical reduction. The photoreduction resulted in smaller AgNPs and the amount and size of the AgNPs was observed to depend on the content of MAA. The UV-reduced AgNPs show significantly higher catalytic activity than chemically reduced AgNPs in P(NAGA-MAA) microgels.
  • Havukainen, Joona (Helsingin yliopisto, 2021)
    The LHC particle accelerator at CERN is probing the elementary building blocks of matter at energies never seen in laboratory conditions before. In the process of providing new insights in to the Standard Model describing the current understanding of physics governing the behaviour of particles, the accelerator is challenging the algorithms and techniques used in storing the collected data, rebuilding the collected collision events from the detector signal and analysing the data. For this end many state of the art methods are being developed by the scientist working in the LHC experiments in order to gain as much knowledge from the unique data collected from these particle collisions. The decade starting from 2010 can be in many respects considered as the deep learning revolution where a family of machine learning algorithms collectively called deep neural networks had significant breakthroughs driven by advances in hardware used to train these algorithms. During this period many achievements previously only seen in the realm of science fiction became reality as the deep neural networks began driving cars, images and videos could be enhanced with super resolution in real time and improvements in automated translation tools lowered the barriers in communication between people. These results have given the field of deep learning a significant momentum and lead to the methods spreading across academic disciplines as well as different industries. In this thesis the recent advances of deep learning are applied into the realm of particle physics using the data collected by the CMS experiment at the LHC at CERN. First topic presented considers the task of rebuilding the flight paths of charged particles called tracks inside the detector using the measurements made by the Tracker sub-detector in the heart of the CMS. The conditions present inside the detector during particle collisions demand for advanced algorithms able to be both fast and precise. The project in this thesis looks at estimating the quality of the reconstructed tracks and reject tracks that look like they are a result of mistakes made by the reconstruction algorithms, purifying the reconstructed dataset from false signals. Previously the task has been done initially by cut based selections determined by physicists and later by another machine learning algorithm known as the boosted decision tree. Here the first application of deep neural networks to the task is presented with the goal of both simplifying the upkeep of the classifier as well as improving the performance. In the second topic the application of deep neural network classifiers in the context of a search for a new particle, the charged Higgs boson, is presented. Here the main focus is in producing a classifier that has been decorrelated from a variable of interest that will be used in making the final discovery or exclusion of the hypothetical particle. The classifier can then be used just like any other selection step in the analysis aiming to separate known Standard Model background events from the expected signal without distorting the distribution for the variable of interest. Both research topics present first time use cases at the CMS for deep neural networks in their respective contexts and the work done includes the full stack of solving a machine learning problem, starting from data collection strategy to cleaning the data and working out the meaningful input variables for the problem all the way to training, optimizing and deploying the model to get the final results for their performance.
  • Li, Xiaodong (Helsingin yliopisto, 2021)
    The sorption on the pore surfaces of the bedrock and diffusion into the low porous rock matrix are the two most significant processes that retard radionuclides migration through the water-conducting fractures of the crystalline rock. Se-79 is considered as one of the main contributors to dose-to-man and it has a high impact on the cumulative radioactive dose in a spent nuclear fuel repository. Due to its long half-life and high mobility, sound scientific and technical knowledge are needed to better understand the processes and related mechanisms that determine the Se transport behaviour in bedrock. Therefore, the focus of this dissertation is on the studies of sorption and diffusion properties of Se(IV) species in bedrock with applications of both experimental and modelling approaches. The work can be divided into two parts, according to the different research objectives. In the first part, the sorption behaviour of Se(IV) species on Grimsel granodiorite and its main minerals, plagioclase, K-feldspar, quartz and biotite were investigated in Grimsel groundwater simulant which has a low ionic strength. Results show that biotite has a much larger specific surface area (SSA) than the other main minerals of granodiorite and the Se sorption on biotite can represent its sorption on the whole bedrock. Thus, a multi-site surface complexation model for Se(IV) sorption onto biotite was developed based on experimental data from titration and sorption experiments. Molecular modelling was used to deduce some basic modelling parameters, such as site densities and site types. The technique of PHREEQC coupling with Python was used to calculate and optimize the fitting processes. In the second part of this work, diffusion properties of Se(IV) sepcies in intact rock core were investigated by an updated electromigration device and modelling analysis. The traditional electromigration device was updated by the introduction of a potentiostat to impose a constant voltage difference over the rock sample and by stabilizing the pH of the background electrolytes. To interpret the experimental results with more confidence, an advection-dispersion model was developed by accounting for the most important mechanisms governing the movement of the tracer ions, i.e. electromigration, electroosmosis and dispersion.
  • Siltala, Lauri (Helsingin yliopisto, 2021)
    The mass of an asteroid is considered one of its fundamental properties. Knowledge of an asteroid's mass is, by itself, useful for spacecraft navigation particularly for space missions planned to the asteroid in question. The gravity of massive asteroids causes small yet measurable perturbations on the orbits of the Solar System's planets such as Earth and Mars and thus knowledge of asteroid masses also contributes to the development of accurate planetary ephemerides. However, the mass of an asteroid gives us little scientifically interesting information on the asteroid by itself. Instead, the main scientific motivation for asteroid mass estimation is that the mass, alongside the volume, is one of the two critical parameters required to compute the asteroid's bulk density: when both parameters are known, the bulk density can be trivially computed with a simple division operation. The bulk density, in turn, may be combined with other compositional information of the asteroid, obtained mainly through spectroscopic observations of the asteroid in question, and compared to spectra, compositions, and densities of similar meteorites found on Earth. Such studies, in turn, allow for constraining the bulk composition and macroporosity, and, by extension, the structure of the asteroid. Thus, it is clear that knowledge of an asteroid's mass is critical for all detailed studies of the characteristics of the asteroid's interior. Besides scientific interest, such studies may also have future practical applications for characterizing potential targets for asteroid mining and for planning asteroid deflection in the event of an impact threat. Asteroid mass estimation is traditionally performed by analyzing an asteroid's gravitational interaction with another object, such as a spacecraft, Mars and/or Earth, or a separate asteroid during an asteroid-asteroid close encounter, or, in the case of binary asteroids, the orbits of the component asteroids. Recently, an alternative approach of direct density estimation through detection and modeling of radiative forces, particularly the Yarkovsky effect, has also begun to see use. This thesis deals with asteroid mass estimation based on asteroid-asteroid close encounters. It begins with a general overview of the asteroids followed by a more detailed discussion on the different approaches for the estimation of asteroid masses and densities. Next, I describe our novel application of Markov-chain Monte Carlo techniques to the mass estimation problem in greater detail. To demonstrate the progress achieved with each consecutive paper, I highlight mass estimates for the asteroid (52) Europa beginning with results from my Master's thesis obtained with the initial version of the MCMC algorithm, followed by updated results from the first, third and finally the fifth and final paper included in this thesis. Clear improvements are seen throughout; in particular, the usage of astrometry from the second Gaia data release in Paper V leads to a significant order-of-magnitude reduction of the uncertainty of the mass. Finally, I briefly discuss future prospects, particularly in regards to the forthcoming third Gaia data release.
  • Kubečka, Jakub (Helsingin yliopisto, 2021)
    A suspension of fine solid particles and liquid droplets in the air is called an aerosol. Atmospheric aerosols play an important role in climate and also affect human health. Some of these aerosols are formed in the atmosphere by collisions of gas molecules with favorable interactions. The agglomerations of molecules formed in this process are referred to as molecular clusters. Unstable molecular clusters usually break apart quickly. In contrast, stable molecular clusters may become the nuclei of subsequent growth by condensation of other vapor molecules, and eventually form new atmospheric fine particles (this process is called new particle formation = NPF). This process is typically accompanied by a nucleation barrier, which has to be surmounted to form the new particle. It is essential to understand and accurately describe the molecular mechanism behind this process as our current understand- ing of NPF is incomplete, leading to significant uncertainties when it comes to forecasting NPF-related phenomena (e.g., mists, clouds). I utilize computational quantum chemistry (QC) to evaluate the stability of molecular clusters, which determines their decomposition rates. The surmounting of the (free) energy nucleation barrier is about a probabilistic competition between cluster evaporation and cluster growth due to the collision with other condensable molecules in the air. The collision rate can be approximately calculated from kinetic gas theory. The evaporation rate can then be calculated using the detailed balance equation, which, however, requires thermodynamic calculations using computationally demanding QC methods. Moreover, to calculate thermodynamic properties of a molecular cluster, the cluster structure has to be known beforehand. The main focus of this thesis is studying molecular cluster structures/configurations and searching for those configurations that can be most probably found in the atmospheric air. The process of searching for various configurations is known as configurational sampling. I discuss methods of configurational sampling and suggest an approach for configurational sampling of atmospherically relevant molecular clusters. The core of the research results shown in this work are applications of the configurational sampling protocol, and the Jammy Key for Configurational Sampling (JKCS) program, which was developed over the course of my Ph.D. studies.

View more