Browsing by Title

Sort by: Order: Results:

Now showing items 98-117 of 859
  • Kaasalainen, Sanna (Helsingin yliopisto, 2002)
  • Mutshinda Mwanza, Crispin (Helsingin yliopisto, 2010)
    Elucidating the mechanisms responsible for the patterns of species abundance, diversity, and distribution within and across ecological systems is a fundamental research focus in ecology. Species abundance patterns are shaped in a convoluted way by interplays between inter-/intra-specific interactions, environmental forcing, demographic stochasticity, and dispersal. Comprehensive models and suitable inferential and computational tools for teasing out these different factors are quite limited, even though such tools are critically needed to guide the implementation of management and conservation strategies, the efficacy of which rests on a realistic evaluation of the underlying mechanisms. This is even more so in the prevailing context of concerns over climate change progress and its potential impacts on ecosystems. This thesis utilized the flexible hierarchical Bayesian modelling framework in combination with the computer intensive methods known as Markov chain Monte Carlo, to develop methodologies for identifying and evaluating the factors that control the structure and dynamics of ecological communities. These methodologies were used to analyze data from a range of taxa: macro-moths (Lepidoptera), fish, crustaceans, birds, and rodents. Environmental stochasticity emerged as the most important driver of community dynamics, followed by density dependent regulation; the influence of inter-specific interactions on community-level variances was broadly minor. This thesis contributes to the understanding of the mechanisms underlying the structure and dynamics of ecological communities, by showing directly that environmental fluctuations rather than inter-specific competition dominate the dynamics of several systems. This finding emphasizes the need to better understand how species are affected by the environment and acknowledge species differences in their responses to environmental heterogeneity, if we are to effectively model and predict their dynamics (e.g. for management and conservation purposes). The thesis also proposes a model-based approach to integrating the niche and neutral perspectives on community structure and dynamics, making it possible for the relative importance of each category of factors to be evaluated in light of field data.
  • Mäntyniemi, Samu (Helsingin yliopisto, 2006)
    In this thesis the use of the Bayesian approach to statistical inference in fisheries stock assessment is studied. The work was conducted in collaboration of the Finnish Game and Fisheries Research Institute by using the problem of monitoring and prediction of the juvenile salmon population in the River Tornionjoki as an example application. The River Tornionjoki is the largest salmon river flowing into the Baltic Sea. This thesis tackles the issues of model formulation and model checking as well as computational problems related to Bayesian modelling in the context of fisheries stock assessment. Each article of the thesis provides a novel method either for extracting information from data obtained via a particular type of sampling system or for integrating the information about the fish stock from multiple sources in terms of a population dynamics model. Mark-recapture and removal sampling schemes and a random catch sampling method are covered for the estimation of the population size. In addition, a method for estimating the stock composition of a salmon catch based on DNA samples is also presented. For most of the articles, Markov chain Monte Carlo (MCMC) simulation has been used as a tool to approximate the posterior distribution. Problems arising from the sampling method are also briefly discussed and potential solutions for these problems are proposed. Special emphasis in the discussion is given to the philosophical foundation of the Bayesian approach in the context of fisheries stock assessment. It is argued that the role of subjective prior knowledge needed in practically all parts of a Bayesian model should be recognized and consequently fully utilised in the process of model formulation.
  • Pirinen, Matti (Helsingin yliopisto, 2009)
    Genetics, the science of heredity and variation in living organisms, has a central role in medicine, in breeding crops and livestock, and in studying fundamental topics of biological sciences such as evolution and cell functioning. Currently the field of genetics is under a rapid development because of the recent advances in technologies by which molecular data can be obtained from living organisms. In order that most information from such data can be extracted, the analyses need to be carried out using statistical models that are tailored to take account of the particular genetic processes. In this thesis we formulate and analyze Bayesian models for genetic marker data of contemporary individuals. The major focus is on the modeling of the unobserved recent ancestry of the sampled individuals (say, for tens of generations or so), which is carried out by using explicit probabilistic reconstructions of the pedigree structures accompanied by the gene flows at the marker loci. For such a recent history, the recombination process is the major genetic force that shapes the genomes of the individuals, and it is included in the model by assuming that the recombination fractions between the adjacent markers are known. The posterior distribution of the unobserved history of the individuals is studied conditionally on the observed marker data by using a Markov chain Monte Carlo algorithm (MCMC). The example analyses consider estimation of the population structure, relatedness structure (both at the level of whole genomes as well as at each marker separately), and haplotype configurations. For situations where the pedigree structure is partially known, an algorithm to create an initial state for the MCMC algorithm is given. Furthermore, the thesis includes an extension of the model for the recent genetic history to situations where also a quantitative phenotype has been measured from the contemporary individuals. In that case the goal is to identify positions on the genome that affect the observed phenotypic values. This task is carried out within the Bayesian framework, where the number and the relative effects of the quantitative trait loci are treated as random variables whose posterior distribution is studied conditionally on the observed genetic and phenotypic data. In addition, the thesis contains an extension of a widely-used haplotyping method, the PHASE algorithm, to settings where genetic material from several individuals has been pooled together, and the allele frequencies of each pool are determined in a single genotyping.
  • Cheng, Lu (Helsingin yliopisto, 2013)
    Vast amounts of molecular data are being generated every day. However, how to properly harness the data remains often a challenge for many biologists. Firstly, due to the typical large dimension of the molecular data, analyses can either require exhaustive amounts of computer memory or be very time-consuming, or both. Secondly, biological problems often have their own special features, which put demand on specially designed software to obtain meaningful results from statistical analyses without imposing too much requirements on the available computing resources. Finally, the general complexity of many biological research questions necessitates joint use of many different methods, which requires a considerable expertise in properly understanding the possibilities and limitations of the analysis tools. In the first part of this thesis, we discuss three general Bayesian classification/clustering frameworks, which in the considered applications are targeted towards clustering of DNA sequence data, in particular in the context of bacterial population genomics and evolutionary epidemiology. Based on more generic Bayesian concepts, we have developed several statistical tools for analyzing DNA sequence data in bacterial metagenomics and population genomics. In the second part of this thesis, we focus on discussing how to reconstruct bacterial evolutionary history from a combination of whole genome sequences and a number of core genes for which a large set of samples are available. A major problem is that for many bacterial species horizontal gene transfer of DNA, which is often termed as recombination, is relatively frequent and the recombined fragments within genome sequences have a tendency to severely distort the phylogenetic inferences. To obtain computationally viable solutions in practice for a majority of currently emerging genome data sets, it is necessary to divide the problem into parts and use different approaches in combination to perform the whole analysis. We demonstrate this strategy by application to two challenging data sets in the context of evolutionary epidemiology and show that biologically significant conclusions can be drawn by shedding light into the complex patterns of relatedness among strains of bacteria. Both studied organisms (\textit{Escherichia coli} and \textit{Campylobacter jejuni}) are major pathogens of humans and understanding the mechanisms behind the evolution of their populations is of vital importance for human health.
  • Sillanpää, Mikko (Helsingin yliopisto, 2000)
  • Li, Zitong (Helsingin yliopisto, 2014)
    Quantitative trait loci (QTL) /association mapping aims to identify the genomic loci associated with the complex traits. From a statistical perspective, multiple linear regression is often used to model, estimate and test the effects of molecular markers on a trait. With genotype data derived from contemporary genomics techniques, however, the number of markers typically exceed the number of individuals, and it is therefore necessary to perform some sort of variable selection or parameter regularization to provide reliable estimates of model parameters. In addition, many quantitative traits are changing during their development process of life. Accordingly, a longitudinal study that jointly maps the repeated measurements of the phenotype over time may increase the statistical power to identify QTLs, compared with the single trait analysis. In this thesis, a series of Bayesian variable selection/regularization linear methods were developed and applied for analyzing quantitative traits measured at either single or multiple time points. The first work provided an overview of the principal frequentist regularization methods for analyzing single traits. The second work also focused on single trait analysis, where a variational Bayesian (VB) algorithm was derived for estimating parameters in several Bayesian regularization methods. The VB methods can be quickly implemented on large data sets in contrast to the classical Markov Chain Monte Carlo methods. In the third work, the Bayesian regularization method was extended to a non-parametric varying coefficient model to analyze longitudinal traits. Particularly, an efficient VB stepwise algorithm was used for variable selection, so that the method can be quickly implemented even on data sets with a large number of time points and/or a large number of markers. The fourth work is an application of variable selection methods on forest genetics data collected from Northern Sweden. From several conifer wood properties traits with multiple time points, four QTLs located at genes were identified, which are promising targets for future research in wood molecular biology and breeding.
  • Tang, Jing (Helsingin yliopisto, 2009)
    Bacteria play an important role in many ecological systems. The molecular characterization of bacteria using either cultivation-dependent or cultivation-independent methods reveals the large scale of bacterial diversity in natural communities, and the vastness of subpopulations within a species or genus. Understanding how bacterial diversity varies across different environments and also within populations should provide insights into many important questions of bacterial evolution and population dynamics. This thesis presents novel statistical methods for analyzing bacterial diversity using widely employed molecular fingerprinting techniques. The first objective of this thesis was to develop Bayesian clustering models to identify bacterial population structures. Bacterial isolates were identified using multilous sequence typing (MLST), and Bayesian clustering models were used to explore the evolutionary relationships among isolates. Our method involves the inference of genetic population structures via an unsupervised clustering framework where the dependence between loci is represented using graphical models. The population dynamics that generate such a population stratification were investigated using a stochastic model, in which homologous recombination between subpopulations can be quantified within a gene flow network. The second part of the thesis focuses on cluster analysis of community compositional data produced by two different cultivation-independent analyses: terminal restriction fragment length polymorphism (T-RFLP) analysis, and fatty acid methyl ester (FAME) analysis. The cluster analysis aims to group bacterial communities that are similar in composition, which is an important step for understanding the overall influences of environmental and ecological perturbations on bacterial diversity. A common feature of T-RFLP and FAME data is zero-inflation, which indicates that the observation of a zero value is much more frequent than would be expected, for example, from a Poisson distribution in the discrete case, or a Gaussian distribution in the continuous case. We provided two strategies for modeling zero-inflation in the clustering framework, which were validated by both synthetic and empirical complex data sets. We show in the thesis that our model that takes into account dependencies between loci in MLST data can produce better clustering results than those methods which assume independent loci. Furthermore, computer algorithms that are efficient in analyzing large scale data were adopted for meeting the increasing computational need. Our method that detects homologous recombination in subpopulations may provide a theoretical criterion for defining bacterial species. The clustering of bacterial community data include T-RFLP and FAME provides an initial effort for discovering the evolutionary dynamics that structure and maintain bacterial diversity in the natural environment.
  • Jääskinen, Väinö (Helsingin yliopisto, 2015)
    In various fields of knowledge we can observe that the availability of potentially useful data is increasing fast. A prime example is the DNA sequence data. This increase is both an opportunity and a challenge as new methods are needed to benefit from the big data sets. This has sparked a fruitful line of research in statistics and computer science that can be called machine learning. In this thesis, we develop machine learning methods based on the Bayesian approach to statistics. We address a fairly general problem called clustering, i.e. dividing a set of objects to non-overlapping group based on their similarity, and apply it to models with Markovian dependence structures. We consider sequence data in a finite alphabet and present a model class called the Sparse Markov chain (SMC). It is a special case of a Markov chain (MC) model and offers a parsimonious description of the data generating mechanism. A Variable length Markov chain (VLMC) is a popular sparse model presented earlier in the literature and it has a representation as an SMC model. We develop Bayesian clustering methodology for learning the SMC and other Markovian models. Another problem that we study in this thesis is causal inference. We present a model and an algorithm for learning causal mechanisms from data. The model can be considered as a stochastic extension of the sufficient-component cause model that is popular in epidemiology. In our model there are several causal mechanisms each with its own parameters. A mixture distribution gives a probability that an outcome variable is associated with a mechanism. Applications that are considered in this thesis come mainly from computational biology. We cluster states of Markovian models estimated from DNA sequences. This gives an efficient description of the sequence data when comparing to methods reported in the literature. We also cluster DNA sequences with Markov chains, which results in a method that can be used for example in the estimation of bacterial community composition in a sample from which DNA is extracted. The causal model and the related learning algorithm are able to estimate mechanisms from fairly challenging data. We have developed the learning algorithms with big data sets in mind. Still, there is a need to develop them further to handle ever larger data sets.
  • Johansson, Tino (Helsingin yliopisto, 2008)
    Human-wildlife conflicts are today an integral part of the rural development discourse. In this research, the main focus is on the spatial explanation which is not a very common approach in the reviewed literature. My research hypothesis is based on the assumption that human-wildlife conflicts occur when a wild animal crosses a perceived borderline between the nature and culture and enters into the realms of the other. The borderline between nature and culture marks a perceived division of spatial content in our senses of place. The animal subject that crosses this border becomes a subject out of place meaning that the animal is then spatially located in a space where it should not be or where it does not belong according to tradition, custom, rules, law, public opinion, prevailing discourse or some other criteria set by human beings. An appearance of a wild animal in a domesticated space brings an uncontrolled subject into that space where humans have previously commanded total control of all other natural elements. A wild animal out of place may also threaten the biosecurity of the place in question. I carried out a case study in the Liwale district in south-eastern Tanzania to test my hypothesis during June and July 2002. I also collected documents and carried out interviews in Dar es Salaam in 2003. I studied the human-wildlife conflicts in six rural villages, where a total of 183 persons participated in the village meetings. My research methods included semi-structured interviews, participatory mapping, questionnaire survey and Q- methodology. The rural communities in the Liwale district have a long-history of co-existing with wildlife and they still have traditional knowledge of wildlife management and hunting. Wildlife conservation through the establishment of game reserves during the colonial era has escalated human-wildlife conflicts in the Liwale district. This study shows that the villagers perceive some wild animals differently in their images of the African countryside than the district and regional level civil servants do. From the small scale subsistence farmers point of views, wild animals continue to challenge the separation of the wild (the forests) and the domestics spaces (the cultivated fields) by moving across the perceived borders in search of food and shelter. As a result, the farmers may loose their crops, livestock or even their own lives in the confrontations of wild animals. Human-wildlife conflicts in the Liwale district are manifold and cannot be explained simply on the basis of attitudes or perceived images of landscapes. However, the spatial explanation of these conflicts provides us some more understanding of why human-wildlife conflicts are so widely found across the world.
  • Faraco, Daniel (Helsingin yliopisto, 2002)
  • Söderman, Tarja (Helsingin yliopisto, 2012)
    Ecological impact assessment focuses both on spatially bound biophysical environment and biodiversity as composition, structure, and key processes and on benefits of biodiversity gained through ecosystem services. It deals with allocation of space in complex situations characterised by uncertainty and conflicting values of actors. In the process of ecological impact assessment that forms part of environmental impact assessment (EIA) and strategic environmental assessment (SEA), the whole proposal of a project, plan, or programme; its targets; alternative options and their acceptability from a biodiversity standpoint; and knowledge of the biodiversity and ecosystem services it provides are shaped. The analyses in this thesis examine the current practices of Finnish ecological impact assessment with respect to its substantive and procedural features and the roles of actors. The analyses utilise qualitative and semi-quantitative data from EIA and Natura 2000 appropriate assessment reports, statements of environmental authorities, other data produced via assessment processes, and actors views related to ecological impact assessment. After analysis of the present shortcomings, constraints, and development needs, a tool taking into account fully current understanding and ecosystem services is developed to improve prevailing impact assessment practices. The results of the analyses demonstrate that the knowledge base for the comprehensive ecological impact assessment in EIA, Natura 2000 appropriate assessment, and municipal land-use planning SEA is far from adequate. Impact assessments fail to identify the biodiversity at stake, what is affected, and how, and, as a consequence, the selection of biodiversity elements for assessment is unsystematic, superficial, or focused on the most obvious strictly protected species. The connection between baseline studies and impact prediction is loose; consequently, the predictive value of baseline studies is low, preventing effective mitigation and monitoring. There is also a tendency toward unnecessary detail at the expense of a broader treatment of biodiversity that would address ecosystem processes, interactions, and trends. Substantive treatment of biodiversity is often restricted to compositional diversity and at the species and habitat type level. Finnish ecological impact assessment does not take into account the value-laden nature of impact assessment. It is baseline-oriented and often seen as external and parallel to the actual planning and decision-making. Scoping practices reflect this separateness by outsourcing important value-bound significance determinations to individual ecology consultants instead of considering them an integral part of the planning process. Cumulative effects are hardly ever considered in Finnish ecological impact assessment practices. The use of more sophisticated methods and tools than expert judgements and matrices is almost non-existent in Finnish ecological impact assessment practices, because of the planning environment lacking the time, resources, and skills for it. In addition, often a highly detailed treatment of biodiversity elements with complex tools is not necessary for achieving a holistic picture of the targets and impacts of an initiative. Therefore, an objective set for improvement in the knowledge grounding of ecological impact assessment has been the development of a relatively simple tool utilising already available data. Ecosystem services criteria and indicators were developed for target-setting, impact prediction, and monitoring, and these were tested in three processes of local master planning and regional planning. Timing constraints of data delivery; obstacles in data availability, quality, and consistency; and relative closeness of planning processes hampered the use of indicators, but the tool nonetheless was experienced as beneficial by the testing teams overall. The future challenges facing use of the tool involve its independent utilisation by planners without support from researchers on different planning scales, collaboration and commitment of actors in setting targets for ecosystem services, and versatile use of data. The other challenges in improvement of today s ecological impact assessment practices comprise finding a balance between broad-brush and detailed information individually for each planning situation; utilising, sharing, and mediating both knowledge within ecosystem-service-generating units and users and beneficiaries views of valued and prioritised ecosystem services; shifting from parallel linkage of impact assessment and planning towards planning- and decision-making-centred environmental assessment; supplying the necessary substantive and procedural requirements for ecological impact assessment in the EIA, nature conservation, and land-use and building legislation; placing stronger emphasis on scoping by strengthening the guiding role of authorities and reserving more time and resources for scoping by proponents and planners; generating specific cumulative impact assessment in EIA and SEA and improving that employed in Natura 2000 appropriate assessment by creating an iterative link; and fostering work-sharing between project- and plan/programme-level actors in identification of cumulative impacts.
  • Chousionis, Vasileios (Helsingin yliopisto, 2008)
    The topic of this dissertation lies in the intersection of harmonic analysis and fractal geometry. We particulary consider singular integrals in Euclidean spaces with respect to general measures, and we study how the geometric structure of the measures affects certain analytic properties of the operators. The thesis consists of three research articles and an overview. In the first article we construct singular integral operators on lower dimensional Sierpinski gaskets associated with homogeneous Calderón-Zygmund kernels. While these operators are bounded their principal values fail to exist almost everywhere. Conformal iterated function systems generate a broad range of fractal sets. In the second article we prove that many of these limit sets are porous in a very strong sense, by showing that they contain holes spread in every direction. In the following we connect these results with singular integrals. We exploit the fractal structure of these limit sets, in order to establish that singular integrals associated with very general kernels converge weakly. Boundedness questions consist a central topic of investigation in the theory of singular integrals. In the third article we study singular integrals of different measures. We prove a very general boundedness result in the case where the two underlying measures are separated by a Lipshitz graph. As a consequence we show that a certain weak convergence holds for a large class of singular integrals.
  • Vähäkangas, Antti (Helsingin yliopisto, 2009)
    The monograph dissertation deals with kernel integral operators and their mapping properties on Euclidean domains. The associated kernels are weakly singular and examples of such are given by Green functions of certain elliptic partial differential equations. It is well known that mapping properties of the corresponding Green operators can be used to deduce a priori estimates for the solutions of these equations. In the dissertation, natural size- and cancellation conditions are quantified for kernels defined in domains. These kernels induce integral operators which are then composed with any partial differential operator of prescribed order, depending on the size of the kernel. The main object of study in this dissertation being the boundedness properties of such compositions, the main result is the characterization of their Lp-boundedness on suitably regular domains. In case the aforementioned kernels are defined in the whole Euclidean space, their partial derivatives of prescribed order turn out to be so called standard kernels that arise in connection with singular integral operators. The Lp-boundedness of singular integrals is characterized by the T1 theorem, which is originally due to David and Journé and was published in 1984 (Ann. of Math. 120). The main result in the dissertation can be interpreted as a T1 theorem for weakly singular integral operators. The dissertation deals also with special convolution type weakly singular integral operators that are defined on Euclidean spaces.
  • Puolamäki, Kai (Helsingin yliopisto, 2001)
  • Launiainen, Samuli (Helsingin yliopisto, 2011)
    Interaction between forests and the atmosphere occurs by radiative and turbulent transport. The fluxes of energy and mass between surface and the atmosphere directly influence the properties of the lower atmosphere and in longer time scales the global climate. Boreal forest ecosystems are central in the global climate system, and its responses to human activities, because they are significant sources and sinks of greenhouse gases and of aerosol particles. The aim of the present work was to improve our understanding on the existing interplay between biologically active canopy, microenvironment and turbulent flow and quantify. In specific, the aim was to quantify the contribution of different canopy layers to whole forest fluxes. For this purpose, long-term micrometeorological and ecological measurements made in a Scots pine (Pinus sylvestris) forest at SMEAR II research station in Southern Finland were used. The properties of turbulent flow are strongly modified by the interaction between the canopy elements: momentum is efficiently absorbed in the upper layers of the canopy, mean wind speed and turbulence intensities decrease rapidly towards the forest floor and power spectra is modulated by spectral short-cut . In the relative open forest, diabatic stability above the canopy explained much of the changes in velocity statistics within the canopy except in strongly stable stratification. Large eddies, ranging from tens to hundred meters in size, were responsible for the major fraction of turbulent transport between a forest and the atmosphere. Because of this, the eddy-covariance (EC) method proved to be successful for measuring energy and mass exchange inside a forest canopy with exception of strongly stable conditions. Vertical variations of within canopy microclimate, light attenuation in particular, affect strongly the assimilation and transpiration rates. According to model simulations, assimilation rate decreases with height more rapidly than stomatal conductance (gs) and transpiration and, consequently, the vertical source-sink distributions for carbon dioxide (CO2) and water vapor (H2O) diverge. Upscaling from a shoot scale to canopy scale was found to be sensitive to chosen stomatal control description. The upscaled canopy level CO2 fluxes can vary as much as 15 % and H2O fluxes 30 % even if the gs models are calibrated against same leaf-level dataset. A pine forest has distinct overstory and understory layers, which both contribute significantly to canopy scale fluxes. The forest floor vegetation and soil accounted between 18 and 25 % of evapotranspiration and between 10 and 20 % of sensible heat exchange. Forest floor was also an important deposition surface for aerosol particles; between 10 and 35 % of dry deposition of particles within size range 10 30 nm occurred there. Because of the northern latitudes, seasonal cycle of climatic factors strongly influence the surface fluxes. Besides the seasonal constraints, partitioning of available energy to sensible and latent heat depends, through stomatal control, on the physiological state of the vegetation. In spring, available energy is consumed mainly as sensible heat and latent heat flux peaked about two months later, in July August. On the other hand, annual evapotranspiration remains rather stable over range of environmental conditions and thus any increase of accumulated radiation affects primarily the sensible heat exchange. Finally, autumn temperature had strong effect on ecosystem respiration but its influence on photosynthetic CO2 uptake was restricted by low radiation levels. Therefore, the projected autumn warming in the coming decades will presumably reduce the positive effects of earlier spring recovery in terms of carbon uptake potential of boreal forests.