Browsing by Subject "scalability"

Sort by: Order: Results:

Now showing items 1-4 of 4
  • Lacagnina, Carlo; Doblas-Reyes, Francisco; Larnicol, Gilles; Buontempo, Carlo; Obregón, André; Costa-Surós, Montserrat; San-Martín, Daniel; Bretonnière, Pierre-Antoine; Polade, Suraj D.; Romanova, Vanya; Putero, Davide; Serva, Federico; Llabrés-Brustenga, Alba; Pérez, Antonio; Cavaliere, Davide; Membrive, Olivier; Steger, Christian; Pérez-Zanón, Núria; Cristofanelli, Paolo; Madonna, Fabio; Rosoldi, Marco; Riihelä, Aku; Díez, Markel García (Ubiquity Press, Ltd., 2022)
    Data Science Journal
    Data from a variety of research programmes are increasingly used by policy makers, researchers, and private sectors to make data-driven decisions related to climate change and variability. Climate services are emerging as the link to narrow the gap between climate science and downstream users. The Global Framework for Climate Services (GFCS) of the World Meteorological Organization (WMO) offers an umbrella for the development of climate services and has identified the quality assessment, along with its use in user guidance, as a key aspect of the service provision. This offers an extra stimulus for discussing what type of quality information to focus on and how to present it to downstream users. Quality has become an important keyword for those working on data in both the private and public sectors and significant resources are now devoted to quality management of processes and products. Quality management guarantees reliability and usability of the product served, it is a key element to build trust between consumers and suppliers. Untrustworthy data could lead to a negative economic impact at best and a safety hazard at worst. In a progressive commitment to establish this relation of trust, as well as providing sufficient guidance for users, the Copernicus Climate Change Service (C3S) has made significant investments in the development of an Evaluation and Quality Control (EQC) function. This function offers a homogeneous user-driven service for the quality of the C3S Climate Data Store (CDS). Here we focus on the EQC component targeting the assessment of the CDS datasets, which include satellite and in-situ observations, reanalysis, climate projections, and seasonal forecasts. The EQC function is characterised by a two-tier review system designed to guarantee the quality of the dataset information. While the need of assessing the quality of climate data is well recognised, the methodologies, the metrics, the evaluation framework, and how to present all this information to the users have never been developed before in an operational service, encompassing all the main climate dataset categories. Building the underlying technical solutions poses unprecedented challenges and makes the C3S EQC approach unique. This paper describes the development and the implementation of the operational EQC function providing an overarching quality management service for the whole CDS data.
  • Lee, Hyeongju (Helsingin yliopisto, 2021)
    The number of IoT and sensor devices is expected to reach 25 billion by 2030. Many IoT appli- cations, such as connected vehicle and smart factory that require high availability, scalability, low latency, and security have appeared in the world. There have been many attempts to use cloud computing for IoT applications, but the mentioned requirements cannot be ensured in cloud environments. To solve this problem, edge computing has appeared in the world. In edge environments, containerization technology is useful to deploy apps with limited resources. In this thesis, two types of high available Kubernetes architecture (2 nodes with an external DB and 3 nodes with embedded DB) were surveyed and implemented using K3s distribution that is suitable for edges. By having a few experiments with the implemented K3s clusters, this thesis shows that the K3s clusters can provide high availability and scalability. We discuss the limitations of the implementations and provide possible solutions too. In addition, we provide the resource usages of each cluster in terms of CPU, RAM, and disk. Both clusters need only less than 10% CPU and about 500MB RAM on average. However, we could see that the 3 nodes cluster with embedded DB uses more resources than the 2 nodes + external DB cluster when changing the status of clusters. Finally, we show that the implemented K3s clusters are suitable for many IoT applications such as connected vehicle and smart factory. If an application that needs high availability and scalability has to be deployed in edge environments, the K3s clusters can provide good solutions to achieve the goals of the applications. The 2 nodes + external DB cluster is suitable for the applications where the amount of data fluctuate often, or where there is a stable connection with the external DB. On the other hand, the 3 nodes cluster will be suitable for the applications that need high availability of the database even in poor internet connection. ACM Computing Classification System (CCS) Computer systems organization → Embedded and cyber-physical systems Human-centered computing → Ubiquitous and mobile computing
  • Hyeongju, Lee (Helsingin yliopisto, 2021)
    The number of IoT and sensor devices is expected to reach 25 billion by 2030. Many IoT appli- cations, such as connected vehicle and smart factory that require high availability, scalability, low latency, and security have appeared in the world. There have been many attempts to use cloud computing for IoT applications, but the mentioned requirements cannot be ensured in cloud environments. To solve this problem, edge computing has appeared in the world. In edge environments, containerization technology is useful to deploy apps with limited resources. In this thesis, two types of high available Kubernetes architecture (2 nodes with an external DB and 3 nodes with embedded DB) were surveyed and implemented using K3s distribution that is suitable for edges. By having a few experiments with the implemented K3s clusters, this thesis shows that the K3s clusters can provide high availability and scalability. We discuss the limitations of the implementations and provide possible solutions too. In addition, we provide the resource usages of each cluster in terms of CPU, RAM, and disk. Both clusters need only less than 10% CPU and about 500MB RAM on average. However, we could see that the 3 nodes cluster with embedded DB uses more resources than the 2 nodes + external DB cluster when changing the status of clusters. Finally, we show that the implemented K3s clusters are suitable for many IoT applications such as connected vehicle and smart factory. If an application that needs high availability and scalability has to be deployed in edge environments, the K3s clusters can provide good solutions to achieve the goals of the applications. The 2 nodes + external DB cluster is suitable for the applications where the amount of data fluctuate often, or where there is a stable connection with the external DB. On the other hand, the 3 nodes cluster will be suitable for the applications that need high availability of the database even in poor internet connection. ACM Computing Classification System (CCS) Computer systems organization → Embedded and cyber-physical systems Human-centered computing → Ubiquitous and mobile computing
  • Šimko, Tibor; Heinrich, Lukas Alexander; Lange, Clemens; Lintuluoto, Adelina Eleonora; MacDonell, Danika Marina; Mečionis, Audrius; Rodriguez, Diego Rodriguez; Shandilya, Parth (2021)
    We describe a novel approach for experimental High-Energy Physics (HEP) data analyses that is centred around the declarative rather than imperative paradigm when describing analysis computational tasks. The analysis process can be structured in the form of a Directed Acyclic Graph (DAG), where each graph vertex represents a unit of computation with its inputs and outputs, and the graph edges describe the interconnection of various computational steps. We have developed REANA, a platform for reproducible data analyses, that supports several such DAG workflow specifications. The REANA platform parses the analysis workflow and dispatches its computational steps to various supported computing backends (Kubernetes, HTCondor, Slurm). The focus on declarative rather than imperative programming enables researchers to concentrate on the problem domain at hand without having to think about implementation details such as scalable job orchestration. The declarative programming approach is further exemplified by a multi-level job cascading paradigm that was implemented in the Yadage workflow specification language. We present two recent LHC particle physics analyses, ATLAS searches for dark matter and CMS jet energy correction pipelines, where the declarative approach was successfully applied. We argue that the declarative approach to data analyses, combined with recent advancements in container technology, facilitates the portability of computational data analyses to various compute backends, enhancing the reproducibility and the knowledge preservation behind particle physics data analyses.