From Classical DW to Cloud Data Warehouse

Show simple item record

dc.contributor Helsingin yliopisto, Matemaattis-luonnontieteellinen tiedekunta fi
dc.contributor University of Helsinki, Faculty of Science en
dc.contributor Helsingfors universitet, Matematisk-naturvetenskapliga fakulteten sv Heinonen, Jyrki 2020
dc.identifier.uri URN:NBN:fi:hulib-202012084720
dc.description.abstract Conventional Data warehouse main theme is ’single version of truth’ with either dimensional modeling option or normalized 3NF modeling. These both techniques have issues because on the way to data warehouse data is cleansed/transformed and data ends up changed, hence loosing information. Data Vault modeling - as response to these issues - is detail oriented and tracks history keeping the audit trail intact. This means we have ’single version of facts’ or ’all the data, all of the time’. Data Vault methodology and architecture can handle Big Data and NoSQL, which are also covered in this work on the Data Lake section. Data Lake tools have evolved strongly during the last decade and response to the ever expanding data amounts using distributed computing tactics. Data Lake can also ingest different types of structured, semi-structured and unstructured data. Data warehouse (and Data Lake) processing is moving from on-premises server rooms to the cloud data centers. Specifically Apache and Google have developed and inspired a lot of new tools, which can process data warehouse data on petabyte-scale. Now the challenge is that not only operational systems generate data to data warehouse but also huge amounts of machine-generated data has to be processed and analyzed on these practically infinitely scalable platforms. Data warehouse solution has to cover also machine-learning requirements. So the modernization of data warehouse is not over but still all these methodologies, architectures and tools are in use. The trick is to choose the right tool for the right job. en
dc.language.iso eng
dc.publisher Helsingin yliopisto fi
dc.publisher University of Helsinki en
dc.publisher Helsingfors universitet sv
dc.subject Data Warehouse
dc.subject Data Vault
dc.subject Data Lake
dc.subject Cloud Data Warehouse
dc.subject NoSQL
dc.subject Serverless architecture
dc.title From Classical DW to Cloud Data Warehouse en
dc.type.ontasot pro gradu -tutkielmat fi
dc.type.ontasot master's thesis en
dc.type.ontasot pro gradu-avhandlingar sv
dc.subject.discipline Tietojenkäsittelytiede und
dct.identifier.urn URN:NBN:fi:hulib-202012084720

Files in this item

Total number of downloads: Loading...

Files Size Format View
JyrkiHeinonen_Masters_Thesis_V1.0.pdf 1.998Mb PDF View/Open

This item appears in the following Collection(s)

Show simple item record