benchmarking - TPC-DS BenchMark on Hadoop - Why use star schema -
i trying run tpc-ds benchmark sparksql.
in document talk having star schema , number of tables.
from understanding of hadoop , better have denormalized data, , can format paraquet in compression. (use partitions parallelism)
i found document sas -> https://support.sas.com/resources/papers/data-modeling-hadoop.pdf
which talks in same term. no dataware house expert, request , me understand how model data dataware house in hadoop
Comments
Post a Comment