benchmarking - TPC-DS BenchMark on Hadoop - Why use star schema -


i trying run tpc-ds benchmark sparksql.

in document talk having star schema , number of tables.

from understanding of hadoop , better have denormalized data, , can format paraquet in compression. (use partitions parallelism)

i found document sas -> https://support.sas.com/resources/papers/data-modeling-hadoop.pdf

which talks in same term. no dataware house expert, request , me understand how model data dataware house in hadoop


Comments

Popular posts from this blog

OpenCV OpenCL: Convert Mat to Bitmap in JNI Layer for Android -

android - org.xmlpull.v1.XmlPullParserException: expected: START_TAG {http://schemas.xmlsoap.org/soap/envelope/}Envelope -

python - How to remove the Xframe Options header in django? -