GK Question

technology hard mcq

Which data format is most efficient for columnar storage and analytics workloads?

  1. CSV
  2. JSON
  3. Parquet
  4. XML

Answer: Parquet

Parquet is a columnar storage format optimized for analytics: efficient compression, predicate pushdown, and schema evolution. Reduces I/O by reading only required columns. Used in Spark, Hive, Presto. CSV/JSON are row-oriented; XML is verbose. Critical for big data engineering questions.

Topic Data Engineering
Exam Relevance Banking, SSC JE, UPSC