Unlocking the Power of Data with Snowflake Schema Detection and Schema Evolution

Soumak Das
4 min readSep 18, 2023

At Snowflake, we understand the importance of simplifying the process of working with data. One of Snowflake’s highly compelling features is its native support for semi-structured data, offering flexibility and efficiency to data professionals. Whether you prefer schema-on-write or schema-on-read, Snowflake gives you the freedom to choose the best approach for your data needs.

In this blog post, we’ll dive into how Snowflake has enhanced its schema-on-write capabilities to make loading files into structured data tables a breeze. We introduced these features during our journey from Summit 2021 to Summit 2023, with the goal of simplifying schematization while maintaining the high performance Snowflake is known for.

Schema Detection: A Glimpse into Your Data

Data files come in various shapes and sizes, generated by different teams. Data professionals often find themselves manually inspecting files to understand their structure, a process that can be time-consuming and error-prone.

Enter Snowflake’s Schema Detection, the first step in understanding your data and building data pipelines. Initially available for binary file formats like Avro, ORC, and Parquet, Schema Detection has now expanded its support to CSV and JSON in public preview.

With Schema Detection, Snowflake automates the process of determining and returning the schema of your staged…

--

--

Soumak Das

Sr. Data Engineer @EY & Snowflake/Airflow/Databricks/AWS writer