The Accidental Medallion Architecture

The Accidental Medallion Architecture

our first (or maybe it was second…) client way back in 2011 wanted to create a customer 360 (total view of customer) dataset by combining many disparate datasets. the datasets had been acquired from different 3rd party vendors. they were very large datasets (think usa...
The Accidental Medallion Architecture

Class Methods on Null in Scala

i recently was debugging a simple refactoring of some scala code that led to a surprise NullPointerException. we generally avoid using null and prefer to use Option[T] where None replaces (null: T), and we assume in our code base that no null will be passed in. in...
The Accidental Medallion Architecture

Developments for Delta Lake

at tresata we have been using and supporting the delta open source format since 2019 (the year it was open sourced). for us it has been more or less parquet+, e.g. parquet format with some added benefits. the main benefit to us is better/safer support for concurrent...
The Accidental Medallion Architecture

Why We are excited about spark 3.2

i wanted to use this post to summarize what is exciting about the spark 3.2.x release from the perspective of tresata. background: tresata uses spark for analytics and machine learning on large amounts of data. we exclusively use the scala api (mostly dataframes but...