Twenty Years with SQL: The Language That Never Goes Out of Style

Twenty years ago, I had zero programming experience. Today, I'm a data architect who watched countless technologies rise and fall while SQL remained unshakeable. From converting COBOL programs to solving critical performance crises, SQL became my career passport through developer, analyst, engineer, and architect role

Optimize Delta Lake Storage with VACUUM Command

As a data engineer managing batch file processing with Databricks, I recently encountered a storage issue that many teams face: rapidly increasing storage volume. In this blog, I'll share the challenge I faced with my Delta Lake storage, how I resolved it, and the benefits I gained by implementing Databricks' VACUUM command to manage storage... Continue Reading →

Optimizing Parallel Data Loads to Delta Lake: A Concurrency Issue Solution

The data lake architecture utilizes SFTP for data uploads from multiple customers, requiring parallel file loading into Delta Lake. Concurrency issues arose during merging operations, primarily due to simultaneous updates. The team implemented table partitioning by Customer ID and added retry logic to mitigate conflicts, planning a future upgrade to Databricks Runtime 15.4.

Blog at WordPress.com.

Up ↑