Azure Synapse Analytics Deep Dive: Serverless SQL

Azure Synapse Analytics’ serverless SQL pool lets you query data lake files with T-SQL. No infrastructure to manage, pay only for queries. Here’s how it works.

Data analytics dashboard
Photo by Luke Chesser on Unsplash

Query Data Lake with SQL

-- Query Parquet files directly
SELECT 
    ProductCategory,
    SUM(SalesAmount) as TotalSales
FROM OPENROWSET(
    BULK 'https://myaccount.dfs.core.windows.net/data/sales/*.parquet',
    FORMAT = 'PARQUET'
) AS sales
GROUP BY ProductCategory
ORDER BY TotalSales DESC;

Create External Tables

CREATE EXTERNAL TABLE Sales (
    ProductId INT,
    SalesAmount DECIMAL(10,2),
    SalesDate DATE
)
WITH (
    LOCATION = 'sales/*.parquet',
    DATA_SOURCE = MyDataLake,
    FILE_FORMAT = ParquetFormat
);

Use Cases

  • Ad-hoc exploration: Query without loading
  • Data transformation: CETAS for ETL
  • Logical data warehouse: Views over lake files

Pricing Model

Pay per TB scanned. Use partitioning and column pruning to minimize costs. Parquet format is highly recommended for cost efficiency.

References


Discover more from C4: Container, Code, Cloud & Context

Subscribe to get the latest posts sent to your email.

Leave a comment

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.