Searching in

Enter search term to find items

to navigate, to select, and to close

Azure Synapse Analytics Deep Dive: Serverless SQL

Posted on January 27, 2020 by Nithin Mohan TK 1 min read

Azure Synapse Analytics’ serverless SQL pool lets you query data lake files with T-SQL. No infrastructure to manage, pay only for queries. Here’s how it works.

Data analytics dashboard — Photo by Luke Chesser on Unsplash

Query Data Lake with SQL

-- Query Parquet files directly
SELECT 
    ProductCategory,
    SUM(SalesAmount) as TotalSales
FROM OPENROWSET(
    BULK 'https://myaccount.dfs.core.windows.net/data/sales/*.parquet',
    FORMAT = 'PARQUET'
) AS sales
GROUP BY ProductCategory
ORDER BY TotalSales DESC;

Create External Tables

CREATE EXTERNAL TABLE Sales (
    ProductId INT,
    SalesAmount DECIMAL(10,2),
    SalesDate DATE
)
WITH (
    LOCATION = 'sales/*.parquet',
    DATA_SOURCE = MyDataLake,
    FILE_FORMAT = ParquetFormat
);

Use Cases

Ad-hoc exploration: Query without loading
Data transformation: CETAS for ETL
Logical data warehouse: Views over lake files

Pricing Model

Pay per TB scanned. Use partitioning and column pruning to minimize costs. Parquet format is highly recommended for cost efficiency.

References

Serverless SQL Pool

Related

Discover more from C4: Container, Code, Cloud & Context

Subscribe to get the latest posts sent to your email.

Posted in Uncategorized

Tagged Analytics, Azure

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.