While Cosmos DB is known for its document (SQL) API, its Graph API powered by Apache TinkerPop and Gremlin query language is incredibly powerful for connected data. If your data is about relationships – social networks, recommendations, knowledge graphs – the Graph API might be the right choice.
When to Use Graph
Graph databases excel when:
- Relationships are as important as the data itself
- You need to traverse connections of arbitrary depth
- Queries like “friends of friends who like X” are common
- Your domain is naturally a network: social, fraud detection, knowledge
They’re not ideal for simple CRUD operations or when you always query by a single key.
Graph Concepts
- Vertices (Nodes): Entities like Person, Product, Location
- Edges (Relationships): Connections like “knows”, “purchased”, “located_in”
- Properties: Key-value attributes on vertices and edges
Gremlin Query Examples
// Add a vertex (person)
g.addV('person')
.property('id', 'john')
.property('name', 'John')
.property('age', 35)
// Add another vertex
g.addV('person')
.property('id', 'jane')
.property('name', 'Jane')
// Create relationship
g.V('john').addE('knows').to(g.V('jane'))
.property('since', 2020)
// Find John's friends
g.V('john').out('knows').values('name')
// Result: ["Jane"]
// Find friends of friends
g.V('john').out('knows').out('knows').values('name')
// Recommendation: products bought by people who bought same products as John
g.V('john').out('purchased')
.in('purchased').where(neq('john'))
.out('purchased')
.where(__.not(__.in('purchased').is('john')))
.dedup()
.limit(5)
.NET Client
var gremlinServer = new GremlinServer(
hostname: "myaccount.gremlin.cosmos.azure.com",
port: 443,
username: "/dbs/graphdb/colls/social",
password: primaryKey);
using var client = new GremlinClient(gremlinServer);
// Execute query
var results = await client.SubmitAsync<dynamic>(
"g.V('john').out('knows').values('name')");
foreach (var result in results)
{
Console.WriteLine(result);
}
Partition Strategy
Partitioning is crucial in Cosmos DB. For graphs, choose a partition key that keeps connected vertices together when possible. Common strategies:
- Partition by tenant/organization in multi-tenant systems
- Partition by locale for location-based graphs
- Use a synthetic partition key combining entity type and region
Key Takeaways
- Use Graph API when relationships are the core of your domain
- Gremlin provides powerful traversal queries for connected data
- Plan partitioning to keep related vertices together
- Great for social networks, recommendations, and knowledge graphs
References
Discover more from C4: Container, Code, Cloud & Context
Subscribe to get the latest posts sent to your email.