Skip to content

Commit ad6f680

Browse files
authored
add cursor rules (#76)
1 parent f8463d5 commit ad6f680

File tree

7 files changed

+846
-0
lines changed

7 files changed

+846
-0
lines changed
Lines changed: 363 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,363 @@
1+
---
2+
description:
3+
globs:
4+
alwaysApply: false
5+
---
6+
# Rule Name: dql-aggregation
7+
# Description: Generate DQL queries for aggregation operations including counting, summing, averaging, and other mathematical operations on graph data.
8+
9+
# DQL Aggregation Query Patterns
10+
11+
DQL aggregation queries perform mathematical operations and calculations on graph data, including counting nodes, relationships, and computing statistical values.
12+
13+
Follow [dql-language.mdc](mdc:.cursor/rules/dql-language.mdc) to generate valid DQL.
14+
15+
Respond with the parameterized query, include meaningful parameters and identifier values. Include comments in the query to explain the aggregation steps.
16+
17+
Don't prompt the user for anything else. Just produce the query.
18+
19+
---
20+
21+
## 1. Basic Counting Operations
22+
23+
**Pattern**: Count entities, relationships, or specific attribute values.
24+
25+
**Instructions**:
26+
- Use `count()` function to count nodes or relationships
27+
- Use `count(predicate)` to count specific relationships
28+
- Combine with filters to count subsets
29+
- Use `has()` to ensure entities have the predicate before counting
30+
31+
**Generic Template**:
32+
```dql
33+
query countEntities($filterValue: string = "FILTER_VALUE") {
34+
# Count total entities
35+
totalCount(func: has(ENTITY.IDENTIFIER)) {
36+
totalEntities: count(uid)
37+
}
38+
39+
# Count entities with specific criteria
40+
filteredCount(func: has(ENTITY.ATTRIBUTE))
41+
@filter(eq(ENTITY.FILTER_ATTRIBUTE, $filterValue)) {
42+
filteredEntities: count(uid)
43+
}
44+
45+
# Count relationships per entity
46+
entityRelationCounts(func: has(ENTITY.RELATION)) {
47+
ENTITY.IDENTIFIER
48+
relationCount: count(ENTITY.RELATION)
49+
}
50+
}
51+
```
52+
53+
**Replace**:
54+
- `ENTITY` with the entity type
55+
- `IDENTIFIER`, `ATTRIBUTE`, `FILTER_ATTRIBUTE` with actual attribute names
56+
- `RELATION` with relationship predicate
57+
- `FILTER_VALUE` with the filter criteria
58+
59+
---
60+
61+
## 2. Statistical Aggregations (Sum, Average, Min, Max)
62+
63+
**Pattern**: Perform mathematical operations on numeric attributes.
64+
65+
**Instructions**:
66+
- Use `sum()`, `avg()`, `min()`, `max()` functions
67+
- Apply to numeric predicates only
68+
- Combine with grouping for category-wise statistics
69+
- Use `val()` to reference computed values
70+
71+
**Generic Template**:
72+
```dql
73+
query getStatistics($groupBy: string = "GROUP_VALUE") {
74+
# Overall statistics
75+
overallStats(func: has(ENTITY.NUMERIC_ATTRIBUTE)) {
76+
totalSum: sum(val(ENTITY.NUMERIC_ATTRIBUTE))
77+
average: avg(val(ENTITY.NUMERIC_ATTRIBUTE))
78+
minimum: min(val(ENTITY.NUMERIC_ATTRIBUTE))
79+
maximum: max(val(ENTITY.NUMERIC_ATTRIBUTE))
80+
count: count(uid)
81+
}
82+
83+
# Grouped statistics
84+
groupedStats(func: eq(ENTITY.GROUP_ATTRIBUTE, $groupBy)) {
85+
ENTITY.GROUP_ATTRIBUTE
86+
groupSum: sum(val(ENTITY.NUMERIC_ATTRIBUTE))
87+
groupAvg: avg(val(ENTITY.NUMERIC_ATTRIBUTE))
88+
groupCount: count(uid)
89+
# Include entities in this group
90+
ENTITY.RELATION {
91+
RELATED_ENTITY.IDENTIFIER
92+
RELATED_ENTITY.NUMERIC_ATTRIBUTE
93+
}
94+
}
95+
}
96+
```
97+
98+
**Replace**:
99+
- `NUMERIC_ATTRIBUTE` with numeric predicate (e.g., `price`, `age`, `score`)
100+
- `GROUP_ATTRIBUTE` with grouping predicate (e.g., `category`, `status`)
101+
- `GROUP_VALUE` with the specific group to analyze
102+
103+
---
104+
105+
## 3. Hierarchical Aggregations
106+
107+
**Pattern**: Aggregate data across hierarchical relationships (parent-child, category-subcategory).
108+
109+
**Instructions**:
110+
- Use nested aggregations to roll up values
111+
- Combine parent and child counts/sums
112+
- Use variables to propagate values up the hierarchy
113+
- Include both individual and cumulative totals
114+
115+
**Generic Template**:
116+
```dql
117+
query getHierarchicalAggregation($parentId: string = "PARENT_VALUE") {
118+
# Parent level aggregation
119+
parentAggregation(func: eq(PARENT_ENTITY.IDENTIFIER, $parentId)) {
120+
PARENT_ENTITY.IDENTIFIER
121+
PARENT_ENTITY.ATTRIBUTE
122+
123+
# Direct children aggregation
124+
directChildrenCount: count(PARENT_ENTITY.CHILD_RELATION)
125+
directChildrenSum: sum(val(PARENT_ENTITY.CHILD_RELATION.NUMERIC_ATTRIBUTE))
126+
127+
# Detailed children with their own aggregations
128+
PARENT_ENTITY.CHILD_RELATION {
129+
CHILD_ENTITY.IDENTIFIER
130+
CHILD_ENTITY.NUMERIC_ATTRIBUTE
131+
132+
# Grandchildren aggregation
133+
grandchildrenCount: count(CHILD_ENTITY.GRANDCHILD_RELATION)
134+
grandchildrenSum: sum(val(CHILD_ENTITY.GRANDCHILD_RELATION.NUMERIC_ATTRIBUTE))
135+
136+
# Include grandchildren details
137+
CHILD_ENTITY.GRANDCHILD_RELATION {
138+
GRANDCHILD_ENTITY.IDENTIFIER
139+
GRANDCHILD_ENTITY.NUMERIC_ATTRIBUTE
140+
}
141+
}
142+
}
143+
}
144+
```
145+
146+
**Replace**:
147+
- `PARENT_ENTITY`, `CHILD_ENTITY`, `GRANDCHILD_ENTITY` with entity types
148+
- `CHILD_RELATION`, `GRANDCHILD_RELATION` with relationship predicates
149+
- `NUMERIC_ATTRIBUTE` with the attribute to aggregate
150+
151+
---
152+
153+
## 4. Time-based Aggregations
154+
155+
**Pattern**: Aggregate data by time periods (daily, monthly, yearly).
156+
157+
**Instructions**:
158+
- Use date/time functions with aggregations
159+
- Group by time periods using date extraction
160+
- Use `ge()`, `le()` for date range filtering
161+
- Combine with other filters for specific time-based analysis
162+
163+
**Generic Template**:
164+
```dql
165+
query getTimeBasedAggregation($startDate: string = "2024-01-01", $endDate: string = "2024-12-31") {
166+
# Aggregation within date range
167+
timeRangeStats(func: has(ENTITY.DATE_ATTRIBUTE))
168+
@filter(ge(ENTITY.DATE_ATTRIBUTE, $startDate) AND le(ENTITY.DATE_ATTRIBUTE, $endDate)) {
169+
170+
# Overall stats for the period
171+
totalCount: count(uid)
172+
totalSum: sum(val(ENTITY.NUMERIC_ATTRIBUTE))
173+
avgValue: avg(val(ENTITY.NUMERIC_ATTRIBUTE))
174+
175+
# Group by related entity (e.g., by category, user, etc.)
176+
ENTITY.GROUP_RELATION {
177+
GROUP_ENTITY.IDENTIFIER
178+
periodCount: count(~ENTITY.GROUP_RELATION @filter(ge(ENTITY.DATE_ATTRIBUTE, $startDate) AND le(ENTITY.DATE_ATTRIBUTE, $endDate)))
179+
periodSum: sum(val(~ENTITY.GROUP_RELATION.NUMERIC_ATTRIBUTE @filter(ge(ENTITY.DATE_ATTRIBUTE, $startDate) AND le(ENTITY.DATE_ATTRIBUTE, $endDate))))
180+
}
181+
}
182+
}
183+
```
184+
185+
**Replace**:
186+
- `DATE_ATTRIBUTE` with date/datetime predicate
187+
- `GROUP_RELATION` with the relationship to group by
188+
- `GROUP_ENTITY` with the entity type to group by
189+
190+
---
191+
192+
## 5. Complex Multi-level Aggregations
193+
194+
**Pattern**: Perform aggregations across multiple relationship levels with complex conditions.
195+
196+
**Instructions**:
197+
- Use variables to collect UIDs at different levels
198+
- Apply multiple aggregation functions
199+
- Use `val()` for computed value references
200+
- Combine filtering with aggregation
201+
202+
**Generic Template**:
203+
```dql
204+
query getComplexAggregation($param1: string = "VALUE1", $threshold: int = 100) {
205+
# Step 1: Identify entities meeting criteria
206+
var(func: eq(ENTITY1.ATTRIBUTE, $param1)) {
207+
entity1_set as uid
208+
}
209+
210+
# Step 2: Find related entities and compute intermediate values
211+
var(func: uid(entity1_set)) {
212+
ENTITY1.RELATION1 {
213+
intermediate_entities as uid
214+
intermediate_value as ENTITY2.NUMERIC_ATTRIBUTE
215+
}
216+
}
217+
218+
# Step 3: Aggregate intermediate values
219+
var(func: uid(intermediate_entities)) {
220+
total_intermediate as sum(val(intermediate_value))
221+
}
222+
223+
# Step 4: Final aggregation with conditions
224+
complexAggregation(func: uid(entity1_set)) {
225+
ENTITY1.IDENTIFIER
226+
ENTITY1.ATTRIBUTE
227+
228+
# Direct aggregations
229+
directCount: count(ENTITY1.RELATION1)
230+
directSum: sum(val(ENTITY1.RELATION1.NUMERIC_ATTRIBUTE))
231+
232+
# Conditional aggregations
233+
highValueCount: count(ENTITY1.RELATION1 @filter(gt(ENTITY2.NUMERIC_ATTRIBUTE, $threshold)))
234+
235+
# Multi-level aggregations
236+
ENTITY1.RELATION1 {
237+
ENTITY2.IDENTIFIER
238+
ENTITY2.NUMERIC_ATTRIBUTE
239+
nestedCount: count(ENTITY2.RELATION2)
240+
nestedSum: sum(val(ENTITY2.RELATION2.NUMERIC_ATTRIBUTE))
241+
}
242+
243+
# Reference computed totals
244+
totalIntermediate: val(total_intermediate)
245+
}
246+
}
247+
```
248+
249+
---
250+
251+
## 6. Ranking and Top-K Aggregations
252+
253+
**Pattern**: Find top/bottom entities based on aggregated values.
254+
255+
**Instructions**:
256+
- Use `orderdesc` or `orderasc` for sorting
257+
- Use `first` parameter to limit results
258+
- Combine aggregation with ranking
259+
- Use `val()` to sort by computed values
260+
261+
**Generic Template**:
262+
```dql
263+
query getTopEntities($topK: int = 10, $minThreshold: int = 0) {
264+
# Find entities with aggregated values
265+
var(func: has(ENTITY.RELATION)) {
266+
ENTITY.IDENTIFIER
267+
aggregated_value as sum(val(ENTITY.RELATION.NUMERIC_ATTRIBUTE))
268+
}
269+
270+
# Get top K entities by aggregated value
271+
topEntities(func: uid(aggregated_value), orderdesc: val(aggregated_value), first: $topK)
272+
@filter(gt(val(aggregated_value), $minThreshold)) {
273+
274+
ENTITY.IDENTIFIER
275+
ENTITY.ATTRIBUTE
276+
totalValue: val(aggregated_value)
277+
278+
# Show breakdown of the aggregated value
279+
relationCount: count(ENTITY.RELATION)
280+
avgValue: avg(val(ENTITY.RELATION.NUMERIC_ATTRIBUTE))
281+
282+
# Include top contributing relationships
283+
ENTITY.RELATION (orderdesc: RELATED_ENTITY.NUMERIC_ATTRIBUTE, first: 5) {
284+
RELATED_ENTITY.IDENTIFIER
285+
RELATED_ENTITY.NUMERIC_ATTRIBUTE
286+
}
287+
}
288+
}
289+
```
290+
291+
---
292+
293+
## 7. Conditional Aggregations
294+
295+
**Pattern**: Perform aggregations with complex conditional logic.
296+
297+
**Instructions**:
298+
- Use `@filter` within aggregation functions
299+
- Combine multiple conditions with `AND`, `OR`, `NOT`
300+
- Use `uid_in()` for relationship-based conditions
301+
- Apply different aggregations based on conditions
302+
303+
**Generic Template**:
304+
```dql
305+
query getConditionalAggregation($condition1: string = "VALUE1", $condition2: int = 50) {
306+
conditionalAggregation(func: has(ENTITY.IDENTIFIER)) {
307+
ENTITY.IDENTIFIER
308+
309+
# Total counts
310+
totalRelations: count(ENTITY.RELATION)
311+
312+
# Conditional counts
313+
condition1Count: count(ENTITY.RELATION @filter(eq(RELATED_ENTITY.ATTRIBUTE1, $condition1)))
314+
condition2Count: count(ENTITY.RELATION @filter(gt(RELATED_ENTITY.NUMERIC_ATTRIBUTE, $condition2)))
315+
bothConditionsCount: count(ENTITY.RELATION @filter(eq(RELATED_ENTITY.ATTRIBUTE1, $condition1) AND gt(RELATED_ENTITY.NUMERIC_ATTRIBUTE, $condition2)))
316+
317+
# Conditional sums
318+
condition1Sum: sum(val(ENTITY.RELATION.NUMERIC_ATTRIBUTE @filter(eq(RELATED_ENTITY.ATTRIBUTE1, $condition1))))
319+
condition2Sum: sum(val(ENTITY.RELATION.NUMERIC_ATTRIBUTE @filter(gt(RELATED_ENTITY.NUMERIC_ATTRIBUTE, $condition2))))
320+
321+
# Conditional averages
322+
condition1Avg: avg(val(ENTITY.RELATION.NUMERIC_ATTRIBUTE @filter(eq(RELATED_ENTITY.ATTRIBUTE1, $condition1))))
323+
condition2Avg: avg(val(ENTITY.RELATION.NUMERIC_ATTRIBUTE @filter(gt(RELATED_ENTITY.NUMERIC_ATTRIBUTE, $condition2))))
324+
}
325+
}
326+
```
327+
328+
---
329+
330+
## Validation Guidelines
331+
332+
- Always validate that aggregation functions are applied to appropriate data types
333+
- Use meaningful parameter names and default values for thresholds and limits
334+
- Include comments explaining the aggregation logic
335+
- Consider performance implications of complex aggregations
336+
- Use variables efficiently to avoid redundant calculations
337+
- Test aggregation results with known data sets
338+
- Handle edge cases (empty results, null values)
339+
- Use appropriate root functions to minimize the initial result set
340+
341+
---
342+
343+
## Common Aggregation Functions
344+
345+
- `count(predicate)` - Count relationships or nodes
346+
- `count(uid)` - Count nodes in current context
347+
- `sum(val(predicate))` - Sum numeric values
348+
- `avg(val(predicate))` - Average of numeric values
349+
- `min(val(predicate))` - Minimum numeric value
350+
- `max(val(predicate))` - Maximum numeric value
351+
- `val(variable)` - Reference computed values
352+
- `math(expression)` - Custom mathematical expressions
353+
354+
---
355+
356+
## Performance Considerations
357+
358+
- Use `has(predicate)` to filter entities before aggregation
359+
- Apply filters early to reduce the dataset size
360+
- Use variables to avoid recomputing the same aggregations
361+
- Consider using `first` parameter to limit large result sets
362+
- Be cautious with deep nesting in aggregation queries
363+
- Use appropriate indexes on predicates used in aggregations

0 commit comments

Comments
 (0)