Writing Custom ClickHouse Queries for Application Security Analytics
A developer's tutorial on writing ClickHouse SQL queries for security analytics — find suspicious IPs, detect error patterns, and analyze application traffic using trace data.
Posted by
Related reading
Adding Security Observability to Your App in 15 Minutes with OpenTelemetry
A step-by-step developer guide to instrumenting your application with OpenTelemetry and connecting it to SecureNow for real-time security monitoring, threat detection, and AI-powered analysis.
From Traces to Security Alerts: A Developer's Guide to Threat Detection
Learn how developers can set up security alerts on their applications without a dedicated SOC — detect 4xx spikes, error patterns, and suspicious IPs using trace-based alert rules.
The Complete SecureNow Workflow: From First Trace to Incident Resolution
A comprehensive walkthrough of the entire SecureNow platform — from application setup and trace ingestion through alert rules, AI investigation, forensic analysis, and incident resolution.
Writing Custom ClickHouse Queries for Application Security Analytics
Your application generates thousands of traces per hour. Each trace contains spans with HTTP methods, URLs, status codes, client IPs, database queries, response times, and service-to-service call chains. That data sits in ClickHouse — a columnar database engine built for exactly the kind of analytical queries that security investigations demand. The question is whether you know how to ask it the right questions.
Most developers are comfortable with PostgreSQL or MySQL. ClickHouse SQL looks similar on the surface, but its columnar architecture and specialized functions unlock analytical patterns that traditional databases struggle with. Scanning billions of spans to find the ten IPs that exhibited suspicious behavior in the last hour? ClickHouse does that in under a second.
This tutorial teaches you the ClickHouse query patterns that matter for application security: the schema you're querying, the functions that make security analytics practical, and a library of ready-to-use queries you can run in SecureNow's forensics console or convert into alert rules.
Understanding the Trace Schema
SecureNow stores OpenTelemetry trace data in the SigNoz format. The primary table you'll query is signoz_traces.distributed_signoz_index_v2. Understanding its columns is essential for writing effective queries.
Key Columns for Security Analytics
| Column | Type | Description |
|---|---|---|
timestamp | DateTime64 | When the span was created |
traceID | String | Unique trace identifier |
spanID | String | Unique span identifier |
parentSpanID | String | Parent span (empty for root spans) |
serviceName | LowCardinality(String) | Application/service name |
name | LowCardinality(String) | Span name (usually the route or operation) |
kind | Int8 | Span kind (1=internal, 2=server, 3=client) |
duration | UInt64 | Span duration in nanoseconds |
response_status_code | LowCardinality(String) | HTTP status code |
attribute_string_http_method | LowCardinality(String) | HTTP method (GET, POST, etc.) |
attribute_string_http_url | String | Full request URL |
attribute_string_http_host | String | HTTP Host header |
attribute_string_peer_ip | String | Client IP address |
attribute_string_db_system | LowCardinality(String) | Database system (postgres, mysql, etc.) |
attribute_string_db_statement | String | Database query statement |
Additional columns exist for custom attributes you've added to your spans. Use the schema introspection feature in SecureNow's forensics console to explore the full column list for your specific data.
The __USER_APP_KEYS__ Placeholder
When writing queries for alert rules, use __USER_APP_KEYS__ instead of hardcoding service names. SecureNow replaces this placeholder with your registered application keys at execution time:
WHERE serviceName IN (__USER_APP_KEYS__)
This ensures rules automatically apply to all your registered services and adapt as you add new applications.
For ad-hoc forensic queries, you can either filter by specific service names or select an application context in the forensics UI, which injects the appropriate filter automatically.
ClickHouse Functions Every Security Analyst Should Know
ClickHouse includes a rich library of functions that make security analytics practical. These are the ones you'll reach for most often.
Time Bucketing with toStartOfInterval
Group events into time windows for trend analysis. This is essential for detecting spikes and anomalies:
SELECT
toStartOfInterval(timestamp, INTERVAL 5 MINUTE) AS time_bucket,
count(*) AS request_count,
countIf(response_status_code >= 400) AS error_count
FROM signoz_traces.distributed_signoz_index_v2
WHERE timestamp >= now() - INTERVAL 1 HOUR
AND serviceName IN (__USER_APP_KEYS__)
GROUP BY time_bucket
ORDER BY time_bucket
This gives you a 5-minute-resolution time series of total requests vs. errors — the basic shape of your traffic that makes spikes visually obvious.
Conditional Counting with countIf and sumIf
Count or sum only rows matching a condition, without needing subqueries:
countIf(response_status_code = 401) AS unauthorized_count,
countIf(response_status_code = 403) AS forbidden_count,
countIf(response_status_code >= 500) AS server_errors
This is dramatically more efficient than running separate filtered queries and joining the results.
Unique Value Collection with groupArray and groupUniqArray
Collect distinct values into an array for each group — perfect for seeing which endpoints an IP targeted or which methods were used:
groupArray(DISTINCT name) AS targeted_endpoints,
groupArray(DISTINCT attribute_string_http_method) AS methods_used
Exact and Approximate Distinct Counts
uniqExact() gives precise distinct counts. uniq() gives approximate counts that are faster on large datasets:
uniqExact(traceID) AS unique_traces,
uniq(attribute_string_peer_ip) AS approx_unique_ips
For security analytics, use uniqExact when precision matters (e.g., counting affected users) and uniq when you just need order-of-magnitude estimates.
String Matching with like, match, and extract
ClickHouse supports both SQL LIKE patterns and full regular expressions via match():
WHERE name LIKE '%/api/auth%'
WHERE match(attribute_string_http_url, '.*(union|select|insert|drop).*')
WHERE extract(attribute_string_http_url, '(\d+\.\d+\.\d+\.\d+)') != ''
The match() function uses RE2 syntax and is significantly more powerful than LIKE for pattern detection.
Array Operations with arrayJoin
OpenTelemetry attributes sometimes contain array values. arrayJoin expands arrays into individual rows, letting you analyze nested data:
SELECT
arrayJoin(attribute_string_tags) AS tag,
count(*) AS occurrences
FROM signoz_traces.distributed_signoz_index_v2
WHERE timestamp >= now() - INTERVAL 1 HOUR
GROUP BY tag
ORDER BY occurrences DESC
Security Query Library
Here are eight ready-to-use queries covering the most common security analytics scenarios. Run these directly in SecureNow's forensics console, save them to your query library, or adapt them into alert rules.
<!-- CTA:trial -->Query 1: Top IPs by Error Rate
Identify IPs that generate a disproportionate number of errors — the primary indicator of scanning and brute-force activity:
SELECT
attribute_string_peer_ip AS ip,
count(*) AS total_requests,
countIf(response_status_code >= 400 AND response_status_code < 500) AS client_errors,
countIf(response_status_code >= 500) AS server_errors,
round(client_errors / total_requests * 100, 2) AS client_error_rate,
round(server_errors / total_requests * 100, 2) AS server_error_rate,
min(timestamp) AS first_seen,
max(timestamp) AS last_seen
FROM signoz_traces.distributed_signoz_index_v2
WHERE timestamp >= now() - INTERVAL 1 HOUR
AND serviceName IN (__USER_APP_KEYS__)
AND attribute_string_peer_ip != ''
GROUP BY ip
HAVING total_requests > 10
ORDER BY client_errors DESC
LIMIT 25
The HAVING total_requests > 10 clause filters out IPs with too few requests to be statistically meaningful. Adjust this threshold based on your traffic volume.
Query 2: Suspicious User Agent Detection
Automated tools often use distinctive or missing User-Agent strings. This query surfaces unusual agents hitting your application:
SELECT
attribute_string_http_user_agent AS user_agent,
count(*) AS request_count,
uniqExact(attribute_string_peer_ip) AS unique_ips,
countIf(response_status_code >= 400) AS error_count,
round(error_count / request_count * 100, 2) AS error_rate,
groupArray(DISTINCT name)(10) AS sample_endpoints
FROM signoz_traces.distributed_signoz_index_v2
WHERE timestamp >= now() - INTERVAL 6 HOUR
AND serviceName IN (__USER_APP_KEYS__)
AND attribute_string_http_user_agent != ''
GROUP BY user_agent
HAVING request_count > 20
ORDER BY error_rate DESC
LIMIT 20
Look for user agents like python-requests, curl, Go-http-client, or empty strings with high error rates — these are common scanner signatures. The sample_endpoints column shows what they're targeting.
Query 3: Endpoint Abuse Detection
Find endpoints receiving abnormally high traffic. This catches API abuse, enumeration attacks, and denial-of-service attempts:
SELECT
name AS endpoint,
attribute_string_http_method AS method,
count(*) AS request_count,
uniqExact(attribute_string_peer_ip) AS unique_ips,
countIf(response_status_code >= 400) AS error_count,
avg(duration) / 1000000 AS avg_duration_ms,
max(duration) / 1000000 AS max_duration_ms
FROM signoz_traces.distributed_signoz_index_v2
WHERE timestamp >= now() - INTERVAL 1 HOUR
AND serviceName IN (__USER_APP_KEYS__)
AND kind = 2
GROUP BY endpoint, method
ORDER BY request_count DESC
LIMIT 20
The kind = 2 filter selects only server spans (incoming requests), excluding internal and client spans. This gives you the true request count per endpoint.
Query 4: Slow Query Detection (Potential DoS or Injection)
Unusually slow database operations can indicate SQL injection probes, resource exhaustion attacks, or queries designed to cause denial-of-service:
SELECT
attribute_string_db_system AS db_system,
attribute_string_db_statement AS query,
duration / 1000000 AS duration_ms,
serviceName AS service,
attribute_string_peer_ip AS source_ip,
timestamp
FROM signoz_traces.distributed_signoz_index_v2
WHERE timestamp >= now() - INTERVAL 1 HOUR
AND serviceName IN (__USER_APP_KEYS__)
AND attribute_string_db_system != ''
AND duration > 5000000000
ORDER BY duration DESC
LIMIT 20
The duration > 5000000000 filter catches queries taking more than 5 seconds (duration is stored in nanoseconds). Examine the db_statement column for injection patterns — UNION SELECT, SLEEP(), BENCHMARK(), or pg_sleep() are strong indicators.
Query 5: Traffic Distribution Over Time
Visualize traffic patterns to spot anomalies — sudden spikes, unusual off-hours activity, or traffic drops that might indicate a DDoS:
SELECT
toStartOfInterval(timestamp, INTERVAL 5 MINUTE) AS time_bucket,
count(*) AS total_requests,
uniqExact(attribute_string_peer_ip) AS unique_ips,
countIf(response_status_code >= 200 AND response_status_code < 300) AS success,
countIf(response_status_code >= 400 AND response_status_code < 500) AS client_error,
countIf(response_status_code >= 500) AS server_error
FROM signoz_traces.distributed_signoz_index_v2
WHERE timestamp >= now() - INTERVAL 24 HOUR
AND serviceName IN (__USER_APP_KEYS__)
AND kind = 2
GROUP BY time_bucket
ORDER BY time_bucket
Plot the results and look for patterns: does traffic follow a normal diurnal curve? Are there sudden bursts of client errors? Does the unique IP count spike at unusual times?
Query 6: Cross-Service Request Tracing
Track how a suspicious request propagated through your microservices. This is essential for understanding the blast radius of an attack:
SELECT
serviceName AS service,
name AS operation,
spanID,
parentSpanID,
duration / 1000000 AS duration_ms,
response_status_code AS status,
attribute_string_http_method AS method,
attribute_string_http_url AS url,
attribute_string_db_statement AS db_query
FROM signoz_traces.distributed_signoz_index_v2
WHERE traceID = 'your-trace-id-here'
ORDER BY timestamp
Replace 'your-trace-id-here' with the trace ID from an alert or the trace explorer. This query reconstructs the full request path across services, showing every database query, outbound call, and status code in execution order.
Query 7: Authentication Endpoint Analysis
Deep analysis of authentication patterns — the most critical attack surface for most applications:
SELECT
attribute_string_peer_ip AS ip,
countIf(response_status_code = 200) AS successful_logins,
countIf(response_status_code = 401) AS failed_logins,
countIf(response_status_code = 429) AS rate_limited,
count(*) AS total_attempts,
round(failed_logins / total_attempts * 100, 2) AS failure_rate,
min(timestamp) AS first_attempt,
max(timestamp) AS last_attempt,
dateDiff('second', min(timestamp), max(timestamp)) AS duration_seconds
FROM signoz_traces.distributed_signoz_index_v2
WHERE timestamp >= now() - INTERVAL 6 HOUR
AND serviceName IN (__USER_APP_KEYS__)
AND (name LIKE '%/auth%' OR name LIKE '%/login%' OR name LIKE '%/token%')
AND attribute_string_peer_ip != ''
GROUP BY ip
HAVING total_attempts > 5
ORDER BY failed_logins DESC
LIMIT 25
The duration_seconds column reveals attack velocity — an IP generating 100 failed logins in 30 seconds is running an automated tool, while 100 failures spread over 6 hours might be a shared NAT.
Query 8: Outbound Connection Monitoring
Detect your application making unexpected outbound HTTP calls — a key indicator of SSRF, data exfiltration, or compromised dependencies:
SELECT
attribute_string_http_host AS destination_host,
count(*) AS call_count,
groupArray(DISTINCT serviceName) AS source_services,
groupArray(DISTINCT attribute_string_http_method) AS methods,
countIf(response_status_code >= 400) AS error_count,
min(timestamp) AS first_seen,
max(timestamp) AS last_seen
FROM signoz_traces.distributed_signoz_index_v2
WHERE timestamp >= now() - INTERVAL 24 HOUR
AND serviceName IN (__USER_APP_KEYS__)
AND kind = 3
AND attribute_string_http_host != ''
GROUP BY destination_host
ORDER BY call_count DESC
LIMIT 30
The kind = 3 filter selects client spans — outbound calls your application makes. Review the list of destination hosts: do you recognize all of them? Unknown domains, IP addresses instead of hostnames, or cloud metadata endpoints (169.254.169.254) are red flags. For a real-world example of catching a supply chain attack through outbound call monitoring, see How a Developer Caught a Supply Chain Attack in Trace Data.
Using the Query Library
Writing queries from scratch every time is inefficient. SecureNow's Query Library lets you save, organize, and reuse queries.
Saving Queries
After running a query in the forensics console, click Save to Library. Add:
- Name — descriptive, like "Top error-generating IPs (1 hour)"
- Description — what the query detects and when to use it
- Category — group related queries (e.g., "Authentication", "IP Analysis", "Traffic Patterns")
- Tags — flexible labels for cross-category search (e.g., "brute-force", "scanning", "baseline")
Executing Saved Queries
Open the Query Library, find your saved query, and click Execute. The query runs immediately against your current application context. No copy-pasting, no remembering syntax.
Sharing with Your Team
Saved queries are visible to your team members. Build a shared library of security queries tailored to your applications — new team members can start investigating immediately without learning the schema from scratch.
For more on the forensics system including natural language queries, see Security Forensics with Natural Language.
Schema Introspection
Not sure which columns are available in your data? SecureNow's forensics console includes a Schema Introspection feature that lets you explore:
- All databases and tables available in your ClickHouse instance
- Column names, data types, and cardinality for each table
- Sample values from key columns
This is especially useful when your OpenTelemetry instrumentation includes custom span attributes. If you've added attributes like auth.user_id or billing.plan_type to your spans, the schema introspection shows them as queryable columns, and you can incorporate them into your security queries.
Multi-Instance Considerations
If you're running dedicated ClickHouse instances for different applications (rather than the shared default instance), keep these points in mind:
- Forensic queries execute against the instance bound to the selected application
- Alert rules with
__USER_APP_KEYS__automatically route to the correct instance - Cross-instance queries are not supported — if you need to correlate data across applications on different instances, run separate queries and correlate the results manually
- Each instance has its own schema, which may differ if applications use different OpenTelemetry instrumentation configurations
Converting Queries to Alert Rules
The bridge between forensics and detection is straightforward. Any query that identifies a threat pattern can become an alert rule. When adapting a forensic query for alerting:
- Replace absolute time filters with relative intervals (
now() - INTERVAL 15 MINUTE) - Add the
__USER_APP_KEYS__filter if the query should scope to your applications - Ensure the query returns rows only when the threat condition is met (use
HAVINGfor thresholds) - Add a
LIMITclause to cap result size - Remove any columns that aren't useful in the notification context
For a complete guide on building alert rules, see From Traces to Security Alerts: A Developer's Guide.
<!-- CTA:demo -->Writing Queries That Perform
ClickHouse is fast, but poorly structured queries can still take longer than necessary on large datasets. Follow these guidelines:
- Always filter by timestamp first. ClickHouse partitions data by time. A time filter lets the engine skip entire partitions.
- Use
LowCardinalitycolumns in WHERE clauses. Columns likeserviceName,name, andattribute_string_http_methodareLowCardinality, meaning ClickHouse can filter them extremely efficiently. - Prefer
countIfover subqueries. MultiplecountIfexpressions in a single query are orders of magnitude faster than separate queries with different WHERE clauses. - Use
LIMITgenerously. You rarely need more than 25 results for investigation. Limiting output reduces memory usage and network transfer. - Avoid
SELECT *. Columnar databases perform best when you select only the columns you need. Each additional column is an additional data stream to read.
Your trace data is a security goldmine. The queries in this guide give you the tools to extract threat intelligence from it — whether through ad-hoc investigation, saved library queries, or automated alert rules running around the clock.
Frequently Asked Questions
What ClickHouse tables does SecureNow use?
SecureNow queries signoz_traces.distributed_signoz_index_v2 for trace data, signoz_logs for log data, and signoz_metrics for metrics. The traces table contains span data with HTTP methods, URLs, status codes, IP addresses, and more.
Can I run arbitrary SQL queries?
SecureNow restricts forensic queries to SELECT statements for safety. You can query any column in the trace tables, use aggregations, window functions, and ClickHouse-specific functions.
How do I scope queries to my applications?
Use the serviceName column to filter by application, or let SecureNow automatically inject your application keys using the __USER_APP_KEYS__ placeholder in alert rules.
Can I save and reuse queries?
Yes, the query library lets you save queries with names, descriptions, categories, and tags. You can execute saved queries with one click and share them with your team.