Security Forensics with Natural Language: Query Your Traces Like You Think

You are investigating an incident. A suspicious pattern appeared in your alerts at 3:47 AM. You need to know: which IPs made requests to your authentication endpoints in the last six hours, how many failed attempts came from each source, and whether any of those IPs also accessed your admin panel. The data exists in your ClickHouse database, spread across millions of trace records.

The traditional approach requires you to know ClickHouse SQL syntax, understand the OpenTelemetry schema, remember which columns hold IP addresses versus HTTP paths, figure out the correct timestamp format, and construct a multi-join query with proper aggregation. Even experienced analysts spend 10–15 minutes writing and debugging these queries. During an active incident, that delay is unacceptable.

What if you could simply ask: "Show me all IPs that hit /api/auth with more than 50 failed requests in the last 6 hours, and check if any of them also accessed /admin endpoints"?

That is exactly what SecureNow's natural language forensics system does. You type a question in plain English. The AI converts it to optimized ClickHouse SQL. The query executes against your live trace data. Results appear in seconds.

The Gap Between Security Questions and Database Queries

Security investigators think in questions. Databases think in SQL. The translation between these two is where investigations stall.

The NIST Computer Security Incident Handling Guide (SP 800-61) outlines a structured incident response process where the analysis phase requires rapid correlation of evidence across multiple data sources. In practice, most teams hit a bottleneck here—not because the data is missing, but because extracting it requires specialized query skills that not every security professional possesses.

ClickHouse is extraordinarily powerful for security analytics. Its columnar architecture processes billions of spans in seconds. But its SQL dialect has idiosyncrasies—DateTime64 handling, LowCardinality types, materialized column references, array functions for nested attributes—that make it hostile to anyone who is not a database specialist.

This creates a two-tier system in most security teams. A handful of senior engineers who know the schema can answer forensic questions rapidly. Everyone else files a ticket and waits. During an incident, that bottleneck can mean the difference between containment and breach.

How SecureNow's NL-to-SQL Forensics Works

SecureNow bridges this gap with an AI-powered forensics system that converts natural language questions into accurate, optimized ClickHouse queries. The system is not a simple text-to-SQL translator—it understands your specific data environment and generates queries that are both correct and performant.

Schema Introspection: Understanding Your Data

Before generating any SQL, the AI needs to understand what data is available. SecureNow's forensics system performs deep schema introspection, gathering:

Databases and tables — the full catalog of available data stores, including signoz_traces, signoz_logs, and signoz_metrics
Column definitions — every column name, data type, and whether it is a LowCardinality, Nullable, or materialized column
Index information — which columns are indexed and how, enabling the AI to generate queries that hit indexes rather than triggering full table scans
Table relationships — how trace, log, and metric tables relate through trace IDs, span IDs, and timestamps
Sample values — representative data from key columns, so the AI understands what actual service names, HTTP methods, and status codes look like in your environment

This context is assembled and provided to the AI alongside your natural language question, enabling it to generate queries that are syntactically correct for ClickHouse, use the right table and column references, leverage indexes for performance, and return the specific data you asked for.

Application and Instance Scoping

Security investigations are rarely global. You are usually investigating a specific application, service, or deployment. SecureNow's forensics system supports scoping queries to specific applications and instances, automatically filtering results to the relevant subset of your trace data.

When you select an application context before asking your question, the AI incorporates appropriate WHERE clauses to limit results to that service—eliminating noise from unrelated applications and dramatically improving query performance on large datasets.

The Query Generation Pipeline

When you submit a natural language question, the system follows a structured pipeline:

Parse intent — identify what the user is looking for (counts, specific records, aggregations, time-series data)
Map entities — connect natural language terms to schema elements ("failed requests" → http.status_code >= 400, "Russian IPs" → GeoIP resolution)
Construct query — generate syntactically valid ClickHouse SQL with proper joins, aggregations, and time filters
Validate safety — verify the query is a SELECT statement with no data modification capabilities
Execute — run the query against ClickHouse in a read-only context
Present results — display the data in a formatted table with the generated SQL visible for review

Natural Language Queries in Practice

The power of NL-to-SQL forensics becomes clear through real investigative scenarios. Here are examples of natural language questions and the kinds of queries they produce.

Threat Hunting

Question: "Show me all 500 errors from Russian IPs in the last 24 hours"

The AI generates a query that joins trace data with GeoIP resolution, filters for server errors (status code 500), constrains the time window to 24 hours, and groups results by source IP with count and timeline distribution.

Question: "Find all requests where the URL contains UNION SELECT or OR 1=1"

This produces a query scanning HTTP span attributes for common SQL injection patterns, returning the full request context including source IP, target endpoint, timestamp, and response code—exactly the evidence you need to confirm an injection campaign.

Question: "Which IPs made more than 100 requests per minute to any endpoint in the last 4 hours?"

The AI constructs an aggregation query with per-minute bucketing, HAVING clause for the threshold, and ordering by request rate. This identifies brute-force and credential-stuffing sources without requiring you to know ClickHouse's toStartOfMinute() function syntax.

Incident Investigation

Question: "Show me the full trace for request ID abc-123-def"

Simple but critical during incident response. The AI generates a query that retrieves all spans for the specified trace, ordered by start time, with full attribute details—giving you the complete execution path of a suspicious request.

Question: "What endpoints did IP 203.0.113.47 access in the last 7 days, grouped by day?"

This produces a time-series aggregation showing the attacker's reconnaissance pattern over time, revealing which endpoints were probed and when the access pattern changed—essential for determining the attack timeline per MITRE ATT&CK's Reconnaissance tactic (TA0043).

Compliance and Auditing

Question: "List all admin API calls that were not preceded by an authentication span"

The AI generates a query that identifies administrative endpoint access without corresponding authentication spans in the same trace—a direct test for the broken access control vulnerabilities described in the OWASP Top 10 (A01:2021).

Question: "Show me all database queries that took longer than 5 seconds in production this week"

Long-running queries can indicate both performance issues and SQL injection (particularly time-based blind injection). This query surfaces candidates for security review with full context.

The Query Library: Building Institutional Knowledge

Individual queries solve immediate problems. The query library turns those solutions into persistent, shared institutional knowledge.

Save and Categorize

Every query you generate—whether from natural language or written directly in SQL—can be saved to the query library with:

Name — a descriptive title ("Credential Stuffing Detection - By Source IP")
Description — context about what the query detects and when to use it
Category — organizational grouping (Threat Hunting, Incident Response, Compliance, Performance)
Tags — flexible labels for cross-cutting concerns (brute-force, authentication, data-exfiltration)

Share and Re-Execute

Saved queries are available to your entire team. When a junior analyst encounters a situation that a senior investigator has already written a query for, they can find it in the library, understand its purpose from the description, and execute it with a single click. This dramatically reduces the skill barrier for effective forensic investigation.

Convert to Alert Rules

This is where forensics becomes proactive defense. When you develop a query that reliably detects a specific attack pattern, you can convert it directly into a SecureNow alert rule. The query becomes a scheduled monitor running against your ClickHouse data on a cron schedule, with configurable throttling and notification channels (Email, Slack, In-app).

The forensic query "show me all IPs with more than 500 failed login attempts in the last hour" becomes a persistent alert that fires automatically when the threshold is breached. Your investigation effort compounds into continuous monitoring.

Direct SQL Mode for Power Users

Natural language is powerful for rapid investigation, but experienced analysts sometimes need precise control over query construction. SecureNow provides a direct SQL editor alongside the natural language interface.

The SQL editor includes:

Schema browser — navigate databases, tables, and columns without leaving the query interface
Syntax highlighting — ClickHouse SQL syntax with keyword, function, and column recognition
Query validation — real-time safety checking that prevents non-SELECT operations
Execution controls — run, cancel, and time-limit options for managing expensive queries

Both modes share the same query library. A query generated from natural language can be refined in the SQL editor, and a hand-written SQL query can be saved and described in natural language for team sharing.

A Real Forensic Investigation Walkthrough

Let us trace through a complete investigation to illustrate how NL-to-SQL forensics works in practice.

The Alert

Your alert system fires: "More than 200 failed authentication attempts from a single IP in 15 minutes." The notification identifies IP 198.51.100.23 targeting your user-service application.

Step 1: Scope the Attack

You open SecureNow's forensics page, select the user-service application, and type: "Show me all requests from 198.51.100.23 in the last 24 hours, grouped by endpoint and status code."

The results reveal 3,247 requests to /api/auth/login (98% returning 401), 12 requests to /api/auth/reset-password, and 3 requests to /api/users/me returning 200—meaning the attacker eventually succeeded.

Step 2: Identify the Breach Window

"When did 198.51.100.23 first get a 200 response from any endpoint?"

The query returns a timestamp: 04:23:17 UTC. The brute force started at 03:47:00 and succeeded 36 minutes later. You now have your breach window.

Step 3: Determine What Was Accessed

"Show me all traces from 198.51.100.23 after 04:23:00 with full span details."

The results show the attacker accessed /api/users/me (profile data), /api/users/export (data export), and /api/admin/users (attempted privilege escalation, returned 403). You now know the blast radius.

Step 4: Check for Lateral Movement

"Are there any other IPs that accessed the same user account after 04:23:00?"

The query correlates by user ID extracted from authentication spans and reveals two additional IPs from different geographies accessing the compromised account—indicating credential sharing or proxy rotation.

Step 5: Save and Automate

You save the credential-stuffing detection query to the library, convert it to an alert rule with a lower threshold (50 failed attempts in 10 minutes), and configure Slack notifications to your incident response channel.

The entire investigation took eight minutes. Without natural language forensics, the SQL construction alone would have consumed that time.

For deeper analysis of the specific attack patterns found in those traces, you can pivot directly into AI-powered trace analysis to examine the span-level details of each suspicious request.

From Reactive Investigation to Proactive Hunting

The traditional security forensics model is reactive: something bad happens, and you investigate after the fact. NL-to-SQL forensics enables a fundamentally different approach—proactive threat hunting.

With the barrier to querying eliminated, security teams can ask exploratory questions routinely:

"Which endpoints received the most requests from IPs not seen before this week?"
"Show me all database queries containing subqueries from external-facing services"
"Are there any services making outbound HTTP calls to IP addresses instead of hostnames?"

These are not queries you would write during an incident. They are the kind of open-ended exploration that catches threats before they become incidents—the proactive hunting that the MITRE ATT&CK framework and NIST Cybersecurity Framework both recommend but that most teams cannot sustain because the query cost is too high.

When combined with ClickHouse-optimized security analytics, natural language forensics transforms your trace data from a passive archive into an active defense layer. The data was always there. Now you can actually use it—at the speed of thought, not the speed of SQL.

The Compounding Value of Accessible Forensics

Every investigation that results in a saved query makes the next investigation faster. Every saved query that becomes an alert rule makes the next attack detectable automatically. Every analyst who can now run forensic queries without SQL expertise multiplies your team's investigative capacity.

This is the compounding effect that changes security operations from a staffing problem into a knowledge management advantage. The forensic capability of your team is no longer bottlenecked by the one person who knows the ClickHouse schema—it scales with every question asked and every query saved.

The traces are in your database. The questions are in your head. The only thing that was missing was the bridge between them.

Frequently Asked Questions

How does natural language to SQL conversion work?

SecureNow uses AI to convert plain English questions into optimized ClickHouse SQL queries. The system understands your schema context (tables, columns, indexes) to generate accurate, safe queries.

Is there a risk of SQL injection through natural language queries?

No. SecureNow restricts all forensic queries to SELECT statements only and runs them in a read-only context. The system validates generated SQL before execution.

Can I save and share forensic queries?

Yes, the query library lets you save queries with names, descriptions, categories, and tags. Saved queries can be re-executed, shared across your team, and even converted to alert rules.

What data can I query with forensics?

SecureNow forensics queries run against ClickHouse tables containing OpenTelemetry trace data (signoz_traces), logs (signoz_logs), and metrics (signoz_metrics) from your instrumented applications.