Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
63 changes: 63 additions & 0 deletions detections/endpoint/llm_model_file_creation.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,63 @@
name: LLM Model File Creation
id: 23e5b797-378d-45d6-ab3e-d034ca12a99b
version: 1
date: '2025-11-12'
author: Rod Soto
status: production
type: Hunting
description: |
Detects the creation of Large Language Model (LLM) files on Windows endpoints by monitoring file creation events for specific model file formats and extensions commonly used by local AI frameworks.
This detection identifies potential shadow AI deployments, unauthorized model downloads, and rogue LLM infrastructure by detecting file creation patterns associated with quantized models (.gguf, .ggml), safetensors model format files, and Ollama Modelfiles.
These file types are characteristic of local inference frameworks such as Ollama, llama.cpp, GPT4All, LM Studio, and similar tools that enable running LLMs locally without cloud dependencies.
Organizations can use this detection to identify potential data exfiltration risks, policy violations related to unapproved AI usage, and security blind spots created by decentralized AI deployments that bypass enterprise governance and monitoring.
data_source:
- Sysmon EventID 11
search: |
| tstats `security_content_summariesonly` count
min(_time) as firstTime
max(_time) as lastTime
from datamodel=Endpoint.Filesystem
where Filesystem.file_name IN (
"*.gguf*",
"*ggml*",
"*Modelfile*",
"*safetensors*"
)
by Filesystem.action Filesystem.dest Filesystem.file_access_time Filesystem.file_create_time
Filesystem.file_hash Filesystem.file_modify_time Filesystem.file_name Filesystem.file_path
Filesystem.file_acl Filesystem.file_size Filesystem.process_guid Filesystem.process_id
Filesystem.user Filesystem.vendor_product
| `drop_dm_object_name(Filesystem)`
| `security_content_ctime(firstTime)`
| `security_content_ctime(lastTime)`
| `llm_model_file_creation_filter`
how_to_implement: |
To successfully implement this search, you need to be ingesting logs with file creation events from your endpoints.
Ensure that the Endpoint data model is properly populated with filesystem events from EDR agents or Sysmon Event ID 11.
The logs must be processed using the appropriate Splunk Technology Add-ons that are specific to the EDR product.
The logs must also be mapped to the `Filesystem` node of the `Endpoint` data model.
Use the Splunk Common Information Model (CIM) to normalize the field names and speed up the data modeling process.
known_false_positives: |
Legitimate creation of LLM model files by authorized developers, ML engineers, and researchers during model training, fine-tuning, or experimentation. Approved AI/ML sandboxes and lab environments where model file creation is expected. Automated ML pipelines and workflows that generate or update model files as part of their normal operation. Third-party applications and services that manage or cache LLM model files for legitimate purposes.
references:
- https://docs.microsoft.com/en-us/sysinternals/downloads/sysmon
- https://www.ibm.com/think/topics/shadow-ai
- https://www.splunk.com/en_us/blog/artificial-intelligence/splunk-technology-add-on-for-ollama.html
- https://blogs.cisco.com/security/detecting-exposed-llm-servers-shodan-case-study-on-ollama
tags:
analytic_story:
- Suspicious Local LLM Frameworks
asset_type: Endpoint
mitre_attack_id:
- T1543
product:
- Splunk Enterprise
- Splunk Enterprise Security
- Splunk Cloud
security_domain: endpoint
tests:
- name: True Positive Test
attack_data:
- data: https://media.githubusercontent.com/media/splunk/attack_data/master/datasets/suspicious_behaviour/local_llms/sysmon_local_llms.log
source: XmlWinEventLog:Microsoft-Windows-Sysmon/Operational
sourcetype: XmlWinEventLog
74 changes: 74 additions & 0 deletions detections/endpoint/local_llm_framework_dns_query.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,74 @@
name: Local LLM Framework DNS Query
id: d7ceffc5-a45e-412b-b9fa-2ba27c284503
version: 1
date: '2025-11-12'
author: Rod Soto
status: production
type: Hunting
description: |
Detects DNS queries related to local LLM models on endpoints by monitoring Sysmon DNS query events (Event ID 22) for known LLM model domains and services.
Local LLM frameworks like Ollama, LM Studio, and GPT4All make DNS calls to repositories such as huggingface.co and ollama.ai for model downloads, updates, and telemetry.
These queries can reveal unauthorized AI tool usage or data exfiltration risks on corporate networks.
data_source:
- Sysmon EventID 22
search: |
`sysmon`
EventCode=22
QueryName IN (
"*huggingface*",
"*ollama*",
"*jan.ai*",
"*gpt4all*",
"*nomic*",
"*koboldai*",
"*lmstudio*",
"*modelscope*",
"*civitai*",
"*oobabooga*",
"*replicate*",
"*anthropic*",
"*openai*",
"*openrouter*",
"*api.openrouter*",
"*aliyun*",
"*alibabacloud*",
"*dashscope.aliyuncs*"
)
NOT Image IN (
"*\\MsMpEng.exe",
"C:\\ProgramData\\*",
"C:\\Windows\\System32\\*",
"C:\\Windows\\SysWOW64\\*"
)
| stats count
min(_time) as firstTime
max(_time) as lastTime
by src Image process_name QueryName query_count answer answer_count reply_code_id vendor_product
| `security_content_ctime(firstTime)`
| `security_content_ctime(lastTime)`
| `local_llm_framework_dns_query_filter`
how_to_implement: |
Ensure Sysmon is deployed across Windows endpoints and configured to capture DNS query events (Event ID 22). Configure Sysmon's XML configuration file to log detailed command-line arguments, parent process information, and full process image paths. Ingest Sysmon event logs into Splunk via the Splunk Universal Forwarder or Windows Event Log Input, ensuring they are tagged with `sourcetype=XmlWinEventLog:Microsoft-Windows-Sysmon/Operational`. Verify the `sysmon` macro exists in your Splunk environment and correctly references the Sysmon event logs. Create or update the `unauthorized_local_llm_framework_usage_filter` macro in your detections/filters folder to exclude approved systems, authorized developers, sanctioned ML/AI workstations, or known development/lab environments as needed. Deploy this hunting search to your Splunk Enterprise Security or Splunk Enterprise instance and schedule it to run on a regular cadence to detect unauthorized LLM model DNS queries and shadow AI activities. Correlate findings with endpoint asset inventory and user identity data to prioritize investigation.
known_false_positives: |
Legitimate DNS queries to LLM model hosting platforms by authorized developers, ML engineers, and researchers during model training, fine-tuning, or experimentation. Approved AI/ML sandboxes and lab environments where LLM model downloads are expected. Automated ML pipelines and workflows that interact with LLM model hosting services as part of their normal operation. Third-party applications and services that access LLM model platforms for legitimate purposes.
references:
- https://docs.microsoft.com/en-us/sysinternals/downloads/sysmon
- https://www.splunk.com/en_us/blog/artificial-intelligence/splunk-technology-add-on-for-ollama.html
- https://blogs.cisco.com/security/detecting-exposed-llm-servers-shodan-case-study-on-ollama
tags:
analytic_story:
- Suspicious Local LLM Frameworks
asset_type: Endpoint
mitre_attack_id:
- T1590
product:
- Splunk Enterprise
- Splunk Enterprise Security
- Splunk Cloud
security_domain: endpoint
tests:
- name: True Positive Test
attack_data:
- data: https://media.githubusercontent.com/media/splunk/attack_data/master/datasets/suspicious_behaviour/local_llms/sysmon_dns.log
source: XmlWinEventLog:Microsoft-Windows-Sysmon/Operational
sourcetype: XmlWinEventLog
150 changes: 150 additions & 0 deletions detections/endpoint/windows_local_llm_framework_execution.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,150 @@
name: Windows Local LLM Framework Execution
id: a3f8e2c9-7d4b-4e1f-9c6a-2b5d8f3e1a7c
version: 1
date: '2025-11-20'
author: Rod Soto, Splunk
status: production
type: Hunting
description: |
The following analytic detects execution of unauthorized local LLM frameworks (Ollama, LM Studio, GPT4All, Jan, llama.cpp, KoboldCPP, Oobabooga, NutStudio) and Python-based AI/ML libraries (HuggingFace Transformers, LangChain) on Windows endpoints by leveraging process creation events.
It identifies cases where known LLM framework executables are launched or command-line arguments reference AI/ML libraries.
This activity is significant as it may indicate shadow AI deployments, unauthorized model inference operations, or potential data exfiltration through local AI systems.
If confirmed malicious, this could lead to unauthorized access to sensitive data, intellectual property theft, or circumvention of organizational AI governance policies.
data_source:
- Sysmon EventID 1
- Windows Event Log Security 4688
- CrowdStrike ProcessRollup2
search: |
| tstats `security_content_summariesonly` count
min(_time) as firstTime
max(_time) as lastTime
from datamodel=Endpoint.Processes
where
(
Processes.process_name IN (
"gpt4all.exe",
"jan.exe",
"kobold.exe",
"koboldcpp.exe",
"llama-run.exe",
"llama.cpp.exe",
"lmstudio.exe",
"nutstudio.exe",
"ollama.exe",
"oobabooga.exe",
"text-generation-webui.exe"
)
OR
Processes.original_file_name IN (
"ollama.exe",
"lmstudio.exe",
"gpt4all.exe",
"jan.exe",
"llama-run.exe",
"koboldcpp.exe",
"nutstudio.exe"
)
OR
Processes.process IN (
"*\\gpt4all\\*",
"*\\jan\\*",
"*\\koboldcpp\\*",
"*\\llama.cpp\\*",
"*\\lmstudio\\*",
"*\\nutstudio\\*",
"*\\ollama\\*",
"*\\oobabooga\\*",
"*huggingface*",
"*langchain*",
"*llama-run*",
"*transformers*"
)
OR
Processes.parent_process_name IN (
"gpt4all.exe",
"jan.exe",
"kobold.exe",
"koboldcpp.exe",
"llama-run.exe",
"llama.cpp.exe",
"lmstudio.exe",
"nutstudio.exe",
"ollama.exe",
"oobabooga.exe",
"text-generation-webui.exe"
)
)
by Processes.action Processes.dest Processes.original_file_name Processes.parent_process
Processes.parent_process_exec Processes.parent_process_guid Processes.parent_process_id
Processes.parent_process_name Processes.parent_process_path Processes.process
Processes.process_exec Processes.process_guid Processes.process_hash Processes.process_id
Processes.process_integrity_level Processes.process_name Processes.process_path Processes.user
Processes.user_id Processes.vendor_product
| `drop_dm_object_name(Processes)`
| eval Framework=case(
match(process_name, "(?i)ollama") OR match(process, "(?i)ollama"), "Ollama",
match(process_name, "(?i)lmstudio") OR match(process, "(?i)lmstudio") OR match(process, "(?i)lm-studio"), "LM Studio",
match(process_name, "(?i)gpt4all") OR match(process, "(?i)gpt4all"), "GPT4All",
match(process_name, "(?i)kobold") OR match(process, "(?i)kobold"), "KoboldCPP",
match(process_name, "(?i)jan") OR match(process, "(?i)jan"), "Jan AI",
match(process_name, "(?i)nutstudio") OR match(process, "(?i)nutstudio"), "NutStudio",
match(process_name, "(?i)llama") OR match(process, "(?i)llama"), "llama.cpp",
match(process_name, "(?i)oobabooga") OR match(process, "(?i)oobabooga") OR match(process, "(?i)text-generation-webui"), "Oobabooga",
match(process, "(?i)transformers") OR match(process, "(?i)huggingface"), "HuggingFace/Transformers",
match(process, "(?i)langchain"), "LangChain",
1=1, "Other"
)
| `security_content_ctime(firstTime)`
| `security_content_ctime(lastTime)`
| table action dest Framework original_file_name parent_process parent_process_exec
parent_process_guid parent_process_id parent_process_name parent_process_path
process process_exec process_guid process_hash process_id process_integrity_level
process_name process_path user user_id vendor_product
| `windows_local_llm_framework_execution_filter`
how_to_implement: |
The detection is based on data that originates from Endpoint Detection
and Response (EDR) agents. These agents are designed to provide security-related
telemetry from the endpoints where the agent is installed. To implement this search,
you must ingest logs that contain the process GUID, process name, and parent process.
Additionally, you must ingest complete command-line executions. These logs must
be processed using the appropriate Splunk Technology Add-ons that are specific to
the EDR product. The logs must also be mapped to the `Processes` node of the `Endpoint`
data model. Use the Splunk Common Information Model (CIM) to normalize the field
names and speed up the data modeling process.
known_false_positives: Legitimate development, data science, and AI/ML workflows where
authorized developers, researchers, or engineers intentionally execute local LLM
frameworks (Ollama, LM Studio, GPT4All, Jan, NutStudio) for model experimentation,
fine-tuning, or prototyping. Python developers using HuggingFace Transformers or
LangChain for legitimate AI/ML projects. Approved sandbox and lab environments where
framework testing is authorized. Open-source contributors and hobbyists running
frameworks for educational purposes. Third-party applications that bundle or invoke
LLM frameworks as dependencies (e.g., IDE plugins, analytics tools, chatbot integrations).
System administrators deploying frameworks as part of containerized services or
orchestrated ML workloads. Process name keyword overlap with unrelated utilities
(e.g., "llama-backup", "janimation"). Recommended tuning — baseline approved frameworks
and users by role/department, exclude sanctioned dev/lab systems via the filter
macro, correlate with user identity and peer group anomalies before escalating to
incident response.
references:
- https://splunkbase.splunk.com/app/8024
- https://www.ibm.com/think/topics/shadow-ai
- https://www.splunk.com/en_us/blog/artificial-intelligence/splunk-technology-add-on-for-ollama.html
- https://blogs.cisco.com/security/detecting-exposed-llm-servers-shodan-case-study-on-ollama
- https://docs.microsoft.com/en-us/sysinternals/downloads/sysmon
tags:
analytic_story:
- Suspicious Local LLM Frameworks
asset_type: Endpoint
mitre_attack_id:
- T1543
product:
- Splunk Enterprise
- Splunk Enterprise Security
- Splunk Cloud
security_domain: endpoint
tests:
- name: True Positive Test - Sysmon
attack_data:
- data: https://media.githubusercontent.com/media/splunk/attack_data/master/datasets/suspicious_behaviour/local_llms/sysmon_local_llms.log
source: XmlWinEventLog:Microsoft-Windows-Sysmon/Operational
sourcetype: XmlWinEventLog
29 changes: 29 additions & 0 deletions stories/suspicious_local_llm_frameworks.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
name: Suspicious Local LLM Frameworks
id: 0b4396a1-aeff-412e-b39e-4e26457c780d
version: 1
date: '2025-11-12'
author: Rod Soto, Splunk
status: production
description: |
Leverage advanced Splunk searches to detect and investigate suspicious activities targeting possibly unauthorized local LLM frameworks. This analytic story addresses discovery and detection of unauthorized local LLM frameworks and related shadow AI artifacts.
narrative: |
This analytic story addresses the growing security challenge of Shadow AI - the deployment and use of unauthorized Large Language Model (LLM) frameworks and AI tools within enterprise environments without proper governance, oversight, or security controls.

Shadow AI deployments pose significant risks including data exfiltration through local model inference (where sensitive corporate data is processed by unmonitored AI systems), intellectual property leakage, policy violations, and creation of security blind spots that bypass enterprise data loss prevention and monitoring solutions.

Local LLM frameworks such as Ollama, LM Studio, GPT4All, Jan, llama.cpp, and KoboldCPP enable users to download and run powerful language models entirely on their endpoints, processing sensitive information without cloud-based safeguards or enterprise visibility. These detections monitor process execution patterns, file creation activities (model files with .gguf, .ggml, safetensors extensions), DNS queries to model repositories, and network connections to identify unauthorized AI infrastructure.

By correlating Windows Security Event Logs (Event ID 4688), Sysmon telemetry (Events 1, 11, 22), and behavioral indicators, security teams can detect shadow AI deployments early, investigate the scope of unauthorized model usage, assess data exposure risks, and enforce AI governance policies to prevent covert model manipulation, persistent endpoint compromise, and uncontrolled AI experimentation that bypasses established security frameworks.
references:
- https://splunkbase.splunk.com/app/8024
- https://www.ibm.com/think/topics/shadow-ai
- https://www.splunk.com/en_us/blog/artificial-intelligence/splunk-technology-add-on-for-ollama.html
- https://blogs.cisco.com/security/detecting-exposed-llm-servers-shodan-case-study-on-ollama
tags:
category:
- Adversary Tactics
product:
- Splunk Enterprise
- Splunk Enterprise Security
- Splunk Cloud
usecase: Advanced Threat Detection