Adds optional-skills/security/web-pentest/ — an authorized web app penetration testing skill adapted from Shannon's methodology (concepts only; AGPL-clean fresh implementation). Phased: recon (read-only) → vuln analysis (delegate_task per OWASP class) → proof-based exploitation → report. Guardrails baked in: - Authorization gate before first active scan (templates/authorization.md) - Scope allowlist (scope.txt) consulted by recon-scan.sh and documented as the rule for every active request - Aux-client leakage warning (compression + title gen replay history; payloads/creds must not enter chat verbatim) - Bypass-exhaustion discipline before false-positive classification - L3/L4 (proof-required) for reportable findings; L1/L2 listed as candidates only Closes #400. Supersedes #21845 (plugin-shaped proposal; skill-shaped is cheaper and matches the existing optional-skills/security/ pattern).
4.3 KiB
4.3 KiB
Vulnerability Taxonomy
Two classification systems used during analysis. Both come from Shannon (concepts only; rewritten here). Both exist to make the question "is this exploitable?" mechanical instead of vibes-based.
Injection: Slot Types
Every injection sink has a slot type — the lexical position the attacker payload lands in. Each slot type has a small set of required defenses. A mismatch is a vulnerability. The same defense applied to the wrong slot is also a vulnerability.
| Slot | Example | Required defense |
|---|---|---|
SQL-val |
SELECT * FROM u WHERE id = :v |
Parameterized binding |
SQL-ident |
SELECT * FROM ${table} |
Allowlist on identifier values |
SQL-keyword |
ORDER BY ${col} ${dir} |
Allowlist on column AND direction |
CMD-argument |
subprocess.run(["ls", v]) |
argv list (never shell=True) |
CMD-shell |
os.system("ls " + v) |
DON'T — refactor to argv list |
PATH-segment |
open("/data/" + v) |
Normalize + allowlist + base-relative check |
URL-host |
redirect to https://${v}/x |
Allowlist of acceptable hosts |
URL-fetch |
requests.get(v) |
Allowlist + block private/metadata IPs (SSRF) |
TEMPLATE-string |
Template("Hello {{ v }}") |
Autoescape ON, no user-controlled template syntax |
DESERIALIZE-pickle |
pickle.loads(v) |
DON'T — use JSON / msgpack |
DESERIALIZE-yaml |
yaml.load(v) |
yaml.safe_load, never yaml.load |
XPATH-expr |
tree.xpath("//u[@id='" + v + "']") |
Parameterized XPath or escape |
LDAP-filter |
(uid=${v}) |
LDAP filter escaping |
REGEX-pattern |
re.search(v, text) |
Don't take pattern from user (ReDoS too) |
LOG-record |
log.info("got " + v) |
Encode CR/LF/control chars before logging |
EMAIL-header |
Subject: ${v} |
Reject CR/LF |
HTTP-header |
Set-Cookie: ${v} |
Reject CR/LF (response splitting) |
When you classify a finding:
- Identify the slot type
- Identify the actual defense in the code (if you have source)
- If defense doesn't match the required-defense set: vulnerable
XSS: Render Contexts
XSS exploitability depends on where in the HTML/JS the value lands. Encoding for one context doesn't protect another.
| Context | Example | Required encoding |
|---|---|---|
HTML_BODY |
<div>{{ v }}</div> |
HTML entity encode <>&"' |
HTML_ATTR_QUOTED |
<a href="{{ v }}"> |
HTML attr encode |
HTML_ATTR_UNQUOTED |
<a href={{ v }}> |
Almost impossible to safely encode; quote the attr |
URL_ATTR (href/src) |
<a href="{{ v }}"> |
Validate scheme allowlist + attr encode |
JAVASCRIPT_STRING |
<script>var x = "{{ v }}";</script> |
JS string escape + ensure quote consistency |
JAVASCRIPT_BLOCK |
<script>{{ v }}</script> |
DON'T — refactor; no safe encoding |
CSS_VALUE |
<style>color: {{ v }};</style> |
CSS encode + allowlist scheme/format |
CSS_BLOCK |
<style>{{ v }}</style> |
DON'T — refactor |
JSON_RESPONSE (consumed by JS) |
JSON.parse(response) |
JSON encode + correct content-type header |
EVENT_HANDLER |
<div onclick="{{ v }}"> |
JS string escape inside HTML attr encode |
URL_PATH (router-driven) |
route param echoed unencoded | URL-encode + HTML-encode |
DOM_INNERHTML |
el.innerHTML = v (DOM XSS) |
Use textContent instead, or DOMPurify |
DOM_DOC_WRITE |
document.write(v) |
DON'T — refactor |
When you classify:
- Identify the render context where user input lands
- Identify the encoding applied
- Mismatch = vulnerable. Even "HTML encoded" output in
JAVASCRIPT_STRINGis exploitable (</script><script>evasion).
OWASP Top 10 (2021) Mapping
For reporting:
| OWASP | Slot/context covered |
|---|---|
| A01 Broken Access Control | authz class (IDOR, vertical/horizontal) |
| A02 Cryptographic Failures | infra class (weak TLS, plaintext storage) |
| A03 Injection | injection class (all slot types except deserialize) |
| A04 Insecure Design | reported in findings narrative |
| A05 Security Misconfiguration | infra class |
| A06 Vulnerable Components | infra class (whatweb output) |
| A07 Auth Failures | auth class |
| A08 Software/Data Integrity | DESERIALIZE-* slots, also supply chain |
| A09 Logging/Monitoring | infra class (out of scope for active testing) |
| A10 SSRF | ssrf class |