Security#
Ovellum is a build-time tool that turns Markdown into HTML. This page documents what we defend against, how, and where we explicitly don't.
Threat model#
Three parties interact with an Ovellum-built site:
| Party | Trust | Notes |
| ---------------------- | ---------------------------------- | -------------------------------------------------------------------------------------------------- |
| Author | trusted | Writes ovellum.config.* and .md content. We trust their source files but defend against typos. |
| Build environment | trusted | ovellum build runs locally or in CI with the author's permissions. |
| Reader of the site | untrusted (from the site's PoV) | Visits the rendered HTML in a browser. We must not let attacker-controlled content reach them. |
The interesting attacker is the third-party contributor: someone
who can submit a PR that adds Markdown to a repo that builds with
Ovellum. A malicious <script> tag, an onclick handler, or a
javascript: href in such a PR must not survive into the rendered HTML
that visitors load after the PR merges.
We do not defend against authors who can already run arbitrary code
on the build host. If you control ovellum.config.ts, you control the
build — the entire config-load path is power equivalent to running the
file as a Node script.
HTML sanitization#
Every Markdown source runs through rehype-sanitize before any output is produced. The pipeline:
remark-parse → remark-rehype (allowDangerousHtml)
→ rehype-raw ← parses literal HTML strings into HAST
→ rehype-sanitize ← strips dangerous elements/attrs
→ rehype-slug
→ rehype-autolink-headings
→ custom (collect headings + shiki highlight)
→ rehype-stringify
What's stripped#
<script>,<iframe>,<object>,<embed>on*event-handler attributes (onclick,onload, etc.)- URL schemes other than the allowlist on
href,src,cite,longDesc. Today the allowlist ishttp,https,irc,ircs,mailto,xmppforhref;http,httpsforsrcandcite.
data: URLs are not allowed anywhere, including <img src> —
because data:image/svg+xml can carry an <svg onload="…"> payload
that browsers execute.
What's kept#
The default schema permits the HTML elements authors actually use:
<details>, <summary>, <kbd>, <sup>, <sub>, <mark>, <abbr>,
tables, etc.
Order matters#
Sanitization runs before shiki highlighting. Shiki emits inline
style="…" attributes that the sanitizer would otherwise strip — by
running shiki second, its generated output stays intact while
author-authored HTML is locked down.
Shell-injection resistance#
Whenever Ovellum invokes an external binary (today: git for the
last-modified timestamp on each page), arguments are passed via
execFile as an array. We never build a command string and hand it to
a shell.
The practical consequence: a path containing $(...), backticks,
;cmd, or any other shell metacharacters is treated as a literal
filename argument to git, never interpreted by sh/zsh. Tests in
@ovellum/site pin this behaviour by writing fixture files named
inject;touch PWNED;x.md and $(touch PWNED).md and asserting the
canary never appears.
URL scheme allowlist in ovellum check#
Beyond render-time stripping, ovellum check flags links whose scheme
is on a deny list (javascript:, vbscript:, data:, file:) as
[SECURITY] issues so the author notices them in source and removes
them. This is defense in depth: the sanitizer keeps the rendered HTML
safe regardless, but flagging them at lint time means they don't ship
silently stripped.
Detection normalises HTML-numeric entities (	 → tab) and
zero-width / BiDi / whitespace characters before matching, so attempts
like javas	cript: or \tjavascript: are caught.
See CLI → ovellum check for the
output format and exit codes.
What's out of scope today#
- Supply-chain pinning.
package.jsonuses caret ranges (the npm default); we don't pin to exact versions. Acceptable for a dev tool but worth revisiting once Ovellum is widely installed. - Symlink traversal in
input/. If an author's content directory contains a symlink to/etc/passwd, Ovellum will read and (depending on the file extension) potentially render its contents. The mitigation is to trust the content directory; we don't try to detect this today. - Prototype pollution from config files. A malicious config with
"__proto__": {…}could in theory pollute prototypes, but theObject.keys-based merge in strict mode doesn't write toObject.prototype. We rely on this behaviour rather than explicitly filtering keys. - Resource exhaustion from giant Markdown files. No size limits on
.mdinputs. A multi-GB Markdown file would slow the build but not compromise the output.
Reporting#
Found a security issue? For anything that could be used to execute code on a reader's browser, email the maintainer directly (contact in the project README). For non-exploitable concerns, a public GitHub issue is fine.