Skip to content

RFC: Deepen URL decision pipeline in @docs.plus/extension-hyperlink #146

@HMarzban

Description

@HMarzban

RFC produced by the improve-codebase-architecture skill. Targets packages/extension-hyperlink/. Part 1 of 4 — see also: UI Controller, Link Interaction Plugin, Command Façade.

Problem

URL handling inside @docs.plus/extension-hyperlink is a deep classification + normalization + validation + gating problem fragmented across five caller paths, each combining the same primitives slightly differently. The result is documented bugs sitting in seams.

Path URL detect Normalize validate arg Policy gate
paste-over-selection (plugins/pasteHandler.ts) linkify find normalizeLinkifyHref validate(link.href) inside setHyperlink
paste rule (hyperlink.ts:addPasteRules) linkify find normalizeLinkifyHref validate(link.value) buildHrefGate
autolink (plugins/autolink.ts) findLinks (linkify + regex + phone) normalizeLinkifyHref validate(link.value) buildHrefGate
input rule []() (hyperlink.ts:addInputRules) n/a normalizeHref none buildHrefGate
click / preview (plugins/clickHandler.ts, popovers/previewHyperlinkPopover.ts) n/a none none isSafeHref + isAllowedUri (skips validateURL)
create / edit forms implicit inside commands validateURL inside commands

Concrete friction:

  1. validate(link.href) vs validate(link.value) — same callback, different inputs across paste paths. The user's validate hook sees inconsistent arguments depending on which surface triggered the write.
  2. Three parallel special-scheme detection mechanismsgetSpecialUrlInfo in utils/specialUrls.ts, the regex in utils/findLinks.ts, and SPECIAL_SCHEME_REGEX in plugins/autolink.ts. Adding a scheme requires editing three places.
  3. Click / preview path skips validateURL — stored hrefs can navigate even when shape rules would reject them. A href written before tightening a validate rule remains clickable forever.
  4. buildHrefGate(this.options) is reconstructed at five call sites in hyperlink.ts per dispatch, plus inside applySetHyperlink, applyUnsetHyperlink, and the input/paste rule bodies.

The utils/__tests__/* directory has 138 unit tests against the leaf helpers (validateURL, normalizeHref, findLinks, phone, specialUrls). They're shallow-module tests — passing them does not prove the five-path pipeline produces consistent behavior.

Proposed Interface

A single URLDecisions factory bound at extension onCreate, exposing two methods that fold every caller's question into one of two boundaries:

export type WriteInput =
  | { kind: 'text'; text: string }
  | { kind: 'href'; href: string }
  | { kind: 'match'; match: LinkifyMatchLike }

export type WriteOptions = {
  /** True for automatic writes (paste/autolink); false for explicit user intent. */
  autolink?: boolean
}

export type WriteResult = {
  href: string                       // canonical, normalized
  value: string                      // surface text
  start: number; end: number         // span in source string
  type: 'url' | 'email' | 'phone' | 'special'
  special: SpecialUrlInfo | null
}

export type ReadDecision = {
  safe: boolean                      // isSafeHref floor
  navigable: boolean                 // safe + isAllowedUri
  href: string
  special: SpecialUrlInfo | null
}

export interface URLDecisions {
  forWrite(input: WriteInput, opts?: WriteOptions): WriteResult[]
  forRead(href: string | null | undefined): ReadDecision
}

export function createURLDecisions(options?: URLDecisionsOptions): URLDecisions
export function isSafeHref(href: string | null | undefined): href is string

Usage — every existing path collapses to one call:

// onCreate
this.storage.urls = createURLDecisions(this.options)

// autolink, paste rule
this.storage.urls.forWrite({ kind: 'text', text }, { autolink: true })

// input rule, create/edit form
this.storage.urls.forWrite({ kind: 'href', href: url })

// click, preview
const { navigable, special } = this.storage.urls.forRead(href)

What it hides internally:

  • findLinks, phone.isBarePhone, validateURL, getURLScheme, hasPlausibleHost become module-internal — they remain exported as compatibility shims for packages/webapp consumers.
  • Special-scheme detection consolidates into one detector keyed off the specialUrls catalog. The two duplicate regexes (autolink.ts SPECIAL_SCHEME_REGEX and findLinks.ts SPECIAL_SCHEME_REGEX_GLOBAL) are deleted.
  • buildHrefGate is composed once at construction; callers ask forWrite / forRead instead of re-composing.
  • The validate(href) vs validate(value) ambiguity becomes inexpressible — forWrite always invokes validate(WriteResult.value) at the same step, regardless of input shape.

Dependency Strategy

Category 1 — In-process. Every dependency is in-memory:

  • linkifyjs (with its global protocol registry — encapsulated inside URLDecisions; the onCreate registerCustomProtocol block moves out of hyperlink.ts).
  • The user's option hooks (validate, isAllowedUri, shouldAutoLink, protocols, defaultProtocol) — captured by reference at construction.
  • A logger.

No I/O, no external services, no test substitutes required.

Testing Strategy

New boundary tests to write (against the URLDecisions interface, not its internals):

  • Every WriteInput shape produces consistent results for the same underlying URL — forWrite({kind:'text', text:'https://x.test'}) and forWrite({kind:'href', href:'https://x.test'}) agree on href, value, and which gates ran.
  • Write-time vs read-time gating differs predictably: a href that fails the validate shape rule is rejected by forWrite but forRead reports navigable: false.
  • validate is invoked exactly once per candidate per forWrite call, against WriteResult.value.
  • shouldAutoLink is invoked only when WriteOptions.autolink === true; explicit-intent callers (input rule, create form) cannot accidentally honor an autolink veto.
  • Special-scheme classification is consistent across text, href, and match inputs — the same magnet-style URL is classified identically regardless of how it entered the pipeline.
  • The XSS floor (DANGEROUS_SCHEME_RE) cannot be bypassed by any combination of WriteInput.kind + options.
  • forRead of a stored javascript: href returns navigable: false even when the href predates a tightened option set.

Old tests to delete once boundary tests pass:

  • utils/__tests__/validateURL.test.ts — replaced by the forWrite / forRead boundary suite.
  • utils/__tests__/findLinks.test.ts — replaced (detection is internal).
  • utils/__tests__/phone.test.ts — replaced.
  • utils/__tests__/normalizeHref.test.ts — keep only the contract tests for the public re-exported shim.
  • utils/__tests__/specialUrls.test.ts — keep only the catalog-shape tests; behavioral tests move to the boundary.

Test environment needs: none beyond the existing Bun test setup. No DOM required for the boundary tests — they exercise pure functions.

Implementation Recommendations

Module owns:

  • The full URL decision pipeline: detect → normalize → shape-validate → user validate → safety floor → user isAllowedUri → optional shouldAutoLink.
  • Encapsulation of the linkifyjs global protocol registry.
  • The single source of truth for special-scheme classification.

Module hides:

  • Whether linkify, regex, or catalog-based detection is used internally.
  • Construction order of the gate composition.
  • Per-input-shape branching.

Module exposes:

  • createURLDecisions(options) → URLDecisions factory.
  • forWrite(input, opts) and forRead(href) methods on the result.
  • isSafeHref(href) as an option-free standalone (the safety floor).

Caller migration:

  • The five caller paths each drop their imports of findLinks, validateURL, normalizeLinkifyHref, buildHrefGate, isNavigable and replace with one forWrite or forRead call.
  • Published utils/index.ts exports (validateURL, isSafeHref, normalizeHref, normalizeLinkifyHref, getSpecialUrlInfo) remain as compatibility shims for packages/webapp — bodies become 1-3 lines that delegate to the factory.

Sequencing note: This RFC is the foundation for three follow-ups (UI Controller, Link Interaction Plugin, Command Façade). Land this first; the others depend on urls.isAllowed / urls.normalize / urls.forWrite being available as composed values rather than raw option reads.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions