You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Fix connection pool exhaustion from proxy requests (#292)
* Fix connection pool exhaustion from ?uri= proxy requests (#287)
- Add 30 s connectionRequestTimeout to both HTTP client builders in
Application so pool exhaustion fails fast instead of blocking forever
- Replace allMatch(HTMLMediaTypePredicate) with Request.selectVariant()
in ProxyRequestFilter so real browser Accept headers (text/html,
application/xml;q=0.9, */*;q=0.8) correctly trigger the early return,
leaving (X)HTML responses to the downstream handler and Varnish cache
- In client.xsl ldh:rdf-document-response, detect external ?uri= URIs
and replace-content on #content-body with bs2:Row rendering of the
fetched RDF instead of iterating stale home-page blocks
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* Server-side condition
* Server-side progress bar
* Fix external URI proxy bypass and client-side rendering
- ProxyRequestFilter: use Core-only MediaTypes (no HTML) with combined
Model+ResultSet writable variant list; selectVariant==null is the sole
bypass signal so Accept:*/* correctly reaches the proxy instead of
falling through to the HTML handler
- Thread pre-computed Variant through all getResponse() overloads to
avoid a second selectVariant call inside Core's Response constructor
- client.xsl onsubmit: skip the XHTML round-trip for external URIs and
call PushState + RDFDocumentLoad directly, advancing the progress bar
to 66% between the two steps; fixes the double-click issue
- client.xsl ldh:rdf-document-response: respect the #layout-modes mode
selector for client-side rendered external resources; refactor the
duplicate id('content-body') lookup out of both xsl:choose branches
- ProxyRequestFilterTest: stub Request.selectVariant() to return a
non-null Variant so both tests reach the logic they exercise
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* ProxyRequestFilter: document HTML bypass rationale; cache MediaTypes instance
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* ProxyRequestFilter: clarify HTML bypass as resource exhaustion defence
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* Make HTTP client connectionRequestTimeout configurable
Defaults to 30000 ms (via Dockerfile ENV). Passed through the
CATALINA_OPTS path (same as allowInternalUrls) to avoid exceeding
the ~30-param libxslt limit already reached by context.xsl.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* Fix ProxyRequestFilter HTML bypass: check Accept header explicitly
Replace the selectVariant==null bypass with an explicit check for
non-wildcard text/html or application/xhtml+xml in the Accept header.
Browsers list these types explicitly (q=1.0) and get bypassed to the
app shell; API clients that send only */* reach the proxy.
The old approach (Core MediaTypes, selectVariant==null) failed for
browsers because their */*;q=0.8 wildcard matched RDF variants,
causing the proxy to return RDF instead of the (X)HTML app shell.
Add testHtmlAcceptBypassesProxy to cover the bypass path.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* Fix ContentMode block rendering for proxied external resources
Proxied resources' ContentMode blocks (charts, maps) were querying the
local SPARQL endpoint instead of the remote one because ProxyRequestFilter
discarded all external response headers and ResponseHeadersFilter then
injected the local sd:endpoint Link.
- ApplicationFilter: register external ?uri= target in request context
(AC.uri property) as authoritative proxy marker
- ProxyRequestFilter: forward all Link headers from external response
- ResponseHeadersFilter: skip local sd:endpoint/ldt:ontology/ac:stylesheet
for proxy requests; removes now-unused parseLinkHeaderValues/getLinksByRel
- client.xsl (ldh:rdf-document-response): extract sd:endpoint from Link
header and store in LinkedDataHub.endpoint, mirroring acl:mode pattern
- functions.xsl (sd:endpoint()): return LinkedDataHub.endpoint when set,
fall back to local /sparql — no changes needed in view.xsl or chart.xsl
- CLAUDE.md: document the proxy/client-side rendering architecture
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* Replace proxy-detection heuristics with ac:uri() / $ac:uri throughout
- ApplicationFilter: store external URI as AC.uri context property; strip ?uri= from UriInfo
- ProxyRequestFilter: read proxy target from AC.uri context property; bypass HTML requests
- XsltExecutableFilter: remove SYSTEM_ID_PROPERTY; XSLTWriterBase reads AC.uri directly
- XSLTWriterBase: pass $ac:uri to server-side XSLT when proxying
- layout.xsl: declare $ac:uri param; use it for export links and search input pre-fill
- document.xsl: remove proxy spinner branch from bs2:ContentBody
- client/functions.xsl: add ac:uri() function (dynamic read of ixsl:query-params()?uri);
ldh:base-uri() now calls ac:uri() instead of stale global $ac:uri
- client.xsl: drop global $ac:uri param; ldh:HTMLDocumentLoaded passes ldh:base-uri(.)
to ldh:RDFDocumentLoad after pushState so URL is already updated
- ProxyRequestFilterTest: update mocks to use AC.uri context property
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* Add server-side ac:uri() function; refactor ActionBar templates into document.xsl
- Add ac:uri() server-side function to imports/default.xsl (mirrors acl:mode() pattern)
- Move ActionBarLeft/ActionBarMain/ActionBarRight/BreadCrumbBar/ModeList/MediaTypeList templates from layout.xsl to document.xsl
- Fix $effective-mode type error (xs:string → xs:anyURI) and simplify with [1] idiom
- Use ac:uri() instead of $ac:uri in MediaTypeList hrefs
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
---------
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Copy file name to clipboardExpand all lines: CLAUDE.md
+22-2Lines changed: 22 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -91,8 +91,28 @@ The application runs as a multi-container setup:
91
91
1. Requests come through nginx proxy
92
92
2. Varnish provides caching layer
93
93
3. LinkedDataHub application handles business logic
94
-
4. Data persisted to appropriate Fuseki triplestore
95
-
5. XSLT transforms data for client presentation
94
+
4. RDF data is read/written via the **Graph Store Protocol** — each document in the hierarchy corresponds to a named graph in the triplestore; the document URI is the graph name
95
+
5. Data persisted to appropriate Fuseki triplestore
96
+
6. XSLT transforms data for client presentation
97
+
98
+
### Linked Data Proxy and Client-Side Rendering
99
+
100
+
LDH includes a Linked Data proxy that dereferences external URIs on behalf of the browser. The original design rendered proxied resources identically to local ones — server-side RDF fetch + XSLT. This created a DDoS/resource-exhaustion vector: scraper bots routing arbitrary external URIs through the proxy would trigger a full server-side pipeline (HTTP fetch → XSLT rendering) per request, exhausting HTTP connection pools and CPU.
101
+
102
+
The current design splits rendering by request origin:
103
+
104
+
-**Browser requests** (`Accept: text/html`): `ProxyRequestFilter` bypasses the proxy entirely. The server returns the local application shell. Saxon-JS then issues a second, RDF-typed request (`Accept: application/rdf+xml`) from the browser.
105
+
-**RDF requests** (API clients, Saxon-JS second pass): `ProxyRequestFilter` fetches the external RDF, parses it, and returns it to the caller. No XSLT happens server-side.
106
+
-**Client-side rendering**: Saxon-JS receives the raw RDF and applies the same XSLT 3 templates used server-side (shared stylesheet), so proxied resources look almost identical to local ones.
107
+
108
+
Key implementation files:
109
+
-`ProxyRequestFilter.java` — intercepts `?uri=` and `lapp:Dataset` proxy requests; HTML bypass; forwards external `Link` headers
110
+
-`ApplicationFilter.java` — registers external proxy target URI in request context (`AC.uri` property) as authoritative proxy marker
111
+
-`ResponseHeadersFilter.java` — skips local-only hypermedia links (`sd:endpoint`, `ldt:ontology`, `ac:stylesheet`) for proxy requests; external ones are forwarded by `ProxyRequestFilter`
112
+
-`client.xsl` (`ldh:rdf-document-response`) — receives the RDF proxy response client-side; extracts `sd:endpoint` from `Link` header; stores it in `LinkedDataHub.endpoint`
113
+
-`functions.xsl` (`sd:endpoint()`) — returns `LinkedDataHub.endpoint` when set (external proxy), otherwise falls back to the local SPARQL endpoint
114
+
115
+
The SPARQL endpoint forwarding chain ensures ContentMode blocks (charts, maps) query the **remote** app's SPARQL endpoint, not the local one. `LinkedDataHub.endpoint` is reset to the local endpoint by `ldh:HTMLDocumentLoaded` on every HTML page navigation, so there is no stale state when navigating back to local documents.
96
116
97
117
### Key Extension Points
98
118
-**Vocabulary definitions** in `com.atomgraph.linkeddatahub.vocabulary`
0 commit comments