Proposal : 2 filter hooks to enable companion plugins (caching + vector DB)

Paul ARGOUD May 23, 2026 at 8:55 pm
1 Replies
1 Participants

Discussion

Hi MxChat team,

First, a huge thank you for MxChat — it’s a genuinely great plugin, both comprehensive and well-built. It’s saved me a lot of time across several projects, and it’s precisely because I rely on it heavily that I’m writing this post.

Context

I’m currently building two small companion plugins, designed to extend MxChat without modifying any of its files:

mxchat-promptcache — automatically enables Anthropic prompt caching (the system block + conversation history, with a 1-hour TTL via the extended-cache-ttl-2025-04-11 beta header) on Claude API calls, by hooking into WordPress’s http_request_args filter. Expected gains: ~90% reduction in input tokens served from cache, and significantly lower first-token latency — with zero admin configuration.

mxchat-duckdb — a local vector backend (DuckDB or MotherDuck) as an alternative to MySQL storage (which deserializes every embedding in PHP on each query and collapses past a few thousand vectors) and to Pinecone (proprietary, paid, third-party US dependency).

Both plugins partly work as-is, but to wrap them up cleanly without patching MxChat Basic, I’d need two small apply_filters() hooks on the MxChat side. Both follow the same mxchat_* filter convention you already use throughout the plugin (e.g. mxchat_get_bot_pinecone_config, mxchat_pre_process_message, mxchat_get_bot_options), so this would slot naturally into your existing extensibility model.

Proposal #1 — Hook before the streaming cURL call to Anthropic

File: includes/class-mxchat-integrator.php, around line 8116 (right before curl_setopt(CURLOPT_POSTFIELDS, $body)).

Problem: the main chat flow uses curl_init() directly to stream the Anthropic response, which bypasses wp_remote_post() and therefore the WordPress http_request_args filter. As a result, mxchat-promptcache can’t inject cache_control blocks on that code path — even though it intercepts every other call site without trouble (interpret_query_with_claude, the content generator, non-streamed fallbacks).

Proposed change: a passthrough filter allowing the JSON $body and $headers to be rewritten before the cURL call:

$filtered = apply_filters('mxchat_pre_claude_stream_request', [
    'body'    => $body,
    'headers' => $headers,
], $selected_model);
$body    = $filtered['body'];
$headers = $filtered['headers'];

Zero impact on default behavior, and an immediate opening for caching, observability, or other payload transformations.

Proposal #2 — Hook before the Pinecone vector query

Files: includes/class-mxchat-integrator.php, inside find_relevant_content_pinecone() around line 5287 (before the wp_remote_post() to Pinecone), plus the four admin call sites in admin/class-pinecone-manager.php.

Proposed change: an mxchat_pre_vector_query filter following the WordPress pre_* convention — returning null preserves the current behavior, returning a Pinecone-compatible array short-circuits the network call:

$override = apply_filters('mxchat_pre_vector_query', null, $embedding, $top_k, $context);
if ($override !== null) {
    return $override; // bypass Pinecone
}

This would let mxchat-duckdb intercept queries and translate them into local DuckDB SQL, without touching MxChat’s source.

Intentions

To be completely clear: I have no competitive or commercial intent here. These two plugins exist purely because I needed the functionality on my own deployments, and I’d rather contribute back to the MxChat ecosystem than fork or monetize anything.

I’d in fact be delighted if my work were integrated into the original plugin free of charge — either one, the other, or both. I’m happy to share the source, the readme.txt files, and to open formal PRs if the project is hosted on an accessible repo.

Thanks again for everything you’ve built with MxChat. Looking forward to hearing your thoughts.

Paul

Follow-up with empirical data on Proposal 2 — Hook 1

Quick update: I tested the manual workaround (disabling streaming forces a code path that goes through the WordPress HTTP API instead of the direct cURL one). A companion plugin injecting Anthropic prompt caching markers worked cleanly on the first try.

After 14 requests across Haiku 4.5 and Sonnet 4.6 in real use, the cumulative cache hit rate sits at 65 percent (Haiku alone: roughly 60, Sonnet: roughly 68). Effective input token cost dropped to about 35 to 40 percent of baseline. Steady state projects to 70 to 85 percent input cost reduction on typical multi-turn chats — most impactful on Sonnet and Opus where input pricing is several times Haiku.

The trade-off today is binary: streaming and caching cannot currently coexist. The proposed filter resolves that — companion plugins could modify the request payload before dispatch while keeping streaming on. Pass-through by default, zero behaviour change for users without a companion plugin.

Full technical proposal (unified diff against the integrator class, working consumer example, full numbers) lives in this Gist:

https://gist.github.com/PaulArgoud/0f8cc1b455e27a679cc2b84445e8dc87

Happy to send a pull request against main with the patch and a small test — just point me at the branch you prefer.

PS : The MxChat chatbot is prompting visitors for a satisfaction rating even though the feature is disabled in the admin settings.