MxChat Image Analysis

Let visitors upload images straight into the chat so the AI can read, describe, or extract information from them. Useful for support flows where users need to share a screenshot, document, receipt, chart, or photo to get a real answer.

Overview

MxChat Image Analysis adds a camera icon to the chat toolbar so visitors can attach images and ask the AI to look at them. The AI can extract text via OCR, recognize objects, describe scenes, parse charts, and pull data from documents — then keep talking about what it saw in the rest of the conversation. It runs on three vision providers: OpenAI GPT-4V, OpenAI GPT-4o (multimodal), and xAI Grok Vision. You pick the model that fits the use case and the API key you already have. The add-on requires the base MxChat plugin plus a MxChat Pro license, and reuses whichever OpenAI or xAI key you already configured in MxChat core.

Setup

Install MxChat Image Analysis by uploading the ZIP through Plugins → Add New → Upload Plugin, or drop the mxchat-vision folder into /wp-content/plugins/.
Activate the plugin under Plugins → Installed Plugins.
Make sure the base MxChat plugin is active and a MxChat Pro license is registered (the add-on warns you in admin if either is missing).
Confirm an OpenAI key and/or xAI key is saved in MxChat core. Easiest path: MxChat → Settings, switch the chat model to GPT-4V (or Grok), paste the key, save, then switch your chat model back to whatever you normally use.
Open MxChat → Image Analysis, toggle Enable Vision Analysis on, pick your model, and save.

Settings

Enable Vision Analysis

Master switch for the feature. When on, the camera/upload button shows up in the chat toolbar for visitors. Off by default.

Vision Mode

Choose Image Analysis (AI returns text about the image) or Image Editing (AI returns a modified image). The rest of the settings page changes based on this choice. Default: Analysis.

Analysis Model

Which vision model handles uploads. Options: OpenAI Vision (GPT-4V / GPT-4o) and Grok Vision. Models are grayed out if their API key is not configured in MxChat core. Default: OpenAI.

Analysis Prompt

The system instruction sent to the AI alongside the image. Default prompt asks for a detailed description covering text, objects, people, colors, and context. Built-in example templates cover Professional Analysis, Simple & Friendly, and OCR Focus.

Max Images Per Message

How many images a visitor can attach to a single message. Range 1–10, default 5. Higher numbers raise processing time and API cost.

Max File Size (MB)

Per-image upload limit. Range 1–20 MB, default 10 MB. Oversized images are auto-resized client-side to fit.

Allowed Image Types

Fixed list: JPEG, PNG, GIF, WebP. Anything else is rejected at the upload step.

Editing Mode Settings

Visible only when Vision Mode is set to Editing. These control the optional Gemini-powered image editing path (covered in detail in the dedicated MxChat Image Editing documentation).

Features

Simple in-chat upload: A camera icon appears in the chat toolbar. Visitors can pick a file, paste from the clipboard, or drag and drop — no separate workflow, no leaving the conversation.
OpenAI + xAI vision models: Choose between GPT-4V, GPT-4o, or Grok Vision per install. Switch when one provider drops prices or ships a better model.
OCR and text extraction: Pulls text out of screenshots, scanned documents, receipts, business cards, and even handwriting, then feeds it into the conversation so the AI can answer questions about what it read.
Object recognition and scene description: The AI identifies what is in the image and describes it in detail, which gives the rest of the chat the context it needs to answer follow-ups.
Chart and graph analysis: Visitors upload a screenshot of a chart and ask things like "what is the peak value" or "what does this trend mean" — the vision model reads axes, labels, and bars.
Document and form extraction: Pulls structured data out of invoices, forms, ID cards, and similar documents so a support agent or downstream automation can use it.
Custom analysis prompts: Pre-load templates (Document Analysis, OCR Focus, Professional Analysis) or write your own to point the AI at exactly what you care about per use case.
Multi-format support: JPEG, PNG, GIF, and WebP, including pasted clipboard images and dragged screenshots.
No extra setup: Reuses your existing MxChat OpenAI or xAI keys. Toggle vision per bot from MxChat settings, no second account or billing relationship needed.

API & Integration

OpenAI Vision API: Calls the OpenAI Chat Completions endpoint with multimodal content (GPT-4V or GPT-4o). Uses the api_key value from MxChat core's mxchat_options.
xAI Grok Vision API: Calls the xAI Grok Vision endpoint. Uses the xai_api_key value from MxChat core's mxchat_options.
WordPress options: All settings live under standard get_option('mxchat_vision_*') keys (mxchat_vision_enabled, mxchat_vision_model, mxchat_vision_mode, mxchat_vision_max_images, mxchat_vision_max_file_size, mxchat_vision_custom_prompt) so they can be read or set programmatically.
JavaScript object: The frontend script exposes window.mxchatVision with ajaxUrl, nonce, maxFileSize, allowedTypes, and mode — handy if you need to extend the upload UI in a custom theme.
Admin AJAX: Settings save via the mxchat_vision_save_settings admin-ajax action, gated by the mxchat_vision_nonce.

Requirements

MxChat (mxchat-basic): Required and active.
MxChat Pro License: Required — the add-on shows a warning and disables features without it. See pricing.
WordPress: 5.8 or newer.
PHP: 7.4 or newer.
Third-party API key: An OpenAI API key (for GPT-4V / GPT-4o) and/or an xAI API key (for Grok Vision). Configured inside MxChat core, not inside this add-on.

Troubleshooting

Confirm Enable Vision Analysis is on in MxChat → Image Analysis, MxChat Pro is active, and your visitor hard-refreshes the page (the toolbar JS is cached). If you run multiple bots, double-check that the bot displayed on the page actually has vision enabled.

"API Key Required" appears next to a model

The selected provider's key is not saved in MxChat core. Go to MxChat → Settings, switch the chat model temporarily to GPT-4V or Grok, paste the key, save, then switch the chat model back. Reload the Image Analysis settings page to confirm the badge flips to Active.

Uploads fail or get rejected

Check the file type (JPEG, PNG, GIF, WebP only) and the file size against Max File Size in settings. Files above 10 MB are rejected by default. Large screenshots taken on high-DPI phones often crash this limit — bump it to 15–20 MB or downscale the image.

AI response is empty or "I cannot see an image"

Almost always means the request to the provider failed silently. Check your WordPress error log and your API account's usage dashboard for rate-limit or quota errors. Grok and OpenAI both return 429s when out of credits.

Image content is described inaccurately

Vision models hallucinate when the image is low-resolution or compressed. Try a larger original, or rewrite the Analysis Prompt to be more specific about what you want the AI to look for (for example, "Focus only on the text in the receipt and list every line item with its price").

Last reviewed by Sage on 2026-05-21. Spotted an issue? Open a ticket and we'll patch the doc.