Skip to main content
Mythos

MiniMax CLI (mmx) is an 🏷️#open-source πŸ“Command Line Interface (CLI) for the πŸ“MiniMax AI Platform, built in TypeScript and distributed via npm. It exposes MiniMax's full multimodal API β€” text, image, video, speech, music, vision, and web search β€” through terminal commands. The CLI supports both the global endpoint (api.minimax.io) and the China region endpoint (api.minimaxi.com), with automatic region detection.

Capabilities

  • Text β€” multi-turn chat with streaming and JSON output
  • Image β€” text-to-image generation with aspect ratio controls
  • Video β€” asynchronous generation with progress tracking and download
  • Speech β€” text-to-speech with 30+ voices and speed adjustment
  • Music β€” lyrics-based generation, instrumental mode, and cover creation
  • Vision β€” image analysis and description
  • Search β€” web search via the MiniMax API

Agent Tool Integration

The CLI exports tool schemas via mmx config export-schema, making it usable as an AI agent tool through πŸ“Model Context Protocol (MCP) or similar orchestration layers. This is the feature that elevates it beyond a standard CLI β€” it turns MiniMax's entire multimodal stack into callable tools for agentic workflows. In a πŸ“Claude Code or BrianBot context, an agent could generate images, produce speech, or run video generation as part of a larger pipeline.

Architecture & Dependencies

Minimal footprint: one runtime dependency (@clack/prompts for interactive UI), Node 18+, ES modules. Auth supports both API key (stored in ~/.mmx/config.json with 0o600 permissions) and OAuth with PKCE. Self-update mechanism downloads binaries from GitHub releases with SHA256 checksum verification.

Security Profile (Scanned April 2026)

No critical vulnerabilities. The codebase demonstrates solid security fundamentals β€” no command injection vectors, no unsafe deserialization, proper file permissions, PKCE for OAuth, and a clean supply chain with no postinstall hooks. Four medium-severity findings, all most relevant when the CLI is invoked by an untrusted agent rather than a human operator:

  • Path traversal β€” --out and --out-dir flags pass user-controlled paths directly to createWriteStream without sanitization, allowing writes to arbitrary filesystem locations
  • Verbose mode credential leakage β€” --verbose logs full HTTP headers to stderr, including the Bearer token
  • HTTP scheme acceptance β€” base_url config validation accepts http://, allowing unencrypted credential transmission
  • Indirect SSRF β€” user-supplied image/audio URLs forwarded to the MiniMax API without scheme or IP validation

The path traversal and SSRF findings are the ones to watch in an agentic context. If an untrusted LLM controls the CLI's inputs via the exported tool schema, it could write files to unintended locations or probe internal network endpoints through the API.

The agent tool schema export is the interesting play here. Most AI platform CLIs are human-facing wrappers around REST APIs β€” this one explicitly positions itself as agent-callable infrastructure. That's the right instinct, but the input validation hasn't caught up with the trust model shift. When a human types --out ./video.mp4, path traversal is a non-issue. When an untrusted LLM generates that flag value, it becomes a real attack surface.

Worth evaluating as a tool in the BrianBot ecosystem if MiniMax's multimodal capabilities prove useful for content pipelines β€” but the security hardening for agent-mode use should land first.

Contexts

Created with πŸ’œ by One Inc | Copyright 2026