diff --git a/README.md b/README.md
index 92ccce1d..fb4f1657 100644
--- a/README.md
+++ b/README.md
@@ -18,12 +18,13 @@
 ## Features
 
 - **Text** — Multi-turn chat, streaming, system prompts, JSON output
-- **Image** — Text-to-image with aspect ratio and batch controls
+- **Image** — Text-to-image with aspect ratio, batch controls, and fine-grained sizing
 - **Video** — Async video generation with progress tracking
-- **Speech** — TTS with 30+ voices, speed control, streaming playback
+- **Speech** — TTS with 30+ voices, speed control, streaming playback, SRT subtitles
 - **Music** — Text-to-music with lyrics, instrumental mode, auto lyrics, and cover generation from reference audio
 - **Vision** — Image understanding and description
 - **Search** — Web search powered by MiniMax
+- **File** — Upload, list, and delete files from MiniMax storage
 - **Dual Region** — Seamless Global (`api.minimax.io`) and CN (`api.minimaxi.com`) support
 
 <img src="https://file.cdn.minimax.io/public/MMX-CLI.png" alt="MiniMax" width="100%" />
@@ -60,12 +61,16 @@ mmx music generate --prompt "Upbeat pop" --lyrics "[verse] La da dee, sunny day"
 mmx search "MiniMax AI latest news"
 mmx vision photo.jpg
 mmx quota
+
+# File management
+mmx file upload --file doc.pdf
+mmx file list
+mmx file delete --file-id 123456789
 ```
 
 ## Commands
 
 ### `mmx text`
-
 ```bash
 mmx text chat --message "Write a poem"
 mmx text chat --model MiniMax-M2.7-highspeed --message "Hello" --stream
@@ -80,10 +85,13 @@ cat messages.json | mmx text chat --messages-file - --output json
 mmx image "A cat in a spacesuit"
 mmx image generate --prompt "A cat" --n 3 --aspect-ratio 16:9
 mmx image generate --prompt "Logo" --out-dir ./out/
+# Fine-grained sizing
+mmx image generate --prompt "A photo" --width 1024 --height 768
+# With prompt optimizer
+mmx image generate --prompt "cat spacesuit" --prompt-optimizer
 ```
 
 ### `mmx video`
-
 ```bash
 mmx video generate --prompt "Ocean waves at sunset" --download sunset.mp4
 mmx video generate --prompt "A robot painting" --async
@@ -98,6 +106,9 @@ mmx speech synthesize --text "Hello!" --out hello.mp3
 mmx speech synthesize --text "Stream me" --stream | mpv -
 mmx speech synthesize --text "Hi" --voice English_magnetic_voiced_man --speed 1.2
 echo "Breaking news" | mmx speech synthesize --text-file - --out news.mp3
+# With SRT subtitles (if voice model supports it)
+mmx speech synthesize --text "Hello world" --subtitles --out hello.mp3
+#  saves hello.mp3 + hello.srt
 mmx speech voices
 ```
 
@@ -116,7 +127,6 @@ mmx music cover --prompt "Indie folk" --audio https://example.com/song.mp3 --out
 ```
 
 ### `mmx vision`
-
 ```bash
 mmx vision photo.jpg
 mmx vision describe --image https://example.com/img.jpg --prompt "What breed?"
@@ -130,6 +140,21 @@ mmx search "MiniMax AI"
 mmx search query --q "latest news" --output json
 ```
 
+### `mmx file`
+
+```bash
+# Upload a file (returns file ID for use with vision, speech, etc.)
+mmx file upload --file doc.pdf
+mmx file upload --file image.png --purpose vision
+
+# List all uploaded files
+mmx file list
+mmx file list --output json
+
+# Delete a file by ID
+mmx file delete --file-id 123456789
+```
+
 ### `mmx auth`
 
 ```bash
@@ -153,6 +178,7 @@ is auto-detected by probing both Global and CN.
 
 ```bash
 mmx quota
+mmx quotas                                  # show detailed quota breakdown per modality
 mmx config show
 mmx config set --key region --value cn
 mmx config set --key default-text-model --value MiniMax-M2.7-highspeed
@@ -162,8 +188,8 @@ mmx config export-schema | jq .
 ### `mmx update`
 
 ```bash
-mmx update
-mmx update latest
+mmx update                                 # update to latest stable
+mmx update latest                          # update to latest pre-release
 ```
 
 ## Thanks to
diff --git a/skill/SKILL.md b/skill/SKILL.md
index 79b7a60b..c6abe417 100644
--- a/skill/SKILL.md
+++ b/skill/SKILL.md
@@ -5,28 +5,19 @@ description: Use mmx to generate text, images, video, speech, and music via the
 
 # MiniMax CLI — Agent Skill Guide
 
-Use `mmx` to generate text, images, video, speech, music, and perform web search via the MiniMax AI platform.
+**Repo:** https://github.com/MiniMax-AI/cli
+**NPM:** https://www.npmjs.com/package/mmx-cli
+**Requires:** Node.js 18+, MiniMax Token Plan (Global or CN)
 
-## Prerequisites
+Use `mmx` to generate text, images, video, speech, music, web search, and file storage — via the MiniMax AI platform.
+
+## Quick Install
 
 ```bash
-# Install
 npm install -g mmx-cli
-
-# Auth (OAuth persists to ~/.mmx/credentials.json, API key persists to ~/.mmx/config.json)
-mmx auth login --api-key sk-xxxxx
-
-# Verify active auth source
-mmx auth status
-
-# Or pass per-call
-mmx text chat --api-key sk-xxxxx --message "Hello"
+npx skills add MiniMax-AI/cli -y -g   # add as OpenClaw skill
 ```
 
-Region is auto-detected. Override with `--region global` or `--region cn`.
-
----
-
 ## Agent Flags
 
 Always use these flags in non-interactive (agent/CI) contexts:
@@ -42,6 +33,34 @@ Always use these flags in non-interactive (agent/CI) contexts:
 
 ---
 
+## Authentication
+
+```bash
+# Interactive — choose OAuth or paste an API key
+mmx auth login
+
+# Non-interactive — paste API key directly
+mmx auth login --api-key ***
+
+# Skip the menu — auto-select OAuth for the given region
+mmx auth login --recommend --region=global   # → api.minimax.io
+mmx auth login --recommend --region=cn       # → api.minimaxi.com
+
+# Verify current auth status
+mmx auth status
+mmx auth refresh
+mmx auth logout
+```
+
+**Auth notes:**
+- Credentials are stored in `~/.mmx/config.json` — separate from OpenClaw's own MiniMax OAuth config
+- OAuth and API-key are mutually exclusive; logging in with one clears the other
+- `--api-key` flag can be passed per-command to override stored auth
+- With an API key, region is auto-detected by probing both Global and CN endpoints
+- `mmx auth status` is the canonical way to verify active authentication
+
+---
+
 ## Commands
 
 ### text chat
@@ -54,12 +73,12 @@ mmx text chat --message <text> [flags]
 
 | Flag | Type | Description |
 |---|---|---|
-| `--message <text>` | string, **required**, repeatable | Message text. Prefix with `role:` to set role (e.g. `"system:You are helpful"`, `"user:Hello"`) |
+| `--message <text>` | string, **required**, repeatable | Message text. Prefix with `role:` to set role (`"system:"`, `"user:"`, `"assistant:"`) |
 | `--messages-file <path>` | string | JSON file with messages array. Use `-` for stdin |
 | `--system <text>` | string | System prompt |
 | `--model <model>` | string | Model ID (default: `MiniMax-M2.7`) |
 | `--max-tokens <n>` | number | Max tokens (default: 4096) |
-| `--temperature <n>` | number | Sampling temperature (0.0, 1.0] |
+| `--temperature <n>` | number | Sampling temperature (0.0–1.0] |
 | `--top-p <n>` | number | Nucleus sampling threshold |
 | `--stream` | boolean | Stream tokens (default: on in TTY) |
 | `--tool <json-or-path>` | string, repeatable | Tool definition JSON or file path |
@@ -78,7 +97,7 @@ mmx text chat \
 cat conversation.json | mmx text chat --messages-file - --output json
 ```
 
-**stdout**: response text (text mode) or full response object (json mode).
+**stdout:** response text (text mode) or full response object (json mode).
 
 ---
 
@@ -90,6 +109,8 @@ Generate images. Model: `image-01`.
 mmx image generate --prompt <text> [flags]
 ```
 
+Single-token shorthand: `mmx image "prompt here"`
+
 | Flag | Type | Description |
 |---|---|---|
 | `--prompt <text>` | string, **required** | Image description |
@@ -111,13 +132,19 @@ mmx image generate --prompt "A cat in a spacesuit" --output json --quiet
 
 mmx image generate --prompt "Logo" --n 3 --out-dir ./gen/ --quiet
 # stdout: saved file paths (one per line)
+
+# Fine-grained sizing
+mmx image generate --prompt "A photo" --width 1024 --height 768
+
+# With prompt optimizer
+mmx image generate --prompt "cat spacesuit" --prompt-optimizer
 ```
 
 ---
 
 ### video generate
 
-Generate video. Default model: `MiniMax-Hailuo-2.3`. This is an async task — by default it polls until completion.
+Generate video. Default model: `MiniMax-Hailuo-2.3`. Async task — polls until completion by default.
 
 ```bash
 mmx video generate --prompt <text> [flags]
@@ -135,13 +162,12 @@ mmx video generate --prompt <text> [flags]
 | `--poll-interval <seconds>` | number | Polling interval (default: 5) |
 
 ```bash
-# Non-blocking: get task ID
+# Non-blocking: get task ID immediately
 mmx video generate --prompt "A robot." --async --quiet
 # stdout: {"taskId":"..."}
 
-# Blocking: wait and get file path
+# Blocking: wait for completion, save to file
 mmx video generate --prompt "Ocean waves." --download ocean.mp4 --quiet
-# stdout: ocean.mp4
 ```
 
 ### video task get
@@ -164,12 +190,14 @@ mmx video download --file-id <id> [--out <path>]
 
 ### speech synthesize
 
-Text-to-speech. Default model: `speech-2.8-hd`. Max 10k chars.
+Text-to-speech. Default model: `speech-2.8-hd`. Max 10k chars. 30+ voices available.
 
 ```bash
 mmx speech synthesize --text <text> [flags]
 ```
 
+Single-token shorthand: `mmx speech "Hello!"`
+
 | Flag | Type | Description |
 |---|---|---|
 | `--text <text>` | string | Text to synthesize |
@@ -184,19 +212,27 @@ mmx speech synthesize --text <text> [flags]
 | `--bitrate <bps>` | number | Bitrate (default: 128000) |
 | `--channels <n>` | number | Audio channels (default: 1) |
 | `--language <code>` | string | Language boost |
-| `--subtitles` | boolean | Download and save subtitles as `.srt` file (alongside `--out` audio file). API must support subtitles for the selected model.
+| `--subtitles` | boolean | Include subtitle timing data |
 | `--pronunciation <from/to>` | string, repeatable | Custom pronunciation |
 | `--sound-effect <effect>` | string | Add sound effect |
 | `--out <path>` | string | Save audio to file |
 | `--stream` | boolean | Stream raw audio to stdout |
 
 ```bash
+# Save to file
 mmx speech synthesize --text "Hello world" --out hello.mp3 --quiet
-# stdout: hello.mp3
 
-mmx speech synthesize --text "Hello" --subtitles --out hello.mp3
-# saves hello.mp3 + hello.srt (SRT subtitle file)
+# With SRT subtitles (if voice model supports it)
+mmx speech synthesize --text "Hello world" --subtitles --out hello.mp3
+# saves hello.mp3 + hello.srt
+
+# List all available voices
+mmx speech voices
 
+# Stream to audio player (pipe raw audio to mpv)
+mmx speech synthesize --text "Stream me" --stream | mpv -
+
+# From stdin
 echo "Breaking news." | mmx speech synthesize --text-file - --out news.mp3
 ```
 
@@ -204,30 +240,30 @@ echo "Breaking news." | mmx speech synthesize --text-file - --out news.mp3
 
 ### music generate
 
-Generate music. Responds well to rich, structured descriptions.
-
-**Model:** `music-2.6-free` — unlimited for API key users, RPM = 3.
+Generate music. Model: `music-2.6-free` — unlimited for API key users, RPM = 3. Responds well to rich, structured descriptions.
 
 ```bash
 mmx music generate --prompt <text> [--lyrics <text>] [flags]
 ```
 
+Single-token shorthand: `mmx music "Upbeat pop"`
+
 | Flag | Type | Description |
 |---|---|---|
 | `--prompt <text>` | string | Music style description (can be detailed) |
-| `--lyrics <text>` | string | Song lyrics with structure tags. Required unless `--instrumental` or `--lyrics-optimizer` is used. |
+| `--lyrics <text>` | string | Song lyrics with structure tags. Required unless `--instrumental` or `--lyrics-optimizer` is used |
 | `--lyrics-file <path>` | string | Read lyrics from file. Use `-` for stdin |
-| `--lyrics-optimizer` | boolean | Auto-generate lyrics from prompt. Cannot be used with `--lyrics` or `--instrumental`. |
-| `--instrumental` | boolean | Generate instrumental music (no vocals). Cannot be used with `--lyrics`. |
+| `--lyrics-optimizer` | boolean | Auto-generate lyrics from prompt. Cannot be used with `--lyrics` or `--instrumental` |
+| `--instrumental` | boolean | Generate instrumental music (no vocals) |
 | `--vocals <text>` | string | Vocal style, e.g. `"warm male baritone"`, `"bright female soprano"`, `"duet with harmonies"` |
 | `--genre <text>` | string | Music genre, e.g. folk, pop, jazz |
 | `--mood <text>` | string | Mood or emotion, e.g. warm, melancholic, uplifting |
 | `--instruments <text>` | string | Instruments to feature, e.g. `"acoustic guitar, piano"` |
 | `--tempo <text>` | string | Tempo description, e.g. fast, slow, moderate |
 | `--bpm <number>` | number | Exact tempo in beats per minute |
-| `--key <text>` | string | Musical key, e.g. C major, A minor, G sharp |
+| `--key <text>` | string | Musical key, e.g. C major, A minor |
 | `--avoid <text>` | string | Elements to avoid in the generated music |
-| `--use-case <text>` | string | Use case context, e.g. `"background music for video"`, `"theme song"` |
+| `--use-case <text>` | string | Use case context, e.g. `"background music for video"` |
 | `--structure <text>` | string | Song structure, e.g. `"verse-chorus-verse-bridge-chorus"` |
 | `--references <text>` | string | Reference tracks or artists, e.g. `"similar to Ed Sheeran"` |
 | `--extra <text>` | string | Additional fine-grained requirements |
@@ -238,10 +274,8 @@ mmx music generate --prompt <text> [--lyrics <text>] [flags]
 | `--out <path>` | string | Save audio to file |
 | `--stream` | boolean | Stream raw audio to stdout |
 
-At least one of `--prompt` or `--lyrics` is required.
-
 ```bash
-# With lyrics
+# With explicit lyrics
 mmx music generate --prompt "Upbeat pop" --lyrics "La la la..." --out song.mp3 --quiet
 
 # Auto-generate lyrics from prompt
@@ -250,7 +284,7 @@ mmx music generate --prompt "Upbeat pop about summer" --lyrics-optimizer --out s
 # Instrumental
 mmx music generate --prompt "Cinematic orchestral, building tension" --instrumental --out bgm.mp3 --quiet
 
-# Detailed prompt with vocal characteristics
+# Detailed with vocal characteristics
 mmx music generate --prompt "Warm morning folk" \
   --vocals "male and female duet, harmonies in chorus" \
   --instruments "acoustic guitar, piano" \
@@ -263,9 +297,7 @@ mmx music generate --prompt "Warm morning folk" \
 
 ### music cover
 
-Generate a cover version of a song based on reference audio.
-
-**Model:** `music-cover-free` — unlimited for API key users, RPM = 3.
+Generate a cover version of a song based on reference audio. Model: `music-cover-free` — unlimited for API key users, RPM = 3.
 
 ```bash
 mmx music cover --prompt <text> (--audio <url> | --audio-file <path>) [flags]
@@ -274,10 +306,10 @@ mmx music cover --prompt <text> (--audio <url> | --audio-file <path>) [flags]
 | Flag | Type | Description |
 |---|---|---|
 | `--prompt <text>` | string, **required** | Target cover style, e.g. `"Indie folk, acoustic guitar, warm male vocal"` |
-| `--audio <url>` | string | URL of reference audio (mp3, wav, flac, etc. — 6s to 6min, max 50MB) |
+| `--audio <url>` | string | URL of reference audio (mp3, wav, flac — 6s to 6min, max 50MB) |
 | `--audio-file <path>` | string | Local reference audio file (auto base64-encoded) |
-| `--lyrics <text>` | string | Cover lyrics. If omitted, extracted from reference audio via ASR. |
-| `--lyrics-file <path>` | string | Read lyrics from file. Use `-` for stdin |
+| `--lyrics <text>` | string | Cover lyrics. If omitted, extracted from reference audio via ASR |
+| `--lyrics-file <path>` | string | Read lyrics from file |
 | `--seed <number>` | number | Random seed 0–1000000 for reproducible results |
 | `--format <fmt>` | string | Audio format: `mp3`, `wav`, `pcm` (default: `mp3`) |
 | `--sample-rate <hz>` | number | Sample rate (default: 44100) |
@@ -289,14 +321,16 @@ mmx music cover --prompt <text> (--audio <url> | --audio-file <path>) [flags]
 ```bash
 # Cover from URL
 mmx music cover --prompt "Indie folk, acoustic guitar, warm male vocal" \
-  --audio https://filecdn.minimax.chat/public/d20eda57-2e36-45bf-9e12-82d9f2e69a86.mp3 --out cover.mp3 --quiet
+  --audio https://filecdn.minimax.chat/public/example.mp3 --out cover.mp3 --quiet
 
 # Cover from local file with custom lyrics
 mmx music cover --prompt "Jazz, piano, slow" \
   --audio-file original.mp3 --lyrics-file lyrics.txt --out jazz_cover.mp3 --quiet
 
 # Reproducible result with seed
-mmx music cover --prompt "Pop, upbeat" --audio https://filecdn.minimax.chat/public/d20eda57-2e36-45bf-9e12-82d9f2e69a86.mp3 --seed 42 --out cover.mp3
+mmx music cover --prompt "Pop, upbeat" \
+  --audio https://filecdn.minimax.chat/public/example.mp3 \
+  --seed 42 --out cover.mp3
 ```
 
 ---
@@ -309,6 +343,8 @@ Image understanding via VLM. Provide either `--image` or `--file-id`, not both.
 mmx vision describe (--image <path-or-url> | --file-id <id>) [flags]
 ```
 
+Single-token shorthand: `mmx vision photo.jpg`
+
 | Flag | Type | Description |
 |---|---|---|
 | `--image <path-or-url>` | string | Local path or URL (auto base64-encoded) |
@@ -319,8 +355,6 @@ mmx vision describe (--image <path-or-url> | --file-id <id>) [flags]
 mmx vision describe --image photo.jpg --prompt "What breed?" --output json
 ```
 
-**stdout**: description text (text mode) or full response (json mode).
-
 ---
 
 ### search query
@@ -331,6 +365,8 @@ Web search via MiniMax.
 mmx search query --q <query>
 ```
 
+Single-token shorthand: `mmx search "query here"`
+
 | Flag | Type | Description |
 |---|---|---|
 | `--q <query>` | string, **required** | Search query |
@@ -341,6 +377,58 @@ mmx search query --q "MiniMax AI" --output json --quiet
 
 ---
 
+### file upload
+
+Upload a file to MiniMax storage. Use the returned `file_id` with vision, speech, or other file-dependent commands.
+
+```bash
+mmx file upload --file <path> [--purpose <purpose>] [flags]
+```
+
+| Flag | Type | Description |
+|---|---|---|
+| `--file <path>` | string, **required** | Local path to the file |
+| `--purpose <string>` | string | File purpose: `retrieval` (default) or `vision` |
+
+```bash
+mmx file upload --file doc.pdf
+mmx file upload --file image.png --purpose vision
+# stdout: { file_id, filename, purpose, bytes, created_at }
+```
+
+### file list
+
+List all uploaded files in MiniMax storage.
+
+```bash
+mmx file list [--output json]
+```
+
+```bash
+# Formatted table (default)
+mmx file list
+# ID         FILENAME  PURPOSE     SIZE_KB  CREATED
+# 123456789  doc.pdf    retrieval   2048.0   2026-05-27 10:30
+
+# JSON output
+mmx file list --output json
+```
+
+### file delete
+
+Delete an uploaded file by ID.
+
+```bash
+mmx file delete --file-id <id>
+```
+
+```bash
+mmx file delete --file-id 123456789
+# stdout: deleted
+```
+
+---
+
 ### quota show
 
 Display Token Plan usage and remaining quotas.
@@ -351,6 +439,20 @@ mmx quota show [--output json]
 
 ---
 
+## Single-Token Shorthand Commands
+
+MiniMax CLI supports quick single-token shortcuts for common operations:
+
+```bash
+mmx image "A cat in a spacesuit"           # image generate
+mmx speech "Hello!" --out hello.mp3        # speech synthesize
+mmx music "Upbeat pop" --out song.mp3       # music generate
+mmx vision photo.jpg                        # vision describe
+mmx search "MiniMax AI latest news"        # search query
+```
+
+---
+
 ## Tool Schema Export
 
 Export all commands as Anthropic/OpenAI-compatible JSON tool schemas:
@@ -363,8 +465,6 @@ mmx config export-schema
 mmx config export-schema --command "video generate"
 ```
 
-Use this to dynamically register mmx commands as tools in your agent framework.
-
 ---
 
 ## Exit Codes
@@ -381,47 +481,23 @@ Use this to dynamically register mmx commands as tools in your agent framework.
 
 ---
 
-## Piping Patterns
+## Configuration
 
 ```bash
-# stdout is always clean data — safe to pipe
-mmx text chat --message "Hi" --output json | jq '.content'
-
-# stderr has progress/spinners — discard if needed
-mmx video generate --prompt "Waves" 2>/dev/null
-
-# Chain: generate image → describe it
-URL=$(mmx image generate --prompt "A sunset" --quiet)
-mmx vision describe --image "$URL" --quiet
-
-# Async video workflow
-TASK=$(mmx video generate --prompt "A robot" --async --quiet | jq -r '.taskId')
-mmx video task get --task-id "$TASK" --output json
-mmx video download --task-id "$TASK" --out robot.mp4
+mmx config show                                                         # show current config
+mmx config set --key region --value cn                                  # set platform region
+mmx config set --key default-text-model --value MiniMax-M2.7-highspeed # set default model
+mmx update                                                             # update CLI to latest
+mmx quota                                                              # show token plan usage
 ```
 
----
-
-## Configuration Precedence
-
-CLI flags → environment variables → `~/.mmx/config.json` → defaults.
-
-```bash
-# Persistent config
-mmx config set --key region --value cn
-mmx config show
-
-# Environment
-export MINIMAX_API_KEY=sk-xxxxx
-export MINIMAX_REGION=cn
-```
+**Config precedence:** CLI flags → environment variables → `~/.mmx/config.json` → defaults.
 
 ### Default Model Configuration
 
 Set per-modality defaults so you don't need `--model` every time:
 
 ```bash
-# Set defaults
 mmx config set --key default-text-model --value MiniMax-M2.7-highspeed
 mmx config set --key default-speech-model --value speech-2.8-hd
 mmx config set --key default-video-model --value MiniMax-Hailuo-2.3
@@ -437,4 +513,25 @@ mmx music generate --prompt "Upbeat pop" --instrumental
 mmx text chat --model MiniMax-M2.7 --message "Hello"
 ```
 
-**Resolution priority**: `--model` flag > config default > hardcoded fallback.
+**Resolution priority:** `--model` flag > config default > hardcoded fallback.
+
+---
+
+## Piping Patterns
+
+```bash
+# stdout is always clean data — safe to pipe
+mmx text chat --message "Hi" --output json | jq '.content'
+
+# stderr has progress/spinners — discard if needed
+mmx video generate --prompt "Waves" 2>/dev/null
+
+# Chain: generate image → describe it
+URL=$(mmx image generate --prompt "A sunset" --quiet)
+mmx vision describe --image "$URL" --quiet
+
+# Async video workflow
+TASK=$(mmx video generate --prompt "A robot" --async --quiet | jq -r '.taskId')
+mmx video task get --task-id "$TASK" --output json
+mmx video download --task-id "$TASK" --out robot.mp4
+```