feat: sync full workspace including web modules, docs, and configurations to Gitea

Optimized the root .gitignore to exclude virtual environments, node modules,
and temp folders to ensure clean and lightweight version tracking.

Co-authored-by: Cursor <cursoragent@cursor.com>
This commit is contained in:
王冕
2026-06-09 18:12:25 +08:00
parent 351688006e
commit a27e3b8e43
1510 changed files with 162044 additions and 1517 deletions

View File

@@ -0,0 +1,204 @@
---
name: baoyu-image-gen
description: AI image generation with OpenAI, Google, DashScope and Replicate APIs. Supports text-to-image, reference images, aspect ratios. Sequential by default; parallel generation available on request. Use when user asks to generate, create, or draw images.
---
# Image Generation (AI SDK)
Official API-based image generation. Supports OpenAI, Google, DashScope (阿里通义万象) and Replicate providers.
## Script Directory
**Agent Execution**:
1. `SKILL_DIR` = this SKILL.md file's directory
2. Script path = `${SKILL_DIR}/scripts/main.ts`
## Step 0: Load Preferences ⛔ BLOCKING
**CRITICAL**: This step MUST complete BEFORE any image generation. Do NOT skip or defer.
Check EXTEND.md existence (priority: project → user):
```bash
test -f .baoyu-skills/baoyu-image-gen/EXTEND.md && echo "project"
test -f "$HOME/.baoyu-skills/baoyu-image-gen/EXTEND.md" && echo "user"
```
| Result | Action |
|--------|--------|
| Found | Load, parse, apply settings. If `default_model.[provider]` is null → ask model only (Flow 2) |
| Not found | ⛔ Run first-time setup ([references/config/first-time-setup.md](references/config/first-time-setup.md)) → Save EXTEND.md → Then continue |
**CRITICAL**: If not found, complete the full setup (provider + model + quality + save location) using AskUserQuestion BEFORE generating any images. Generation is BLOCKED until EXTEND.md is created.
| Path | Location |
|------|----------|
| `.baoyu-skills/baoyu-image-gen/EXTEND.md` | Project directory |
| `$HOME/.baoyu-skills/baoyu-image-gen/EXTEND.md` | User home |
**EXTEND.md Supports**: Default provider | Default quality | Default aspect ratio | Default image size | Default models
Schema: `references/config/preferences-schema.md`
## Usage
```bash
# Basic
npx -y bun ${SKILL_DIR}/scripts/main.ts --prompt "A cat" --image cat.png
# With aspect ratio
npx -y bun ${SKILL_DIR}/scripts/main.ts --prompt "A landscape" --image out.png --ar 16:9
# High quality
npx -y bun ${SKILL_DIR}/scripts/main.ts --prompt "A cat" --image out.png --quality 2k
# From prompt files
npx -y bun ${SKILL_DIR}/scripts/main.ts --promptfiles system.md content.md --image out.png
# With reference images (Google multimodal or OpenAI edits)
npx -y bun ${SKILL_DIR}/scripts/main.ts --prompt "Make blue" --image out.png --ref source.png
# With reference images (explicit provider/model)
npx -y bun ${SKILL_DIR}/scripts/main.ts --prompt "Make blue" --image out.png --provider google --model gemini-3-pro-image-preview --ref source.png
# Specific provider
npx -y bun ${SKILL_DIR}/scripts/main.ts --prompt "A cat" --image out.png --provider openai
# DashScope (阿里通义万象)
npx -y bun ${SKILL_DIR}/scripts/main.ts --prompt "一只可爱的猫" --image out.png --provider dashscope
# Replicate (google/nano-banana-pro)
npx -y bun ${SKILL_DIR}/scripts/main.ts --prompt "A cat" --image out.png --provider replicate
# Replicate with specific model
npx -y bun ${SKILL_DIR}/scripts/main.ts --prompt "A cat" --image out.png --provider replicate --model google/nano-banana
```
## Options
| Option | Description |
|--------|-------------|
| `--prompt <text>`, `-p` | Prompt text |
| `--promptfiles <files...>` | Read prompt from files (concatenated) |
| `--image <path>` | Output image path (required) |
| `--provider google\|openai\|dashscope\|replicate` | Force provider (default: google) |
| `--model <id>`, `-m` | Model ID (Google: `gemini-3-pro-image-preview`, `gemini-3.1-flash-image-preview`; OpenAI: `gpt-image-1.5`) |
| `--ar <ratio>` | Aspect ratio (e.g., `16:9`, `1:1`, `4:3`) |
| `--size <WxH>` | Size (e.g., `1024x1024`) |
| `--quality normal\|2k` | Quality preset (default: 2k) |
| `--imageSize 1K\|2K\|4K` | Image size for Google (default: from quality) |
| `--ref <files...>` | Reference images. Supported by Google multimodal (`gemini-3-pro-image-preview`, `gemini-3-flash-preview`, `gemini-3.1-flash-image-preview`) and OpenAI edits (GPT Image models). If provider omitted: Google first, then OpenAI |
| `--n <count>` | Number of images |
| `--json` | JSON output |
## Environment Variables
| Variable | Description |
|----------|-------------|
| `OPENAI_API_KEY` | OpenAI API key |
| `GOOGLE_API_KEY` | Google API key |
| `DASHSCOPE_API_KEY` | DashScope API key (阿里云) |
| `REPLICATE_API_TOKEN` | Replicate API token |
| `OPENAI_IMAGE_MODEL` | OpenAI model override |
| `GOOGLE_IMAGE_MODEL` | Google model override |
| `DASHSCOPE_IMAGE_MODEL` | DashScope model override (default: z-image-turbo) |
| `REPLICATE_IMAGE_MODEL` | Replicate model override (default: google/nano-banana-pro) |
| `OPENAI_BASE_URL` | Custom OpenAI endpoint |
| `GOOGLE_BASE_URL` | Custom Google endpoint |
| `DASHSCOPE_BASE_URL` | Custom DashScope endpoint |
| `REPLICATE_BASE_URL` | Custom Replicate endpoint |
**Load Priority**: CLI args > EXTEND.md > env vars > `<cwd>/.baoyu-skills/.env` > `~/.baoyu-skills/.env`
## Model Resolution
Model priority (highest → lowest), applies to all providers:
1. CLI flag: `--model <id>`
2. EXTEND.md: `default_model.[provider]`
3. Env var: `<PROVIDER>_IMAGE_MODEL` (e.g., `GOOGLE_IMAGE_MODEL`)
4. Built-in default
**EXTEND.md overrides env vars**. If both EXTEND.md `default_model.google: "gemini-3-pro-image-preview"` and env var `GOOGLE_IMAGE_MODEL=gemini-3.1-flash-image-preview` exist, EXTEND.md wins.
**Agent MUST display model info** before each generation:
- Show: `Using [provider] / [model]`
- Show switch hint: `Switch model: --model <id> | EXTEND.md default_model.[provider] | env <PROVIDER>_IMAGE_MODEL`
### Replicate Models
Supported model formats:
- `owner/name` (recommended for official models), e.g. `google/nano-banana-pro`
- `owner/name:version` (community models by version), e.g. `stability-ai/sdxl:<version>`
Examples:
```bash
# Use Replicate default model
npx -y bun ${SKILL_DIR}/scripts/main.ts --prompt "A cat" --image out.png --provider replicate
# Override model explicitly
npx -y bun ${SKILL_DIR}/scripts/main.ts --prompt "A cat" --image out.png --provider replicate --model google/nano-banana
```
## Provider Selection
1. `--ref` provided + no `--provider` → auto-select Google first, then OpenAI, then Replicate
2. `--provider` specified → use it (if `--ref`, must be `google`, `openai`, or `replicate`)
3. Only one API key available → use that provider
4. Multiple available → default to Google
## Quality Presets
| Preset | Google imageSize | OpenAI Size | Use Case |
|--------|------------------|-------------|----------|
| `normal` | 1K | 1024px | Quick previews |
| `2k` (default) | 2K | 2048px | Covers, illustrations, infographics |
**Google imageSize**: Can be overridden with `--imageSize 1K|2K|4K`
## Aspect Ratios
Supported: `1:1`, `16:9`, `9:16`, `4:3`, `3:4`, `2.35:1`
- Google multimodal: uses `imageConfig.aspectRatio`
- Google Imagen: uses `aspectRatio` parameter
- OpenAI: maps to closest supported size
## Generation Mode
**Default**: Sequential generation (one image at a time). This ensures stable output and easier debugging.
**Parallel Generation**: Only use when user explicitly requests parallel/concurrent generation.
| Mode | When to Use |
|------|-------------|
| Sequential (default) | Normal usage, single images, small batches |
| Parallel | User explicitly requests, large batches (10+) |
**Parallel Settings** (when requested):
| Setting | Value |
|---------|-------|
| Recommended concurrency | 4 subagents |
| Max concurrency | 8 subagents |
| Use case | Large batch generation when user requests parallel |
**Agent Implementation** (parallel mode only):
```
# Launch multiple generations in parallel using Task tool
# Each Task runs as background subagent with run_in_background=true
# Collect results via TaskOutput when all complete
```
## Error Handling
- Missing API key → error with setup instructions
- Generation failure → auto-retry once
- Invalid aspect ratio → warning, proceed with default
- Reference images with unsupported provider/model → error with fix hint (switch to Google multimodal: `gemini-3-pro-image-preview`, `gemini-3.1-flash-image-preview`; or OpenAI GPT Image edits)
## Extension Support
Custom configurations via EXTEND.md. See **Preferences** section for paths and supported options.

View File

@@ -0,0 +1,197 @@
---
name: first-time-setup
description: First-time setup and default model selection flow for baoyu-image-gen
---
# First-Time Setup
## Overview
Triggered when:
1. No EXTEND.md found → full setup (provider + model + preferences)
2. EXTEND.md found but `default_model.[provider]` is null → model selection only
## Setup Flow
```
No EXTEND.md found EXTEND.md found, model null
│ │
▼ ▼
┌─────────────────────┐ ┌──────────────────────┐
│ AskUserQuestion │ │ AskUserQuestion │
│ (full setup) │ │ (model only) │
└─────────────────────┘ └──────────────────────┘
│ │
▼ ▼
┌─────────────────────┐ ┌──────────────────────┐
│ Create EXTEND.md │ │ Update EXTEND.md │
└─────────────────────┘ └──────────────────────┘
│ │
▼ ▼
Continue Continue
```
## Flow 1: No EXTEND.md (Full Setup)
**Language**: Use user's input language or saved language preference.
Use AskUserQuestion with ALL questions in ONE call:
### Question 1: Default Provider
```yaml
header: "Provider"
question: "Default image generation provider?"
options:
- label: "Google (Recommended)"
description: "Gemini multimodal - high quality, reference images, flexible sizes"
- label: "OpenAI"
description: "GPT Image - consistent quality, reliable output"
- label: "DashScope"
description: "Alibaba Cloud - z-image-turbo, good for Chinese content"
- label: "Replicate"
description: "Community models - nano-banana-pro, flexible model selection"
```
### Question 2: Default Google Model
Only show if user selected Google or auto-detect (no explicit provider).
```yaml
header: "Google Model"
question: "Default Google image generation model?"
options:
- label: "gemini-3-pro-image-preview (Recommended)"
description: "Highest quality, best for production use"
- label: "gemini-3.1-flash-image-preview"
description: "Fast generation, good quality, lower cost"
- label: "gemini-3-flash-preview"
description: "Fast generation, balanced quality and speed"
```
### Question 3: Default Quality
```yaml
header: "Quality"
question: "Default image quality?"
options:
- label: "2k (Recommended)"
description: "2048px - covers, illustrations, infographics"
- label: "normal"
description: "1024px - quick previews, drafts"
```
### Question 4: Save Location
```yaml
header: "Save"
question: "Where to save preferences?"
options:
- label: "Project (Recommended)"
description: ".baoyu-skills/ (this project only)"
- label: "User"
description: "~/.baoyu-skills/ (all projects)"
```
### Save Locations
| Choice | Path | Scope |
|--------|------|-------|
| Project | `.baoyu-skills/baoyu-image-gen/EXTEND.md` | Current project |
| User | `$HOME/.baoyu-skills/baoyu-image-gen/EXTEND.md` | All projects |
### EXTEND.md Template
```yaml
---
version: 1
default_provider: [selected provider or null]
default_quality: [selected quality]
default_aspect_ratio: null
default_image_size: null
default_model:
google: [selected google model or null]
openai: null
dashscope: null
replicate: null
---
```
## Flow 2: EXTEND.md Exists, Model Null
When EXTEND.md exists but `default_model.[current_provider]` is null, ask ONLY the model question for the current provider.
### Google Model Selection
```yaml
header: "Google Model"
question: "Choose a default Google image generation model?"
options:
- label: "gemini-3-pro-image-preview (Recommended)"
description: "Highest quality, best for production use"
- label: "gemini-3.1-flash-image-preview"
description: "Fast generation, good quality, lower cost"
- label: "gemini-3-flash-preview"
description: "Fast generation, balanced quality and speed"
```
### OpenAI Model Selection
```yaml
header: "OpenAI Model"
question: "Choose a default OpenAI image generation model?"
options:
- label: "gpt-image-1.5 (Recommended)"
description: "Latest GPT Image model, high quality"
- label: "gpt-image-1"
description: "Previous generation GPT Image model"
```
### DashScope Model Selection
```yaml
header: "DashScope Model"
question: "Choose a default DashScope image generation model?"
options:
- label: "z-image-turbo (Recommended)"
description: "Fast generation, good quality"
- label: "z-image-ultra"
description: "Higher quality, slower generation"
```
### Replicate Model Selection
```yaml
header: "Replicate Model"
question: "Choose a default Replicate image generation model?"
options:
- label: "google/nano-banana-pro (Recommended)"
description: "Google's fast image model on Replicate"
- label: "google/nano-banana"
description: "Google's base image model on Replicate"
```
### Update EXTEND.md
After user selects a model:
1. Read existing EXTEND.md
2. If `default_model:` section exists → update the provider-specific key
3. If `default_model:` section missing → add the full section:
```yaml
default_model:
google: [value or null]
openai: [value or null]
dashscope: [value or null]
replicate: [value or null]
```
Only set the selected provider's model; leave others as their current value or null.
## After Setup
1. Create directory if needed
2. Write/update EXTEND.md with frontmatter
3. Confirm: "Preferences saved to [path]"
4. Continue with image generation

View File

@@ -0,0 +1,69 @@
---
name: preferences-schema
description: EXTEND.md YAML schema for baoyu-image-gen user preferences
---
# Preferences Schema
## Full Schema
```yaml
---
version: 1
default_provider: null # google|openai|dashscope|replicate|null (null = auto-detect)
default_quality: null # normal|2k|null (null = use default: 2k)
default_aspect_ratio: null # "16:9"|"1:1"|"4:3"|"3:4"|"2.35:1"|null
default_image_size: null # 1K|2K|4K|null (Google only, overrides quality)
default_model:
google: null # e.g., "gemini-3-pro-image-preview", "gemini-3.1-flash-image-preview"
openai: null # e.g., "gpt-image-1.5"
dashscope: null # e.g., "z-image-turbo"
replicate: null # e.g., "google/nano-banana-pro"
---
```
## Field Reference
| Field | Type | Default | Description |
|-------|------|---------|-------------|
| `version` | int | 1 | Schema version |
| `default_provider` | string\|null | null | Default provider (null = auto-detect) |
| `default_quality` | string\|null | null | Default quality (null = 2k) |
| `default_aspect_ratio` | string\|null | null | Default aspect ratio |
| `default_image_size` | string\|null | null | Google image size (overrides quality) |
| `default_model.google` | string\|null | null | Google default model |
| `default_model.openai` | string\|null | null | OpenAI default model |
| `default_model.dashscope` | string\|null | null | DashScope default model |
| `default_model.replicate` | string\|null | null | Replicate default model |
## Examples
**Minimal**:
```yaml
---
version: 1
default_provider: google
default_quality: 2k
---
```
**Full**:
```yaml
---
version: 1
default_provider: google
default_quality: 2k
default_aspect_ratio: "16:9"
default_image_size: 2K
default_model:
google: "gemini-3-pro-image-preview"
openai: "gpt-image-1.5"
dashscope: "z-image-turbo"
replicate: "google/nano-banana-pro"
---
```

View File

@@ -0,0 +1,497 @@
import path from "node:path";
import process from "node:process";
import { homedir } from "node:os";
import { access, mkdir, readFile, writeFile } from "node:fs/promises";
import type { CliArgs, Provider, ExtendConfig } from "./types";
function printUsage(): void {
console.log(`Usage:
npx -y bun scripts/main.ts --prompt "A cat" --image cat.png
npx -y bun scripts/main.ts --prompt "A landscape" --image landscape.png --ar 16:9
npx -y bun scripts/main.ts --promptfiles system.md content.md --image out.png
Options:
-p, --prompt <text> Prompt text
--promptfiles <files...> Read prompt from files (concatenated)
--image <path> Output image path (required)
--provider google|openai|dashscope|replicate Force provider (auto-detect by default)
-m, --model <id> Model ID
--ar <ratio> Aspect ratio (e.g., 16:9, 1:1, 4:3)
--size <WxH> Size (e.g., 1024x1024)
--quality normal|2k Quality preset (default: 2k)
--imageSize 1K|2K|4K Image size for Google (default: from quality)
--ref <files...> Reference images (Google multimodal or OpenAI edits)
--n <count> Number of images (default: 1)
--json JSON output
-h, --help Show help
Environment variables:
OPENAI_API_KEY OpenAI API key
GOOGLE_API_KEY Google API key
GEMINI_API_KEY Gemini API key (alias for GOOGLE_API_KEY)
DASHSCOPE_API_KEY DashScope API key (阿里云通义万象)
REPLICATE_API_TOKEN Replicate API token
OPENAI_IMAGE_MODEL Default OpenAI model (gpt-image-1.5)
GOOGLE_IMAGE_MODEL Default Google model (gemini-3-pro-image-preview)
DASHSCOPE_IMAGE_MODEL Default DashScope model (z-image-turbo)
REPLICATE_IMAGE_MODEL Default Replicate model (google/nano-banana-pro)
OPENAI_BASE_URL Custom OpenAI endpoint
OPENAI_IMAGE_USE_CHAT Use /chat/completions instead of /images/generations (true|false)
GOOGLE_BASE_URL Custom Google endpoint
DASHSCOPE_BASE_URL Custom DashScope endpoint
REPLICATE_BASE_URL Custom Replicate endpoint
Env file load order: CLI args > EXTEND.md > process.env > <cwd>/.baoyu-skills/.env > ~/.baoyu-skills/.env`);
}
function parseArgs(argv: string[]): CliArgs {
const out: CliArgs = {
prompt: null,
promptFiles: [],
imagePath: null,
provider: null,
model: null,
aspectRatio: null,
size: null,
quality: null,
imageSize: null,
referenceImages: [],
n: 1,
json: false,
help: false,
};
const positional: string[] = [];
const takeMany = (i: number): { items: string[]; next: number } => {
const items: string[] = [];
let j = i + 1;
while (j < argv.length) {
const v = argv[j]!;
if (v.startsWith("-")) break;
items.push(v);
j++;
}
return { items, next: j - 1 };
};
for (let i = 0; i < argv.length; i++) {
const a = argv[i]!;
if (a === "--help" || a === "-h") {
out.help = true;
continue;
}
if (a === "--json") {
out.json = true;
continue;
}
if (a === "--prompt" || a === "-p") {
const v = argv[++i];
if (!v) throw new Error(`Missing value for ${a}`);
out.prompt = v;
continue;
}
if (a === "--promptfiles") {
const { items, next } = takeMany(i);
if (items.length === 0) throw new Error("Missing files for --promptfiles");
out.promptFiles.push(...items);
i = next;
continue;
}
if (a === "--image") {
const v = argv[++i];
if (!v) throw new Error("Missing value for --image");
out.imagePath = v;
continue;
}
if (a === "--provider") {
const v = argv[++i];
if (v !== "google" && v !== "openai" && v !== "dashscope" && v !== "replicate") throw new Error(`Invalid provider: ${v}`);
out.provider = v;
continue;
}
if (a === "--model" || a === "-m") {
const v = argv[++i];
if (!v) throw new Error(`Missing value for ${a}`);
out.model = v;
continue;
}
if (a === "--ar") {
const v = argv[++i];
if (!v) throw new Error("Missing value for --ar");
out.aspectRatio = v;
continue;
}
if (a === "--size") {
const v = argv[++i];
if (!v) throw new Error("Missing value for --size");
out.size = v;
continue;
}
if (a === "--quality") {
const v = argv[++i];
if (v !== "normal" && v !== "2k") throw new Error(`Invalid quality: ${v}`);
out.quality = v;
continue;
}
if (a === "--imageSize") {
const v = argv[++i]?.toUpperCase();
if (v !== "1K" && v !== "2K" && v !== "4K") throw new Error(`Invalid imageSize: ${v}`);
out.imageSize = v;
continue;
}
if (a === "--ref" || a === "--reference") {
const { items, next } = takeMany(i);
if (items.length === 0) throw new Error(`Missing files for ${a}`);
out.referenceImages.push(...items);
i = next;
continue;
}
if (a === "--n") {
const v = argv[++i];
if (!v) throw new Error("Missing value for --n");
out.n = parseInt(v, 10);
if (isNaN(out.n) || out.n < 1) throw new Error(`Invalid count: ${v}`);
continue;
}
if (a.startsWith("-")) {
throw new Error(`Unknown option: ${a}`);
}
positional.push(a);
}
if (!out.prompt && out.promptFiles.length === 0 && positional.length > 0) {
out.prompt = positional.join(" ");
}
return out;
}
async function loadEnvFile(p: string): Promise<Record<string, string>> {
try {
const content = await readFile(p, "utf8");
const env: Record<string, string> = {};
for (const line of content.split("\n")) {
const trimmed = line.trim();
if (!trimmed || trimmed.startsWith("#")) continue;
const idx = trimmed.indexOf("=");
if (idx === -1) continue;
const key = trimmed.slice(0, idx).trim();
let val = trimmed.slice(idx + 1).trim();
if ((val.startsWith('"') && val.endsWith('"')) || (val.startsWith("'") && val.endsWith("'"))) {
val = val.slice(1, -1);
}
env[key] = val;
}
return env;
} catch {
return {};
}
}
async function loadEnv(): Promise<void> {
const home = homedir();
const cwd = process.cwd();
const homeEnv = await loadEnvFile(path.join(home, ".baoyu-skills", ".env"));
const cwdEnv = await loadEnvFile(path.join(cwd, ".baoyu-skills", ".env"));
for (const [k, v] of Object.entries(homeEnv)) {
if (!process.env[k]) process.env[k] = v;
}
for (const [k, v] of Object.entries(cwdEnv)) {
if (!process.env[k]) process.env[k] = v;
}
}
function extractYamlFrontMatter(content: string): string | null {
const match = content.match(/^---\s*\n([\s\S]*?)\n---\s*$/m);
return match ? match[1] : null;
}
function parseSimpleYaml(yaml: string): Partial<ExtendConfig> {
const config: Partial<ExtendConfig> = {};
const lines = yaml.split("\n");
let currentKey: string | null = null;
for (const line of lines) {
const trimmed = line.trim();
if (!trimmed || trimmed.startsWith("#")) continue;
if (trimmed.includes(":") && !trimmed.startsWith("-")) {
const colonIdx = trimmed.indexOf(":");
const key = trimmed.slice(0, colonIdx).trim();
let value = trimmed.slice(colonIdx + 1).trim();
if (value === "null" || value === "") {
value = "null";
}
if (key === "version") {
config.version = value === "null" ? 1 : parseInt(value, 10);
} else if (key === "default_provider") {
config.default_provider = value === "null" ? null : (value as Provider);
} else if (key === "default_quality") {
config.default_quality = value === "null" ? null : (value as "normal" | "2k");
} else if (key === "default_aspect_ratio") {
const cleaned = value.replace(/['"]/g, "");
config.default_aspect_ratio = cleaned === "null" ? null : cleaned;
} else if (key === "default_image_size") {
config.default_image_size = value === "null" ? null : (value as "1K" | "2K" | "4K");
} else if (key === "default_model") {
config.default_model = { google: null, openai: null, dashscope: null, replicate: null };
currentKey = "default_model";
} else if (currentKey === "default_model" && (key === "google" || key === "openai" || key === "dashscope" || key === "replicate")) {
const cleaned = value.replace(/['"]/g, "");
config.default_model![key] = cleaned === "null" ? null : cleaned;
}
}
}
return config;
}
async function loadExtendConfig(): Promise<Partial<ExtendConfig>> {
const home = homedir();
const cwd = process.cwd();
const paths = [
path.join(cwd, ".baoyu-skills", "baoyu-image-gen", "EXTEND.md"),
path.join(home, ".baoyu-skills", "baoyu-image-gen", "EXTEND.md"),
];
for (const p of paths) {
try {
const content = await readFile(p, "utf8");
const yaml = extractYamlFrontMatter(content);
if (!yaml) continue;
return parseSimpleYaml(yaml);
} catch {
continue;
}
}
return {};
}
function mergeConfig(args: CliArgs, extend: Partial<ExtendConfig>): CliArgs {
return {
...args,
provider: args.provider ?? extend.default_provider ?? null,
quality: args.quality ?? extend.default_quality ?? null,
aspectRatio: args.aspectRatio ?? extend.default_aspect_ratio ?? null,
imageSize: args.imageSize ?? extend.default_image_size ?? null,
};
}
async function readPromptFromFiles(files: string[]): Promise<string> {
const parts: string[] = [];
for (const f of files) {
parts.push(await readFile(f, "utf8"));
}
return parts.join("\n\n");
}
async function readPromptFromStdin(): Promise<string | null> {
if (process.stdin.isTTY) return null;
try {
const t = await Bun.stdin.text();
const v = t.trim();
return v.length > 0 ? v : null;
} catch {
return null;
}
}
function normalizeOutputImagePath(p: string): string {
const full = path.resolve(p);
const ext = path.extname(full);
if (ext) return full;
return `${full}.png`;
}
function detectProvider(args: CliArgs): Provider {
if (args.referenceImages.length > 0 && args.provider && args.provider !== "google" && args.provider !== "openai" && args.provider !== "replicate") {
throw new Error(
"Reference images require a ref-capable provider. Use --provider google (Gemini multimodal), --provider openai (GPT Image edits), or --provider replicate."
);
}
if (args.provider) return args.provider;
const hasGoogle = !!(process.env.GOOGLE_API_KEY || process.env.GEMINI_API_KEY);
const hasOpenai = !!process.env.OPENAI_API_KEY;
const hasDashscope = !!process.env.DASHSCOPE_API_KEY;
const hasReplicate = !!process.env.REPLICATE_API_TOKEN;
if (args.referenceImages.length > 0) {
if (hasGoogle) return "google";
if (hasOpenai) return "openai";
if (hasReplicate) return "replicate";
throw new Error(
"Reference images require Google, OpenAI or Replicate. Set GOOGLE_API_KEY/GEMINI_API_KEY, OPENAI_API_KEY, or REPLICATE_API_TOKEN, or remove --ref."
);
}
const available = [hasGoogle && "google", hasOpenai && "openai", hasDashscope && "dashscope", hasReplicate && "replicate"].filter(Boolean) as Provider[];
if (available.length === 1) return available[0]!;
if (available.length > 1) return available[0]!;
throw new Error(
"No API key found. Set GOOGLE_API_KEY, GEMINI_API_KEY, OPENAI_API_KEY, DASHSCOPE_API_KEY, or REPLICATE_API_TOKEN.\n" +
"Create ~/.baoyu-skills/.env or <cwd>/.baoyu-skills/.env with your keys."
);
}
async function validateReferenceImages(referenceImages: string[]): Promise<void> {
for (const refPath of referenceImages) {
const fullPath = path.resolve(refPath);
try {
await access(fullPath);
} catch {
throw new Error(`Reference image not found: ${fullPath}`);
}
}
}
type ProviderModule = {
getDefaultModel: () => string;
generateImage: (prompt: string, model: string, args: CliArgs) => Promise<Uint8Array>;
};
function isRetryableGenerationError(error: unknown): boolean {
const msg = error instanceof Error ? error.message : String(error);
const nonRetryableMarkers = [
"Reference image",
"not supported",
"only supported",
"No API key found",
"is required",
];
return !nonRetryableMarkers.some((marker) => msg.includes(marker));
}
async function loadProviderModule(provider: Provider): Promise<ProviderModule> {
if (provider === "google") {
return (await import("./providers/google")) as ProviderModule;
}
if (provider === "dashscope") {
return (await import("./providers/dashscope")) as ProviderModule;
}
if (provider === "replicate") {
return (await import("./providers/replicate")) as ProviderModule;
}
return (await import("./providers/openai")) as ProviderModule;
}
async function main(): Promise<void> {
const args = parseArgs(process.argv.slice(2));
if (args.help) {
printUsage();
return;
}
await loadEnv();
const extendConfig = await loadExtendConfig();
const mergedArgs = mergeConfig(args, extendConfig);
if (!mergedArgs.quality) mergedArgs.quality = "2k";
let prompt: string | null = mergedArgs.prompt;
if (!prompt && mergedArgs.promptFiles.length > 0) prompt = await readPromptFromFiles(mergedArgs.promptFiles);
if (!prompt) prompt = await readPromptFromStdin();
if (!prompt) {
console.error("Error: Prompt is required");
printUsage();
process.exitCode = 1;
return;
}
if (!mergedArgs.imagePath) {
console.error("Error: --image is required");
printUsage();
process.exitCode = 1;
return;
}
if (mergedArgs.referenceImages.length > 0) {
await validateReferenceImages(mergedArgs.referenceImages);
}
const provider = detectProvider(mergedArgs);
const providerModule = await loadProviderModule(provider);
let model = mergedArgs.model;
if (!model && extendConfig.default_model) {
if (provider === "google") model = extendConfig.default_model.google ?? null;
if (provider === "openai") model = extendConfig.default_model.openai ?? null;
if (provider === "dashscope") model = extendConfig.default_model.dashscope ?? null;
if (provider === "replicate") model = extendConfig.default_model.replicate ?? null;
}
model = model || providerModule.getDefaultModel();
const outputPath = normalizeOutputImagePath(mergedArgs.imagePath);
let imageData: Uint8Array;
let retried = false;
while (true) {
try {
imageData = await providerModule.generateImage(prompt, model, mergedArgs);
break;
} catch (e) {
if (!retried && isRetryableGenerationError(e)) {
retried = true;
console.error("Generation failed, retrying...");
continue;
}
throw e;
}
}
const dir = path.dirname(outputPath);
await mkdir(dir, { recursive: true });
await writeFile(outputPath, imageData);
if (mergedArgs.json) {
console.log(
JSON.stringify(
{
savedImage: outputPath,
provider,
model,
prompt: prompt.slice(0, 200),
},
null,
2
)
);
} else {
console.log(outputPath);
}
}
main().catch((e) => {
const msg = e instanceof Error ? e.message : String(e);
console.error(msg);
process.exit(1);
});

View File

@@ -0,0 +1,32 @@
export type Provider = "google" | "openai" | "dashscope" | "replicate";
export type Quality = "normal" | "2k";
export type CliArgs = {
prompt: string | null;
promptFiles: string[];
imagePath: string | null;
provider: Provider | null;
model: string | null;
aspectRatio: string | null;
size: string | null;
quality: Quality | null;
imageSize: string | null;
referenceImages: string[];
n: number;
json: boolean;
help: boolean;
};
export type ExtendConfig = {
version: number;
default_provider: Provider | null;
default_quality: Quality | null;
default_aspect_ratio: string | null;
default_image_size: "1K" | "2K" | "4K" | null;
default_model: {
google: string | null;
openai: string | null;
dashscope: string | null;
replicate: string | null;
};
};