feat: sync full workspace including web modules, docs, and configurations to Gitea
Optimized the root .gitignore to exclude virtual environments, node modules, and temp folders to ensure clean and lightweight version tracking. Co-authored-by: Cursor <cursoragent@cursor.com>
This commit is contained in:
204
axhub-make/skills/third-party/baoyu-image-gen/SKILL.md
vendored
Normal file
204
axhub-make/skills/third-party/baoyu-image-gen/SKILL.md
vendored
Normal file
@@ -0,0 +1,204 @@
|
||||
---
|
||||
name: baoyu-image-gen
|
||||
description: AI image generation with OpenAI, Google, DashScope and Replicate APIs. Supports text-to-image, reference images, aspect ratios. Sequential by default; parallel generation available on request. Use when user asks to generate, create, or draw images.
|
||||
---
|
||||
|
||||
# Image Generation (AI SDK)
|
||||
|
||||
Official API-based image generation. Supports OpenAI, Google, DashScope (阿里通义万象) and Replicate providers.
|
||||
|
||||
## Script Directory
|
||||
|
||||
**Agent Execution**:
|
||||
1. `SKILL_DIR` = this SKILL.md file's directory
|
||||
2. Script path = `${SKILL_DIR}/scripts/main.ts`
|
||||
|
||||
## Step 0: Load Preferences ⛔ BLOCKING
|
||||
|
||||
**CRITICAL**: This step MUST complete BEFORE any image generation. Do NOT skip or defer.
|
||||
|
||||
Check EXTEND.md existence (priority: project → user):
|
||||
|
||||
```bash
|
||||
test -f .baoyu-skills/baoyu-image-gen/EXTEND.md && echo "project"
|
||||
test -f "$HOME/.baoyu-skills/baoyu-image-gen/EXTEND.md" && echo "user"
|
||||
```
|
||||
|
||||
| Result | Action |
|
||||
|--------|--------|
|
||||
| Found | Load, parse, apply settings. If `default_model.[provider]` is null → ask model only (Flow 2) |
|
||||
| Not found | ⛔ Run first-time setup ([references/config/first-time-setup.md](references/config/first-time-setup.md)) → Save EXTEND.md → Then continue |
|
||||
|
||||
**CRITICAL**: If not found, complete the full setup (provider + model + quality + save location) using AskUserQuestion BEFORE generating any images. Generation is BLOCKED until EXTEND.md is created.
|
||||
|
||||
| Path | Location |
|
||||
|------|----------|
|
||||
| `.baoyu-skills/baoyu-image-gen/EXTEND.md` | Project directory |
|
||||
| `$HOME/.baoyu-skills/baoyu-image-gen/EXTEND.md` | User home |
|
||||
|
||||
**EXTEND.md Supports**: Default provider | Default quality | Default aspect ratio | Default image size | Default models
|
||||
|
||||
Schema: `references/config/preferences-schema.md`
|
||||
|
||||
## Usage
|
||||
|
||||
```bash
|
||||
# Basic
|
||||
npx -y bun ${SKILL_DIR}/scripts/main.ts --prompt "A cat" --image cat.png
|
||||
|
||||
# With aspect ratio
|
||||
npx -y bun ${SKILL_DIR}/scripts/main.ts --prompt "A landscape" --image out.png --ar 16:9
|
||||
|
||||
# High quality
|
||||
npx -y bun ${SKILL_DIR}/scripts/main.ts --prompt "A cat" --image out.png --quality 2k
|
||||
|
||||
# From prompt files
|
||||
npx -y bun ${SKILL_DIR}/scripts/main.ts --promptfiles system.md content.md --image out.png
|
||||
|
||||
# With reference images (Google multimodal or OpenAI edits)
|
||||
npx -y bun ${SKILL_DIR}/scripts/main.ts --prompt "Make blue" --image out.png --ref source.png
|
||||
|
||||
# With reference images (explicit provider/model)
|
||||
npx -y bun ${SKILL_DIR}/scripts/main.ts --prompt "Make blue" --image out.png --provider google --model gemini-3-pro-image-preview --ref source.png
|
||||
|
||||
# Specific provider
|
||||
npx -y bun ${SKILL_DIR}/scripts/main.ts --prompt "A cat" --image out.png --provider openai
|
||||
|
||||
# DashScope (阿里通义万象)
|
||||
npx -y bun ${SKILL_DIR}/scripts/main.ts --prompt "一只可爱的猫" --image out.png --provider dashscope
|
||||
|
||||
# Replicate (google/nano-banana-pro)
|
||||
npx -y bun ${SKILL_DIR}/scripts/main.ts --prompt "A cat" --image out.png --provider replicate
|
||||
|
||||
# Replicate with specific model
|
||||
npx -y bun ${SKILL_DIR}/scripts/main.ts --prompt "A cat" --image out.png --provider replicate --model google/nano-banana
|
||||
```
|
||||
|
||||
## Options
|
||||
|
||||
| Option | Description |
|
||||
|--------|-------------|
|
||||
| `--prompt <text>`, `-p` | Prompt text |
|
||||
| `--promptfiles <files...>` | Read prompt from files (concatenated) |
|
||||
| `--image <path>` | Output image path (required) |
|
||||
| `--provider google\|openai\|dashscope\|replicate` | Force provider (default: google) |
|
||||
| `--model <id>`, `-m` | Model ID (Google: `gemini-3-pro-image-preview`, `gemini-3.1-flash-image-preview`; OpenAI: `gpt-image-1.5`) |
|
||||
| `--ar <ratio>` | Aspect ratio (e.g., `16:9`, `1:1`, `4:3`) |
|
||||
| `--size <WxH>` | Size (e.g., `1024x1024`) |
|
||||
| `--quality normal\|2k` | Quality preset (default: 2k) |
|
||||
| `--imageSize 1K\|2K\|4K` | Image size for Google (default: from quality) |
|
||||
| `--ref <files...>` | Reference images. Supported by Google multimodal (`gemini-3-pro-image-preview`, `gemini-3-flash-preview`, `gemini-3.1-flash-image-preview`) and OpenAI edits (GPT Image models). If provider omitted: Google first, then OpenAI |
|
||||
| `--n <count>` | Number of images |
|
||||
| `--json` | JSON output |
|
||||
|
||||
## Environment Variables
|
||||
|
||||
| Variable | Description |
|
||||
|----------|-------------|
|
||||
| `OPENAI_API_KEY` | OpenAI API key |
|
||||
| `GOOGLE_API_KEY` | Google API key |
|
||||
| `DASHSCOPE_API_KEY` | DashScope API key (阿里云) |
|
||||
| `REPLICATE_API_TOKEN` | Replicate API token |
|
||||
| `OPENAI_IMAGE_MODEL` | OpenAI model override |
|
||||
| `GOOGLE_IMAGE_MODEL` | Google model override |
|
||||
| `DASHSCOPE_IMAGE_MODEL` | DashScope model override (default: z-image-turbo) |
|
||||
| `REPLICATE_IMAGE_MODEL` | Replicate model override (default: google/nano-banana-pro) |
|
||||
| `OPENAI_BASE_URL` | Custom OpenAI endpoint |
|
||||
| `GOOGLE_BASE_URL` | Custom Google endpoint |
|
||||
| `DASHSCOPE_BASE_URL` | Custom DashScope endpoint |
|
||||
| `REPLICATE_BASE_URL` | Custom Replicate endpoint |
|
||||
|
||||
**Load Priority**: CLI args > EXTEND.md > env vars > `<cwd>/.baoyu-skills/.env` > `~/.baoyu-skills/.env`
|
||||
|
||||
## Model Resolution
|
||||
|
||||
Model priority (highest → lowest), applies to all providers:
|
||||
|
||||
1. CLI flag: `--model <id>`
|
||||
2. EXTEND.md: `default_model.[provider]`
|
||||
3. Env var: `<PROVIDER>_IMAGE_MODEL` (e.g., `GOOGLE_IMAGE_MODEL`)
|
||||
4. Built-in default
|
||||
|
||||
**EXTEND.md overrides env vars**. If both EXTEND.md `default_model.google: "gemini-3-pro-image-preview"` and env var `GOOGLE_IMAGE_MODEL=gemini-3.1-flash-image-preview` exist, EXTEND.md wins.
|
||||
|
||||
**Agent MUST display model info** before each generation:
|
||||
- Show: `Using [provider] / [model]`
|
||||
- Show switch hint: `Switch model: --model <id> | EXTEND.md default_model.[provider] | env <PROVIDER>_IMAGE_MODEL`
|
||||
|
||||
### Replicate Models
|
||||
|
||||
Supported model formats:
|
||||
|
||||
- `owner/name` (recommended for official models), e.g. `google/nano-banana-pro`
|
||||
- `owner/name:version` (community models by version), e.g. `stability-ai/sdxl:<version>`
|
||||
|
||||
Examples:
|
||||
|
||||
```bash
|
||||
# Use Replicate default model
|
||||
npx -y bun ${SKILL_DIR}/scripts/main.ts --prompt "A cat" --image out.png --provider replicate
|
||||
|
||||
# Override model explicitly
|
||||
npx -y bun ${SKILL_DIR}/scripts/main.ts --prompt "A cat" --image out.png --provider replicate --model google/nano-banana
|
||||
```
|
||||
|
||||
## Provider Selection
|
||||
|
||||
1. `--ref` provided + no `--provider` → auto-select Google first, then OpenAI, then Replicate
|
||||
2. `--provider` specified → use it (if `--ref`, must be `google`, `openai`, or `replicate`)
|
||||
3. Only one API key available → use that provider
|
||||
4. Multiple available → default to Google
|
||||
|
||||
## Quality Presets
|
||||
|
||||
| Preset | Google imageSize | OpenAI Size | Use Case |
|
||||
|--------|------------------|-------------|----------|
|
||||
| `normal` | 1K | 1024px | Quick previews |
|
||||
| `2k` (default) | 2K | 2048px | Covers, illustrations, infographics |
|
||||
|
||||
**Google imageSize**: Can be overridden with `--imageSize 1K|2K|4K`
|
||||
|
||||
## Aspect Ratios
|
||||
|
||||
Supported: `1:1`, `16:9`, `9:16`, `4:3`, `3:4`, `2.35:1`
|
||||
|
||||
- Google multimodal: uses `imageConfig.aspectRatio`
|
||||
- Google Imagen: uses `aspectRatio` parameter
|
||||
- OpenAI: maps to closest supported size
|
||||
|
||||
## Generation Mode
|
||||
|
||||
**Default**: Sequential generation (one image at a time). This ensures stable output and easier debugging.
|
||||
|
||||
**Parallel Generation**: Only use when user explicitly requests parallel/concurrent generation.
|
||||
|
||||
| Mode | When to Use |
|
||||
|------|-------------|
|
||||
| Sequential (default) | Normal usage, single images, small batches |
|
||||
| Parallel | User explicitly requests, large batches (10+) |
|
||||
|
||||
**Parallel Settings** (when requested):
|
||||
|
||||
| Setting | Value |
|
||||
|---------|-------|
|
||||
| Recommended concurrency | 4 subagents |
|
||||
| Max concurrency | 8 subagents |
|
||||
| Use case | Large batch generation when user requests parallel |
|
||||
|
||||
**Agent Implementation** (parallel mode only):
|
||||
```
|
||||
# Launch multiple generations in parallel using Task tool
|
||||
# Each Task runs as background subagent with run_in_background=true
|
||||
# Collect results via TaskOutput when all complete
|
||||
```
|
||||
|
||||
## Error Handling
|
||||
|
||||
- Missing API key → error with setup instructions
|
||||
- Generation failure → auto-retry once
|
||||
- Invalid aspect ratio → warning, proceed with default
|
||||
- Reference images with unsupported provider/model → error with fix hint (switch to Google multimodal: `gemini-3-pro-image-preview`, `gemini-3.1-flash-image-preview`; or OpenAI GPT Image edits)
|
||||
|
||||
## Extension Support
|
||||
|
||||
Custom configurations via EXTEND.md. See **Preferences** section for paths and supported options.
|
||||
197
axhub-make/skills/third-party/baoyu-image-gen/references/config/first-time-setup.md
vendored
Normal file
197
axhub-make/skills/third-party/baoyu-image-gen/references/config/first-time-setup.md
vendored
Normal file
@@ -0,0 +1,197 @@
|
||||
---
|
||||
name: first-time-setup
|
||||
description: First-time setup and default model selection flow for baoyu-image-gen
|
||||
---
|
||||
|
||||
# First-Time Setup
|
||||
|
||||
## Overview
|
||||
|
||||
Triggered when:
|
||||
1. No EXTEND.md found → full setup (provider + model + preferences)
|
||||
2. EXTEND.md found but `default_model.[provider]` is null → model selection only
|
||||
|
||||
## Setup Flow
|
||||
|
||||
```
|
||||
No EXTEND.md found EXTEND.md found, model null
|
||||
│ │
|
||||
▼ ▼
|
||||
┌─────────────────────┐ ┌──────────────────────┐
|
||||
│ AskUserQuestion │ │ AskUserQuestion │
|
||||
│ (full setup) │ │ (model only) │
|
||||
└─────────────────────┘ └──────────────────────┘
|
||||
│ │
|
||||
▼ ▼
|
||||
┌─────────────────────┐ ┌──────────────────────┐
|
||||
│ Create EXTEND.md │ │ Update EXTEND.md │
|
||||
└─────────────────────┘ └──────────────────────┘
|
||||
│ │
|
||||
▼ ▼
|
||||
Continue Continue
|
||||
```
|
||||
|
||||
## Flow 1: No EXTEND.md (Full Setup)
|
||||
|
||||
**Language**: Use user's input language or saved language preference.
|
||||
|
||||
Use AskUserQuestion with ALL questions in ONE call:
|
||||
|
||||
### Question 1: Default Provider
|
||||
|
||||
```yaml
|
||||
header: "Provider"
|
||||
question: "Default image generation provider?"
|
||||
options:
|
||||
- label: "Google (Recommended)"
|
||||
description: "Gemini multimodal - high quality, reference images, flexible sizes"
|
||||
- label: "OpenAI"
|
||||
description: "GPT Image - consistent quality, reliable output"
|
||||
- label: "DashScope"
|
||||
description: "Alibaba Cloud - z-image-turbo, good for Chinese content"
|
||||
- label: "Replicate"
|
||||
description: "Community models - nano-banana-pro, flexible model selection"
|
||||
```
|
||||
|
||||
### Question 2: Default Google Model
|
||||
|
||||
Only show if user selected Google or auto-detect (no explicit provider).
|
||||
|
||||
```yaml
|
||||
header: "Google Model"
|
||||
question: "Default Google image generation model?"
|
||||
options:
|
||||
- label: "gemini-3-pro-image-preview (Recommended)"
|
||||
description: "Highest quality, best for production use"
|
||||
- label: "gemini-3.1-flash-image-preview"
|
||||
description: "Fast generation, good quality, lower cost"
|
||||
- label: "gemini-3-flash-preview"
|
||||
description: "Fast generation, balanced quality and speed"
|
||||
```
|
||||
|
||||
### Question 3: Default Quality
|
||||
|
||||
```yaml
|
||||
header: "Quality"
|
||||
question: "Default image quality?"
|
||||
options:
|
||||
- label: "2k (Recommended)"
|
||||
description: "2048px - covers, illustrations, infographics"
|
||||
- label: "normal"
|
||||
description: "1024px - quick previews, drafts"
|
||||
```
|
||||
|
||||
### Question 4: Save Location
|
||||
|
||||
```yaml
|
||||
header: "Save"
|
||||
question: "Where to save preferences?"
|
||||
options:
|
||||
- label: "Project (Recommended)"
|
||||
description: ".baoyu-skills/ (this project only)"
|
||||
- label: "User"
|
||||
description: "~/.baoyu-skills/ (all projects)"
|
||||
```
|
||||
|
||||
### Save Locations
|
||||
|
||||
| Choice | Path | Scope |
|
||||
|--------|------|-------|
|
||||
| Project | `.baoyu-skills/baoyu-image-gen/EXTEND.md` | Current project |
|
||||
| User | `$HOME/.baoyu-skills/baoyu-image-gen/EXTEND.md` | All projects |
|
||||
|
||||
### EXTEND.md Template
|
||||
|
||||
```yaml
|
||||
---
|
||||
version: 1
|
||||
default_provider: [selected provider or null]
|
||||
default_quality: [selected quality]
|
||||
default_aspect_ratio: null
|
||||
default_image_size: null
|
||||
default_model:
|
||||
google: [selected google model or null]
|
||||
openai: null
|
||||
dashscope: null
|
||||
replicate: null
|
||||
---
|
||||
```
|
||||
|
||||
## Flow 2: EXTEND.md Exists, Model Null
|
||||
|
||||
When EXTEND.md exists but `default_model.[current_provider]` is null, ask ONLY the model question for the current provider.
|
||||
|
||||
### Google Model Selection
|
||||
|
||||
```yaml
|
||||
header: "Google Model"
|
||||
question: "Choose a default Google image generation model?"
|
||||
options:
|
||||
- label: "gemini-3-pro-image-preview (Recommended)"
|
||||
description: "Highest quality, best for production use"
|
||||
- label: "gemini-3.1-flash-image-preview"
|
||||
description: "Fast generation, good quality, lower cost"
|
||||
- label: "gemini-3-flash-preview"
|
||||
description: "Fast generation, balanced quality and speed"
|
||||
```
|
||||
|
||||
### OpenAI Model Selection
|
||||
|
||||
```yaml
|
||||
header: "OpenAI Model"
|
||||
question: "Choose a default OpenAI image generation model?"
|
||||
options:
|
||||
- label: "gpt-image-1.5 (Recommended)"
|
||||
description: "Latest GPT Image model, high quality"
|
||||
- label: "gpt-image-1"
|
||||
description: "Previous generation GPT Image model"
|
||||
```
|
||||
|
||||
### DashScope Model Selection
|
||||
|
||||
```yaml
|
||||
header: "DashScope Model"
|
||||
question: "Choose a default DashScope image generation model?"
|
||||
options:
|
||||
- label: "z-image-turbo (Recommended)"
|
||||
description: "Fast generation, good quality"
|
||||
- label: "z-image-ultra"
|
||||
description: "Higher quality, slower generation"
|
||||
```
|
||||
|
||||
### Replicate Model Selection
|
||||
|
||||
```yaml
|
||||
header: "Replicate Model"
|
||||
question: "Choose a default Replicate image generation model?"
|
||||
options:
|
||||
- label: "google/nano-banana-pro (Recommended)"
|
||||
description: "Google's fast image model on Replicate"
|
||||
- label: "google/nano-banana"
|
||||
description: "Google's base image model on Replicate"
|
||||
```
|
||||
|
||||
### Update EXTEND.md
|
||||
|
||||
After user selects a model:
|
||||
|
||||
1. Read existing EXTEND.md
|
||||
2. If `default_model:` section exists → update the provider-specific key
|
||||
3. If `default_model:` section missing → add the full section:
|
||||
|
||||
```yaml
|
||||
default_model:
|
||||
google: [value or null]
|
||||
openai: [value or null]
|
||||
dashscope: [value or null]
|
||||
replicate: [value or null]
|
||||
```
|
||||
|
||||
Only set the selected provider's model; leave others as their current value or null.
|
||||
|
||||
## After Setup
|
||||
|
||||
1. Create directory if needed
|
||||
2. Write/update EXTEND.md with frontmatter
|
||||
3. Confirm: "Preferences saved to [path]"
|
||||
4. Continue with image generation
|
||||
69
axhub-make/skills/third-party/baoyu-image-gen/references/config/preferences-schema.md
vendored
Normal file
69
axhub-make/skills/third-party/baoyu-image-gen/references/config/preferences-schema.md
vendored
Normal file
@@ -0,0 +1,69 @@
|
||||
---
|
||||
name: preferences-schema
|
||||
description: EXTEND.md YAML schema for baoyu-image-gen user preferences
|
||||
---
|
||||
|
||||
# Preferences Schema
|
||||
|
||||
## Full Schema
|
||||
|
||||
```yaml
|
||||
---
|
||||
version: 1
|
||||
|
||||
default_provider: null # google|openai|dashscope|replicate|null (null = auto-detect)
|
||||
|
||||
default_quality: null # normal|2k|null (null = use default: 2k)
|
||||
|
||||
default_aspect_ratio: null # "16:9"|"1:1"|"4:3"|"3:4"|"2.35:1"|null
|
||||
|
||||
default_image_size: null # 1K|2K|4K|null (Google only, overrides quality)
|
||||
|
||||
default_model:
|
||||
google: null # e.g., "gemini-3-pro-image-preview", "gemini-3.1-flash-image-preview"
|
||||
openai: null # e.g., "gpt-image-1.5"
|
||||
dashscope: null # e.g., "z-image-turbo"
|
||||
replicate: null # e.g., "google/nano-banana-pro"
|
||||
---
|
||||
```
|
||||
|
||||
## Field Reference
|
||||
|
||||
| Field | Type | Default | Description |
|
||||
|-------|------|---------|-------------|
|
||||
| `version` | int | 1 | Schema version |
|
||||
| `default_provider` | string\|null | null | Default provider (null = auto-detect) |
|
||||
| `default_quality` | string\|null | null | Default quality (null = 2k) |
|
||||
| `default_aspect_ratio` | string\|null | null | Default aspect ratio |
|
||||
| `default_image_size` | string\|null | null | Google image size (overrides quality) |
|
||||
| `default_model.google` | string\|null | null | Google default model |
|
||||
| `default_model.openai` | string\|null | null | OpenAI default model |
|
||||
| `default_model.dashscope` | string\|null | null | DashScope default model |
|
||||
| `default_model.replicate` | string\|null | null | Replicate default model |
|
||||
|
||||
## Examples
|
||||
|
||||
**Minimal**:
|
||||
```yaml
|
||||
---
|
||||
version: 1
|
||||
default_provider: google
|
||||
default_quality: 2k
|
||||
---
|
||||
```
|
||||
|
||||
**Full**:
|
||||
```yaml
|
||||
---
|
||||
version: 1
|
||||
default_provider: google
|
||||
default_quality: 2k
|
||||
default_aspect_ratio: "16:9"
|
||||
default_image_size: 2K
|
||||
default_model:
|
||||
google: "gemini-3-pro-image-preview"
|
||||
openai: "gpt-image-1.5"
|
||||
dashscope: "z-image-turbo"
|
||||
replicate: "google/nano-banana-pro"
|
||||
---
|
||||
```
|
||||
497
axhub-make/skills/third-party/baoyu-image-gen/scripts/main.ts
vendored
Normal file
497
axhub-make/skills/third-party/baoyu-image-gen/scripts/main.ts
vendored
Normal file
@@ -0,0 +1,497 @@
|
||||
import path from "node:path";
|
||||
import process from "node:process";
|
||||
import { homedir } from "node:os";
|
||||
import { access, mkdir, readFile, writeFile } from "node:fs/promises";
|
||||
import type { CliArgs, Provider, ExtendConfig } from "./types";
|
||||
|
||||
function printUsage(): void {
|
||||
console.log(`Usage:
|
||||
npx -y bun scripts/main.ts --prompt "A cat" --image cat.png
|
||||
npx -y bun scripts/main.ts --prompt "A landscape" --image landscape.png --ar 16:9
|
||||
npx -y bun scripts/main.ts --promptfiles system.md content.md --image out.png
|
||||
|
||||
Options:
|
||||
-p, --prompt <text> Prompt text
|
||||
--promptfiles <files...> Read prompt from files (concatenated)
|
||||
--image <path> Output image path (required)
|
||||
--provider google|openai|dashscope|replicate Force provider (auto-detect by default)
|
||||
-m, --model <id> Model ID
|
||||
--ar <ratio> Aspect ratio (e.g., 16:9, 1:1, 4:3)
|
||||
--size <WxH> Size (e.g., 1024x1024)
|
||||
--quality normal|2k Quality preset (default: 2k)
|
||||
--imageSize 1K|2K|4K Image size for Google (default: from quality)
|
||||
--ref <files...> Reference images (Google multimodal or OpenAI edits)
|
||||
--n <count> Number of images (default: 1)
|
||||
--json JSON output
|
||||
-h, --help Show help
|
||||
|
||||
Environment variables:
|
||||
OPENAI_API_KEY OpenAI API key
|
||||
GOOGLE_API_KEY Google API key
|
||||
GEMINI_API_KEY Gemini API key (alias for GOOGLE_API_KEY)
|
||||
DASHSCOPE_API_KEY DashScope API key (阿里云通义万象)
|
||||
REPLICATE_API_TOKEN Replicate API token
|
||||
OPENAI_IMAGE_MODEL Default OpenAI model (gpt-image-1.5)
|
||||
GOOGLE_IMAGE_MODEL Default Google model (gemini-3-pro-image-preview)
|
||||
DASHSCOPE_IMAGE_MODEL Default DashScope model (z-image-turbo)
|
||||
REPLICATE_IMAGE_MODEL Default Replicate model (google/nano-banana-pro)
|
||||
OPENAI_BASE_URL Custom OpenAI endpoint
|
||||
OPENAI_IMAGE_USE_CHAT Use /chat/completions instead of /images/generations (true|false)
|
||||
GOOGLE_BASE_URL Custom Google endpoint
|
||||
DASHSCOPE_BASE_URL Custom DashScope endpoint
|
||||
REPLICATE_BASE_URL Custom Replicate endpoint
|
||||
|
||||
Env file load order: CLI args > EXTEND.md > process.env > <cwd>/.baoyu-skills/.env > ~/.baoyu-skills/.env`);
|
||||
}
|
||||
|
||||
function parseArgs(argv: string[]): CliArgs {
|
||||
const out: CliArgs = {
|
||||
prompt: null,
|
||||
promptFiles: [],
|
||||
imagePath: null,
|
||||
provider: null,
|
||||
model: null,
|
||||
aspectRatio: null,
|
||||
size: null,
|
||||
quality: null,
|
||||
imageSize: null,
|
||||
referenceImages: [],
|
||||
n: 1,
|
||||
json: false,
|
||||
help: false,
|
||||
};
|
||||
|
||||
const positional: string[] = [];
|
||||
|
||||
const takeMany = (i: number): { items: string[]; next: number } => {
|
||||
const items: string[] = [];
|
||||
let j = i + 1;
|
||||
while (j < argv.length) {
|
||||
const v = argv[j]!;
|
||||
if (v.startsWith("-")) break;
|
||||
items.push(v);
|
||||
j++;
|
||||
}
|
||||
return { items, next: j - 1 };
|
||||
};
|
||||
|
||||
for (let i = 0; i < argv.length; i++) {
|
||||
const a = argv[i]!;
|
||||
|
||||
if (a === "--help" || a === "-h") {
|
||||
out.help = true;
|
||||
continue;
|
||||
}
|
||||
|
||||
if (a === "--json") {
|
||||
out.json = true;
|
||||
continue;
|
||||
}
|
||||
|
||||
if (a === "--prompt" || a === "-p") {
|
||||
const v = argv[++i];
|
||||
if (!v) throw new Error(`Missing value for ${a}`);
|
||||
out.prompt = v;
|
||||
continue;
|
||||
}
|
||||
|
||||
if (a === "--promptfiles") {
|
||||
const { items, next } = takeMany(i);
|
||||
if (items.length === 0) throw new Error("Missing files for --promptfiles");
|
||||
out.promptFiles.push(...items);
|
||||
i = next;
|
||||
continue;
|
||||
}
|
||||
|
||||
if (a === "--image") {
|
||||
const v = argv[++i];
|
||||
if (!v) throw new Error("Missing value for --image");
|
||||
out.imagePath = v;
|
||||
continue;
|
||||
}
|
||||
|
||||
if (a === "--provider") {
|
||||
const v = argv[++i];
|
||||
if (v !== "google" && v !== "openai" && v !== "dashscope" && v !== "replicate") throw new Error(`Invalid provider: ${v}`);
|
||||
out.provider = v;
|
||||
continue;
|
||||
}
|
||||
|
||||
if (a === "--model" || a === "-m") {
|
||||
const v = argv[++i];
|
||||
if (!v) throw new Error(`Missing value for ${a}`);
|
||||
out.model = v;
|
||||
continue;
|
||||
}
|
||||
|
||||
if (a === "--ar") {
|
||||
const v = argv[++i];
|
||||
if (!v) throw new Error("Missing value for --ar");
|
||||
out.aspectRatio = v;
|
||||
continue;
|
||||
}
|
||||
|
||||
if (a === "--size") {
|
||||
const v = argv[++i];
|
||||
if (!v) throw new Error("Missing value for --size");
|
||||
out.size = v;
|
||||
continue;
|
||||
}
|
||||
|
||||
if (a === "--quality") {
|
||||
const v = argv[++i];
|
||||
if (v !== "normal" && v !== "2k") throw new Error(`Invalid quality: ${v}`);
|
||||
out.quality = v;
|
||||
continue;
|
||||
}
|
||||
|
||||
if (a === "--imageSize") {
|
||||
const v = argv[++i]?.toUpperCase();
|
||||
if (v !== "1K" && v !== "2K" && v !== "4K") throw new Error(`Invalid imageSize: ${v}`);
|
||||
out.imageSize = v;
|
||||
continue;
|
||||
}
|
||||
|
||||
if (a === "--ref" || a === "--reference") {
|
||||
const { items, next } = takeMany(i);
|
||||
if (items.length === 0) throw new Error(`Missing files for ${a}`);
|
||||
out.referenceImages.push(...items);
|
||||
i = next;
|
||||
continue;
|
||||
}
|
||||
|
||||
if (a === "--n") {
|
||||
const v = argv[++i];
|
||||
if (!v) throw new Error("Missing value for --n");
|
||||
out.n = parseInt(v, 10);
|
||||
if (isNaN(out.n) || out.n < 1) throw new Error(`Invalid count: ${v}`);
|
||||
continue;
|
||||
}
|
||||
|
||||
if (a.startsWith("-")) {
|
||||
throw new Error(`Unknown option: ${a}`);
|
||||
}
|
||||
|
||||
positional.push(a);
|
||||
}
|
||||
|
||||
if (!out.prompt && out.promptFiles.length === 0 && positional.length > 0) {
|
||||
out.prompt = positional.join(" ");
|
||||
}
|
||||
|
||||
return out;
|
||||
}
|
||||
|
||||
async function loadEnvFile(p: string): Promise<Record<string, string>> {
|
||||
try {
|
||||
const content = await readFile(p, "utf8");
|
||||
const env: Record<string, string> = {};
|
||||
for (const line of content.split("\n")) {
|
||||
const trimmed = line.trim();
|
||||
if (!trimmed || trimmed.startsWith("#")) continue;
|
||||
const idx = trimmed.indexOf("=");
|
||||
if (idx === -1) continue;
|
||||
const key = trimmed.slice(0, idx).trim();
|
||||
let val = trimmed.slice(idx + 1).trim();
|
||||
if ((val.startsWith('"') && val.endsWith('"')) || (val.startsWith("'") && val.endsWith("'"))) {
|
||||
val = val.slice(1, -1);
|
||||
}
|
||||
env[key] = val;
|
||||
}
|
||||
return env;
|
||||
} catch {
|
||||
return {};
|
||||
}
|
||||
}
|
||||
|
||||
async function loadEnv(): Promise<void> {
|
||||
const home = homedir();
|
||||
const cwd = process.cwd();
|
||||
|
||||
const homeEnv = await loadEnvFile(path.join(home, ".baoyu-skills", ".env"));
|
||||
const cwdEnv = await loadEnvFile(path.join(cwd, ".baoyu-skills", ".env"));
|
||||
|
||||
for (const [k, v] of Object.entries(homeEnv)) {
|
||||
if (!process.env[k]) process.env[k] = v;
|
||||
}
|
||||
for (const [k, v] of Object.entries(cwdEnv)) {
|
||||
if (!process.env[k]) process.env[k] = v;
|
||||
}
|
||||
}
|
||||
|
||||
function extractYamlFrontMatter(content: string): string | null {
|
||||
const match = content.match(/^---\s*\n([\s\S]*?)\n---\s*$/m);
|
||||
return match ? match[1] : null;
|
||||
}
|
||||
|
||||
function parseSimpleYaml(yaml: string): Partial<ExtendConfig> {
|
||||
const config: Partial<ExtendConfig> = {};
|
||||
const lines = yaml.split("\n");
|
||||
let currentKey: string | null = null;
|
||||
|
||||
for (const line of lines) {
|
||||
const trimmed = line.trim();
|
||||
if (!trimmed || trimmed.startsWith("#")) continue;
|
||||
|
||||
if (trimmed.includes(":") && !trimmed.startsWith("-")) {
|
||||
const colonIdx = trimmed.indexOf(":");
|
||||
const key = trimmed.slice(0, colonIdx).trim();
|
||||
let value = trimmed.slice(colonIdx + 1).trim();
|
||||
|
||||
if (value === "null" || value === "") {
|
||||
value = "null";
|
||||
}
|
||||
|
||||
if (key === "version") {
|
||||
config.version = value === "null" ? 1 : parseInt(value, 10);
|
||||
} else if (key === "default_provider") {
|
||||
config.default_provider = value === "null" ? null : (value as Provider);
|
||||
} else if (key === "default_quality") {
|
||||
config.default_quality = value === "null" ? null : (value as "normal" | "2k");
|
||||
} else if (key === "default_aspect_ratio") {
|
||||
const cleaned = value.replace(/['"]/g, "");
|
||||
config.default_aspect_ratio = cleaned === "null" ? null : cleaned;
|
||||
} else if (key === "default_image_size") {
|
||||
config.default_image_size = value === "null" ? null : (value as "1K" | "2K" | "4K");
|
||||
} else if (key === "default_model") {
|
||||
config.default_model = { google: null, openai: null, dashscope: null, replicate: null };
|
||||
currentKey = "default_model";
|
||||
} else if (currentKey === "default_model" && (key === "google" || key === "openai" || key === "dashscope" || key === "replicate")) {
|
||||
const cleaned = value.replace(/['"]/g, "");
|
||||
config.default_model![key] = cleaned === "null" ? null : cleaned;
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
return config;
|
||||
}
|
||||
|
||||
async function loadExtendConfig(): Promise<Partial<ExtendConfig>> {
|
||||
const home = homedir();
|
||||
const cwd = process.cwd();
|
||||
|
||||
const paths = [
|
||||
path.join(cwd, ".baoyu-skills", "baoyu-image-gen", "EXTEND.md"),
|
||||
path.join(home, ".baoyu-skills", "baoyu-image-gen", "EXTEND.md"),
|
||||
];
|
||||
|
||||
for (const p of paths) {
|
||||
try {
|
||||
const content = await readFile(p, "utf8");
|
||||
const yaml = extractYamlFrontMatter(content);
|
||||
if (!yaml) continue;
|
||||
|
||||
return parseSimpleYaml(yaml);
|
||||
} catch {
|
||||
continue;
|
||||
}
|
||||
}
|
||||
|
||||
return {};
|
||||
}
|
||||
|
||||
function mergeConfig(args: CliArgs, extend: Partial<ExtendConfig>): CliArgs {
|
||||
return {
|
||||
...args,
|
||||
provider: args.provider ?? extend.default_provider ?? null,
|
||||
quality: args.quality ?? extend.default_quality ?? null,
|
||||
aspectRatio: args.aspectRatio ?? extend.default_aspect_ratio ?? null,
|
||||
imageSize: args.imageSize ?? extend.default_image_size ?? null,
|
||||
};
|
||||
}
|
||||
|
||||
async function readPromptFromFiles(files: string[]): Promise<string> {
|
||||
const parts: string[] = [];
|
||||
for (const f of files) {
|
||||
parts.push(await readFile(f, "utf8"));
|
||||
}
|
||||
return parts.join("\n\n");
|
||||
}
|
||||
|
||||
async function readPromptFromStdin(): Promise<string | null> {
|
||||
if (process.stdin.isTTY) return null;
|
||||
try {
|
||||
const t = await Bun.stdin.text();
|
||||
const v = t.trim();
|
||||
return v.length > 0 ? v : null;
|
||||
} catch {
|
||||
return null;
|
||||
}
|
||||
}
|
||||
|
||||
function normalizeOutputImagePath(p: string): string {
|
||||
const full = path.resolve(p);
|
||||
const ext = path.extname(full);
|
||||
if (ext) return full;
|
||||
return `${full}.png`;
|
||||
}
|
||||
|
||||
function detectProvider(args: CliArgs): Provider {
|
||||
if (args.referenceImages.length > 0 && args.provider && args.provider !== "google" && args.provider !== "openai" && args.provider !== "replicate") {
|
||||
throw new Error(
|
||||
"Reference images require a ref-capable provider. Use --provider google (Gemini multimodal), --provider openai (GPT Image edits), or --provider replicate."
|
||||
);
|
||||
}
|
||||
|
||||
if (args.provider) return args.provider;
|
||||
|
||||
const hasGoogle = !!(process.env.GOOGLE_API_KEY || process.env.GEMINI_API_KEY);
|
||||
const hasOpenai = !!process.env.OPENAI_API_KEY;
|
||||
const hasDashscope = !!process.env.DASHSCOPE_API_KEY;
|
||||
const hasReplicate = !!process.env.REPLICATE_API_TOKEN;
|
||||
|
||||
if (args.referenceImages.length > 0) {
|
||||
if (hasGoogle) return "google";
|
||||
if (hasOpenai) return "openai";
|
||||
if (hasReplicate) return "replicate";
|
||||
throw new Error(
|
||||
"Reference images require Google, OpenAI or Replicate. Set GOOGLE_API_KEY/GEMINI_API_KEY, OPENAI_API_KEY, or REPLICATE_API_TOKEN, or remove --ref."
|
||||
);
|
||||
}
|
||||
|
||||
const available = [hasGoogle && "google", hasOpenai && "openai", hasDashscope && "dashscope", hasReplicate && "replicate"].filter(Boolean) as Provider[];
|
||||
|
||||
if (available.length === 1) return available[0]!;
|
||||
if (available.length > 1) return available[0]!;
|
||||
|
||||
throw new Error(
|
||||
"No API key found. Set GOOGLE_API_KEY, GEMINI_API_KEY, OPENAI_API_KEY, DASHSCOPE_API_KEY, or REPLICATE_API_TOKEN.\n" +
|
||||
"Create ~/.baoyu-skills/.env or <cwd>/.baoyu-skills/.env with your keys."
|
||||
);
|
||||
}
|
||||
|
||||
async function validateReferenceImages(referenceImages: string[]): Promise<void> {
|
||||
for (const refPath of referenceImages) {
|
||||
const fullPath = path.resolve(refPath);
|
||||
try {
|
||||
await access(fullPath);
|
||||
} catch {
|
||||
throw new Error(`Reference image not found: ${fullPath}`);
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
type ProviderModule = {
|
||||
getDefaultModel: () => string;
|
||||
generateImage: (prompt: string, model: string, args: CliArgs) => Promise<Uint8Array>;
|
||||
};
|
||||
|
||||
function isRetryableGenerationError(error: unknown): boolean {
|
||||
const msg = error instanceof Error ? error.message : String(error);
|
||||
const nonRetryableMarkers = [
|
||||
"Reference image",
|
||||
"not supported",
|
||||
"only supported",
|
||||
"No API key found",
|
||||
"is required",
|
||||
];
|
||||
return !nonRetryableMarkers.some((marker) => msg.includes(marker));
|
||||
}
|
||||
|
||||
async function loadProviderModule(provider: Provider): Promise<ProviderModule> {
|
||||
if (provider === "google") {
|
||||
return (await import("./providers/google")) as ProviderModule;
|
||||
}
|
||||
if (provider === "dashscope") {
|
||||
return (await import("./providers/dashscope")) as ProviderModule;
|
||||
}
|
||||
if (provider === "replicate") {
|
||||
return (await import("./providers/replicate")) as ProviderModule;
|
||||
}
|
||||
return (await import("./providers/openai")) as ProviderModule;
|
||||
}
|
||||
|
||||
async function main(): Promise<void> {
|
||||
const args = parseArgs(process.argv.slice(2));
|
||||
|
||||
if (args.help) {
|
||||
printUsage();
|
||||
return;
|
||||
}
|
||||
|
||||
await loadEnv();
|
||||
const extendConfig = await loadExtendConfig();
|
||||
const mergedArgs = mergeConfig(args, extendConfig);
|
||||
|
||||
if (!mergedArgs.quality) mergedArgs.quality = "2k";
|
||||
|
||||
let prompt: string | null = mergedArgs.prompt;
|
||||
if (!prompt && mergedArgs.promptFiles.length > 0) prompt = await readPromptFromFiles(mergedArgs.promptFiles);
|
||||
if (!prompt) prompt = await readPromptFromStdin();
|
||||
|
||||
if (!prompt) {
|
||||
console.error("Error: Prompt is required");
|
||||
printUsage();
|
||||
process.exitCode = 1;
|
||||
return;
|
||||
}
|
||||
|
||||
if (!mergedArgs.imagePath) {
|
||||
console.error("Error: --image is required");
|
||||
printUsage();
|
||||
process.exitCode = 1;
|
||||
return;
|
||||
}
|
||||
|
||||
if (mergedArgs.referenceImages.length > 0) {
|
||||
await validateReferenceImages(mergedArgs.referenceImages);
|
||||
}
|
||||
|
||||
const provider = detectProvider(mergedArgs);
|
||||
const providerModule = await loadProviderModule(provider);
|
||||
|
||||
let model = mergedArgs.model;
|
||||
if (!model && extendConfig.default_model) {
|
||||
if (provider === "google") model = extendConfig.default_model.google ?? null;
|
||||
if (provider === "openai") model = extendConfig.default_model.openai ?? null;
|
||||
if (provider === "dashscope") model = extendConfig.default_model.dashscope ?? null;
|
||||
if (provider === "replicate") model = extendConfig.default_model.replicate ?? null;
|
||||
}
|
||||
model = model || providerModule.getDefaultModel();
|
||||
|
||||
const outputPath = normalizeOutputImagePath(mergedArgs.imagePath);
|
||||
|
||||
let imageData: Uint8Array;
|
||||
let retried = false;
|
||||
|
||||
while (true) {
|
||||
try {
|
||||
imageData = await providerModule.generateImage(prompt, model, mergedArgs);
|
||||
break;
|
||||
} catch (e) {
|
||||
if (!retried && isRetryableGenerationError(e)) {
|
||||
retried = true;
|
||||
console.error("Generation failed, retrying...");
|
||||
continue;
|
||||
}
|
||||
throw e;
|
||||
}
|
||||
}
|
||||
|
||||
const dir = path.dirname(outputPath);
|
||||
await mkdir(dir, { recursive: true });
|
||||
await writeFile(outputPath, imageData);
|
||||
|
||||
if (mergedArgs.json) {
|
||||
console.log(
|
||||
JSON.stringify(
|
||||
{
|
||||
savedImage: outputPath,
|
||||
provider,
|
||||
model,
|
||||
prompt: prompt.slice(0, 200),
|
||||
},
|
||||
null,
|
||||
2
|
||||
)
|
||||
);
|
||||
} else {
|
||||
console.log(outputPath);
|
||||
}
|
||||
}
|
||||
|
||||
main().catch((e) => {
|
||||
const msg = e instanceof Error ? e.message : String(e);
|
||||
console.error(msg);
|
||||
process.exit(1);
|
||||
});
|
||||
32
axhub-make/skills/third-party/baoyu-image-gen/scripts/types.ts
vendored
Normal file
32
axhub-make/skills/third-party/baoyu-image-gen/scripts/types.ts
vendored
Normal file
@@ -0,0 +1,32 @@
|
||||
export type Provider = "google" | "openai" | "dashscope" | "replicate";
|
||||
export type Quality = "normal" | "2k";
|
||||
|
||||
export type CliArgs = {
|
||||
prompt: string | null;
|
||||
promptFiles: string[];
|
||||
imagePath: string | null;
|
||||
provider: Provider | null;
|
||||
model: string | null;
|
||||
aspectRatio: string | null;
|
||||
size: string | null;
|
||||
quality: Quality | null;
|
||||
imageSize: string | null;
|
||||
referenceImages: string[];
|
||||
n: number;
|
||||
json: boolean;
|
||||
help: boolean;
|
||||
};
|
||||
|
||||
export type ExtendConfig = {
|
||||
version: number;
|
||||
default_provider: Provider | null;
|
||||
default_quality: Quality | null;
|
||||
default_aspect_ratio: string | null;
|
||||
default_image_size: "1K" | "2K" | "4K" | null;
|
||||
default_model: {
|
||||
google: string | null;
|
||||
openai: string | null;
|
||||
dashscope: string | null;
|
||||
replicate: string | null;
|
||||
};
|
||||
};
|
||||
Reference in New Issue
Block a user