Studio tooling reference

A working catalogue of the ij8 · studio.

A single-page technical reference for prospective artists: what is wired, which models are available, where the gallery and minting surfaces are live, and which studio internals still require authenticated capture.

01 Static, self-contained reference for tooling.ij8.ai.

02 Dense feature index, not a marketing page.

03 Auth-gated studio views are marked as capture pending.

Premise

ij8 is a generative creative studio for serious artists. Image, video, 3D, code, sound, classroom work, and gallery distribution converge in a single chat-driven canvas. The studio is in active development; the gallery and minting infrastructure are live. Its stance toward AI is a middle way — a collaborator the artist directs and authors with, not a system that replaces the work, and not a pace the artist must match. The human brings the vision, the taste, and the questions; the AI brings the speed and the hands. The same position carries into how it teaches.

Generative Structures II gallery artwork — Public gallery artwork capture: Generative Structures II.

Animi gallery artwork — Public gallery artwork capture: Animi.

Oracles3D gallery artwork — Public gallery artwork capture: Oracles3D.

Image generation

The image surface is multi-backend by design: Gemini first, OpenAI fallback, and a ComfyUI workflow library for specific model families and edit modes.

Gemini

gemini-3-pro-image-preview is the primary image model. Multi-turn editing chats are cached in memory by ${userId}:${sessionId}:${imageId} and cleaned every five minutes. The model sees prompt, reference image, and mask.

OpenAI image

gpt-image-1.5 is the fallback path when Gemini returns 503. The fallback is explicit rather than hidden behind a generic “AI image” label.

ComfyUI library

comfyui.ts is a ~3.8k-line workflow library spanning Flux 1, Flux 2, Flux 2 Klein, SD 3.5, Qwen-Image 2512, Z-Image Turbo (incl. GGUF-quantized), CyberRealistic, Playground 2.5, Kontext (image-edit adapter), and SDXL style-transfer, plus inpaint, outpaint, upscale, and two-pass variants. Per-checkpoint detection (isFlux2, isZImageGGUF, isQwenImage, isFlux2Klein) routes each builder to the right text-encoder / VAE pair.

Model pool

A configured image-model pool can be set. Per turn, one model is selected deterministically or randomly and sent as imageConfig.imageModel. The single-model dropdown writes through to both the pool and legacy field.

Prompt enhancement

/api/enhance-prompt performs a pre-pass that rewrites loose prompts toward the selected backend’s strengths.

Conversational editing

Direct generation and agentic-loop tools share the same image conversation context, allowing multi-turn refinement rather than isolated prompt submissions.

Sketch to image

A pencil button in image mode opens a freehand drawing pad — pen, brush, eraser, color swatches, undo — sized to the session's aspect ratio. The drawing is sent as a composition reference (Gemini or gpt-image, multimodal conditioning, no ControlNet), so the finished image preserves the drawn layout, shapes, and spatial relationships. /api/sketch-to-image.

Provenance

Every generated image records its generationMeta — resolved model id, AI backend, infrastructure target (local vs remote), and seed — shown in the canvas metadata panel. A render is traceable to the exact model and seed that produced it.

Authenticated studio capture pending: chat-driven image generation with model pool selector visible.

Authenticated studio capture pending: prompt enhancement before/after view.

Authenticated studio capture pending: multi-turn image conversation with reference image.

Image editing

The editing layer is region-aware. Masks, bounding boxes, popovers, inpaint variants, background removal, crop, outpaint, upscale, and code-to-raster conversion are treated as normal studio operations.

Mask painter
MaskEditor, MaskToolbar, and BoundingBoxOverlay support precise selection, freehand masks, region selection, and bounding-box overlay.
Inpaint
Three inpaint paths are wired: Gemini multi-turn, Kontext Flux through ComfyUI, and a ComfyUI inpainting workflow. The inpaint variation picker is backed by sessionStorage across turns. Within the ComfyUI path, inpaint likewise prefers Flux Fill over the selected checkpoint whenever its components are available.
Canvas edits
Outpaint, crop, zoom-enhance, upscale, remove-region, and region popovers are available operations. Outpaint always runs Flux Fill — a true inpainting model that conditions on the original pixels — regardless of which 2D model is selected: the selected model is the generation engine, not the outpaint engine. Extension proceeds one edge per pass with computed feathering and a smoothstep soft-stitch that composites the true original back over the render, so the preserved region is untouched and the seam is structural, not cosmetic. The same workflow runs identically on the local GPU and on HPC, with a preflight that verifies all four Flux Fill components before committing. Outpaint runs as an async job so long passes outlive Cloudflare's ~100s ceiling.
Background removal
rembg runs as a Python subprocess with four variants: general, anime, precise, and birefnet.
Region popovers
A selected region opens a popover for a localized prompt, prompt enhancement, and an applied inpaint edit — refining one area without reprompting the whole image.
Code raster
convert-to-2d captures code-rendered output as a raster image so sketches can re-enter the visual pipeline.

Authenticated studio capture pending: mask painter with toolbar, bounding-box overlay, and selected region.

Authenticated studio capture pending: inpaint variation picker and cross-turn session state.

Authenticated studio capture pending: region popover with a localized inpaint prompt.

Video

Video generation spans local ComfyUI workflows, Minimax API models, masked video inpaint, animation pipelines, and frame-level timeline inspection.

Wan

Wan 2.2 14B runs through ComfyUI as the local high-end video target, paired with VACE for masked inpaint.

Hailuo

Hailuo 2.3 and Hailuo 2.3 Fast are accessed through the Minimax HTTP API.

VACE

Animate region: on any still image, paint a mask with the full inpaint toolset, type a motion prompt, and generate a short looping video in which only the painted region moves — every unmasked pixel is anchored to the source photograph per frame, by construction (the source and mask are repeated across all frames). Wan 2.2 14B + VACE through ComfyUI; the endpoint is SSE-streamed to survive Cloudflare's 100s upstream timeout.

Cel-style

Cel-style video is available as a ComfyUI pipeline.

Animation

gemini-animation.ts and hy-motion.ts are part of the animation path.

Timeline

A frame timeline scrubber supports per-frame inspection. The share-video path uses ffmpeg with exact-duration alignment for X.com loop compatibility: video is pinned to -t, -r, -vsync cfr, -fflags +genpts; audio is wrapped with apad,atrim=0:videoDurationSec,asetpts=PTS-STARTPTS so MediaRecorder undershoot / overshoot does not break loop replay. AAC is the output codec; H.264 Main is the video codec (level 4.0 — level 3.1's macroblock ceiling rejected square 1080² captures); -movflags +faststart enables progressive download.

Authenticated studio capture pending: frame timeline scrubber during a generated video session.

Authenticated studio capture pending: masked VACE video inpaint setup.

Authenticated studio capture pending: share-video export controls and ffmpeg duration metadata.

3D

The current 3D target is HunYuan3D 2.1. The pipeline separates shape and texture generation, prepares 2D sources for silhouette extraction, and presents meshes in a browser viewer.

HunYuan3D 2.1
Current target. Dual-pass shape generation followed by texture generation. ComfyUI restarts between the shape and texture passes to clear VRAM — the two stages do not fit simultaneously on the single 24GB GPU.
HunYuan3D 2.0
Alternate target retained in the 3D path.
Prep prompt
build3DPrepPrompt re-renders the 2D source as a sculpture on a light-gray background. Black voids were producing shredded silhouettes. The Convert-to-3D path sends tPose and prepFor3D flags so the source is re-rendered as a sculpture before HunYuan3D conversion.
Job queue
The 3D queue is durable, has startedAt for stale-recovery, and scopes fulfill jobs per user.
Multi-view
Single-image 3D must hallucinate what it cannot see — the flat-back problem. An MV picker locks the current image as the front view and fills back/left/right either from hand-picked session images or from a single "Generate views" click that synthesizes turntable rotations. With two or more views, the shape pass runs the HunYuan3D multi-view checkpoint, conditioning the mesh on every labeled view.
Mesh viewer
model-viewer supports turntable yaw and APNG rotation export.
Fulfill worker
The async fulfill worker sits behind an internal-secret-gated route. Jobs are cancelable mid-pipeline: cancel is ownership-checked and idempotent, interrupts the in-flight ComfyUI workflow, and the fulfill worker observes cancellation between every stage — prep, T-pose, shape, texture — with per-stage timeouts.

Bio-mechanical Artifact gallery artwork — Public gallery artwork suitable as a 2D-to-3D source reference.

Generative Structures II public 3D collection page — Public gallery capture: Generative Structures II collection page with model-viewer meshes.

Animi public 3D collection page — Public gallery capture: a 3D collection page rendering a real mesh in the gallery's model-viewer turntable.

Animation + rigging

Rigging combines conventional retargeting, UniRig auto-rigging, a geometric morphology classifier, free-form RigSpec authoring, and Blender-driven bakes streamed over SSE for long jobs.

Retargeting
retarget-mixamo.ts runs Mixamo FBX retargeting through a Blender subprocess.
UniRig
unirig.ts provides auto-rigging.
Morphology
scripts/classify_morphology.py and classify-morphology.ts use a Fibonacci-sphere ray probe to identify radial, stalk, body-tail, and pulse-mass archetypes.
RigSpec
lib/rig/spec.ts and lib/rig/topology.ts define a free-form AI rig schema and topology archetypes. scripts/bake_rig.py and bake-rig.ts bake through Blender.
Behaviors
Continuous behaviors include sway, pulse, softbody, and rotate.
Long bakes
/api/rig/auto-smoke is SSE-streamed to survive Cloudflare’s 100s upstream timeout for dense radial bakes with 100+ bones across 240 frames and multiple chains.
Weighting
The hub bone gets influenceRadius via a chain to prevent global hub dominance through manual_distance_weights. Orphan vertices fall back to the nearest bone to prevent static appendage islands.
Debug surface
/rig-test is the debug surface for rigging work.

Authenticated capture pending: /rig-test debug surface with morphology classification output.

Authenticated capture pending: RigSpec authoring view with topology archetype and behavior controls.

Authenticated capture pending: SSE-streamed bake progress for dense radial rig.

Code generation

The code surface supports creative coding, shader sketches, validation in Chromium, forked sketches, and conversion of code output into images.

Pipeline

code-pipeline.ts is the ~1.5k-line creative-coding pipeline.

GLSL

glsl-wrapper.ts wraps shader sketches.

Frameworks

Four creative-coding frameworks are wired: p5.js, three.js, GLSL, and Tone.js. An "Audio + visual" pair toggle injects Tone.js into a visual sketch.

Pairing addendum

When "Audio + visual" is enabled, the prompt injects a pairing addendum into the framework instructions. For Tone.js sketches it requires a mandatory p5.js spectrum visualizer in the same HTML. For p5.js / three.js / GLSL sketches it requires a Tone.js audio graph routed through new Tone.Limiter(-1).toDestination(), started by a centered Play overlay that transitions to a corner widget after first interaction.

Safety rules

The prompt enforces frequency guards (≥200 Hz on laptop speakers), forbids Tone.Destination.connect() (always .toDestination()), forbids event-driven triggerAttackRelease(freq, dur, Tone.now()) (omit the time arg to avoid same-frame start-time errors), and forbids p5.js variable names that shadow globals (scale, rotate, color, image, map, random, ...).

Editor

CodeViewer supplies code editor and run-and-capture integration.

Raster export

convert-to-2d converts sketch output to a raster image.

Fork

/api/sessions/fork-sketch creates sketch forks.

Explain

The Learn panel calls /api/images/[id]/explain to break a generated sketch down into its key algorithms from first principles.

Models

gemini-3.5-flash is primary with GEMINI_CODE_MODEL override. gpt-5.4 is fallback with OPENAI_CODE_MODEL override.

Draw to code

A pencil button in code mode opens the same freehand pad — but here the drawing is sent to the code model as vision input, with instructions to reproduce the drawn form as code: vertices, beziers, transforms, joints. The sketch is design guidance, not an asset — it never ships in the artifact and is explicitly excluded from runtime reference injection. Sketch governs shape and structure; the text prompt governs behavior, color, and motion. One drawing surface, two destinations: pixels in image mode, geometry-written-as-code in code mode.

Explanation layers

Every sketch's viewer carries four tabs — Preview, Gist, How it Works, Code. Gist distills the core algorithm for a beginner: real lines copied from the sketch (never invented pseudo-code) with short explanations, plus a plain-steps recipe view. How it Works is the deep layer: sections tagged math, computation, and aesthetics — aesthetic principles treated as a first-class explanatory dimension. Both generate once and cache per sketch; Refresh regenerates.

Stable seeds

The render seed is derived from the sketch id, so a generative sketch reproduces identically on every view — across reloads, sessions, and devices. Sketches are stable artifacts, not slot machines. A New Seed button re-rolls for the session only.

Validation

Browser-side validation runs through Playwright Chromium via CODE_VALIDATION_BROWSER.

Authenticated capture pending: code editor and live sketch viewer integrated with the chat surface.

Authenticated capture pending: run-and-capture result converted to 2D image asset.

Authenticated capture pending: sketch fork flow and validation output.

Sound

Sound is a shipped medium, not a planned one. Music, sound effects, and video-to-audio foley generate through ComfyUI; Tone.js is wired as a fourth creative-coding framework so audio can be authored alongside a visual sketch.

Music

ACE-Step v1.5 turbo generates music through ComfyUI — 8 steps, cfg 1, ModelSamplingAuraFlow shift 3. Reached through /api/generate-audio with kind: 'music'.

SFX / ambient

Stable Audio Open 1.0 generates sound effects and ambient beds. The checkpoint ships without T5, so the workflow loads it via a separate CLIPLoader(type='stable_audio').

Foley

HunyuanVideo-Foley XXL generates audio from video. kind: 'foley' routes through comfyui-foley.ts, including a WebP-to-MP4 transcode pipeline.

Endpoint

/api/generate-audio is a single endpoint covering all three modalities, routing to workflows in comfyui-audio.ts and comfyui-foley.ts. Audio defaults to the local backend — an HPC round-trip exceeds generation time. Backend routing for all other modalities is covered in section 12.

Tone.js

Tone.js is one of four creative-coding frameworks. An "Audio + visual" pair toggle injects Tone.js into a visual sketch so sound and image are authored in the same canvas.

Canvas

AudioPlayer presents generated audio in the canvas. The MediaType model includes music, sfx, and speech alongside image, video, 3D, animation, code, app, and text.

Storyboard soundtracks

Storyboards can ship with a generated ambient music bed that loops in Present mode. Volume slider and mute toggle live next to the panel controls. Generation reuses kind: 'music' with the storyboard title and outline as prompt context.

Authenticated studio capture pending: chat-driven music generation with the AudioPlayer in canvas.

Authenticated studio capture pending: video-to-audio foley result alongside its source clip.

Authenticated studio capture pending: a paired Tone.js + visual sketch with the audio-visual toggle.

Storyboards

A storyboard is a sequence of panels — image, video, 3D, code, or text — assembled inside the same chat-driven canvas, with a scaffolded outline, real per-panel playback, an optional ambient soundtrack, and a Present mode for review and sharing.

Sidebar tab
Storyboards live in their own sidebar tab (alongside Folders and Collections). The dropdown switcher above the tab list is the entry point on narrow widths.
Brainstorm
Brainstorm scratchpad with Refine and Send to storyboard seeds an outline; AI fills the outline progressively via the update_outline tool and always offers a "your own direction" option.
Outline → panels
The scaffolder writes one placeholder panel per outline beat and auto-navigates to the new storyboard. Generate-this on a placeholder replaces it with the real artifact for its declared media type.
Panels in canvas
3D panels render textured and animated like the canvas itself (not as static frames). Video and code panels run their real previews. Any result can be sent to a storyboard from the viewer toolbar's collection picker.
Captions
Panel captions are auto-written from the outline plus image descriptions via Gemini; saved with the panel.
Soundtrack
An ambient music bed is generated inline (kind: 'music') from the storyboard title + outline. The track loops in Present mode and has a volume slider and a mute toggle.
Present mode
Narrate toggle, auto-advance, and loop. A render-loop cancellation fix from the May rollout keeps narration + auto-advance from dying between panels.
Scenes
Panels group into named scenes — add, rename, reorder, and per-panel scene assignment are all in the storyboard UI. Storyboard is also a collection mode (see Distribution).

Authenticated studio capture pending: brainstorm scratchpad and progressive outline fill.

Authenticated studio capture pending: storyboard with mixed image / video / 3D / code panels and ambient soundtrack controls.

Authenticated studio capture pending: Present mode with Narrate + auto-advance running across panels.

Apps + classroom

The studio itself is the app surface: chat, canvas, collections, and classroom. The classroom layer adds authoring, cohorts, lesson sessions, AI grading, and exportable reports. The middle way carries into teaching: ij8 teaches AI literacy, computational thinking, design, and creative innovation by having students make — self-paced, with full instructor control and visibility. The full teaching reference is at classroom.ij8.ai; institutional pilots and partnerships are at pilots.ij8.ai.

Scaffolding
course-scaffold.ts creates AI course scaffolds. /api/courses/syllabus/parse parses syllabi.
Authoring
CourseAuthorWorkspace and the course wizard modal support course authors.
Lessons
/api/lessons/sessions/[id]/next advances lesson sessions step by step. Lessons are self-paced: the mastery score is live feedback toward a visible goal, never a lock on the next lesson.
Grading
lesson-scoring.ts uses gemini-3.1-flash-lite as the judge. The v2 algorithm splits weight by code-eligibility: code lessons earn engagement points across turns and apply / edit channels; non-code lessons (text, image, audio, 3D) reallocate apply-channel weight back to turns so a writing or sound lesson can reach the full 100. The judge sees the actual rendered artifact alongside the student's prompts, code, and explanations.
Roles
Roles are admin, dev, teacher, student, and user. Cohorts and cohort memberships are part of the data model.
Commons
The Commons course pool is shared across teachers. Course reports and CSV export are available.
Shell
AppShell, Sidebar, LearnPanel, and LessonScoreHud integrate into the chat-canvas surface.
App Lab
lib/ai/app-lab/ is the guided-prototype framework: context-pack.ts bundles the refined prompt, requirements, constraints, variables, and an output sample; prototype-generator.ts generates a single-file React-CDN HTML prototype (React 19 UMD + ReactDOM + Tailwind + Babel Standalone, hash-routing, localStorage, mock FakeAuthProvider); inspect.ts validates structure and Babel syntax; virtual-files.ts isolates context; detect-underspecification.ts flags incomplete briefs; up to two repair attempts on validation failure; soft cap 750 KB, hard cap 1.5 MB. Prototypes are teaching-grade — readable and annotated — not production.
Lesson media types
Lessons declare which media types are allowed: image, video, 3d, code, app, text, and audio (which subsumes music, SFX, and foley). Text-mode chat is inlined; artifacts save explicitly. When both text and other modes are allowed, the lesson defaults to non-text so the canvas isn't crowded out.
Teacher controls
Teachers can override lesson sessions with Mark complete and Reopen. Restarting a lesson snapshots the prior outline as a tile in the new session so the student keeps a thread to what came before.
Tutorials
Alongside lessons: 50 official AI-led tutorials, each teaching one tightly scoped concept on an Explain → Show → Play → Make arc, with the canvas as the blackboard — every idea illustrated by a live, runnable sketch that was vetted before any student sees it. Ungraded by design, open to every signed-in account, searchable by tag and difficulty. Teachers author their own tutorials with AI co-authoring and accept-to-freeze illustration review, and courses interleave lessons and tutorials in one ordered curriculum. The full teaching treatment is at classroom.ij8.ai.
Student showcase
Students publish curated work to a public, teacher-curated class gallery (apps/showcase, served at showcase.ij8.ai) — cohort-scoped and wallet-free, distinct from the NFT gallery. Submissions move student → teacher curation (approve, cohort↔public, scene) → public reader; approved code sketches render live via /api/sketches/[id]/wrapped.

ij8 Studio App Lab context pack in AppShell with chat and canvas — Studio capture: AppShell + chat + canvas running an App Lab prototype — context pack on the left (refined prompt, entities, variables), live mobile prototype rendered in the canvas.

Authenticated capture pending: student lesson session with stepwise progression and LessonScoreHud.

Authenticated capture pending: admin/teacher report table with CSV export.

Hybrid approaches

The platform is strongest when a work moves between modes. The following cross-pipeline flows are routine or directly implied by wired surfaces.

Image to 3D mesh
Generate or import a 2D image, run the 3D prep prompt, extract shape, texture, and inspect in the model viewer.
Mesh to rigged animation
Generate a mesh, classify morphology, author or infer a RigSpec, then bake continuous behavior in Blender.
Sketch to image
Generate p5.js or GLSL, run it, capture the canvas frame, and store the result as an image.
Image to mask to video
Select a region, create a mask, and send the image/mask pair into VACE or another video pipeline.
Mask to inpaint variation
Paint a mask, generate multiple inpaint variants, and retain picker state across turns.
Image to background removal to 3D
Batch remove background with rembg variants, then feed the cleaner silhouette to HunYuan3D.
Animation frame to image edit to re-animation
Scrub to a frame, edit it as an image, then use the altered frame as a new animation source.
Code to APNG
Run a sketch or shader, capture frames, and export a lightweight loop.
Sketch to sound
Pair Tone.js into a visual sketch with the audio-visual toggle so sound is authored alongside the canvas rather than added afterward.
Video to foley
Generate or scrub a video, then run HunyuanVideo-Foley to synthesize a matching audio track for it.
Course session to minted collection
Use classroom-guided production to generate work, curate outputs, then publish through collections.
Drawing to artifact
One freehand pad, two destinations: render the drawing as a finished image, or hand it to the code model as vision guidance and get the form back as live geometry in code.
Still to regional motion
Mask a region of a finished image and animate only that region into a looping video; the rest stays the literal photograph.
Views to solid mesh
Hand-pick or synthesize back/left/right views of a subject, then condition the 3D shape pass on all of them to kill the flat-back problem.
Generative reserve to fulfill
An on-chain reservation triggers background image generation, IPFS pinning, and on-chain fulfillment.
Image edit to gallery drop
Use conversational editing and background removal to refine a work, then place it inside a scheduled gallery release.
Codegen collection
Fork a sketch, validate it in Chromium, mint or surface it as a codegen lane collection.

Isometric studio artwork from public gallery — Public gallery capture: isometric studio work as a reference for hybrid spatial/code/image practice.

Gli Studi degli Artisti artwork — Public gallery capture: environment-like generative image, suitable for downstream 3D or animation treatment.

Authenticated capture pending: a complete hybrid chain from chat prompt to canvas artifact to collection.

Distribution: gallery, collections, NFT

The distribution layer is live: artist pages, collection pages, scheduled drops, IPFS pinning, Ethereum contracts, royalties, blind reveals, regenerative fulfill, and gallery surfacing.

Contracts
IJGCollection.sol V1, IJGCollectionV2.sol generative reservation, IJGCollectionV3.sol regeneration + refund + cloneable, and IJGCollectionFactory.sol EIP-1167 minimal-proxy clones.
Chain + tests
Solidity 0.8.28, Cancun EVM, OpenZeppelin v5.4, 66 contract test cases across 3 suites (V1, V2, V3), Hardhat tooling. Sepolia V1 deploy: 0x92B10e1f7107D7A2436445B4842a8077d041151e, verified.
Collection modes
open, closed, edition, generative, blind, codegen, storyboard, and showcase. Creative lanes: ai, code, hybrid. Storyboard and showcase are deliberately non-NFT modes: storyboard is a project workspace that hides all minting and contract UI; showcase carries per-item curation state (pending/approved, cohort/public) for teacher-curated class galleries.
IPFS + fulfill
Pinata REST API handles IPFS pinning. Generative drops move from on-chain reserve to non-blocking image-gen fetch to IPFS pin to on-chain fulfill, with recovery and regen-status routes for orphans.
Blind drops
Blind drops use the same Closed contract mode, with an API/UI redaction layer that reads tokenSold() on-chain and nulls filePath, title, and prompt for unsold items on a five-second cache.
Regeneration (V3)
V3 collections expose regenerate(tokenId) bounded by a maxRegenerations counter. The full state machine is reserved → generating → pinning → fulfilling → fulfilled, with refundGenerative() closing the loop when a generation fails. Recovery and regen-status routes pick up orphaned reservations.
Storyboard collections
The storyboard collection mode is a non-NFT project surface: panels render live in the studio canvas and in the class-showcase reader — 3D panels textured and animated, video and code panels running their real previews, an optional ambient soundtrack looping in Present mode. It deliberately hides minting and contract UI.
Royalty + fee
ERC-2981 royalty is 7.5% on secondary sales. Platform fee is 10% on primary sales.

ij8 Gallery homepage capture — Public capture: gallery home with featured collections.

ij8 Gallery explore capture — Public capture: explore page with many on-chain collection cards.

ij8 public landing capture — Public capture: unauthenticated ij8 landing surface.

Backends + routing

Generation is routed per request across four ComfyUI backends. For most users the choice is silent — they see generation time, not infrastructure. Users granted remote access get an explicit Compute backend toggle. Per-user and per-cohort opt-in gates HPC access; audio always pins to local.

Local 4090
Studio host, no tunnel. Default for every user. Audio (music, SFX, foley) pins here unconditionally — remote round-trip exceeds generation time.
SMU SuperPOD A100
Reached via an SSH -L tunnel from the studio host. Env: COMFYUI_URL_HPC. Opportunistic partition scheduling prefers the short partition (4-hour cap) over contended batch and auto-retargets pending jobs as queues shift. Cannot serve commercial / paid work — gated by cohort + user flag.
RunPod
Secondary remote target for commercial / paid work and for capacity overflow. Env: COMFYUI_URL_RUNPOD or COMFYUI_RUNPOD_URLS.
RunComfy API
Managed-template backend with bearer-token auth. Suited to workflows already curated by RunComfy (e.g. HunYuan3D 2.1). Cold-start latency on min_instances=0 templates is ≈ 8.5 minutes — pre-warm before class. RunComfy also serves 2D generation now: Flux 2 Dev, Qwen-Image 2512, Flux 2 Klein, and Z-Image Turbo are mapped to serverless model endpoints.
Routing
comfyui-backend.ts uses AsyncLocalStorage — a comfyBackendStore typed as ComfyBackendContext: the per-request preference is resolved at the route layer and inherited by every downstream getComfyUrl() call. No global state, no parameter threading through 20 callees.
Liveness
Two-stage probe per backend: a 700ms TCP connect (detects whether an SSH tunnel is up at all), then a 3s HTTP GET /queue (alive but allows for heavy generations). 15-second cache. Timeouts on the HTTP stage are treated as alive-but-slow so polling does not falsely demote a backend mid-generation.
Access
Per-user users.hpcAccess and per-cohort cohorts.hpcAccess opt-in HPC routing. Preference field settings.preferredComfyBackend = 'local' | 'hpc' | 'runpod' | 'api' | 'auto'; 'auto' normalizes to local; 'api' selects the RunComfy managed-template path.
Health
/api/health reports each backend independently: DB (SELECT 1), ComfyUI /queue probe, HunYuan 3D model presence, and API-key presence for Gemini / OpenAI. Latency captured per backend. Status: ok / degraded / down.
Provider settings
/api/admin/providers exposes platform-level provider toggles and per-platform key overrides; platform_provider_settings + user_provider_settings tables back per-user keys (platform and per-user settings both have shipped UI; users can supply their own provider keys in the settings panel).

Routing policy is intentionally policy-driven rather than availability-driven: paid / commercial requests must not be routed to SMU. Free / educational requests can fall back to local if HPC is unreachable.

Roadmap

Roadmap items are marked by current state. Sound has shipped and has its own section; word remains a planned medium, not yet a shipped feature.

State	Item	Notes
Shipping	Gallery and minting infrastructure	Artist pages, collection pages, scheduled drops, Ethereum L1 + Sepolia contract paths, ERC-2981 royalties, Pinata IPFS pinning.
Shipping / active	Image generation and editing	Gemini, OpenAI fallback, ComfyUI workflows, masking, inpaint, outpaint, background removal, crop, upscale, and region-level interactions.
Shipping / active	Sound	Music (ACE-Step v1.5 turbo), SFX and ambient (Stable Audio Open 1.0), video-to-audio foley (HunyuanVideo-Foley), and Tone.js as a creative-coding framework. See section 07.
Shipping	Learning surfaces	50 official tutorials (Explain → Show → Play → Make), instructor-authored tutorials, Gist and How-it-Works explanation layers. See classroom.ij8.ai.
Shipped	3D job cancellation	Ownership-checked cancel that interrupts the in-flight ComfyUI workflow; the fulfill worker checks between every stage; per-stage timeouts.
In progress	Cancellation everywhere	Video, animation, and generative-fulfill jobs still need the same treatment.
Next	Hot-wallet hygiene	Move toward KMS or signing-sidecar abstraction.
Next	Observability	Sentry rollout, structured logging, and real per-backend health checks.
Started	Test scaffolding	Vitest landed in the studio (11 unit-test suites: outpaint, gist, history-trim, render-policy, app-lab, access). CI gates remain lint + build; typecheck gate and rate limiting still ahead.
Planned	Word	Algorithmic poetry, generative text and word-art forms, computational poetics, and lettrist / concrete-poetry tooling, treated as a first-class medium alongside image and sound rather than as a chat afterthought.
Exploratory	Speech / TTS	The `AudioMediaType` enum already includes `'speech'`; no implementation. Pending a deliberate choice of voice model and license terms.
Exploratory	Multi-chain support	Support beyond Ethereum.
Exploratory	Beyond-NFT distribution	Digital-painting fabrication, 3D printing, and game-asset pipelines.

Open questions / what artists shape

Founding-cohort artists are invited to influence which roadmap items get prioritized and what bespoke workflows are developed around their practice. The relationship is closer to founding members than customers: the tools should be tested against real methods, not generic usage funnels.

Open questions include which hybrid flows matter most, where bespoke artist tools should become shared primitives, which distribution formats should sit beyond NFT, and how word should enter the same canvas — and how sound should deepen — without becoming ornamental add-ons. Drawing as a first-class input — one pad feeding both image and code — and storyboard authoring are the most recent shapes under active redesign; the questions there are about how much narrative scaffolding to bake in before it stops feeling like an open canvas.

Index / colophon

Complete enumeration of named models, backends, contracts, and infrastructure in this reference.

Layer	Technology / model / path	Status / use
Image	`gemini-3-pro-image-preview`	Primary image generation and multi-turn editing.
Image	`gpt-image-1.5`	OpenAI fallback for Gemini 503.
Chat orchestration	`gemini-3.5-flash` (`GEMINI_CHAT_MODEL` override); `qwen3:14b` (Ollama); `minimax-m27`	Primary chat / agentic loop; local non-streaming alternate inside the agentic loop; Minimax HTTP alternate.
ComfyUI image	Flux 1, Flux 2, SD 3.5, Qwen-Image 2512, Flux 2 Klein, Z-Image Turbo (incl. GGUF-quantized), CyberRealistic, Playground 2.5, Kontext, SDXL style-transfer	Workflow library in `comfyui.ts`.
Edit	`MaskEditor`, `MaskToolbar`, `BoundingBoxOverlay`, rembg general / anime / precise / birefnet, `convert-to-2d`, sketch-to-image, draw-to-code (role-separated sketch refs)	Masking, inpaint, outpaint, crop, zoom-enhance, upscale, background removal, code raster.
Video	Wan 2.2 14B, Hailuo 2.3, Hailuo 2.3 Fast, VACE, cel-style video, `gemini-animation.ts`, `hy-motion.ts`, ffmpeg 6.1.1, VACE animate-region (SSE)	ComfyUI, Minimax API, masked video inpaint, animation, share-video export.
3D	HunYuan3D 2.1, HunYuan3D 2.0, HunYuan3D-2mv multi-view, `/api/3d-views/generate`, 3D job cancel, `build3DPrepPrompt`, `model-viewer`	Dual-pass shape to texture, multi-view conditioning, prep-prompt, viewer, APNG export.
Rigging	Mixamo FBX, Blender 4.4.3, UniRig, `scripts/classify_morphology.py`, `classify-morphology.ts`, `lib/rig/spec.ts`, `lib/rig/topology.ts`, `scripts/bake_rig.py`, `bake-rig.ts`, `/rig-test`, `/api/rig/auto-smoke`	Retargeting, auto-rigging, morphology classification (radial, stalk, body-tail, pulse-mass), RigSpec authoring, Blender bake, SSE streamed long bakes.
Code	`code-pipeline.ts`, `glsl-wrapper.ts`, p5.js, three.js, GLSL, Tone.js, app framework, `CodeViewer`, GistPane, HowItWorksPane, deterministic per-id seed, DrawToCodePanel, `/api/sessions/fork-sketch`, `/api/images/[id]/explain`, `gemini-3.5-flash`, `gpt-5.4`, `CODE_VALIDATION_BROWSER`	Creative coding across four frameworks, run-and-capture, fork, sketch explanation, model fallback, browser validation.
Storyboards	scenes schema, `collection_items.narrative`, present mode, narrate toggle, auto-advance, ambient soundtrack	Multi-panel scene authoring; mixed-media panels; present-mode playback.
Brainstorm	brainstorm scratchpad + Refine + Send-to-storyboard, `update_outline` tool, progressive outline fill	Pre-scaffold ideation loop feeding storyboards.
App Lab	`lib/ai/app-lab/context-pack.ts`, `prototype-generator.ts`, `inspect.ts`, `virtual-files.ts`, `detect-underspecification.ts`, `map.ts`, `shape.ts`, `stage-selection.ts`	React-CDN guided prototype generation with validation + repair.
Sound	ACE-Step v1.5 turbo, Stable Audio Open 1.0, HunyuanVideo-Foley XXL, Tone.js, `/api/generate-audio`, `comfyui-audio.ts`, `comfyui-foley.ts`, `AudioPlayer`, `SoundtrackBar`	Music, SFX and ambient, video-to-audio foley, Tone.js creative-coding audio, storyboard soundtrack playback.
Classroom	`course-scaffold.ts`, `/api/courses/syllabus/parse`, `CourseAuthorWorkspace`, course wizard modal, `/api/lessons/sessions/[id]/next`, `lesson-scoring.ts`, `gemini-3.1-flash-lite`, AppShell, Sidebar, LearnPanel, LessonScoreHud	Course authoring, lesson progression, AI grading, roles, reports, Commons course pool.
Learning	50 seeded tutorials (`seed-tutorials.ts`), illustration verify-refine agent + heartbeat curator, archetype widgets (mapping-diagram, step-builder, param-explorer, live-animated), `tutorials` / `tutorial_illustrations` / `tutorial_sessions` / `course_tutorials` tables	Explain→Show→Play→Make tutorial runtime + instructor authoring.
Backends	`comfyui-backend.ts`, `comfyui-backend-config.ts`, AsyncLocalStorage(`comfyBackendContext`), TCP+HTTP liveness, 15s cache, `users.hpcAccess`, `cohorts.hpcAccess`	Per-request routing across local 4090, SMU SuperPOD A100, RunPod, RunComfy.
Operations	`/api/health` (DB · ComfyUI · HunYuan3D · key-presence per backend), `/api/admin/providers`, `platform_provider_settings`, `user_provider_settings`, pino, Sentry stubs	Health + provider-key admin + observability scaffolding.
Contracts	`IJGCollection.sol`, `IJGCollectionV2.sol`, `IJGCollectionV3.sol`, `IJGCollectionFactory.sol`, Sepolia `0x92B10e1f7107D7A2436445B4842a8077d041151e`	V1, generative reservation, regeneration + refund + cloneable, EIP-1167 factory.
Stack	Next.js 16.1.6, React 19.2.3, TypeScript 5 strict, Tailwind 4, Zustand 5, Drizzle ORM + better-sqlite3 SQLite (36 tables), Auth.js, Google OAuth, email magic link, DB whitelist, wagmi ^3, viem ^2, custom WalletButton, ComfyUI, Blender 4.4.3, ffmpeg 6.1.1, rembg, UniRig, ACE-Step v1.5 turbo, Stable Audio Open 1.0, HunyuanVideo-Foley XXL, Tone.js, storyboards (scenes schema, present mode, ambient soundtrack), App Lab framework (React 19 UMD, Tailwind, Babel Standalone), Gemini 3.5 Flash, Qwen3:14b (Ollama), Minimax M27, npm workspaces, Turborepo, Cortex-studio, PM2, Cloudflare tunnel, Pinata, Ethereum mainnet + Sepolia	Application and infrastructure stack.

Self-hosted Inter. No tracking. No external scripts. Single-page static output for tooling.ij8.ai. Last rendered June 12, 2026.