Skip to content

Add capability system and collectStream utility for AI tasks#479

Open
sroussey wants to merge 16 commits intomainfrom
claude/multi-task-model-registration-cwfD2
Open

Add capability system and collectStream utility for AI tasks#479
sroussey wants to merge 16 commits intomainfrom
claude/multi-task-model-registration-cwfD2

Conversation

@sroussey
Copy link
Copy Markdown
Collaborator

Summary

This PR introduces a new capability-based system for AI tasks and adds a collectStream utility for consuming streaming events. It replaces the legacy task-based model metadata with a closed vocabulary of capability identifiers, enabling stricter type safety and better capability matching at compile time.

Key Changes

New Capability System

  • Capabilities.ts: Defines a closed vocabulary of 44 AI capability identifiers (e.g., "text.generation", "text.embedding", "image.segmentation", "tool-use", "json-mode") with descriptions. Uses dot-notation and hyphen-notation instead of legacy PascalCase task names.
  • StreamEvents.ts: Re-exports canonical stream event types from @workglow/task-graph for capability-aware consumers.
  • capability/index.ts: Public barrel export for the capability module.

Stream Collection Utility

  • collectStream.ts: New async function that consumes AsyncIterable<StreamEvent<T>> and returns fully-accumulated output T. Supports:
    • Delta accumulation: Concatenates text-delta events per-port; handles object-delta with replace semantics for objects and upsert-by-id for arrays
    • One-shot mode: Returns finish.data directly when no deltas arrive
    • Snapshot mode: Last snapshot wins, with finish.data merged on top
    • Error handling: Throws on StreamError events or missing finish event
    • Mixed-mode guard: Rejects streams mixing text-delta and object-delta events
    • First finish wins: Breaks immediately on first finish event to prevent corruption from duplicates

Task Base Class Updates

  • AiTask.ts: Adds static requires property (empty array by default) to declare capabilities a task requires from the model. Subclasses override with relevant Capability values. Includes legacy task name detection guard (isLegacyTaskClassName) for backward compatibility during migration.

Test Coverage

  • collectStream.test.ts: 18 comprehensive tests covering delta accumulation, one-shot results, error handling, multi-port streams, snapshot mode, mixed-mode rejection, and type safety.
  • AiTask.requires.test.ts: Tests for requires property on AiTask, StreamingAiTask, AiVisionTask, and AiImageOutputTask base classes and subclasses.

Model Metadata Migration

Updated model registrations across all providers and test fixtures to use new capability strings instead of legacy task class names:

  • "TextGenerationTask""text.generation"
  • "TextEmbeddingTask""text.embedding"
  • "ImageGenerateTask""image.generation"
  • "StructuredGenerationTask""json-mode"
  • "ToolCallingTask""tool-use"
  • And 30+ other mappings across HuggingFace, Google Gemini, OpenAI, Anthropic, Ollama, MediaPipe, and local ONNX models.

Schema and Export Updates

  • ModelSchema.ts: Renamed tasks field to capabilities in model configuration schema.
  • ModelRepository.ts: Updated to filter models by capabilities instead of tasks.
  • common.ts: Added capability module to public exports.

Implementation Details

  • The collectStream function exactly mirrors StreamProcessor's accumulation logic for consistency
  • Capability strings use a closed vocabulary enforced at compile time via TypeScript's satisfies operator
  • Legacy task class name detection preserves backward compatibility during the migration phase
  • All 44 concrete AI tasks will be populated with their required capabilities in Phase 4

https://claude.ai/code/session_01KGifQAjvG8qHkCKzRuarr6

claude added 7 commits May 9, 2026 22:20
Adds a new `capability/` module to `@workglow/ai` with:
- `Capabilities.ts`: closed `as const` vocabulary of 29 AI capability
  identifiers with a derived `Capability` type (no enum)
- `StreamEvents.ts`: re-exports `StreamEvent<T>` and related types from
  `@workglow/task-graph` for capability-aware consumers
- `collectStream.ts`: `async function collectStream<T>()` that handles
  both delta-accumulation (text-delta/object-delta) and one-shot (finish-
  only) stream variants, with full error propagation
- `index.ts`: barrel re-exporting all three modules
- `collectStream.test.ts`: 7 vitest tests covering all accumulation paths,
  error cases, and a compile-time Capability type check

Re-exports the capability barrel from `@workglow/ai`'s `common.ts` so
consumers can import via `@workglow/ai`.

https://claude.ai/code/session_01KGifQAjvG8qHkCKzRuarr6
…tics

Per code review on Phase 0 commit 1b22e11. Replace shallow-merge with
replace semantics for object-delta to match StreamProcessor; track
text deltas per-port to preserve multi-port output type; add isNonEmptyObject
guard; add missing tests for snapshot, post-delta error, and merge paths.
Rename ModelConfigSchema.tasks -> capabilities and propagate across libs:
- packages/ai/src/model/ModelSchema.ts: field + required-list rename
- packages/ai/src/model/ModelRepository.ts: read-site updates
- packages/ai/src/task/base/AiTask.ts: model-compat check uses
  capabilities; transitional usesTaskClassNames guard skips the legacy
  check when values look like new dot-notation strings (Phase 4 will
  formalize the task-type -> capability mapping)
- packages/ai/src/provider-utils/HfModelSearch.ts: read-site
- providers/*/ai/common/*_ModelSearch.ts: 7 vendor model-search helpers
- packages/test/src/samples/{MediaPipe,ONNX}ModelSamples.ts: fixtures
  migrated, values mapped to capability strings (TextEmbeddingTask ->
  text.embedding, etc.)
- packages/test/src/test/**/*.test.ts: integration test fixtures updated

Schema kept as plain string array (fallback form) per the spec - the
preferred enum-derived form interferes with the as const satisfies
DataPortSchemaObject literal tracking.

Tests verified: 85/85 across capability, ai-provider, and ai-model areas.
Build green: build:packages 58/58, build:types 60/60.
…ures

Per code review on Phase 1 commit 79abbb4:
- usesTaskClassNames now matches PascalCase ...Task pattern, not just
  no-dot strings; tool-use/json-mode/vision-input no longer trigger the
  legacy path; isLegacyTaskClassName extracted to module scope so both
  call sites share the predicate (single Phase 4 deletion target)
- Dedupe StructuredGenerationTask migration overlap with TextGenerationTask
  in Anthropic/Gemini/OpenAI/HFT generic test fixtures (duplicate
  "text.generation" entries removed)
- StreamingAiTaskPhases test fixtures now pass "text.summary" capability
  string instead of "TextSummaryTask" class name
- AgentTask handling: AgentTask does not exist anywhere in the codebase;
  the pre-Phase-1 fixtures referenced it but the class was never defined
  in packages/ or providers/. No mapping applied; fixtures left as-is.
- I3 (DownloadModelTask in LlamaCpp fixtures): DownloadModelTask was never
  in the capabilities array; it is a task used to download models. LlamaCpp
  fixtures at HEAD already use new-style capability strings. Non-issue.

https://claude.ai/code/session_01KGifQAjvG8qHkCKzRuarr6
…e comments

Per code review on commit 6c119e5: enumerate json-mode and vision-input
alongside tool-use in the JSDoc and inline comments so the Phase 4
removal scope is unambiguous.
…Phase 2)

Each of the four AiTask base classes (AiTask, StreamingAiTask, AiVisionTask,
AiImageOutputTask) now declares `static readonly requires: readonly Capability[] = []`.
This field lets concrete task subclasses (Phase 4) declare which model capabilities
they need; the Phase 3 dispatcher will read it via `(instance.constructor as typeof
AiTask).requires`. Defaults to `[]` so all 44 existing concrete tasks continue to
compile without modification. Adds a focused vitest suite covering base defaults,
subclass override, and the instance-constructor access pattern.

https://claude.ai/code/session_01KGifQAjvG8qHkCKzRuarr6
…ask bases

Per code review on commit dd2e691:
- Add public modifier to AiTask.requires for visibility consistency with
  other public static fields on the class
- Expand JSDoc on AiTask.requires to describe dispatch semantics (gating
  rule, empty-array vacuous pass, Phase 4 obligation, link to CAPABILITIES)
- Remove redundant `static override readonly requires = []` declarations
  on StreamingAiTask, AiVisionTask, AiImageOutputTask — they inherit the
  empty default from AiTask via the prototype chain. Concrete subclasses
  in Phase 4 will override directly on top of the AiTask declaration.
- Drop the now-unused Capability imports from StreamingAiTask and
  AiVisionTask and AiImageOutputTask.

Issue 1 (silent inheritance for tasks that omit requires) is intentional
per plan: empty default preserves Phase-2/3 build greenness for the 44
concrete tasks; the Phase 4 audit test (already in plan acceptance
criteria) verifies coverage.

Tests: 16/16 passing. Build: types 60/60, packages 58/58.
@pkg-pr-new
Copy link
Copy Markdown

pkg-pr-new Bot commented May 10, 2026

Open in StackBlitz

@workglow/cli

npm i https://pkg.pr.new/@workglow/cli@479

@workglow/ai

npm i https://pkg.pr.new/@workglow/ai@479

@workglow/job-queue

npm i https://pkg.pr.new/@workglow/job-queue@479

@workglow/knowledge-base

npm i https://pkg.pr.new/@workglow/knowledge-base@479

@workglow/storage

npm i https://pkg.pr.new/@workglow/storage@479

@workglow/task-graph

npm i https://pkg.pr.new/@workglow/task-graph@479

@workglow/tasks

npm i https://pkg.pr.new/@workglow/tasks@479

@workglow/util

npm i https://pkg.pr.new/@workglow/util@479

workglow

npm i https://pkg.pr.new/workglow@479

commit: a111591

claude added 9 commits May 10, 2026 03:25
…(Phase 3)

Replaces the 3-Map per-task-type provider registry with a single capability-set
registration list per provider. Dispatch uses strict gating
(`model.capabilities ⊇ task.requires`) and most-specific-superset selection
(smallest matching `serves` wins; ties broken by registration order).

Breaking changes for downstream packages:

- `AiProvider` constructor now takes `(runFns?, previewTasks?)` where
  `runFns` is `readonly AiProviderRunFnRegistration[]` instead of three
  per-task `Record<string, fn>` maps.
- `AiProviderRunFn` (Promise-returning) is removed; the canonical authoring
  surface is the streaming `AiProviderStreamFn`. Non-streaming consumers use
  `collectStream(...)`.
- `AiProviderRegistry` removes `registerRunFn(provider, taskType, fn)`,
  `registerStreamFn`, `getDirectRunFn`, `getStreamFn`,
  `registerAsWorkerStreamFn`, and `getProviderIdsForTask`. New surface:
  `registerRunFn(providerName, registration)`,
  `registerAsWorkerRunFn(providerName, serves)`,
  `getRunFnFor(providerName, requires)`,
  `getProviderIdsForCapabilities(requires)`.
- `AiProvider.taskTypes` and the `tasks` / `streamTasks` instance fields are
  gone; `inferCapabilities(model)` is added (default returns
  `model.capabilities ?? []`).
- `AiJobInput` adds `requires: readonly Capability[]` alongside `taskType`
  (taskType is retained as observability/queue-key metadata only).
- `AiTask.execute` strictly gates on `requires` before dispatch and uses
  the new `model.unload` capability for the resource-scope unload hook.
- `model.unload` capability added to `CAPABILITIES`.

Worker-side serialisation: registrations are exposed under a deterministic
`workerKeyForServes(serves)` (sorted, comma-joined) so the main-thread proxy
and `registerOnWorkerServer` resolve to the same generator.

Phase 4 will populate concrete `requires` per task; Phase 5 will migrate the
provider implementations under `providers/*` and `packages/test`. Expected
red downstream packages from this commit (Phase 5 starting list):
@workglow/anthropic, @workglow/chrome-ai, @workglow/google-gemini,
@workglow/huggingface-inference, @workglow/huggingface-transformers,
@workglow/node-llama-cpp, @workglow/ollama, @workglow/openai,
@workglow/tf-mediapipe, @workglow/test (contract assertions).

https://claude.ai/code/session_01KGifQAjvG8qHkCKzRuarr6
…JobInput

Per code review on Phase 3 commit 5629518:
- Critical 1: extract gateOrThrow helper from AiTask.execute and call it
  from StreamingAiTask.executeStream so streaming-task dispatch is gated
  the same as non-streaming. AiChatTask.executeStream also calls gateOrThrow
  since it overrides executeStream without calling super.
- Critical 2: AiChatTask.getJobInput now calls super.getJobInput so
  timeoutMs, outputSchema, and any future base fields stay populated;
  session caching layered via the (input as any).sessionId convention
  AiTask.getJobInput already honors.
- Important 4: document the "model.unload" registration contract that
  Phase 5 providers must satisfy for the unload lifecycle hook to fire.
- Important 5: tighten gating test to assert the missing-cap name appears
  in the error message (/missing capabilities[^:]*: text\.generation/).
- Important 6: isolate the CollectingStrategy test from the global registry
  using per-test beforeEach/afterEach with setAiProviderRegistry so
  subsequent tests in the same worker aren't polluted.
- Important 7: AiProvider.register() now throws when worker-mode is taken
  and workerRunFnSpecs() returns []; previously silent no-op registration.
- Important 8: AiProviderRegistry.previewRunFnRegistry now private.

Tests: 42 passed (3 files). Build (@workglow/ai): types green, packages green.
Wider monorepo still RED for vendor packages (Phase 5 territory).

https://claude.ai/code/session_01KGifQAjvG8qHkCKzRuarr6
…te task classes

Adds `public static override readonly requires: readonly Capability[]` (or
`public static readonly requires` for plain-Task subclasses) to every concrete
task class registered in `registerAiTasks()`. Pure-compute tasks (storage-backed,
chunking, vector math) declare `[]`; AI-dispatch tasks declare their provider
capability strings per the Phase 4 mapping table.

Also adds `import type { Capability }` to every file that needed it, and creates
`packages/ai/src/task/index.test.ts` — an audit test that verifies every registered
task has a valid `requires` array and that key provider-facing capabilities appear
on at least one task. All 44 tests in packages/ai/ pass; build:types and
build:packages are green.

https://claude.ai/code/session_01KGifQAjvG8qHkCKzRuarr6
…se 4 JSDoc

Per code review on Phase 4 commit d20d7e9:
- Extract registerAiTasks to packages/ai/src/task/registerAiTasks.ts so
  the audit test no longer imports from a banned barrel ("./index"). The
  index module re-exports the function so the public surface is unchanged.
- Clarify the JSDoc on plain-Task subclasses (RerankerTask,
  QueryExpanderTask, ModelSearchTask) so future readers know `requires`
  on these classes is informational only — they implement their own
  execute() and bypass AiTask.gateOrThrow. The audit test still validates
  the values are known capabilities.

Tests: 44/44 across @workglow/ai. Build green for @workglow/ai (wider
build remains red for chrome-ai + 8 other vendor packages — Phase 5).
Convert every OpenAI run-fn to an `async function*` yielding StreamEvents
and build a single `OPENAI_RUN_FNS: AiProviderRunFnRegistration[]` keyed
by the closed `serves` capability set, replacing the per-task-type
`OPENAI_TASKS` / `OPENAI_STREAM_TASKS` records.

The plain-prompt and chat-history paths are folded into one
`["text.generation"]` registration so both `TextGenerationTask` and
`AiChatTask` (which share the same `requires` array) dispatch correctly;
`OpenAI_Chat.ts` is removed.

Both provider shells (`OpenAiProvider`, `OpenAiQueuedProvider`) now
override `inferCapabilities` and `workerRunFnSpecs` from a shared
`OpenAI_Capabilities.ts` helper so worker-mode registration declares the
same capability sets the worker-side runFns serve.

Also: re-export the `Capability` vocabulary from `@workglow/ai/worker`
so provider subclasses living behind the worker barrel can name it.

https://claude.ai/code/session_01KGifQAjvG8qHkCKzRuarr6
…emplate

Per code quality review on Phase 5a commit 45c31f2:
- Issue B: extract OPENAI_CAPABILITY_SETS as the single source of truth in
  OpenAI_CapabilitySets.ts. OPENAI_RUN_FN_SPECS and the serves field of
  every OPENAI_RUN_FNS entry now derive from it. Adds a parity test that
  fails fast if the lists drift.
- Issue D: extend vision-input inference to o-series models (o1, o3, o4)
  alongside gpt-family vision models. Broaden o-series detection from
  /^o[134]/i to /^o\d/i so future o2/o5 are recognised.
- Issue E: convert the gpt-4o-mini and text-embedding-3-small tests to
  exact-set assertions so regressions in inferOpenAiCapabilities can't
  silently add or drop capabilities.

Documentation locked in for the Phase 5b-5i template:
- libs/.claude/CLAUDE.md: structured-generation finish-payload exception
  and the capability-collision pattern (chat vs. prompt discrimination).

Tests: 13 passed (providers/openai), 44 passed (packages/ai). @workglow/openai build green.

https://claude.ai/code/session_01KGifQAjvG8qHkCKzRuarr6
…ctor (Phase 5b)

Migrates the @workglow/anthropic provider to the new capability-set dispatch
model introduced in Phase 5a, following the @workglow/openai template exactly.

- Add Anthropic_CapabilitySets.ts (single source of truth — was already stubbed,
  now committed) and Anthropic_Capabilities.ts (ANTHROPIC_RUN_FN_SPECS +
  inferAnthropicCapabilities heuristic covering Claude 3/3.5/4-series families)
- Rewrite Anthropic_JobRunFns.ts: replaces the old ANTHROPIC_TASKS /
  ANTHROPIC_STREAM_TASKS Records with a single ANTHROPIC_RUN_FNS
  AiProviderRunFnRegistration[] list keyed by capability sets
- Unify Anthropic_Chat.ts + Anthropic_TextGeneration.ts into a single
  Anthropic_TextGeneration_Stream run-fn that discriminates on
  Array.isArray(input.messages) per the capability-collision convention
- Convert every run-fn to async function* AiProviderStreamFn; remove all
  AiProviderRunFn (Promise-returning) variants and update_progress calls;
  wrap logger.time/timeEnd in try/finally
- Structured generation: finish.data.object populated per streaming-convention
  exception (CLAUDE.md lines 201-205)
- Rewrite AnthropicProvider.ts and AnthropicQueuedProvider.ts shell classes to
  override inferCapabilities and workerRunFnSpecs; drop old taskTypes constructor
- Update registerAnthropicInline.ts and registerAnthropicWorker.ts to pass
  (ANTHROPIC_RUN_FNS, ANTHROPIC_PREVIEW_TASKS) to the constructor
- Add AnthropicProvider.test.ts: 15 tests covering 5+ model families, 2 exact-set
  assertions, and capability-set parity check

https://claude.ai/code/session_01KGifQAjvG8qHkCKzRuarr6
Per code review on Phase 5b commit 9cb5533: the regex
`^claude-3[.-][57]-sonnet` only matched -sonnet variants and silently
missed claude-3-5-haiku-20241022 (which is in Anthropic_ModelSearch's
fallback list). Widen to `^claude-3[.-][57]-` so the entire 3.5/3.7
family routes to the full vision/tools/json-mode capability set.

Adds a regression test for claude-3-5-haiku-20241022.
…ase 5c)

Migrates @workglow/google-gemini to the new capability-set dispatch model,
following the OpenAI (5a) / Anthropic (5b) template.

Structural changes:
- Add Gemini_CapabilitySets.ts as the single source of truth (named exports
  + GEMINI_CAPABILITY_SETS aggregate; SDK-free for main-thread import)
- Add Gemini_Capabilities.ts deriving GEMINI_RUN_FN_SPECS from the source of
  truth and exporting inferGeminiCapabilities() heuristic
- Convert all run-fns to async generators yielding StreamEvent (drop
  update_progress); StructuredGeneration populates finish.data.object per
  the json-mode exception
- Unify Gemini_TextGeneration with the deleted Gemini_Chat into a single
  ["text.generation"] runFn, discriminating on
  Array.isArray(input.messages) && length > 0
- Both shells (GoogleGeminiProvider, GoogleGeminiQueuedProvider) override
  inferCapabilities() and workerRunFnSpecs()
- registerGeminiInline / registerGeminiWorker pass
  (GEMINI_RUN_FNS, GEMINI_PREVIEW_TASKS) to the constructor

Heuristic coverage (Phase 5b lesson): every model in
Gemini_ModelSearch.GEMINI_FALLBACK_MODELS is verified to receive
non-baseline capabilities via a parameterised test suite that iterates the
fallback list. Covers gemini-3.x/2.x/1.5 (full set + vision),
gemini-pro-vision (legacy), gemini-1.0-pro / gemini-pro (no vision),
text-embedding-* / gemini-embedding-* (text.embedding), imagen-*
(image.generation), gemini-*-image-* (image.generation + image.editing).

Tests: 29/29 pass (model-id coverage + 3 exact-set assertions + parity test
+ fallback-list coverage + run-fn shape).

Also removes two dead unused imports in Gemini_ToolCalling that were
blocking the build.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants