Tokenizer.ts
Tokenizer.ts overview
Section titled “Tokenizer.ts overview”Service for model-specific token counting and prompt truncation. Tokenization depends on the target provider, model, and encoding rules, so this module leaves the actual tokenization function to the service implementation.
The Tokenizer service can count tokens for raw prompt input and shorten a
prompt to a token limit by keeping the newest messages that fit. This module
defines the service tag, the service interface, and a make constructor that
builds a full tokenizer service from a token-counting function.
Since v4.0.0
Exports Grouped by Category
Section titled “Exports Grouped by Category”constructors
Section titled “constructors”Creates a Tokenizer service implementation from tokenization options.
Details
This function constructs a complete Tokenizer service by providing a tokenization function. The service handles both tokenization and truncation operations using the provided tokenizer.
Example (Creating a word tokenizer)
import { Effect } from "effect"import { Tokenizer } from "effect/unstable/ai"
// Simple word-based tokenizerconst wordTokenizer = Tokenizer.make({ tokenize: (prompt) => Effect.succeed( prompt.content .flatMap((msg) => typeof msg.content === "string" ? msg.content.split(" ") : msg.content.flatMap((part) => (part.type === "text" ? part.text.split(" ") : [])) ) .map((_, index) => index) )})Signature
declare const make: (options: { readonly tokenize: (content: Prompt.Prompt) => Effect.Effect<Array<number>, AiError.AiError>}) => ServiceSince v4.0.0
models
Section titled “models”Service (interface)
Section titled “Service (interface)”Tokenizer service interface providing text tokenization and truncation operations.
Details
This interface defines the core operations for converting text to tokens and managing content length within token limits for AI model compatibility.
Example (Implementing a custom tokenizer)
import { Effect } from "effect"import { Prompt } from "effect/unstable/ai"import type { Tokenizer } from "effect/unstable/ai"
const customTokenizer: Tokenizer.Service = { tokenize: (input) => Effect.succeed( input .toString() .split(" ") .map((_, i) => i) ), truncate: (input, maxTokens) => Effect.succeed(Prompt.make(input.toString().slice(0, maxTokens * 5)))}Signature
export interface Service { /** * Converts text input into an array of token numbers. */ readonly tokenize: ( /** * The text input to tokenize. */ input: Prompt.RawInput ) => Effect.Effect<Array<number>, AiError.AiError> /** * Truncates text input to fit within the specified token limit. */ readonly truncate: ( /** * The text input to truncate. */ input: Prompt.RawInput, /** * Maximum number of tokens to retain. */ tokens: number ) => Effect.Effect<Prompt.Prompt, AiError.AiError>}Since v4.0.0
services
Section titled “services”Tokenizer (class)
Section titled “Tokenizer (class)”Service tag for model tokenization services.
When to use
Use to access or provide model-specific token counting and prompt truncation operations.
Details
This tag provides access to tokenization functionality throughout your application, enabling token counting and prompt truncation capabilities.
Example (Accessing the Tokenizer service)
import { Effect } from "effect"import { Tokenizer } from "effect/unstable/ai"
const useTokenizer = Effect.gen(function* () { const tokenizer = yield* Tokenizer.Tokenizer const tokens = yield* tokenizer.tokenize("Hello, world!") return tokens.length})Signature
declare class TokenizerSince v4.0.0