TokenKit โ A professional .NET 8.0 library and CLI for tokenization, validation, cost estimation, and model registry management across multiple LLM providers (OpenAI, Anthropic, Gemini, etc.).
| Category | Description |
|---|---|
| ๐ข Tokenization | Analyze text or files and count tokens using multiple encoder engines (simple, SharpToken, ML.Tokenizers) |
| ๐ฐ Cost Estimation | Automatically calculate estimated API cost based on model metadata |
| โ Prompt Validation | Validate prompt length against model context limits |
| ๐งฉ Model Registry | Manage model metadata (maxTokens, pricing, encodings, providers) via JSON registry |
| โ๏ธ CLI & SDK | Use TokenKit as a .NET library or a global CLI tool |
| ๐งฎ Multi-Encoder Support | Dynamically select tokenization engines via --engine flag |
| ๐ฆ Self-contained Data | Local registry stored in Registry/models.data.json, auto-updatable |
| ๐ Live Model Scraper | Optional OpenAI API key support to fetch real-time model data |
| ๐ Structured Logging | All CLI commands logged to tokenkit.log with rotation (1MB max) |
| ๐คซ Quiet & JSON Modes | Machine-readable (--json) and silent (--quiet) output modes for automation |
| ๐จ CLI Polish | Colorized output, ASCII banner, and improved user experience |
dotnet add package TokenKitdotnet tool install -g TokenKittokenkit analyze "Hello from TokenKit!" --model gpt-4otokenkit analyze prompt.txt --model gpt-4oecho "This is piped text input" | tokenkit analyze --model gpt-4oExample Output:
{
"Model": "gpt-4o",
"Provider": "OpenAI",
"TokenCount": 4,
"EstimatedCost": 0.00002,
"Valid": true
}tokenkit validate "A very long prompt to validate" --model gpt-4o{
"IsValid": true,
"Message": "OK"
}tokenkit models listtokenkit models list --provider openaitokenkit models list --jsontokenkit update-modelstokenkit update-models --openai-key sk-xxxxcat newmodels.json | tokenkit update-modelsExample Input:
[
{
"Id": "gpt-4o-mini",
"Provider": "OpenAI",
"MaxTokens": 64000,
"InputPricePer1K": 0.002,
"OutputPricePer1K": 0.01,
"Encoding": "cl100k_base"
}
]tokenkit scrape-models --openai-key sk-xxxxIf no key is provided, TokenKit uses the local offline model registry.
Example Output:
๐ Fetching latest OpenAI model data...
โ
Retrieved 3 models:
- OpenAI: gpt-4o (128000 tokens)
- OpenAI: gpt-4o-mini (64000 tokens)
- OpenAI: gpt-3.5-turbo (4096 tokens)
tokenkit analyze "Hello" --model gpt-4o --jsonOutputs pure JSON:
{
"Model": "gpt-4o",
"Provider": "OpenAI",
"TokenCount": 7,
"EstimatedCost": 0.000105,
"Engine": "simple",
"Valid": true
}tokenkit analyze "Silent test" --model gpt-4o --quietNo console output. Log entry saved to tokenkit.log.
using TokenKit.Registry;
using TokenKit.Services;
var model = ModelRegistry.Get("gpt-4o");
var tokenizer = new TokenizerService();
var result = tokenizer.Analyze("Hello from TokenKit!", model!.Id);
var cost = CostEstimator.Estimate(model, result.TokenCount);
Console.WriteLine($"Tokens: {result.TokenCount}, Cost: ${cost}");TokenKit stores all model metadata in:
Registry/models.data.json
Each entry includes:
{
"Id": "gpt-4o",
"Provider": "OpenAI",
"MaxTokens": 128000,
"InputPricePer1K": 0.005,
"OutputPricePer1K": 0.015,
"Encoding": "cl100k_base"
}TokenKit maintains 100% test coverage using xUnit and Codecov.
Run tests locally:
dotnet test --collect:"XPlat Code Coverage"| Feature | Description |
|---|---|
| ๐ Extended Provider Support | Add Gemini, Claude, and Mistral integrations |
| ๐พ Persistent Config Profiles | Store model defaults and pricing overrides per project |
| ๐งฎ Batch Analysis | Analyze multiple files or prompts in a single command |
| ๐ Report Generation | Export CSV/JSON summaries of token usage and estimated cost |
| ๐ง LLM-Aware Cost Planner | Simulate conversation cost across multi-turn dialogues |
| ๐งฉ IDE Integrations | VS Code and JetBrains plugins for inline token analysis |
| โ๏ธ Custom Encoders | Support community-built encoders and language models |
Licensed under the MIT License.
ยฉ 2025 Andrew Clements โ Flow Labs / TokenKit