🧠 TokenKit

TokenKit — A professional .NET 8.0 library and CLI for tokenization, validation, cost estimation, and model registry management across multiple LLM providers (OpenAI, Anthropic, Gemini, etc.).

✨ Features

Category	Description
🔢 Tokenization	Analyze text or files and count tokens using multiple encoder engines (`simple`, `SharpToken`, `ML.Tokenizers`)
💰 Cost Estimation	Automatically calculate estimated API cost based on model metadata
✅ Prompt Validation	Validate prompt length against model context limits
🧩 Model Registry	Manage model metadata (`maxTokens`, pricing, encodings, providers) via JSON registry
⚙️ CLI & SDK	Use TokenKit as a .NET library or a global CLI tool
🧮 Multi-Encoder Support	Dynamically select tokenization engines via `--engine` flag
📦 Self-contained Data	Local registry stored in `Registry/models.data.json`, auto-updatable
🔍 Live Model Scraper	Optional OpenAI API key support to fetch real-time model data
📊 Structured Logging	All CLI commands logged to `tokenkit.log` with rotation (1MB max)
🤫 Quiet & JSON Modes	Machine-readable (`--json`) and silent (`--quiet`) output modes for automation
🎨 CLI Polish	Colorized output, ASCII banner, and improved user experience

⚙️ Installation

📦 As a Library (NuGet)

dotnet add package TokenKit

💻 As a Global CLI Tool

dotnet tool install -g TokenKit

🚀 Usage (All-in-One Guide)

🔹 Analyze Inline Text

tokenkit analyze "Hello from TokenKit!" --model gpt-4o

🔹 Analyze File Input

tokenkit analyze prompt.txt --model gpt-4o

🔹 Pipe Input (stdin)

echo "This is piped text input" | tokenkit analyze --model gpt-4o

Example Output:

{
  "Model": "gpt-4o",
  "Provider": "OpenAI",
  "TokenCount": 4,
  "EstimatedCost": 0.00002,
  "Valid": true
}

🔹 Validate Prompt Length

tokenkit validate "A very long prompt to validate" --model gpt-4o

{
  "IsValid": true,
  "Message": "OK"
}

🔹 List Registered Models

tokenkit models list

Filter by Provider

tokenkit models list --provider openai

JSON Output

tokenkit models list --json

🔹 Update Model Data

Default Update (Offline Fallback)

tokenkit update-models

Using OpenAI API Key

tokenkit update-models --openai-key sk-xxxx

From JSON (stdin)

cat newmodels.json | tokenkit update-models

Example Input:

[
  {
    "Id": "gpt-4o-mini",
    "Provider": "OpenAI",
    "MaxTokens": 64000,
    "InputPricePer1K": 0.002,
    "OutputPricePer1K": 0.01,
    "Encoding": "cl100k_base"
  }
]

🔹 Scrape Latest Model Data (Preview)

tokenkit scrape-models --openai-key sk-xxxx

If no key is provided, TokenKit uses the local offline model registry.

Example Output:

🔍 Fetching latest OpenAI model data...
✅ Retrieved 3 models:
  - OpenAI: gpt-4o (128000 tokens)
  - OpenAI: gpt-4o-mini (64000 tokens)
  - OpenAI: gpt-3.5-turbo (4096 tokens)

🔹 CLI Output Modes

JSON Mode

tokenkit analyze "Hello" --model gpt-4o --json

Outputs pure JSON:

{
  "Model": "gpt-4o",
  "Provider": "OpenAI",
  "TokenCount": 7,
  "EstimatedCost": 0.000105,
  "Engine": "simple",
  "Valid": true
}

Quiet Mode

tokenkit analyze "Silent test" --model gpt-4o --quiet

No console output. Log entry saved to tokenkit.log.

🧩 Programmatic SDK Example

using TokenKit.Registry;
using TokenKit.Services;

var model = ModelRegistry.Get("gpt-4o");
var tokenizer = new TokenizerService();

var result = tokenizer.Analyze("Hello from TokenKit!", model!.Id);
var cost = CostEstimator.Estimate(model, result.TokenCount);

Console.WriteLine($"Tokens: {result.TokenCount}, Cost: ${cost}");

📦 Model Registry

TokenKit stores all model metadata in:

Registry/models.data.json

Each entry includes:

{
  "Id": "gpt-4o",
  "Provider": "OpenAI",
  "MaxTokens": 128000,
  "InputPricePer1K": 0.005,
  "OutputPricePer1K": 0.015,
  "Encoding": "cl100k_base"
}

🧪 Testing & Quality Assurance

TokenKit maintains 100% test coverage using xUnit and Codecov.

Run tests locally:

dotnet test --collect:"XPlat Code Coverage"

🧭 Future Enhancements

Feature	Description
🌐 Extended Provider Support	Add Gemini, Claude, and Mistral integrations
💾 Persistent Config Profiles	Store model defaults and pricing overrides per project
🧮 Batch Analysis	Analyze multiple files or prompts in a single command
📊 Report Generation	Export CSV/JSON summaries of token usage and estimated cost
🧠 LLM-Aware Cost Planner	Simulate conversation cost across multi-turn dialogues
🧩 IDE Integrations	VS Code and JetBrains plugins for inline token analysis
⚙️ Custom Encoders	Support community-built encoders and language models

Name		Name	Last commit message	Last commit date
Latest commit History 54 Commits
.github/workflows		.github/workflows
assets		assets
src		src
tests/TokenKit.Tests		tests/TokenKit.Tests
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE.txt		LICENSE.txt
README.md		README.md
TokenKit.sln		TokenKit.sln
prompt.txt		prompt.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🧠 TokenKit

✨ Features

⚙️ Installation

📦 As a Library (NuGet)

💻 As a Global CLI Tool

🚀 Usage (All-in-One Guide)

🔹 Analyze Inline Text

🔹 Analyze File Input

🔹 Pipe Input (stdin)

🔹 Validate Prompt Length

🔹 List Registered Models

Filter by Provider

JSON Output

🔹 Update Model Data

Default Update (Offline Fallback)

Using OpenAI API Key

From JSON (stdin)

🔹 Scrape Latest Model Data (Preview)

🔹 CLI Output Modes

JSON Mode

Quiet Mode

🧩 Programmatic SDK Example

📦 Model Registry

🧪 Testing & Quality Assurance

🧭 Future Enhancements

💡 License

About

Uh oh!

Releases 3

Packages

Languages

License

AndrewClements84/TokenKit

Folders and files

Latest commit

History

Repository files navigation

🧠 TokenKit

✨ Features

⚙️ Installation

📦 As a Library (NuGet)

💻 As a Global CLI Tool

🚀 Usage (All-in-One Guide)

🔹 Analyze Inline Text

🔹 Analyze File Input

🔹 Pipe Input (stdin)

🔹 Validate Prompt Length

🔹 List Registered Models

Filter by Provider

JSON Output

🔹 Update Model Data

Default Update (Offline Fallback)

Using OpenAI API Key

From JSON (stdin)

🔹 Scrape Latest Model Data (Preview)

🔹 CLI Output Modes

JSON Mode

Quiet Mode

🧩 Programmatic SDK Example

📦 Model Registry

🧪 Testing & Quality Assurance

🧭 Future Enhancements

💡 License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 3

Packages 0

Languages

Packages