Ollama · FinOps Profile
Ollama Finops
FOCUS-aligned FinOps for Ollama: local inference is free and self-hosted; Ollama Cloud is a tiered subscription priced on GPU utilization rather than tokens, with concurrency caps per tier.
Ollama Finops is the FinOps profile for Ollama on the APIs.io network, aligned with the FinOps Foundation Framework.
It defines 4 billable meters, billed in USD, on a monthly or annual cycle, and pricing category subscription (tiered) + self-hosted free.
The profile maps 9 FOCUS columns for cost-allocation reporting.
Tagged areas include Artificial Intelligence, Large Language Models, Models, FinOps, and Cost Management.
Category: AI Infrastructure
Pricing: Subscription (Tiered) + Self-Hosted Free
Billing: Monthly or Annual
FOCUS v1.3
Artificial IntelligenceLarge Language ModelsModelsFinOpsCost ManagementFOCUS
Framework Alignment
Charge Categories
PurchaseUsageTaxCredit
FOCUS Columns
BillingCurrency
USD
ChargeCategory
Purchase
InvoiceIssuerName
Ollama Inc.
PricingCategory
Subscription
PricingUnit
month
ProviderName
Ollama
PublisherName
Ollama Inc.
ServiceCategory
AI Infrastructure
ServiceName
Ollama
Meters
cloud_subscription
Per-account Ollama Cloud subscription (Free, Pro, Max)
cloud_gpu_time
GPU time consumed on Ollama Cloud (the actual consumption meter)
cloud_concurrent_models
Peak concurrent cloud model count (used to size tier)
local_inference_requests
Local Ollama requests (no charge from Ollama; cost is hardware/power)