Fireworks AI uses pay-as-you-go pricing for non-enterprise usage and gives new users free credits. Charges depend on the service, including per-token serverless inference, per-GPU-time deployments, and per-token fine-tuning data. As one current official example, GPT OSS 120B is listed around $0.15 input, $0.07 cached input, and $0.60 output per 1M tokens.
- The official billing docs say Fireworks AI operates on a pay-as-you-go model for all non-enterprise usage and that new users receive free credits.
- The same docs explain that serverless inference is priced per token, on-demand deployments are priced by GPU usage time, and fine-tuning is priced per token of training data.
- Official model pages show current model-specific rates, such as GPT OSS 120B at about $0.15 input, $0.07 cached input, and $0.60 output per 1M tokens.
- Annual vs monthly billing can change effective rates—confirm at checkout or with sales.
Pricing may change - verify on the official site. View official pricing
Observability for AI agents with tracing, debugging, session visibility, and production monitoring.