FAR Labs opens access to cheaper AI inference platform

Fri, 26th Jun 2026 (Today)

FAR Labs has opened registration for builders to access its FAR AI inference platform, following its disclosure of lower listed prices for selected model deployments.

The Abu Dhabi-based AI infrastructure business, part of Dizzaract, is targeting developers looking to reduce the cost of running AI applications as usage grows across software products and automated tools.

At the centre of the offering is a distributed inference network that matches builder demand with available computing supply. Users can access the system through a single OpenAI-compatible API, choose from multiple models and onboard quickly, while workloads are routed across GPU resources through FAR Orchestrator.

FAR Labs has positioned cost as its main point of difference. It shared benchmark comparisons for several models against listed prices from other providers, arguing that its network structure allows it to offer materially lower rates on some deployments.

For Qwen3-30B-A3B, FAR AI listed pricing at USD $0.03 per 1 million tokens, compared with USD $0.35 for NextBit and USD $0.27 for DeepInfra. FAR Labs said this was up to 91 per cent lower.

For Qwen2.5-72B-Instruct, the company listed FP8 pricing at USD $0.17 per 1 million tokens, compared with USD $0.39 for NovitaAI BF16 and USD $0.38 for DeepInfra FP8. It said this put its pricing 55 to 56 per cent below those listed alternatives.

For Qwen3.5-122B-A10B, FAR AI listed FP8 pricing at USD $0.51 per 1 million tokens. Comparisons with providers including AtlasCloud FP8 and SiliconFlow FP8 showed output token costs were up to 79 per cent lower, according to the company.

Cost pressure

The argument comes as AI companies face a different economic challenge from the one that dominated earlier years. While the unit price of tokens has fallen sharply, total spending on inference has continued to rise as businesses push more AI-generated requests through customer support tools, agents, assistants, games and internal workflows.

That shift is particularly significant for developers that rely on proprietary application programming interfaces from companies such as OpenAI and Anthropic. As products scale, recurring inference charges can weigh on margins and limit room for testing and expansion.

FAR Labs said its pricing edge comes from drawing on underused computing resources rather than relying solely on large dedicated data centre fleets. The platform uses available GPU capacity from consumer devices and small and medium-sized enterprise data centres, then allocates work through what it describes as a performance-focused orchestration layer.

The company is also trying to address concerns beyond price. Builders running production workloads need systems that are private, reliably routed, low-latency and ready for real-world use rather than experiments, it said.

Its orchestration layer includes trusted execution environment-based secure inference, reliability scoring, support for open and proprietary models, and Semantic Vector Streaming. The routing system is designed around uptime, workload continuity and latency-sensitive performance, according to FAR Labs.

Market response

Interest in this type of infrastructure emerged repeatedly in conversations with developers, GPU suppliers, model teams, investors and enterprise executives during meetings around SuperAI Singapore, the business said. Those discussions pointed to demand for inference infrastructure that is faster, more reliable, lower cost and ready for production use.

FAR AI is currently in a testing stage with closed partners. FAR Labs is offering early access registration to builders, along with 1 million free tokens for those joining the programme.

In a statement, Ilman Shazhaev, Founder and Chief Executive Officer of Dizzaract, set out the company's case for the market. "The price of AI keeps falling. Cost per token is down about 99% since 2021. And yet your AI bill keeps rising, because usage explodes faster than prices drop. Inference is becoming the single largest cost in AI," he said.

He said the company believes its lower pricing stems from how the network is built rather than temporary subsidies. "Our cost advantage is not a discount we are burning cash on. It is structural."

ChatGPT

Key takeaways Explain why it matters Create action plan Future watch

Claude

Key takeaways Explain why it matters Create action plan Future watch

Perplexity

Key takeaways Explain why it matters Create action plan Future watch

Grok

Key takeaways Explain why it matters Create action plan Future watch

Share Share

Add us as a preferred source on Google

Image: Ilman Shazhaev