Zoho’s Homegrown Server Cuts AI Inference Costs by 20-30%

The rising cost of running AI models has a new target inside Zoho. The enterprise SaaS company unveiled Nathu La, a server platform designed from the motherboard up by its own R&D team and built with Indian manufacturing partners. Zoho says the hardware cuts total cost of ownership by 20-30% and reduces power consumption by up to 18%.

The server runs on Intel Xeon 6 processors. Zoho engineered the motherboards, networking cards and DC‑SC security modules in‑house. Five patent filings cover the thermal management and modular architecture that drive the efficiency gains. The company has roughly 1,000 Nathu La servers in production and pre‑production, and plans to have 2,000 by year end.

The timing targets a specific pain point. AI inference – the stage where a trained model processes new data – has become a fast‑growing operational expense for companies running AI at scale. Renting GPU clusters or paying per‑call for cloud inference APIs adds a recurring cost that compounds with usage. Zoho’s approach replaces that rental layer with owned hardware for its own workloads.

CEO Shailesh Davey framed the move as stacking advantages. “With Zoho’s strategy of using contextual, right‑sized models, running on our own platform, now on our own servers, accelerated by our own GPU database, we are compounding the benefits accrued from owning and operating our entire technology stack,” he said.

The math works partly because Zoho controls the full software stack above the processor. That means it can tune models for the hardware it owns, extract more work per watt and avoid licensing fees that third‑party server designs carry. The 20-30% TCO reduction reflects those savings.

Independence has limits. Nathu La still depends on Intel for its chips. Zoho is not building its own silicon. The processor bill remains exposed to Intel’s pricing and supply chain. The real leverage comes from everything above the chip – the board design, the cooling system, the software optimisation. Zoho owns that layer fully.

The wider context is a macro concern raised repeatedly by Sridhar Vembu, Zoho’s cofounder. He likens India’s growing spending on foreign AI compute to an “oil import bill” – a recurring outflow that strains the current account deficit. Zoho itself spends a few million dollars a year on AI model subscriptions, Vembu said. A broader market analysis of India’s tech import data shows the pressure. Nathu La is one company’s effort to cap that outflow for its own operations.

As more Indian enterprises deploy models for customer service, code generation and data processing, the inference cost line item grows. Zoho’s approach – building hardware domestically, controlling the stack – shows a path that other firms with enough scale could follow. The 2,000‑server target by year end provides a concrete benchmark for the push.

Zoho’s Homegrown Server Cuts AI Inference Costs by 20-30%

Explore More

More from AlphaScala

Trading Q&A

Related Tools & Research