Red Hat accelerates enterprise AI adoption with Models-as-a-Service and inference-time scaling

As artificial intelligence continues to redefine the enterprise landscape, the dialogue around AI adoption is decisively shifting from mere curiosity—“what if”—to practical implementation—“how.” At the recent Red Hat Summit, Red Hat unveiled a portfolio of advancements aimed at transforming AI initiatives from exploratory experiments into dependable business assets.

A central feature of this evolution is the concept of Models-as-a-Service (MaaS). MaaS is designed to make AI more accessible and efficient by allowing a small group of AI specialists to deploy models as consumable API endpoints. This approach enables broader organisational access to AI capabilities without requiring deep technical expertise from every user. By centralising AI resources, MaaS significantly reduces the complexity and costs associated with deploying multiple independent models. It also accelerates innovation cycles and enhances data privacy and security by empowering organisations to maintain control over their AI environments. Such an approach is crucial in enterprise settings where cost efficiency, speed, and privacy weigh heavily on deployment decisions.

Echoing the practical benefits of MaaS, additional insights highlight how it helps enterprises overcome significant adoption hurdles, including managing complex AI infrastructures and data security concerns. Centralisation via MaaS reduces redundancy in AI deployments and fosters operational control and flexibility in choosing or customizing models to suit specific needs.

Complementing MaaS, Red Hat has deepened its AI ecosystem through a notable collaboration with EnterpriseDB (EDB). EDB Postgres AI, built on top of the widely utilised PostgreSQL database, integrates advanced features such as pgvector for storing and querying vector embeddings—key elements in AI data that help build smarter, contextually aware applications. When combined with Red Hat OpenShift AI, which offers a unified platform to manage the entire lifecycle of generative and predictive AI across hybrid cloud environments, this integration paves the way for accelerated, secure, and scalable AI development. This synergy notably supports retrieval-augmented generation (RAG), a method that improves the relevance and accuracy of large language model (LLM) responses by grounding them with domain-specific internal data.

The Red Hat Summit also introduced technical innovations aimed at enhancing AI model accuracy and efficiency during runtime. One such advancement is inference-time scaling (ITS), which improves response precision without the need for retraining models. ITS uses verifier-guided parallel search to generate multiple outputs and select the most accurate one, achieving remarkable gains in accuracy on demanding tasks like financial queries, with performance rivaling top-tier models such as GPT-4o. For enterprises, this means elevated AI quality can be achieved more cost-effectively by optimising existing models’ runtime behaviour rather than simply increasing model size.

Addressing a key barrier to AI deployment—data management—Red Hat has integrated Feast, an open source feature store that centralises how data features are stored, managed, and served for AI models. Feast overcomes challenges like training-serving skew and redundant feature computations, ensuring that AI applications receive consistent, high-quality data in real time. This integration within OpenShift AI standardises deployment processes and has already proven effective in industries requiring real-time personalisation, fraud detection, and RAG applications.

Scalability and performance in deploying large language models are tackled through the introduction of Red Hat AI Inference Server. Built on the high-performance vLLM engine and equipped with advanced methods such as PagedAttention, various parallelism techniques, and model optimization tools like LLM Compressor, this server is delivered as a containerised solution for uniform deployment across hybrid environments including OpenShift and RHEL. It accommodates a range of hardware accelerators—NVIDIA, AMD, Google TPUs—and provides access to curated, optimised model repositories, thus facilitating state-of-the-art AI performance with enterprise-grade flexibility.

Looking forward to the next AI frontier, Red Hat is focusing on agentic AI systems—autonomous agents capable of executing complex, goal-driven tasks rather than responding to isolated queries. To streamline the complexity involved in building such agents, Red Hat is integrating Llama Stack and Model Context Protocol (MCP) into its AI portfolio. OpenShift AI serves as the robust foundational platform, offering an AI API server that functions analogously to a Kubernetes control plane for AI. This setup abstracts complexity and provides a consistent, extensible environment for developing or integrating agent frameworks, alongside enterprise-grade security and hybrid cloud capabilities.

Crucially, Red Hat is addressing the need for “validated” AI models beyond the hype, promoting a framework for scalable, reproducible performance assessments under real-world conditions. Fragmented validation methods previously exposed enterprises to risks of unreliable deployments and high costs. Red Hat’s approach includes rigorous testing and transparency around model capabilities, accuracy, and cost, thereby fostering confidence in AI adoption.

In summary, Red Hat is positioning itself at the forefront of enterprise AI innovation by combining open source technologies, hybrid cloud orchestration, and advanced AI optimisation techniques. Its comprehensive approach—from MaaS and feature stores to inference optimisation, agentic AI platforms, and validated models—aims to equip organisations with practical, reliable tools to harness AI’s full potential sustainably and securely. This strategy reflects a mature phase in enterprise AI, where effective deployment, data management, operational control, and performance validation become as critical as the underlying algorithms themselves.

Source: Noah Wire Services

Trending

Synagistics and China Post Hong Kong launch AI-powered platform to streamline China-Southeast Asia trade

Mountain Pass deposit revival sparks $100 trillion bid for US AI metal independence

NATO must evolve Article 5 to counter new hybrid warfare threats in the Wider Black Sea Region

Seagate launches 30TB drives to power AI workloads at the edge

Zoho launches Zia LLM and AI agent suite with privacy-first cloud rollout

Goldman Sachs bets on Kakao’s AI-driven growth with 48% target price rise

UK unveils Isambard-AI supercomputer with record AI power and green credentials

Tech Mahindra posts 34% profit jump as AI drive fuels margin growth despite revenue dip

Supply chain leaders urged to harness AI and real-time data to navigate the ‘never normal’

Explore

Quick Links

Contribute to SRM Today

Advertise with us

Subscribe to Industry Updates

Trending

Subscribe to Industry Updates

Red Hat accelerates enterprise AI adoption with Models-as-a-Service and inference-time scaling

Keep Reading

Explore

Quick Links

Contribute to SRM Today

Advertise with us

Subscribe to Industry Updates