Inference Time Compute

OpenAI Presents Research on Inference-Time Compute to Better AI Security

A monthly overview of things you need to know as an architect or aspiring architect. Unlock the full InfoQ experience by logging in! Stay updated with your favorite authors and topics, engage with ...

NextBigFuture

OpenAI Strawberry LLM Reasoning Needs More Compute and Energy for Inference

Jim Fan is one of Nvidia’s senior AI researchers. The shift could be about many orders of magnitude more compute and energy needed for inference that can handle the improved reasoning in the OpenAI ...

EDN

Analog in-memory compute tackles the AI inference conundrum

An analog in-memory compute chip claims to solve the power/performance conundrum facing artificial intelligence (AI) inference applications by facilitating energy efficiency and cost reductions ...

VentureBeat

Train-to-Test scaling explained: How to optimize your end-to-end AI compute budget for inference

The standard guidelines for building large language models (LLMs) optimize only for training costs and ignore inference costs. This poses a challenge for real-world applications that use ...

The Next Platform

The Battle Begins For AI Inference Compute In The Datacenter

The major cloud builders and their hyperscaler brethren – in many cases, one company acts like both a cloud and a hyperscaler – have made their technology choices when it comes to deploying AI ...

EDN

Purpose-built AI inference architecture: Reengineering compute design

Over the past several years, the lion’s share of artificial intelligence (AI) investment has poured into training infrastructure—massive clusters designed to crunch through oceans of data, where speed ...

VentureBeat

Hugging Face shows how test-time scaling helps small language models punch above their weight

In a new case study, Hugging Face researchers have demonstrated how small language models (SLMs) can be configured to outperform much larger models. Their findings show that a Llama 3 model with 3B ...

Tech Wire Asia

AMD says Malaysia has a role in Southeast Asia’s yotta-scale AI infrastructure push

AMD sees Malaysia as part of its SEA AI infrastructure focus, supported by data centres, cloud adoption, talent, and ...

InfoQ

Private AI Compute Enables Google Inference with Hardware Isolation and Ephemeral Data Design

Some results have been hidden because they may be inaccessible to you

Show inaccessible results