Out of the clouds: the downstream focus for AI in 2025
Technology Data Science Competition Digital Disruption

Out of the clouds: the downstream focus for AI in 2025

Rapid shifts in AI innovation are rewriting the rules of competition

Artificial intelligence (AI) is transforming industries at breakneck speed, reshaping markets, business models, and consumer experiences alike. The recent shockwaves caused by DeepSeek’s emergence underscore just how volatile the AI landscape has become.

As AI capabilities accelerate, so too do competitive tensions, regulatory scrutiny, and shifting market dynamics. While competition regulators have feared that AI’s reliance on cloud computing could entrench the dominance of a few major players, there are clear signs that a new shift may now be underway. In 2025, the focus is no longer just on who builds the most powerful AI models, but on where and how those models are deployed – and that shift is rewriting the rules of competition.

So far, access to immense computing power has played a critical role in the development and deployment of AI models. Accordingly, the regulatory spotlight has focused on the upstream competition issues in the AI market, including cloud computing services, chip manufacturing, and the dynamic partnerships between AI development labs and digital platforms, as seen with OpenAI and Microsoft.

However, as we look to 2025, regulatory and business focus is moving towards the evolving competition dynamics in the downstream deployment of AI. Competition authorities are increasingly tasked with striking a delicate balance: facilitating the unlocking of AI’s economic benefits while preserving competition in a rapidly evolving market.

As AI moves from being almost entirely cloud-based to more decentralised development and deployment, a range of new challenges and opportunities are emerging. One such issue is the rise of smaller, efficient models that facilitate edge inference, where AI models run locally on devices rather than relying on cloud infrastructure.

Increasingly efficient AI models like OpenAI's 4o-mini are accelerating the move to on-device processing

 

The Rise of Local Inference: AI Moves to the Edge

Generative AI model development unfolds in three primary phases: training, fine-tuning and inference. Training is the process where models “learn” to identify patterns across huge swathes of data, relying on massive computational resources typically provided by the cloud.

Fine-tuning is the process of teaching the trained model to undertake a given task. In the case of general-purpose AI models, like OpenAI’s GPT 4o and Google’s Gemini, this has to date relied on large, labelled datasets and powerful cloud compute to accommodate for the huge variety of tasks that users might set.

In contrast, inference occurs when the trained model applies its knowledge to new data, such as when a user interacts with ChatGPT or Microsoft Copilot. Initially, inference was mainly cloud-based, but as technology evolves, advancements in model fine-tuning and efficient inference have facilitated the rise of on-device processing.

Throughout 2024, we have seen the release of smaller, more efficient AI models, which offer strong model performance but require a fraction of the compute power for inference, such as OpenAI’s market-leading 4o-mini model. However, OpenAI’s performance is now being matched by DeepSeek’s distilled versions of their much-hyped R1 model, which have been made possible by open-source models provided by Meta and Qwen.

These innovations are accelerating the move towards decentralised AI deployment, as on-device inference becomes increasingly viable. A prime example is OpenAI’s integration of ChatGPT into Apple’s iOS, iPadOS, and macOS, which enhances Siri’s capabilities and offers improved writing tools, all through local processing.

The shift towards more efficient, on-device inference has significant implications for the AI market’s structure and could ultimately promote a more diverse and competitive AI ecosystem.
David Dorrell
Head of Data Science

Local Inference and the Changing Market Structure

The shift towards more efficient, on-device inference has significant implications for the AI market’s structure and could ultimately promote a more diverse and competitive AI ecosystem,as companies innovate to leverage these new capabilities. This, in turn, may address some of the regulatory concerns over potential cloud-based AI monopolies.

Major tech firms are increasingly adopting local inference to strengthen their hardware offerings. For example, Microsoft’s launch of laptops with integrated neural engines and AI models allows developers to build apps that run directly on the device, reducing reliance on cloud services. Similarly, Google is embedding advanced AI capabilities directly into its smartphones. This transition challenges traditional cloud service models, as on-device inference reduces the dependency on large-scale compute power.

Cloud providers may need to reassess their strategies. While on-device inference could lead to a reduction in the predicted growth revenue from cloud-based AI services, it could also drive demand for more advanced hardware and increase the importance of operating systems in the AI ecosystem—areas where companies like Microsoft and Google already have strong market positions.

By reducing developers’ reliance on cloud infrastructure, on-device computing could lower operating costs for developers, potentially leading to reduced prices for consumers and attracting more developers to the AI space. It will be fascinating to see how major cloud providers that have made significant AI-related investments—such as Amazon and Microsoft—navigate these developments. One potential outcome is that the battle for control over key inputs in the generative AI value chain intensifies but shifts focus towards other critical factors, such as the data used to train models and access to specialised expertise (a topic we explored in a previous article Bright Stars, Dark Clouds). Alternatively, the primary battleground could move downstream, with less emphasis on securing foundational resources and more on how those resources are strategically deployed within digital ecosystems.

On a larger scale, the decentralisation of AI could align with global policy shifts favouring technology independence and regional innovation. If the AI market pivots away from US-based cloud services, policymakers may encourage local alternatives that offer greater control over data and infrastructure. This shift could impact trade flows, regional competition, and the global AI landscape.

 

Bright stars, dark clouds?

Bright stars, dark clouds?

The battle for talent in the gen AI sector raises questions about hiring practices, as 'backdoor takeovers' could face more scrutiny.

Read more

Benefits for Consumers: Faster, More Secure, and Cost-Effective AI

On-device inference offers numerous advantages to consumers, including cost efficiency, enhanced user experience, improved privacy, and better performance. Unlike cloud-based AI, which incurs recurring costs for data transfer and server infrastructure, on-device inference operates using the device’s battery, eliminating many of these costs. This results in more affordable AI solutions and a smoother, more reliable experience for users, with reduced latency and offline capabilities providing a faster, more responsive system.

Privacy and security are also key benefits. With no data being transmitted to remote servers, on-device inference ensures that personal information remains private, reducing the risks associated with data breaches and fostering greater trust among users. This increased trust may encourage users to share more data, such as GPS coordinates, as they feel more secure. Although challenges such as the need for additional device memory and potential impacts on battery life remain, on-device inference is set to offer a more secure, personalised, and efficient AI experience for consumers.

The Road Ahead for AI

The future of AI remains highly uncertain but 2025 has started with a bang. The recent shock emergence of the R1 model from DeepSeek, a China-based AI lab, shows how quickly innovations are coming in the model development and fine-tuning stages. By employing a "mixture of experts" approach, which activates only the required components of a model for specific tasks, along with reinforcement learning, DeepSeek appear to have dramatically cut the compute required, along with development and fine-tuning costs, of their R1 model.

These changes, along with the rise of edge-based inference, may leave the AI market facing diverging strategies from key players. Operating service providers, cloud services, and app stores may continue to pursue different paths, with some doubling down on vertical integration to control key niches, while others embrace openness and interoperability to drive broader innovation. These competing strategies will shape the trajectory of AI technologies and determine how benefits are distributed across the ecosystem – from consumers and businesses to governments.

As AI continues to evolve, one thing is clear: the focus of AI in 2025 is shifting out of the cloud. This change could unlock new opportunities for competition, growth, and innovation, but it will also introduce fresh challenges for regulators trying to ensure a level playing field. The real question will be whether regulators can keep pace with the rapid evolution of AI technology and maintain competitive fairness in a world that is increasingly defined by on-device intelligence.