Nebius Group N.V. (Nebius) is a global technology company building full-stack AI cloud infrastructure to support the rapid growth of the AI industry.
The company’s intention is to power AI innovation with dedicated, high-performance infrastructure including an AI cloud platform purpose-built for AI innovators, from individual developers to startups to the largest enterprises. To do this, the company provides the compute, storage, managed services and critical software they need to train, run, a...
Nebius Group N.V. (Nebius) is a global technology company building full-stack AI cloud infrastructure to support the rapid growth of the AI industry.
The company’s intention is to power AI innovation with dedicated, high-performance infrastructure including an AI cloud platform purpose-built for AI innovators, from individual developers to startups to the largest enterprises. To do this, the company provides the compute, storage, managed services and critical software they need to train, run, and deploy models and apps quickly and efficiently. The company delivers accelerated compute and data storage solutions that power AI application development and deployment at scale, offering the company’s customers a full range of consumption options ranging from on-demand to fully managed infrastructure to ‘bare metal’ deployments. The company is one of the largest specialized AI cloud providers, with a significant presence in Europe and rapid ongoing expansion in the U.S.
The company’s full-stack approach encompasses data centers, in-house-designed hardware, and an intelligent software layer, enabling the company to deliver accelerated compute clusters, a proprietary cloud platform, and advanced tools for AI model training and inference. This ensures end-to-end optimization that combines the reliability and user experience of a hyperscaler with the flexibility and efficiency of purpose-built AI infrastructure.
The company owns and operates a data center in Finland, and also operates co-location sites in France and Iceland. In early 2025, the company commissioned its first U.S.-based co-location site in Kansas City, Missouri and began construction of a build-to-suit facility in Vineland, New Jersey. The New Jersey site is a phased development scalable up to 300 MW, with initial capacity expected to be available in the second half of 2025. The company plans to dedicate the facility’s incremental capacity to NVIDIA next generation Blackwell GPUs. As of March 31, 2025, the company’s total capacity stood at approximately 30,000 GPUs, most of which are NVIDIA H200s. The company plans to further expand its data center footprint in other regions.
The company has optimized its AI-native cloud platform for highly intensive, distributed AI workloads. The company’s full-stack solution is built for efficiency and reliability, and optimizes resource allocation through continuous innovation across every layer of the company’s infrastructure. Unlike the majority of ‘neoclouds’, the company built its infrastructure from the ground up, designing servers and racks in-house and embedding innovation in the design of the company’s data centers to maximize compute performance. This also gives the company full control over performance optimization, reliability, and cost efficiency. Unlike off-the-shelf hardware, the company’s designs are tailored specifically for AI workloads, enabling optimized power and cooling efficiency, lower latency, and seamless integration with the company’s cloud platform. This not only improves performance and reliability, but also gives the company flexibility on pricing, provides cost savings for customers by maximizing resource utilization and minimizing hardware bottlenecks, and enables the company to be competitive for the company’s customers. This deep hardware integration delivers substantial benefits to customers building transformative applications across many diverse industries including healthcare, robotics, and entertainment.
The company’s purpose-built software allows the company to quickly and efficiently provision compute resources on-demand from a single node to thousands of nodes. This flexibility ensures customers can handle everything from small-scale experiments in the company’s self-service offering, to enterprise-grade AI training and inference, without over-provisioning, and to adjust resources dynamically to meet their evolving needs.
The company’s close-knit and highly experienced team has over a decade of expertise working together to design, scale and operate data centers, cloud-based infrastructure and software solutions. With hundreds of engineers across Europe, the U.S. and Israel, including specialists in data-center construction and operations, hardware R&D, AI cloud platform development, and AI research and development, the company maintains full control over the company’s technology stack, ensuring seamless integration of every aspect from infrastructure to AI services, with 24/7 service globally. The company’s team’s long-standing partnerships with leading chipmakers and OEMs further enhance the company’s infrastructure capabilities.
The company’s customers range from technology companies and AI-native startups to research labs and individual developers building the next generation of AI models, applications and services. They choose the company’s platform for its flexibility, reliability, and comprehensive support for diverse AI workloads.
In addition to the company’s core AI infrastructure business, the company has three distinct businesses that operate under separate brands: Avride, a developer of autonomous driving technology for self-driving vehicles and delivery robots; Toloka, a data partner for AI model training, evaluation, and development; and TripleTen, a leading edtech platform focused on re-skilling individuals for careers in technology.
Core business: Nebius’s full-stack AI cloud offering
Nebius offers a comprehensive and integrated suite of AI cloud solutions, designed to support the entire AI lifecycle – from building and deploying AI models to managing large-scale AI applications.
At its foundation is a highly efficient and sustainable infrastructure layer that delivers scalable compute, storage, and networking resources engineered for high-performance AI workloads. This design offers greater compute capacity per unit of energy when compared to conventional solutions, while ensuring reliability and operational resilience.
Built on top of this robust foundation, the company’s AI cloud platform streamlines AI development and deployment. Completely rebuilt from the ground up and launched in October 2024, Nebius AI Cloud leverages the NVIDIA accelerated computing platform, featuring NVIDIA’s GPUs and networking, and offers key capabilities such as containerization, orchestration, and serverless computing. These features allow AI innovators to effectively scale AI workloads while seeking to maximize flexibility and performance.
At the top layer, the company’s AI and machine learning services provide a comprehensive suite of tools for data preparation, model training and model deployment. In addition to the company’s infrastructure and compute, the company offers pre-built AI applications and development tools that simplify AI adoption – particularly for organizations without dedicated AI engineering teams.
In November 2024, the company significantly extended the application layer of the company’s stack with the launch of Nebius AI Studio, the company’s inference-as-service platform. AI Studio gives AI app builders access to dozens of open-source AI models, enabling them to build and ship innovative products quickly and at an optimized cost.
Hardware layer
Data Centers
With decades of experience in developing hundreds of megawatts of greenfield capacity, the company’s team has built large-scale, highly efficient data centers featuring supercomputing and efficiency innovation, including Finland’s first server-heat recovery system. The company leverages its advanced data-center design to enhance unit economics by reducing energy overheads, optimizing IT workload allocation, and lowering server maintenance costs, and to ensure scalability of capacity at each site.
To support the company’s rapid growth and geographical expansion, the company operates three types of data centers:
Greenfield
The company owns the land and manage the power infrastructure, and the company’s engineers design every aspect of the data center. This approach offers the greatest flexibility for optimizing energy efficiency and performance. The company’s Finland data center is a greenfield facility. It features one of the world’s leading power usage effectiveness (PUE) levels under high IT loads, employs a free-cooling system that eliminates the use of water and refrigerants, and has been recovering 15,000-20,000 MWh of server heat annually to warm local homes.
Build-to-suit
The company partners with a developer who owns the land and secures the power, while the company provides custom specifications for the data-center buildout. This allows the company to drive energy efficiency and infrastructure optimization within the facility. The company’s New Jersey facility follows a build-to-suit model.
Co-location
The company leases capacity at existing data centers through third-party providers, enabling the company to rapidly deploy compute resources. While these facilities are not owned by the company, it applies rigorous selection criteria to ensure they meet the company’s performance, reliability and scalability standards. Operational efficiencies are achieved through the deployment of the company’s in-house-designed racks, optimizing power consumption. The company’s France, Iceland, and Kansas City deployments are all co-location sites.
Hardware
Designing the company’s servers and racks in-house gives the company full control over server prototyping, production and deployment, which is a key factor in reducing operational cost, accelerating time-to-market and scaling AI infrastructure. The company’s servers are engineered to operate at temperatures up to up to 40°C (105°F), compared to the ASHRAE standard limit of 27°C (80°F). This makes air cooling sufficient to maintain optimal performance for the current generation of chips, even those with high thermal density. Beyond energy saving, the company’s proprietary toolless rack design simplifies maintenance and repairs, so components can be replaced within minutes instead of hours, improving reliability and uptime. This also reduces staffing requirements, allowing one engineer to manage thousands of servers. Furthermore, the company’s streamlined design enhances workplace safety, reducing risks associated with complex traditional server maintenance.
As of March 31, 2025, the company had deployed approximately 30,000 GPUs across the company’s global data center footprint, primarily NVIDIA H200s. The company also plans to take delivery of over 22,000 NVIDIA Blackwell and Blackwell Ultra GPUs starting in 2025; the company’s New Jersey data center will be dedicated to NVIDIA Blackwell and Blackwell Ultra-architecture GPUs.
The company’s hardware stack consists of the following core components:
Compute – the company’s compute solutions include both GPU and CPU-only instances, providing flexibility for diverse workloads.
InfiniBand-connected GPU clusters – the company interconnects its GPU clusters using NVIDIA InfiniBand, ensuring high-speed, low latency communication for distributed workloads.
Storage – the company offers a range of storage solutions to meet diverse customer demands, including block storage, shared file storage and object storage. The company’s platform combines in-house storage offerings with solutions from leading third-party storage providers, giving customers flexibility to optimize storage based on their individual needs.
Software Layer
Cloud Platform and Value Added Services
The company’s proprietary AI cloud platform is configured to support AI workloads of any scale, from single-GPU virtual machines to large clusters of many thousands of GPUs. It orchestrates resource usage across compute, storage and workload allocation, enabling efficient scaling and management of infrastructure while minimizing performance bottlenecks. For example, the company’s Kubernetes and SLURM orchestrators are built on industry-standard open-source solutions, delivering resilient and extensible infrastructure for managing containerized workloads and services at scale. The company’s Soperator solution, used by many of the company’s customers, is a SLURM-based workload manager for machine learning and high performance compute clusters, enables robust job scheduling, fault-tolerant training, and a simplified user experience.
Every feature the company develops is designed with an intuitive user interface, covered by comprehensive documentation, and allows for seamless access via API and/or Terraform for programmatic orchestration. The company also prioritizes observability and logging, ensuring that critical information is accessible both through the UI and via an API.
The following graphic shows the company’s AI Platform and Applications and the company’s AI Cloud layers, both further described below.
On top of the company’s AI cloud infrastructure, the company has created a powerful AI Platform and Applications layer for the company’s customers featuring popular third-party and open-source software, which is available as managed services or through the company’s marketplace. This layer includes:
Ready to use LLMs – The company provides access to leading open-source models, including the Llama family, Mistral and DeepSeek-R1. These can also be pre-configured for optimal performance and usability.
Machine-learning operations (MLOps) – The company has built a suite of MLOps tools that streamline the entire machine-learning lifecycle, from data preprocessing and model training to continuous monitoring.
This includes:
MLflow for tracking experiments and efficient model management;
JupyterLab for simplified interactive computing for data and ML workloads;
Flowise for deploying ML pipelines with minimal coding;
Volcano for scheduling high-performance AI and deep learning workloads;
Kubeflow for streamlining deployment scaling and management of ML workflows;
Cvat for annotating images and videos for computer vision tasks;
Apache Airflow for monitoring complex data across distributed environments;
Apache Spark for high-performance data processing; and
ClearML Agent to work with the ClearML AI Platform to streamline AI adoption and the entire development lifecycle.
Vector databases – Efficient data storage and retrieval are critical for AI workloads and particularly for similarity search and large-scale embeddings. The company’s platform supports leading vector databases, such as Qdrant, Milvus and ClickHouse, enabling customers to store, index and access high-dimensional data with low latency.
Development operations (DevOps) – The company has assembled a suite of DevOps tools that enhance every stage of the software development lifecycle, from code integration and deployment to continuous monitoring and scaling. This includes:
Prometheus Operator, Node Exporter, Grafana, Prometheus and Kube State Metrics for comprehensive monitoring of clusters;
NGINX Ingress Controller and cert-manager for managing network traffic; and
Sealed Secrets for control of sensitive data, protecting from unauthorized access.
Other Offerings
AI Studio is the company’s inference-as-a-service offering, designed to streamline application development for foundation model users and app builders. Unlike traditional GPU-per-hour pricing, AI Studio is monetized through a token-based model, offering customers greater flexibility.
Launched in October 2024, AI Studio provides access to dozens of the latest open-source text and multimodal LLMs, including DeepSeek-R1 and the Llama family, with key benefits, including low latency, verified model quality and expert support. Services available to date include real-time inference, batch inference, fine-tuning of models and image generation. Since its launch, AI Studio has attracted nearly 60,000 registered users.
The company is also experimenting with other service offerings. For example, in the fourth quarter of 2024 the company introduced in beta TractoAI, the company’s serverless platform designed for scalable AI and Big Data workloads. It eliminates infrastructure management by automatically provisioning and scaling resources, combining serverless flexibility with hyperscaler-like performance. Built on the company’s bare metal and compute cloud, TractoAI optimizes resource allocation dynamically handling everything from low-GPU tasks to large-scale AI training. It also offers high-throughput and exabyte-scale storage for structured, semi-structured and unstructured data. Clients benefit from flexible compute, enabling them to start small, scale up for tasks that demand more resources, and then scale down again as needed, avoiding overprovisioning and optimizing resource usage. Additionally, Tracto AI’s pay-as-you-go model ensures users are billed only for actual compute consumption, making it the ideal solution for workloads with unpredictable demands.
Data center footprint
The company has a broad data-center footprint across Europe and the U.S. As of March 31, 2025, the company had approximately 45MW of connected capacity across both geographies, and are rapidly expanding it further. The company’s goal is to open more built-to-suit and greenfield data centers; with additional capacity deployments through co-locations as appropriate.
Europe
Mäntsälä, Finland – a greenfield data center built to the company’s own design specifications to optimize power and hardware for greater efficiency. The company signed the agreement for this greenfield data center in April 2015.
Paris, France – in July 2024 the company signed an agreement for its Paris data center, the company’s first co-location facility.
Keflavik, Iceland – in December 2024, the company signed an agreement in connection with adding a cluster of thousands of GPUs at a co-location in Iceland.
The United States
Kansas City – in November 2024, the company signed an agreement for its first co-location data center in the U.S., located in Kansas City, MO.
New Jersey – in February 2025, the company signed an agreement for its first built-to-suit facility, located in Vineland, NJ. The company is working with a developer who will build the facility based on the company’s reference design, and expect to launch in the first clusters as early as summer 2025. This facility is a phased development expandable up to 300MW.
The company is also actively exploring additional sites in the U.S. and other geographies to significantly expand the company’s capacity.
Customers and Go-To-Market Strategy
The company’s primary customers today are independent developers and AI-native technology companies. The company is also focused on building out its offering to meet the needs of frontier AI labs and enterprise customers, which the company sees as potentially important drivers of future revenue growth. Below the company provides more details on these three target customer segments.
Independent developers and AI Native Tech Companies
These are generally VC-backed AI-native technology companies that are building AI-specific solutions and need a full-stack AI cloud service that is flexible, scalable, and can meet their AI workload needs. The AI workloads that these customers run with Nebius include training, fine-tuning and inference using proprietary as well as open-source models. Typically, these customers make use of a range of the different products and services available on the company’s platform, including managed services for workload management and orchestration, as well as MLOps tools. This customer segment also includes independent developers and researchers who are able to access the company’s AI cloud platform via the company’s self-service offering, which provides instant access to GPUs on demand.
Frontier AI Labs
These companies are at the forefront of AI research and development and require massive, scalable compute infrastructure to support the training, fine-tuning and deployment of large-scale AI models, particularly large language models (LLMs) that utilize hundreds of billions or trillions of parameters. Their workloads are computationally intensive, demanding high-performance GPUs, low-latency networking and distributed storage solutions to process vast datasets efficiently.
Enterprise Customers
These include mid-market and larger enterprises that plan to use AI to drive efficiencies and optimized results within their organization. Use cases can range from in-house model development and fine-tuning to the deployment and inferencing of tools built on open-source AI models. The company anticipate that the scale of deployments from enterprise customers and related compute requirements, in particular inference workloads, will grow substantially over time as AI models become more widely available and cost effective to deploy in production systems.
With respect to the company’s go-to-market efforts, the company has made and continue to make significant investment in the company’s sales and marketing functions to expand the company’s customer base and build the company’s brand recognition. The company’s direct sales teams across the U.S. and Europe continue to scale, supported by growing pre-sales and post-sales teams that seek to ensure customer success from initial engagement through deployment. This includes dedicated system architects who lead proof-of-concepts and accelerate onboarding, as well as robust ongoing technical and engineering support. Furthermore, the company is expanding its reach through a growing network of strategic channel partners. Independent software vendors (ISVs) focused on AI and ML, value-added resellers (VARs), system integrators (SIs), and other strategic partners are becoming an essential extension of the company’s go-to-market engine. These partners not only have the potential to broaden the company’s distribution, but also enable the company to penetrate new markets, deliver joint solutions, and scale more efficiently.
Other Businesses and Investments
Avride
Avride is a developer of autonomous driving technology for self-driving cars and delivery robots for use-cases across ride-hailing, logistics, e-commerce, food and grocery delivery. The company main operations are in Austin, Texas, with additional R&D hubs in Europe, Israel and South Korea.
In 2024, Avride signed a multiyear partnership with Uber to deploy its autonomous vehicles and delivery robots on Uber and Uber Eats in the U.S. As part of this collaboration, Uber Eats launched delivery services utilizing Avride’s sidewalk robots in Austin and Dallas, TX, in 2024, with further expansion to Jersey City, NJ, in February 2025. The partnership also encompasses mobility solutions, which are expected to launch for riders in Dallas in 2025.
Avride has also partnered with Grubhub, deploying its sidewalk robots for last-mile deliveries at the Ohio State University campus. Within the first week of deployment, the number of daily deliveries exceeded 1,000. In February 2025, following the receipt of certification for its delivery robots in Japan in December 2024, Avride entered into a commercial partnership with Rakuten, Japan’s largest e-commerce player, to deploy Avride’s autonomous robots for restaurant and grocery deliveries in central Tokyo.
In March 2025, Avride entered a strategic partnership with Hyundai for the joint development of an autonomous driving platform and the expansion of its fleet. As part of this collaboration, Avride will initially deploy 100 Hyundai Ioniq 5 SUVs retrofitted with autonomous driving technology in the near-term, with plans for further fleet expansion.
The company is actively exploring third-party investment into Avride, including transactions in which the company may cede control.
Toloka
Toloka is a leading data provider for LLM and generative AI developers, delivering scalable, high-quality data solutions for all stages of AI development, including training, fine-tuning, alignment, and evaluation. By leveraging AI and expert-human input, Toloka ensures the production of complex, high-quality data at scale.
With an R&D hub in Europe and offices in the U.S., Toloka serves a global client base, including AI research labs, foundational model developers, Fortune 500 companies, and GenAI startups. Its network of domain experts, annotators, and writers spans over 20 knowledge areas and 120 subdomains, providing broad industry coverage.
In 2024, Toloka transitioned to a new technology platform. The company has expanded its generative AI data offerings to include red-teaming for AI agents, evaluation of reasoning models, and scalable training data powered by coding and math experts.
The company is in advanced negotiations regarding a potential third-party investment into Toloka, in which the company may cede control.
Triple Ten
TripleTen is an edtech platform focused on reskilling individuals for careers in technology. As of December 31, 2024, the company offered six immersive B2C study tracks – software engineering, quality assurance, BI analytics, data science, cybersecurity, and UX/UI design – principally in the U.S. and Latin America, as well as B2B AI-focused courses for both beginners and professionals.
TripleTen operates on a proprietary tech stack and automated platform that enables scalable course development, localization, and expansion at minimal incremental cost.
In 2024, Fortune Magazine recognized TripleTen as the best overall provider of software engineering bootcamps in the U.S. The company is among the top-rated U.S. edtech providers based on employment outcomes and student feedback, with a 4.87/5 rating on major bootcamp review platforms SwitchUp and Course Report as of December 2024. More than 14,000 students enrolled in 2024, a 149% increase from 2023.
Material Investments
The company holds a 28% ownership stake in ClickHouse, an open-source, column-oriented DBMS provider that was spun off from the group in September 2021.
Competition
The company’s primary competitors are specialist AI infrastructure providers, including developers of AI-centric cloud services, providers of bare-metal GPU clusters and GPU-centric data-center companies. This group of competitors includes CoreWeave, Crusoe and Lambda Labs.
In addition, as a full-stack AI cloud provider, the company also face competition from general-purpose cloud computing providers that are developing AI-specific offerings such as Amazon (AWS), Google (Google Cloud Platform), Microsoft (Azure) and Oracle.
Avride competes with other major developers of self-driving technologies, including Waymo, Zoox, and others. Toloka competes with other GenAI data providers, including Scale AI, Snorkel AI and Surge AI, while TripleTen primarily competes with number of U.S.-based edtech bootcamp providers.
Product Development
The company’s product development expenses were $129.7 million in 2024.
History
The company was founded in 1989. It was incorporated under the laws of the Netherlands in 2004. The company was formerly known as Yandex N.V. and changed its name to Nebius Group N.V. in 2024.