The "spring" of CPUs is coming in 2024!

2024-01-04 09:51:59

As the parameter scale of the AI model continues to expand, its demand for computing power is also rising sharply. In order to meet this demand, various industries are actively developing and building large-scale arithmetic infrastructures, resulting in a variety of specialized AI acceleration chips in short supply, which are not only difficult to procure but also costly.

As a result, some enterprises have turned their attention to the CPU (Central Processing Unit), which is currently a popular hardware product. Recently, the emergence of the fifth generation of Intel Xeon scalable processors, so that the industry again see, the use of the CPU can also improve the efficiency of AI work, running AI on the CPU, can also be very "fragrant".

CPU in the field of AI's new mission

It is understood that, compared with training, AI reasoning on the demand for computing power resources is relatively small, for some reasoning tasks lighter business or industry, compared to the AI acceleration chip, the choice of CPU is more cost-effective. At the same time, since the CPU is currently the most popular hardware, most enterprises are happy to utilize more widely deployed, CPU-based IT infrastructure and architecture to avoid the deployment challenges of heterogeneous platforms. Introducing AI acceleration in traditional architectures is the new mission of the CPU in this era.
The fifth generation of Intel Xeon Scalable processors was born. The number of processors was increased to 64, equipped with 320MB of L3 cache and 128MB of L2 cache. Both single-core performance and number of cores have been significantly improved compared to the previous generation of processors. Under the same power consumption, the fifth generation of Xeon Scalable processors have an average performance increase of 21%, memory bandwidth increase of up to 16%, and the capacity of the L3 cache has been increased to nearly three times of the original.
Meanwhile, each core of the fifth-generation Xeon Scalable processor is equipped with AI acceleration, which improves training performance by 29% and inference by 42% compared to the previous generation.

The fifth generation of Xeon Scalable processors also saw significant improvements in AI load processing capabilities. Starting with the fourth generation of Xeon Scalable processors, Intel Matrix Extensions (Intel AMX) was introduced as a built-in AI acceleration engine, an innovation that enables the CPU to process AI workloads more efficiently. The Intel AVX-512 instruction set, also similarly built into the fifth-generation Xeon, along with faster-running cores and faster memory, further boosts AI performance, enabling generative AI to perform more workloads without the need for a separate AI-specific gas pedal. With the performance leap in natural language processing inference, it better helps organizations support responsiveness for workloads such as intelligent assistants, chatbots, predictive text, language translation, and more. With this processor, developers are able to reason and tune large language models with up to 20 billion parameters with a response latency of less than 100 milliseconds when running models with less than 20 billion parameters.

Protecting Cloud Service Vendors

The explosion of generative AI brings new opportunities to the cloud computing industry, but also brings challenges. Since large models require huge arithmetic support, cloud vendors need to upgrade data center arithmetic as soon as possible to cope with AI demand, and continue to reduce TCO (total cost of ownership) to provide reasonably priced arithmetic resources for users. In addition, AI application development also involves a large amount of privacy-sensitive data storage and use of the cloud, cloud vendors need to upgrade the existing hardware infrastructure to ensure the safety and reliability of these data, to dispel the user's worries.
The fifth-generation Intel Xeon Scalable Processor builds a favorable ecosystem for cloud service vendors from both hardware and software aspects. In terms of hardware, Intel SGX/TDX solutions provide end-to-end hardware-level protection for cloud data. On the software side, Intel has provided optimizations for the fifth-generation Xeon scalable processors in the industry-standard frameworks of Pytorch, Tensorflow, and the OpenVINO tool suite, enabling cloud vendors and users to quickly leverage processor capabilities such as Intel AMX at a low threshold to open up the arithmetic bottleneck of AI applications.
The fifth-generation Intel Xeon Scalable Processor acts as a strong backbone, providing solid arithmetic support for cloud service vendors. It not only reduces operational costs, but also builds a strong barrier for data security. More importantly, it optimizes AI application development, so that cloud service vendors can also experience the "sweetness" of running AI on the CPU.

Enterprises to start the "taste" mode

Intel CEO Pat Kissinger said in the 2023 Intel ON Technology Innovation Conference: "In this era of rapid development of artificial intelligence technology and industrial digital transformation, Intel maintains a high degree of responsibility to help developers, so that AI technology is ubiquitous, so that AI is more accessible, more visible, transparent and trustworthy. "
It is understood that 70% of inference runs in data centers today use Intel Xeon Scalable processors. With the birth of the fifth generation of Xeon scalable processors, some enterprises have opened the "taste" mode, and their products have been significantly improved in AI performance.
During the 11.11 period, Jingdong Cloud successfully coped with the surge in business volume through a new generation of servers based on the fifth-generation Intel Xeon Scalable processor, and compared with the previous generation of servers, the performance of the entire machine was increased by 123%, and the performance of the AI computer vision reasoning was increased to 138%, and the performance of the Llama 2 reasoning was increased to 151%. It is easy to hold the pressure of the big promotion of 170% increase in the peak value of user visits year-on-year and more than 1.4 billion intelligent customer service inquiries.

If you like this article, may wish to continue to pay attention to our website Oh, later will bring more exciting content. If you have product needs, please contact us.

The "spring" of CPUs is coming in 2024!

CPU in the field of AI's new mission

Protecting Cloud Service Vendors

Enterprises to start the "taste" mode

News Category

Hot Article

Hot Line

Email