Introduction and Investment Thesis
Early this year, I detailed that Intel (INTC) was poised to lead the AI revolution over the coming decade. The widespread adoption of AI will contribute significantly to demand for (Intel) compute silicon, and hence, will be a growth driver for the company. Intel forecasts it will be about a $25 billion opportunity by 2025, compared to $3.8 billion revenue in 2019.
On June 18, Intel launched its third-generation Xeon Scalable platform, codenamed Cooper Lake. This follows a bit over a year after the company’s April 2019 data-centric portfolio launch, which included second-generation Cascade Lake, the 10nm Agilex FPGA and 800 series of 100G Ethernet adapters.
Today’s announcement is equally broad: it does not just cover the new Cooper Lake CPUs. Also launching or announced are new SSDs, 3D XPoint DIMMs, 14nm Stratix 10 FPGA for AI acceleration, as well as several platform updates.
While Cascade Lake was aimed at all data center segments, the Cooper Lake platform is really mostly aimed at AI acceleration; it won’t be until Ice Lake later this year that general purpose servers will see their next increase in performance. (The Cascade Lake refresh earlier this year improved performance per dollar but not top line performance.)
Counting both Cooper Lake and the Stratix 10 NX FPGA, these are two products targeted specifically at AI acceleration, providing respectively up to 2x and 15x increases in performance. Intel also provided a smaller status update on Habana and Movidius VPU3: respectively sampling and in early access.
While Intel isn’t updating on its financials, this latest launch demonstrates the company’s commitment to AI leadership, indicating that my January investment thesis remains on track:
Put another way, Intel’s data center group has the potential to double in size over the next five years from artificial intelligence alone. Intel’s other growth driver in the data center are cloud computing and (5G) networking.
With its comprehensive A.I. product portfolio, Intel should be positioned to capitalize on this opportunity.
Nvidia (NVDA) has been marching into the data center with its deep learning accelerators and has taken meaningful dollar revenue share from Intel, arguably, and more recently with the acquisition of Mellanox. It has recently launched its next-gen Ampere GPUs with significant performance increases; Intel’s own GPUs for the data center will only launch next year.
AMD (AMD) is equally focused on GPUs specifically for the data center (CDNA), and in CPUs has been leapfrogging Intel with respect to core count (64 vs. 28) and process node (7nm vs. 14nm).
Several start-ups are developing many-core chips based on Arm IP.
Nevertheless, no company can offer cloud companies and enterprises such a breadth of IP as Intel. While Cooper Lake still belongs to the set of processors Intel introduced to deal with the 10nm delays, and hence should not be viewed as Intel’s greatest product ever, it does strengthen Intel’s position in the key area of AI acceleration. As the company argues, AI will be infused into everything, so having leading performance (per core) in AI workloads might be a key reason to adopt Intel over AMD.
- Cooper Lake for 4S/8S servers, introducing BF16 support
- Roadmap update on Sapphire Rapids in 2021
- Intel Optane persistent memory 200 series
- Intel SSD D7-P5500 and P5600
- Stratix 10 NX for AI, with 15x AI performance with AI Tensor Blocks
- Movidius VPU Gen 3 (Keem Bay) early access
- Habana Goya and Gaudi sampling
- Continued development of oneAPI
- Several Intel Select Solutions updates
Cooper Lake and Optane 200 are shipping today, with OEM availability in the second half. The new SSDs are also shipping today, while the Stratix 10 NX will be available in the second half.
For detailed write-ups about the new products, one should reach the tech press such as AnandTech, ServeTheHome or Tom’s Hardware, but I will briefly detail the key updates below.
Cooper Lake is still built on 14nm with 28 cores. Its only major key enhancement is bfloat16 or BF16 (a numeric format) support, which is useful for both inference and training. It is part of what Intel calls DLBoost: its AI acceleration for CPUs.
CPUs received a big (2x) improvement in first-gen Xeon Scalable Skylake due to the addition of AVX-512 compute units. Cascade Lake (second-gen) introduced the DLBoost name as these AVX-512 units improved support for INT16 and INT8, with 2x-3x performance in AI inference. Cooper Lake’s BF16 is the first DLBoost enhancement to enhance AI training performance, but it also improves performance of inference compared to FP32. As the name implies, BF16 numbers take up half the size as classical FP32 numbers, and hence, performance is doubled (best case).
Since training is mostly done on large clusters, Cooper Lake has been relegated to the Cedar Island platform for 4S/8S (number of CPUs) servers. Ice Lake on 10nm later this year will serve the mainstream 1S/2S segment but will lack BF16 support.
Intel has also updated some of its Select Solutions. The company has said in the past that these pre-configured/validated workload-optimized platforms increase cross-selling of its adjacency products (beyond the CPU).
Intel already hinted in November that Sapphire Rapids (SPR) would further improve AI performance quite drastically. Intel revealed that Sapphire Rapid’s DLBoost feature is called AMX (Advanced Matrix Extension). The company will reportedly disclose more details later this month.
For comparison, last year, Apple’s (AAPL) A13 chip also introduced a similar feature also called AMX and claimed this improved AI performance by 6x. Given some comments by Intel’s Raja Koduri last year, I expect a similar or slightly higher improvement in the company’s version of AMX. Given that AMD still lacks any AI features in its CPUs, this will all but further increase Intel’s lead for this increasingly important workload.
Intel also said that the silicon is powered-on. The company last year disclosed that its DG1 GPU was powered on in Q3’19, and DG1 might launch in the third quarter, which means it takes Intel about a year from power-on to release (although this might be longer in the data center). This implies the SPR launch might happen next summer.
Stratix 10 NX
Intel has been steadily building out its 14nm Stratix 10 family since 2017. For example, last year it introduced PCIe 4.0 support via a new chiplet.
The new Stratix 10 NX is set to bring competition to Xilinx’s (XLNX) 7nm Versal FPGAs, as it is Intel’s first FPGA specifically for AI acceleration.
While remaining at the 14nm node, Intel has replaced the traditional DSPs with AI Tensor Blocks. These feature 15x the arithmetic hardware, improving performance by up 15x (for INT8). There is support for INT4, INT8, BLOCK FP12, and BLOCK FP16.
Like the Stratix 10 MX series, they also feature HBM support, as well as 58G SerDes networking.
Intel provided three examples where the NX achieved 2x, 4x and 10x higher performance than Nvidia’s V100. Performance against the new 7nm A100 was not provided, but given these 2x-10x numbers provided, it seems like a viable option in many cases, and it is a big upgrade compared to prior Stratix 10 FPGAs in any case. Microsoft (MSFT) has given its seal of approval, in Intel’s press deck. (Microsoft notably supported AI on the Stratix 10 series with its Project Brainwave.)
Optane persistent memory 200 series
Intel’s 200 series of Optane, codenamed Barlow Pass, is reportedly based on its second-generation of 3D XPoint. It has 25% higher memory bandwidth, while memory per socket remains at 4.5TB. (Given the latter point, I would be inclined to conclude that Barlow Pass is still based on first-generation 3D XPoint, though.)
Since Apache Pass launched in April 2019, Intel claims it has been adopted by over 200 of the Fortune 500, is achieving an over 85% conversion from proof of concept to deployment, and has over 270 production wins.
The SSD D7-P5500 and P5600 are Intel’s most advanced TLC 3D NAND SSDs, based on 96-layer 3D NAND, and deliver 40% lower latency, 33% higher performance, and some other advanced features.
They also introduce support for PCIe 4.0.
Notably absent were updates on Intel’s silicon photonics and Barefoot Networks switches. The company said earlier this year 200G and 400G would ramp in the first half, but this seems to be somewhat delayed given their absence from today’s data-centric launch.
Also, last year Intel claimed Cooper Lake would scale up to 56 cores (two times 28 cores in a package), but today’s announcement only went up to 28 cores. Reading Intel’s press release from last year, it would seem that these SKUs have been canceled. The company indeed confirmed some changes to the platform earlier this year, noting that it was mostly focused on Ice Lake-SP.
Intel also disclosed that Movidius Keem Bay (VPU3) is now in early access, while its Habana chips are sampling. If I recall correctly, this implies somewhat of a delay in terms of general availability for Keem Bay.
The Cooper Lake platform launch really consists of two parts and one update:
- An AI-focused silicon launch (Cooper Lake, Stratix 10 NX);
- An update of several key adjacency products (SSDs, Optane);
- Progress update: Habana AI accelerators sampling and Movidius edge VPU3 in early access.
Intel’s data-centric portfolio launches like Cooper Lake highlight the unmatched breadth of the company’s portfolio for the data center.
Cooper Lake does not really move the needle when it comes to per processor performance, in the absence of the previously announced 56-core SKUs. The platform has been relegated to mostly AI acceleration. Cooper Lake adds a new key feature with BF16. Support for BF16 will be important for some customers such as Facebook (FB), as it can double performance for AI training, and this number format continues to become more mainstream.
Intel also disclosed that next year’s Sapphire Rapids will feature yet another new DLBoost feature called AMX, which might be similar to the AMX feature in Apple’s latest A13 chip with a 6x performance improvement.
Also on the AI side, Intel announced the Stratix 10 NX with a staggering 15x AI performance improvement, and this should make it a solid competitor to Xilinx’s offerings.
These two products support the view (and investment thesis), as I explained in my earlier article titled “Intel To Lead The AI Revolution Over The Next Decade”. Intel has a full breadth of silicon products and is making sure AI runs well on each of them. In that way, the company is positioning itself to benefit from the $25 billion market this will become over the coming years.
Intel also introduced some other products. As I have detailed in a previous article, these adjacency products are delivering meaningful growth to the company’s top and bottom lines.
To conclude, the launch was more of an evolutionary step (i.e., upholding the yearly cadence) than a revolution, as should be expected from yet another 14nm refresh, but with a key focus on AI, where it does provide tangible benefits and further strengthens Intel’s lead over AMD (in CPUs) in this area.
So, in that regard, this recent data-centric product launch further proved Intel’s ambition to lead in AI performance for every chip category, in this case with solid updates to its CPUs and FPGAs. Hence, AI remains one of the key long-term investment theses for Intel as a growth company.
Disclosure: I/we have no positions in any stocks mentioned, and no plans to initiate any positions within the next 72 hours. I wrote this article myself, and it expresses my own opinions. I am not receiving compensation for it (other than from Seeking Alpha). I have no business relationship with any company whose stock is mentioned in this article.
Originally published on Seeking Alpha