Tech

New P and E cores, Xe2-LPG graphics and new NPU 4 bring more AI performance

Intel this morning lifts the veil on some of the finer architectural and technical details of its upcoming Lunar Lake SoC – the chip that will be the next generation of Core Ultra mobile processors. Once again hosting one of its increasingly regular Tech Tour events for media and analysts, Intel this time set up shop in Taipei just before the start of Computex 2024. During the Tech Tour, Intel revealed many facets of Lunar Lake, including its new P-core design codename Lion Cove and a new wave of E-cores that look a little more like the pioneering Low Power Island E-Cores from Meteor Lake. Also leaked was the Intel NPU 4, which Intel says delivers up to 48 TOPS, surpassing Microsoft’s Copilot+ requirements for the new era of AI PCs.

Intel’s Lunar Lake represents a strategic evolution in their mobile SoC lineup, building on their launch of Meteor Lake last year, with a focus on improving power efficiency and optimizing performance across the board. levels. Lunar Lake dynamically allocates tasks to efficient cores (E-cores) or performance cores (P-cores) based on workload demands by leveraging advanced scheduling mechanisms, which are assigned to ensure optimal utilization of the energy and performance. Yet once again, Intel Thread Director, along with Windows 11, plays a central role in this process, guiding the operating system scheduler to make real-time adjustments that balance efficiency and computing power based on of the intensity of the workload.

Generations of Intel processor architecture
Lake of alders and birds of preyMeteor
Lake
Lunar
Lake
Arrow
Lake
Panther
Lake
P-Core ArchitectureGolden Cove/
Raptors Cove
Redwood CreekLion CoveLion CoveCougar Cove?
E-Core ArchitectureGracemontCrestmontHeavenly MountCrestmont?Sombremont?
GPU architectureXe-LPXe-LPGXe2Xe2??
NPU architectureN / ANPU3720NPU4??
Active tiles1 (Monolithic)424??
Manufacturing processIntel 7Intel 4 + TSMC N6 + TSMC N5TSMC N3E + TSMC N6Intel 20A and aboveIntel 18A
SegmentMobile + desktopMobileLP MobileHP mobile + desktopMobile?
Release date (OEM)Q4’2021Q4’2023Q3’2024Q4’20242025

Lunar Lake: designed by Intel, built by TSMC

While there are many aspects of Lunar Lake to dive into, it’s perhaps best to start with what’s sure to be the most eye-catching: who’s building it.

Intel’s Lunar Lake tiles are not built using any of their own foundry facilities – a stark departure from historical precedent, and even the recent Meteor Lake, where the compute tile was created at using Intel’s Process 4. Instead, both tiles of the disaggregated Lunar Lake are manufactured at TSMC, using a mix of TSMC’s N3E and N6 processes. In 2021, Intel decided to free up its chip design groups to use the best possible foundry – whether internal or external – and there’s no place more obvious than here.

Overall, Lunar Lake represents their second generation of disaggregated SoC architecture for the mobile market, replacing the Meteor Lake architecture in the low-end space. Currently, Intel has revealed that it uses a 4P+4E (8 core) design, with hyper-threading/SMT disabled, so the total number of threads supported by the processor is simply the number of processor cores, for example 4P+4E. /8T.

The creation of Lunar Lake combines a synergistic collaboration between Intel’s architectural design team and TSMC’s manufacturing process nodes to bring the latest Lion Cove P cores to Lunar Lake, boosting Intel’s architectural IPC as you would expect from a new generation. At the same time, Intel is also introducing Skymont E-cores, which replace Meteor Lake’s Low Power Island Cresmont E-cores. It should be noted, however, that these E cores do not connect to the ring bus like the P cores, making them a sort of hybrid LP e-core, combining the efficiency gains of the more advanced TSMC N3E node with the two digits. IPC gains compared to previous Crestmont cores.

The entire compute tile, including the P and E cores, is built on TSMC’s N3E node, while the SoC tile is built using the TSMC N6 node.

At a higher level, Intel is once again using its Foveros packaging technology here. The compute tiles and SoC (now the “Platform Controller”) sit on top of a base tile, which provides high-speed/low-power routing between tiles and additional connectivity with the rest of the chip and beyond.

Another first for a mainstream Intel Core product, the Lunar Lake SoC platform also includes up to 32GB of LPDDR5X memory on the chip package itself. This is organized as a pair of 64-bit memory chips, providing a total memory interface of 128 bits. As with other vendors using onboard memory, this change means that users cannot simply upgrade DRAM at will, and memory configurations for Lunar Lake will ultimately be determined by the SKUs that Intel chooses to offer. dispatch.

With Lunar Lake, Intel is also focusing heavily on AI, as the architecture incorporates a new NPU called NPU 4. This NPU is rated for up to 48 TOPS of INT8 performance, making it ready for Microsoft Copilot+ AI PC. This is the bar that all PC SoC vendors are aiming for, including AMD and Qualcomm.

Intel’s integrated GPU will also be a big player here. Although it is not a highly efficient machine like the dedicated NPU, the Arc Xe2-LPG brings dozens of additional T(FL)OPS performance, as well as additional flexibility than an NPU not offer. That’s why you’ll also see Intel benchmark the performance of these chips in terms of total platform TOPS – in this case, 120 TOPS.

Intel’s collaboration with Microsoft further improves workload management with the legendary Intel Thread Director, optimized for applications such as Copilot Assistant. Given the timing of Lunar Lake’s introduction, this somewhat sets the stage for a Q3 2024 launch, which coincides with the 2024 holiday market.

Intel Lunar Lake: Intel Thread Director update and power management improvements

To say that energy efficiency is a key goal for Lunar Lake would be an understatement. Even though Intel occupies a significant place in the market for processors for mobile PCs (AMD’s share is still only a fraction), the company has felt pressure in recent years from the customer turned rival of Apple, whose Apple M series silicon has set the bar for power efficiency in recent years. And now, as Qualcomm attempts to do the same for the Windows ecosystem with its upcoming Snapdragon X chips, Intel is preparing to play its own power play.

The Intel Thread Director and power management updates for Lunar Lake feature diverse and significant improvements over Meteor Lake. The Thread Director uses a heterogeneous scheduling policy, initially assigning tasks to a single E-core and expanding to other E-cores or P-cores as needed. OS containment zones are designed to limit tasks to specific cores, which directly improves power efficiency and delivers the performance needed to the core suited to the workload at hand. Integration with power management systems and a quad array of power management controllers (PMCs) further allows the chip, in concert with Windows 11, to make contextual adjustments, ensuring performance optimal with minimal energy consumption and waste.

Lunar Lake’s scheduling strategy efficiently manages energy-sensitive applications. An example given by Intel is that video conferencing tasks are kept in the central efficiency cluster, using E cores to maintain performance while reducing power consumption by up to 35%, as shown in the provided data by Intel. These improvements are achieved through collaboration with operating system developers such as Microsoft for seamless integration to optimize the best balance between power consumption and performance.

Focusing on Lunar Lake’s power management system, Intel uses its power management SoC, operating in tailored efficiency, balance, and performance modes designed to accommodate all workload requirements at the time of operation. This multi-layer approach allows the Lunar Lake SoC to operate efficiently. Again, just like Intel Thread Director, PMCs can balance power consumption with performance needs.

Intel further plans to improve Thread Director by increasing scenario granularity, implementing AI-based scheduling guidance, and enabling cross-IP scheduling in Windows 11. These improvements essentially amount to load management of work designed to improve overall energy efficiency and deliver performance in various applications when needed without wasting the power budget by assigning lighter tasks to higher power P cores.

Over the next few pages, we’ll explore the new P and E cores and Intel’s update to integrated Arc Xe graphics (Xe2-LPG).

News Source : www.anandtech.com
Gn tech

Back to top button