How to find out what generation Intel processor is. From Sandy Bridge to Coffee Lake: Comparing Seven Generations of Intel Core i7

A little over 8 years ago, Steve Jobs introduced the Macbook Air, a device that opened up a new class of portable laptops - ultrabooks. Since then, there have been many different Ultrabooks, but they all had one thing in common - low-voltage processors with a heat dissipation (TDP) of 15-17 watts. However, in 2015, with the transition to the 14 nm process technology, Intel decided to go even further, and introduced a line of Core m processors, which have a TDP of only 4-5 W, but should be much more powerful than the Intel Atom line with a similar TDP. The main feature of the new processors is that they can be passively cooled, that is, the cooler can be removed from the device. But alas, removing the cooler brought a lot of new problems, which we will discuss below.

Comparison with closest competitors

And although the processors on Kaby Lake have already come out, their tests are not yet available, so we will confine ourselves to the previous line, Skylake - from a technical point of view, the difference between them is small. For comparison, let's take three processors - Intel Atom x7-Z8700, as one of the most powerful representatives of the Atom line, Intel Core m3-6Y30 - the weakest Core m (I'll explain later why you shouldn't take more powerful ones), and Intel Core i3-6100U - a popular representative of the weakest line of "full-fledged" low-voltage processors:

It turns out an interesting picture - from a physical point of view, Core m3 and i3 are absolutely identical, only the maximum frequencies of the graphics and the processor differ, while the thermal package differs threefold, which in general cannot be. The Atom has the same TDP as the Core m3, comparable clocks, but 4 physical cores. At the same time, although there are more cores, they are greatly reduced in terms of capabilities to reduce heat generation: for example, an i5-6300HQ with 4 "full" physical cores with the same frequencies has a TDP an order of magnitude higher - 45 watts. Therefore, it will be interesting to compare the capabilities of the trimmed and full-fledged architectures with the same heat dissipation.

Processor tests

As it was already found out above, m3 is essentially i3, clamped three times smaller heat pack. It would seem that the difference in performance should be at least twofold, but there are several nuances here: firstly, Intel allows Core m not to pay attention to TDP until its temperature reaches a certain mark. This can be seen very clearly with multiple runs of the Cinebench R15 benchmark:

As you can see, the processor gained about 215 points for the first 4 test runs, and then the results stabilized at 185, that is, the performance loss due to such "cheating" by Intel was about 15%. Therefore, it makes no sense to take the more powerful Core m5 and m7 - after 10 minutes of load, they will reduce performance to the level of Core m3. And here is the result of i3-6100U, the operating frequency of which is only 100 MHz higher than that of the m3-6Y30, much better - 250 points:

That is, when only the processor is loaded, the difference in performance between m3 and i3 turns out to be 35% - quite a significant result. But Atom showed itself from the best side - although the cores are cut, but twice as many gave the processor the opportunity to score 140 points. Yes, the result is still 25% worse than the Core m3, but don't forget about the eightfold difference in price between them.

The second caveat is that the heat pack is designed for both a video card and a processor at the same time, so let's look at the results of the 3Dmark 11 Performance test: this is a test designed for a mid-range PC (which our systems belong to), testing both the processor and the video card at the same time. And here the final difference turns out to be the same, Core m3 turns out to be 30% worse than i3 (because Core i3 also ceases to have enough heat pack - it needs about 20 watts to work at maximum frequencies):
Intel Core m3-6Y30:

Intel Core i3-6100U:

But Intel Atom fails with a bang - the result is 4-5 times worse than that of m3 and i3:

And this, in principle, is expected - Cinebench tests the bare mathematical performance of the processor and is well suited only for comparing processors of the same architecture, but 3Dmark gives a versatile load, much closer to real life. Still, the eightfold difference in price keeps the Atom afloat.

Energy consumption

As you can see from the tests above, a threefold difference in TDP gives a performance gain of about 35%. However, this is only true under heavy load, which is quite rare for ultrabooks. For convenience, let's take two MacBooks, 12 "and 13" 2016 - macOS on different devices is optimized equally well, and this will allow you to find out the difference in the power consumption of devices without reference to the operating system (yes, the power consumption of the entire system is tested below, but only screens and processors, and since the former are very similar, only processors make a significant contribution to the difference in power consumption). And here the difference turns out to be ... only one and a half watts on average, 7.2 and 8.9 watts (moreover, a 13 "Macbook has a processor more powerful than i3-6100U):

What does this mean? This means that under normal load, both processors consume only a few watts, and the Core m does not come to the TDP limit. Intel Atom shows power consumption comparable to Core m3 (for example, Microsoft Surface 3 is taken, which is well optimized for working with Windows):

conclusions

What is the end result? Intel Atom is a good choice for an inexpensive tablet or netbook, on which no one will run anything heavier than 1080p60 from YouTube. The processor is cheap and can be forgiven for the difference in performance over the Core lines. Intel Core m is a good choice for a performance tablet or a simple ultrabook. Due to the lack of a cooler, such a device will be absolutely silent, and in ordinary tasks it will be no slower than their more powerful Core i counterparts. However, it is clearly not worth taking it for processing photos or videos, and even more so for games - the performance quickly runs into a low TDP and decreases quite significantly even in comparison with a simple i3. Well, the Core i line is a good choice for a productive ultrabook. If there is at least a simple discrete graphics in the system, such a device turns out to be at the level of gaming laptops 5 years ago, and allows you to easily deal with both photo and light video processing, as well as making it possible to play massive games even at not the lowest graphics settings. However, any load above average will result in a noticeable noise from a small high-speed cooler, which can annoy those who like to work in silence at night.

On June 2, Intel announced ten new fifth-generation 14nm Intel Core desktop and mobile processors (codenamed Broadwell-C) and five new 14nm Intel Xeon E3-1200 v4.

Of the ten new 5th Generation Intel Core (Broadwell-C) processors for desktop and mobile, only two are desktop-oriented and have an LGA 1150 socket: the quad-core Intel Core i7-5775C and Core i5-5675C. All other 5th Gen Intel Core processors are BGA and are aimed at notebooks. Brief characteristics of the new Broadwell-C processors are presented in the table.

	Connector	Number of cores / threads	L3 cache size, MB		TDP, W	Graphics core
Core i7-5950HQ	BGA	4/8	6	2,9/3,7	47	Iris Pro Graphics 6200
Core i7-5850HQ	BGA	4/8	6	2,7/3,6	47	Iris Pro Graphics 6200
Core i7-5750HQ	BGA	4/8	6	2,5/3,4	47	Iris Pro Graphics 6200
Core i7-5700HQ	BGA	4/8	6	2,7/3,5	47	Intel HD Graphics 5600
Core i5-5350H	BGA	2/4	4	3,1/3,5	47	Iris Pro Graphics 6200
Core i7-5775R	BGA	4/8	6	3,3/3,8	65	Iris Pro Graphics 6200
Core i5-5675R	BGA	4/4	4	3,1/3,6	65	Iris Pro Graphics 6200
Core i5-5575R	BGA	4/4	4	2,8/3,3	65	Iris Pro Graphics 6200
Core i7-5775C	LGA 1150	4/8	6	3,3/3,7	65	Iris Pro Graphics 6200
Core i5-5675C	LGA 1150	4/4	4	3,1/3,6	65	Iris Pro Graphics 6200

Of the five new processors of the Intel Xeon E3-1200 v4 family, only three models (Xeon E3-1285 v4, Xeon E3-1285L v4, Xeon E3-1265L v4) have an LGA 1150 socket, and two more models are made in a BGA package and are not intended for self-installation on the motherboard. Brief characteristics of the new Intel Xeon E3-1200 v4 family processors are presented in the table.

	Connector	Number of cores / threads	L3 cache size, MB	Frequency nominal / maximum, GHz	TDP, W	Graphics core
Xeon E3-1285 v4	LGA 1150	4/8	6	3,5/3,8	95	Iris Pro Graphics P6300
Xeon E3-1285L v4	LGA 1150	4/8	6	3,4/3,8	65	Iris Pro Graphics P6300
Xeon E3-1265L v4	LGA 1150	4/8	6	2,3/3,3	35	Iris Pro Graphics P6300
Xeon E3-1278L v4	BGA	4/8	6	2,0/3,3	47	Iris Pro Graphics P6300
Xeon E3-1258L v4	BGA	2/4	6	1,8/3,2	47	Intel HD Graphics P5700

Thus, out of 15 new Intel processors, only five models have an LGA 1150 socket and are aimed at desktop systems. For users, the choice is, of course, small, especially when you consider that the Intel Xeon E3-1200 v4 family of processors are focused on servers, not user PCs.

In the future, we will focus on reviewing the new 14nm processors with the LGA 1150 socket.

So, the main features of the new fifth generation Intel Core processors and the Intel Xeon E3-1200 v4 family processors is the new 14nm microarchitecture of the cores, codenamed Broadwell. In principle, there is no fundamental difference between the Intel Xeon E3-1200 v4 family processors and the fifth generation Intel Core processors for desktop systems, so in the future we will refer to all these processors as Broadwell.

In general, it should be noted that the Broadwell microarchitecture is not just Haswell in a 14nm version. Rather, it is a slightly improved Haswell microarchitecture. However, Intel always does this: when switching to a new production process, changes are made to the microarchitecture itself. In the case of Broadwell, we are talking about cosmetic enhancements. In particular, the volumes of internal buffers have been increased, there are changes in the execution units of the processor core (the scheme for performing multiplication and division of floating point numbers has been changed).

We will not consider in detail all the features of the Broadwell microarchitecture (this is a topic for a separate article), but we emphasize once again that we are talking only about cosmetic changes to the Haswell microarchitecture, and therefore, one should not expect that Broadwell processors will be more efficient than Haswell processors. Of course, the transition to a new technical process made it possible to reduce the power consumption of processors (at the same clock frequency), but you should not expect any significant performance gains.

Perhaps the most significant difference between the new Broadwell processors and Haswell is the Crystalwell L4 cache. Let's clarify that such an L4 cache was present in Haswell processors, but only in top models of mobile processors, while Haswell processors for desktop PCs with an LGA 1150 socket did not have it.

Recall that some of the top models of Haswell mobile processors implemented the Iris Pro graphics core with additional eDRAM (embedded DRAM), which made it possible to solve the problem of insufficient memory bandwidth used for the GPU. EDRAM memory was a separate die, which was located on the same substrate with the processor die. This crystal was codenamed Crystalwell.

The eDRAM was 128MB in size and was manufactured using a 22nm process. But most importantly, this eDRAM memory was used not only for the needs of the GPU, but also for the computational cores of the processor itself. That is, in fact, Crystalwell was an L4 cache, shared between the GPU and the processing cores of the processor.

All new Broadwell processors also have a separate 128MB eDRAM die that acts as an L4 cache and can be used by the graphics and processing cores of the processor. Moreover, we note that the eDRAM memory in the 14-nanometer Broadwell processors is exactly the same as in the top-end Haswell mobile processors, that is, it is performed using a 22-nanometer process technology.

The next feature of the new Broadwell processors is the new graphics core, codenamed Broadwell GT3e. For desktop and mobile processors (Intel Core i5 / i7), this is Iris Pro Graphics 6200, and for Intel Xeon E3-1200 v4 family processors, this is Iris Pro Graphics P6300 (excluding Xeon E3-1258L v4). We will not delve into the architecture of Broadwell GT3e graphics cores (this is a topic for a separate article) and will only briefly consider its main features.

Recall that the Iris Pro graphics core was previously present only in Haswell mobile processors (Iris Pro Graphics 5100 and 5200). Moreover, in the graphics cores Iris Pro Graphics 5100 and 5200 there are 40 executive units (EU) each. The new graphics cores Iris Pro Graphics 6200 and Iris Pro Graphics P6300 are already endowed with 48 EUs, and the EU organization system has also changed. Each individual GPU unit contains 8 EUs, and the graphics unit contains three graphics units. That is, one graphics module contains 24 EU, and the Iris Pro Graphics 6200 or Iris Pro Graphics P6300 itself combines two modules, that is, we get 48 EU in total.

As for the difference between the graphics cores Iris Pro Graphics 6200 and Iris Pro Graphics P6300, at the hardware level it is the same (Broadwell GT3e), but their drivers are different. In the Iris Pro Graphics P6300 version, the drivers are optimized for tasks specific to servers and graphics stations.

Before moving on to a detailed examination of the Broadwell test results, let's talk about a few more features of the new processors.

First of all, the new Broadwell processors (including the Xeon E3-1200 v4) are compatible with motherboards based on Intel 9-series chipsets. We cannot claim that any board based on the Intel 9-series chipset will support these new Broadwell processors, but most boards do. However, for this you will have to update the BIOS on the board, and the BIOS must support new processors. For example, for testing, we used the ASRock Z97 OC Formula board and without updating the BIOS, the system worked only with a discrete video card, and image output through the graphics core of Broadwell processors was impossible.

The next feature of the new Broadwell processors is that the Core i7-5775C and Core i5-5675C models have an unlocked multiplication factor, that is, they are focused on overclocking. In the Haswell family of processors, such unlocked multiplier processors made up the K-series, and in the Broadwell family, the letter "C" is used instead of the letter "K". But Xeon E3-1200 v4 processors do not support overclocking (they cannot increase the multiplication factor).

Now let's take a closer look at the processors that came to us for testing. These are models, and. In fact, of the five new models with the LGA 1150 socket, only the Xeon E3-1285L v4 processor is missing, which differs from the Xeon E3-1285 v4 only in lower power consumption (65 W instead of 95 W) and the fact that the nominal clock frequency of the cores slightly lower (3.4 GHz instead of 3.5 GHz). In addition, for comparison, we also added the Intel Core i7-4790K, which is the top processor in the Haswell family.

The characteristics of all tested processors are presented in the table:

	Xeon E3-1285 v4	Xeon E3-1265L v4	Core i7-5775C	Core i5-5675С	Core i7-4790K
Technological process, nm	14	14	14	14	22
Connector	LGA 1150	LGA 1150	LGA 1150	LGA 1150	LGA 1150
Number of Cores	4	4	4	4	4
Number of threads	8	8	8	4	8
L3 cache, MB	6	6	6	4	8
L4 cache (eDRAM), MB	128	128	128	128	N / A
Rated frequency, GHz	3,5	2,3	3,3	3,1	4,0
Maximum frequency, GHz	3,8	3,3	3,7	3,6	4,4
TDP, W	95	35	65	65	88
Memory type	DDR3-1333 / 1600/1866		DDR3-1333/1600
Graphics core	Iris Pro Graphics P6300	Iris Pro Graphics P6300	Iris Pro Graphics 6200	Iris Pro Graphics 6200	HD Graphics 4600
Number of GPU execution units	48 (Broadwell GT3e)	48 (Broadwell GT3e)	48 (Broadwell GT3e)	48 (Broadwell GT3e)	20 (Haswell GT2)
Nominal frequency of the graphics processor, MHz	300	300	300	300	350
Maximum GPU frequency, GHz	1,15	1,05	1,15	1,1	1,25
VPro technology	+	+	−	−	−
VT-x technology	+	+	+	+	+
VT-d technology	+	+	+	+	+
Cost, $	556	417	366	276	339

And now, after our express review of the new Broadwell processors, let's move on to testing new products.

Test stand

To test the processors, we used the bench with the following configuration:

Testing technique

Processors were tested using our scripted benchmarks, and. More precisely, we took the workstation testing methodology as a basis, but expanded it by adding tests from the iXBT Application Benchmark 2015 package and the iXBT Game Benchmark 2015 gaming tests.

Thus, the following applications and benchmarks were used to test the processors:

MediaCoder x64 0.8.33.5680
SVPmark 3.0
Adobe Premiere Pro CC 2014.1 (Build 8.1.0)
Adobe After Effects CC 2014.1.1 (Version 13.1.1.3)
Photodex ProShow Producer 6.0.3410
Adobe Photoshop CC 2014.2.1
ACDSee Pro 8
Adobe Illustrator CC 2014.1.1
Adobe Audition CC 2014.2
Abbyy FineReader 12
WinRAR 5.11
Dassault SolidWorks 2014 SP3 (Flow Simulation Package)
SPECapc for 3ds max 2015
SPECapc for Maya 2012
POV-Ray 3.7
Maxon Cinebench R15
SPECviewperf v.12.0.2
SPECwpc 1.2

In addition, for testing we used games and game benchmarks from the iXBT Game Benchmark 2015 package. Testing in games was carried out at a resolution of 1920 × 1080.

Additionally, we measured the power consumption of the processors in idle and stress mode. For this, a specialized hardware and software complex was used, connected to the break in the power supply circuits of the motherboard, that is, between the power supply and the motherboard.

To create a stressful CPU load, we used the AIDA64 utility (Stress FPU and Stress GPU tests).

Test results

Processor power consumption

So, let's start with the results of testing processors for power consumption. The test results are presented in the diagram.

The most gluttonous in terms of power consumption, as expected, turned out to be the Intel Core i7-4790K processor with a claimed TDP of 88 W. Its real power consumption under stress load was 119 W. At the same time, the temperature of the processor cores was 95 ° C and throttling was observed.

The next in terms of power consumption was the Intel Core i7-5775C processor with a declared TDP of 65 W. For this processor, the stress-load power consumption was 72.5 watts. The core temperature of the processor reached 90 ° C, but throttling was not observed.

The third place in terms of power consumption was taken by the Intel Xeon E3-1285 v4 processor with a TDP of 95 W. Its power consumption in stress mode was 71 W, and the temperature of the processor cores was 78 ° C

And the most economical in terms of power consumption was the Intel Xeon E3-1265L v4 processor with a TDP of 35 W. In stress mode, the power consumption of this processor did not exceed 39 W, and the temperature of the processor cores was only 56 ° C.

Well, if we focus on the power consumption of processors, then we must admit that Broadwell has a significantly lower power consumption in comparison with Haswell.

Tests from the iXBT Application Benchmark 2015 package

Let's start with the tests included in the iXBT Application Benchmark 2015. Note that we calculated the integral performance result as the geometric mean of the results in logical groups of tests (video conversion and video processing, video content creation, etc.). To calculate the results in logical groups of tests, the same reference system was used as in the iXBT Application Benchmark 2015.

Full test results are shown in the table. In addition, we present the test results for logical groups of tests on diagrams in a normalized form. The result of the Core i7-4790K processor is taken as the reference.

Logical group of tests	Xeon E3-1285 v4	Xeon E3-1265L v4	Core i5-5675C	Core i7-5775C	Core i7-4790K
Video converting and video processing, points	364,3	316,7	272,6	280,5	314,0
MediaCoder x64 0.8.33.5680, seconds	125,4	144,8	170,7	155,4	132,3
SVPmark 3.0, points	3349,6	2924,6	2552,7	2462,2	2627,3
Video content creation, points	302,6	264,4	273,3	264,5	290,9
Adobe Premiere Pro CC 2014.1, seconds	503,0	579,0	634,6	612,0	556,9
Adobe After Effects CC 2014.1.1 (Test # 1), seconds	666,8	768,0	802,0	758,8	695,3
Adobe After Effects CC 2014.1.1 (Test # 2), seconds	330,0	372,2	327,3	372,4	342,0
Photodex ProShow Producer 6.0.3410, seconds	436,2	500,4	435,1	477,7	426,7
Digital photo processing, points	295,2	258,5	254,1	288,1	287.0
Adobe Photoshop CC 2014.2.1, seconds	677,5	770,9	789,4	695,4	765,0
ACDSee Pro 8, seconds	289,1	331,4	334,8	295,8	271,0
Vector graphics, points	150,6	130,7	140,6	147,2	177,7
Adobe Illustrator CC 2014.1.1, seconds	341,9	394,0	366,3	349,9	289,8
Audio processing, points	231,3	203,7	202,3	228,2	260,9
Adobe Audition CC 2014.2, seconds	452,6	514,0	517,6	458,8	401,3
OCR, points	302,4	263,6	205,8	269,9	310,6
Abbyy FineReader 12, seconds	181,4	208,1	266,6	203,3	176,6
Archiving and unzipping data, points	228,4	203,0	178,6	220,7	228,9
WinRAR 5.11 archiving, seconds	105,6	120,7	154,8	112,6	110,5
WinRAR 5.11 unzip, seconds	7,3	8,1	8,29	7,4	7,0
Integral result of performance, points	259,1	226,8	212,8	237,6	262,7

So, as can be seen from the test results, the Intel Xeon E3-1285 v4 processor practically does not differ from the Intel Core i7-4790K processor in terms of integrated performance. However, this is an integral result for the totality of all applications used in the benchmark.

However, there are a number of applications that take advantage of the Intel Xeon E3-1285 v4 processor. These are applications such as MediaCoder x64 0.8.33.5680 and SVPmark 3.0 (video converting and video processing), Adobe Premiere Pro CC 2014.1 and Adobe After Effects CC 2014.1.1 (video content creation), Adobe Photoshop CC 2014.2.1 and ACDSee Pro 8 (digital processing photos). In these applications, the higher clock speed of the Intel Core i7-4790K processor does not give it an edge over the Intel Xeon E3-1285 v4 processor.

But in applications such as Adobe Illustrator CC 2014.1.1 (vector graphics), Adobe Audition CC 2014.2 (audio processing), Abbyy FineReader 12 (text recognition), the advantage is on the side of the higher-frequency Intel Xeon E3-1285 v4 processor. It is interesting to note here that tests based on the Adobe Illustrator CC 2014.1.1 and Adobe Audition CC 2014.2 applications load the processor cores to a lesser extent (in comparison with other applications).

And of course, there are tests in which the Intel Xeon E3-1285 v4 and Intel Core i7-4790K processors show the same performance. For example, this is a test based on WinRAR 5.11 application.

In general, it should be noted that the Intel Core i7-4790K processor demonstrates higher performance (in comparison with the Intel Xeon E3-1285 v4 processor) precisely in those applications in which not all processor cores are used or the core load is not full. At the same time, in tests where all processor cores are loaded to 100%, the leadership is on the side of the Intel Xeon E3-1285 v4.

Calculations in Dassault SolidWorks 2014 SP3 (Flow Simulation)

We took out the test based on the Dassault SolidWorks 2014 SP3 application with the additional Flow Simulation package separately, since this test does not use the reference system, as in the tests of the iXBT Application Benchmark 2015.

Recall that this test deals with hydro / aerodynamic and thermal calculations. In total, six different models are calculated, and the results of each subtest are the calculation time in seconds.

Detailed test results are presented in the table.

Test	Xeon E3-1285 v4	Xeon E3-1265L v4	Core i5-5675C	Core i7-5775C	Core i7-4790K
conjugate heat transfer, seconds	353.7	402.0	382.3	328.7	415.7
textile machine, seconds	399.3	449.3	441.0	415.0	510.0
rotating impeller, seconds	247.0	278.7	271.3	246.3	318.7
cpu cooler, seconds	710.3	795.3	784.7	678.7	814.3
halogen floodlight, seconds	322.3	373.3	352.7	331.3	366.3
electronic components, seconds	510.0	583.7	559.3	448.7	602.0
Total calculation time, seconds	2542,7	2882,3	2791,3	2448,7	3027,0

In addition, we also provide the normalized result of the calculation speed (the reciprocal of the total calculation time). The result of the Core i7-4790K processor is taken as the reference.

As can be seen from the test results, in these specific calculations the leadership is on the side of Broadwell processors. All four Broadwell processors demonstrate faster computation speed compared to the Core i7-4790K processor. Apparently, these specific calculations are influenced by the improvements in the execution units that were implemented in the Broadwell microarchitecture.

SPECapc for 3ds max 2015

Next, let's look at the SPECapc for 3ds max 2015 benchmark results for Autodesk 3ds max 2015 SP1. The detailed results of this test are presented in the table, and the normalized results for the CPU Composite Score and GPU Composite Score - in the diagrams. The result of the Core i7-4790K processor is taken as the reference.

Test	Xeon E3-1285 v4	Xeon E3-1265L v4	Core i5-5675C	Core i7-5775C	Core i7-4790K
CPU Composite Score	4,52	3,97	4,09	4,51	4,54
GPU Composite Score	2,36	2,16	2,35	2,37	1,39
Large Model Composite Score	1,75	1,59	1,68	1,73	1,21
Large Model CPU	2,62	2,32	2,50	2,56	2,79
Large Model GPU	1,17	1,08	1,13	1,17	0,52
Interacive Graphics	2,45	2,22	2,49	2,46	1,61
Advanced Visual Styles	2,29	2,08	2,23	2,25	1,19
Modeling	1,96	1,80	1,94	1,98	1,12
CPU Computing	3,38	3,04	3,15	3,37	3,35
CPU Rendering	5,99	5,18	5,29	6,01	5,99
GPU Rendering	3,13	2,86	3,07	3,16	1,74

In the SPECapc 3ds for max 2015 test, Broadwell processors are in the lead. Moreover, if in the subtests that depend on the performance of the CPU (CPU Composite Score), the Core i7-4790K and Xeon E3-1285 v4 processors demonstrate equal performance, then in the subtests that depend on the performance of the graphics core (GPU Composite Score), all Broadwell processors significantly ahead of the Core i7-4790K processor.

SPECapc for Maya 2012

Now let's look at the result of another 3D modeling test - SPECapc for Maya 2012. Recall that this benchmark was run in tandem with the Autodesk Maya 2015 package.

The results of this test are presented in the table, and the normalized results are shown in the diagrams. The result of the Core i7-4790K processor is taken as the reference.

Test	Xeon E3-1285 v4	Xeon E3-1265L v4	Core i5-5675C	Core i7-5775C	Core i7-4790K
GFX Score	1,96	1,75	1,87	1,91	1,67
CPU Score	5,47	4,79	4,76	5,41	5,35

In this test, the Xeon E3-1285 v4 processor performs slightly better than the Core i7-4790K processor, however, the difference is not as significant as in SPECapc 3ds for max 2015.

POV-Ray 3.7

In the POV-Ray 3.7 test (3D model rendering), the Core i7-4790K processor is the leader. In this case, a higher clock speed (with an equal number of cores) gives an advantage to the processor.

Test	Xeon E3-1285 v4	Xeon E3-1265L v4	Core i5-5675C	Core i7-5775C	Core i7-4790K
Render average, PPS	1568,18	1348,81	1396,3	1560.6	1754,48

Cinebench R15

In the Cinebench R15 benchmark, the result was mixed. In the OpenGL test, all Broadwell processors significantly outperform the Core i7-4790K processor, which is natural, since they have a more powerful graphics core integrated into them. But in the processor test, on the contrary, the Core i7-4790K processor turns out to be more productive.

Test	Xeon E3-1285 v4	Xeon E3-1265L v4	Core i5-5675C	Core i7-5775C	Core i7-4790K
OpenGL, fps	71,88	66,4	72,57	73	33,5
CPU, cb	774	667	572	771	850

SPECviewperf v.12.0.2

In SPECviewperf v.12.0.2 tests, the results are determined primarily by the performance of the processor's graphics core and, moreover, by the optimization of the video driver for certain applications. Therefore, in these tests, the Core i7-4790K processor significantly lags behind the Broadwell processors.

The test results are presented in the table, as well as in normalized form in the diagrams. The result of the Core i7-4790K processor is taken as the reference.

Test	Xeon E3-1285 v4	Xeon E3-1265L v4	Core i5-5675C	Core i7-5775C	Core i7-4790K
catia-04	20,55	18,94	20,10	20,91	12,75
creo-01	16,56	15,52	15,33	15,55	9,53
energy-01	0,11	0,10	0,10	0,10	0,08
maya-04	19,47	18,31	19,87	20,32	2,83
medical-01	2,16	1,98	2,06	2,15	1,60
showcase-01	10,46	9,96	10,17	10,39	5,64
snx-02	12,72	11,92	3,51	3,55	3,71
sw-03	31,32	28,47	28,93	29,60	22,63

2,36 Blender2,43 2,11 1,82 2,38 2,59 HandBrake2,33 2,01 1,87 2,22 2,56 LuxRender2,63 2,24 1,97 2,62 2,86 IOMeter15,9 15,98 16,07 15,87 16,06 Maya1,73 1,63 1,71 1,68 0,24 Product Development3,08 2,73 2,6 2,44 2,49 Rodinia3,2 2,8 2,54 1,86 2,41 CalculiX1,77 1,27 1,49 1,76 1,97 Wpccfg2,15 2,01 1,98 1,63 1,72 IOmeter20,97 20,84 20,91 20,89 21,13 catia-041,31 1,21 1,28 1,32 0,81 showcase-011,02 0,97 0,99 1,00 0,55 snx-020,69 0,65 0,19 0,19 0,2 sw-031,51 1,36 1,38 1,4 1,08 Life sciences2,73 2,49 2,39 2,61 2,44 Lammps2,52 2,31 2,08 2,54 2,29 namd2,47 2,14 2,1 2,46 2,63 Rodinia2,89 2,51 2,23 2,37 2,3 Medical-010,73 0,67 0,69 0,72 0,54 IOMeter11,59 11,51 11,49 11,45 11,5 Financial Services2,42 2,08 1,95 2,42 2,59 Monte carlo2,55 2,20 2,21 2,55 2,63 Black scholes2,57 2,21 1,62 2,56 2,68 Binomial2,12 1,83 1,97 2,12 2,44 Energy2,72 2,46 2,18 2,62 2,72 FFTW1,8 1,72 1,52 1,83 2,0 Convolution2,97 2,56 1,35 2,98 3,5 Energy-010,81 0,77 0,78 0,81 0,6 srmp3,2 2,83 2,49 3,15 2,87 Kirchhoff Migration3,58 3,07 3,12 3,54 3,54 Poisson1,79 1,52 1,56 1,41 2,12 IOMeter12,26 12,24 12,22 12,27 12,25 General Operation3,85 3,6 3,53 3,83 4,27 7Zip2,48 2,18 1,96 2,46 2,58 Python1,58 1,59 1,48 1,64 2,06 Octave1,51 1,31 1,44 1,44 1,68 IOMeter37,21 36,95 37,2 37,03 37,4

This is not to say that everything is clear in this test. In some scenarios (Media and Entertainment, Product Development, Life Sciences), Broadwell processors show higher results. There are scenarios (Financial Services, Energy, General Operation) where the advantage is on the side of the Core i7-4790K processor, or the results are approximately the same.

Game tests

And in conclusion, let's look at the results of testing processors in gaming tests. Recall that we used the following games and game benchmarks for testing:

Aliens vs Predator
World of Tanks 0.9.5
Grid 2
Metro: LL Redux
Metro: 2033 Redux
Hitman: Absolution
Thief
Tomb raider
Sleeping dogs
Sniper elite v2

Testing was carried out at a screen resolution of 1920 × 1080 and in two settings modes: maximum and minimum quality. Test results are presented in the diagrams. In this case, the results are not standardized.

In gaming tests, the results are as follows: all Broadwell processors demonstrate very similar results, which is natural, since they use the same Broadwell GT3e graphics core. And most importantly, with the settings for the minimum quality, Broadwell processors allow you to comfortably play (at FPS more than 40) most games (at a resolution of 1920 × 1080).

On the other hand, if the system uses a discrete graphics card, then there is simply no point in the new Broadwell processors. That is, it makes no sense to change Haswell to Broadwell. And the price of Broadwells is not so attractive, which would be very attractive. For example, Intel Core i7-5775C is more expensive than Intel Core i7-4790K.

However, Intel does not seem to be betting on Broadwell desktop processors. The range of models is extremely modest, and Skylake processors are on the way, so the Intel Core i7-5775C and Core i5-5675C processors are unlikely to be in special demand.

The Xeon E3-1200 v4 server processor family is a separate segment of the market. For the majority of ordinary home users, such processors are not of interest, but in the corporate sector of the market, these processors may be in demand.

Almost always, under any publication that in one way or another touches on the performance of modern Intel processors, sooner or later there are several angry readers' comments that the progress in the development of Intel chips has long stalled and there is no point in switching from the "good old Core i7-2600K "For something new. In such remarks, it will most likely be annoying to mention productivity gains at the intangible level of "no more than five percent per year"; about the low-quality internal thermal interface, which irreparably spoiled modern Intel processors; or about the fact that buying processors in modern conditions with the same number of cores as several years ago is the lot of short-sighted amateurs, since they do not have the necessary groundwork for the future.

There is no doubt that all such remarks are not groundless. However, it is very likely that they exaggerate the existing problems many times over. The 3DNews laboratory has been testing Intel processors in detail since 2000, and we cannot agree with the thesis that any development of them has come to an end, and what is happening to the microprocessor giant over the past few years cannot be called anything other than stagnation. Yes, some fundamental changes rarely occur with Intel processors, but nevertheless they continue to be systematically improved. Therefore, those chips of the Core i7 series that you can buy today are certainly better than the models offered a few years ago.

Generation Core	Codename	Technical process	Development stage	Time out
2	Sandy bridge	32 nm	So (Architecture)	I quarter. 2011
3	Ivy Bridge	22 nm	Tick \u200b\u200b(Process)	II quarter. 2012
4	Haswell	22 nm	So (Architecture)	II quarter. 2013
5	Broadwell	14 nm	Tick \u200b\u200b(Process)	II quarter. 2015
6	Skylake	14 nm	So (Architecture)	III quarter. 2015
7	Kaby Lake	14+ nm	Optimization	I quarter. 2017
8	Coffee Lake	14 ++ nm	Optimization	IV quarter. 2017

Actually, this material is just a counterargument for reasoning about the futility of Intel's chosen strategy of gradual development of consumer CPUs. We decided to collect in one test senior Intel processors for mass platforms over the past seven years and see in practice how the representatives of the Kaby Lake and Coffee Lake series have gone ahead with respect to the "reference" Sandy Bridge, which over the years of hypothetical comparisons and mental contrasts in the minds of ordinary people have become a real icon of processor building.

⇡ What has changed in Intel processors from 2011 to the present

Microarchitecture is considered to be the starting point in the recent history of Intel processors. Sandy Bridge... And this is no accident. Despite the fact that the first generation of processors under the Core brand was released in 2008 based on the Nehalem microarchitecture, almost all the main features that are inherent in modern mass CPUs of the microprocessor giant came into use not then, but a couple of years later, when the next generation spread. processor design, Sandy Bridge.

Now Intel has taught us to openly unhurried progress in the development of microarchitecture, when innovations have become very few and they almost do not lead to an increase in the specific performance of processor cores. But only seven years ago, the situation was radically different. In particular, the transition from Nehalem to Sandy Bridge was marked by a 15-20% increase in IPC (the number of instructions executed per clock cycle), which was due to a deep redesign of the logical design of the cores with an eye to increasing their efficiency.

Sandy Bridge was based on many principles that have not changed since then and have become standard for most processors today. For example, it was there that a separate zero-level cache appeared for decoded micro-operations, and also a physical register file began to be used, which reduces power consumption when algorithms for out-of-order execution of instructions are running.

But perhaps the most important innovation was that Sandy Bridge was designed as a unified system-on-a-chip, designed simultaneously for all classes of applications: server, desktop and mobile. Most likely, public opinion put it in the great-grandfather of modern Coffee Lake, and not some Nehalem, and certainly not Penryn, precisely because of this feature. However, the total sum of all the alterations in the depths of the Sandy Bridge microarchitecture also turned out to be quite significant. Ultimately, this design lost all of the old P6 (Pentium Pro) siblings that had been here and there in all previous Intel processors.

Speaking about the general structure, one must also remember that a full-fledged graphics core was built into the Sandy Bridge processor crystal for the first time in the history of Intel CPUs. This block went inside the processor after the DDR3 memory controller shared by the L3 cache and the PCI Express bus controller. To connect computational cores and all other "extra-core" parts, Intel engineers implemented a new scalable ring bus in Sandy Bridge, which is used to organize interaction between structural units in subsequent mainstream CPUs to this day.

If we go down to the level of the Sandy Bridge microarchitecture, then one of its key features is support for the family of SIMD instructions, AVX, designed to work with 256-bit vectors. By now, such instructions have become commonplace and do not seem to be something unusual, but their implementation in Sandy Bridge required the expansion of part of the computing executive devices. Intel engineers strived to make working with 256-bit data as fast as with lesser vectors. Therefore, along with the implementation of full-fledged 256-bit executive devices, an increase in the speed of the processor with memory was also required. Logic actuators for loading and saving data in Sandy Bridge received double the performance, in addition, the bandwidth of the L1 cache when reading was symmetrically increased.

We should also mention the dramatic changes made in Sandy Bridge in the operation of the branch prediction block. Thanks to optimizations in the applied algorithms and increased buffer sizes, the Sandy Bridge architecture has reduced the percentage of branch mispredictions by almost half, which not only significantly affected performance, but also further reduced the power consumption of this design.

Ultimately, from today's perspective, Sandy Bridge processors could be called an exemplary embodiment of the "tock" phase in Intel's "tick-tock" principle. Like their predecessors, these processors continued to be based on the 32-nm process technology, but the performance increase they offered was more than convincing. And it was fueled not only by the updated microarchitecture, but also by the clock frequencies increased by 10-15 percent, as well as the introduction of a more aggressive version of the Turbo Boost 2.0 technology. Considering all this, it is well understandable why many enthusiasts still remember Sandy Bridge with the warmest words.

The senior offering in the Core i7 family at the time of the release of the Sandy Bridge microarchitecture was the Core i7-2600K. This processor has a clock speed of 3.3 GHz with the ability to auto-overclock at partial load up to 3.8 GHz. However, 32-nm representatives of Sandy Bridge were distinguished not only by relatively high clock frequencies for that time, but also by good overclocking potential. Among the Core i7-2600K, one could often find specimens capable of operating at frequencies of 4.8-5.0 GHz, which was largely due to the use of a high-quality internal thermal interface in them - flux-free solder.

Nine months after the release of the Core i7-2600K, in October 2011, Intel updated the senior offering in the lineup and offered a slightly accelerated model Core i7-2700K, the nominal frequency of which was increased to 3.5 GHz, and the maximum frequency in turbo mode was up to 3.9 GHz.

However, the life cycle of the Core i7-2700K turned out to be short - in April 2012, the Sandy Bridge was replaced by an updated design. Ivy Bridge... Nothing special: Ivy Bridge belonged to the "tick" phase, that is, it represented a transfer of the old microarchitecture to new semiconductor rails. And in this regard, the progress was really serious - the Ivy Bridge crystals were manufactured using a 22-nm technological process based on three-dimensional FinFET transistors, which were just coming into use at that time.

At the same time, the old Sandy Bridge microarchitecture at the low level remained practically intact. There were only a few cosmetic tweaks that made Ivy Bridge faster and slightly more efficient with Hyper-Threading. True, along the way, the "extra-nuclear" components were somewhat improved. The PCI Express controller received compatibility with the third version of the protocol, and the memory controller increased its capabilities and began to support high-speed DDR3 overclocking memory. But in the end, the increase in specific productivity during the transition from Sandy Bridge to Ivy Bridge was no more than 3-5 percent.

The new technological process did not give serious reasons for joy either. Unfortunately, the introduction of 22-nm standards did not allow to somehow fundamentally increase the clock frequencies of the Ivy Bridge. The older version of the Core i7-3770K received a nominal frequency of 3.5 GHz with the ability to overclock in turbo mode up to 3.9 GHz, that is, in terms of the frequency formula, it turned out to be no faster than the Core i7-2700K. Only energy efficiency has improved, but desktop users traditionally have little concern about this aspect.

All this, of course, can be easily attributed to the fact that no breakthroughs should occur at the “tick” stage, but in some ways Ivy Bridge turned out to be even worse than its predecessors. It's about overclocking. When introducing carriers of this design to the market, Intel decided to abandon the use of a heat spreader cap to a semiconductor crystal in the final assembly of processors with gallium-free soldering. Starting with Ivy Bridge, banal thermal paste was used to organize the internal thermal interface, and this immediately hit the maximum achievable frequencies. The overclocking potential of Ivy Bridge has definitely gotten worse, and as a result, the transition from Sandy Bridge to Ivy Bridge has become one of the most controversial moments in recent history of Intel consumer processors.

Therefore, to the next stage of evolution, Haswell, special hopes were pinned. In this generation, in the "so" phase, major microarchitectural improvements were expected to appear, from which the ability to at least push forward the stalled progress was expected. And to some extent it happened. Introduced in the summer of 2013, the fourth-generation Core processors have indeed made significant improvements in their internal structure.

The main thing: the theoretical power of Haswell execution units, expressed in the number of micro-operations executed per cycle, has grown by a third compared to previous CPUs. The new microarchitecture not only rebalanced the existing executive devices, but also added two additional executive ports for integer operations, branching and address generation. In addition, the microarchitecture gained compatibility with an extended set of vector 256-bit AVX2 instructions, which, thanks to three-operand FMA instructions, doubled the architecture's peak throughput.

In addition to this, Intel engineers revised the capacity of the internal buffers and, where necessary, increased them. The planner window has grown in size. In addition, the integer and real-number physical register files were increased, which improved the processor's ability to reorder the order of execution of instructions. In addition to all this, the cache memory subsystem has also changed significantly. L1 and L2 caches in Haswell got twice as wide bus.

It would seem that the listed improvements should be enough to noticeably raise the specific performance of the new microarchitecture. But no matter how it is. The problem with Haswell's design was that it left the front end of the execution pipeline unchanged and the x86 decoder retained the same performance as before. That is, the maximum rate of decoding an x86 code in a microinstruction remained at the level of 4-5 instructions per cycle. And as a result, when comparing Haswell and Ivy Bridge at the same frequency and under a load that does not use the new AVX2 instructions, the performance gain was only 5-10 percent.

The image of Haswell's microarchitecture was also spoiled by the first wave of processors released on its basis. Relying on the same 22nm process technology as the Ivy Bridge, the new products were unable to offer high frequencies. For example, the older Core i7-4770K again received a base frequency of 3.5 GHz and a maximum frequency in turbo mode at 3.9 GHz, that is, in comparison with previous generations of Core, there has been no progress.

At the same time, with the introduction of the next technological process with 14 nm norms, Intel began to face all sorts of difficulties, so a year later, in the summer of 2014, not the next generation of Core processors was brought to the market, but the second phase of Haswell, which was codenamed Haswell Refresh, or, if we talk about flagship modifications, then Devil's Canyon. As part of this update, Intel was able to noticeably increase the clock speeds of the 22nm CPU, which really breathed new life into them. As an example, we can cite the new senior processor Core i7-4790K, which took the 4.0 GHz mark at the nominal frequency and got the maximum frequency, taking into account the turbo mode, at 4.4 GHz. It is surprising that such a half-gigahertz acceleration was achieved without any technical process reforms, but only due to simple cosmetic changes in the processor power supply circuit and due to the improvement of the heat-conducting properties of the thermal paste used under the CPU cover.

However, even the representatives of the Devil's Canyon family could not become the proposals especially complained about among the enthusiasts. Against the background of the results of Sandy Bridge, their overclocking was not outstanding, besides, reaching high frequencies required complex “scalping” - dismantling the processor cover with the subsequent replacement of the standard thermal interface with some material with better thermal conductivity.

Due to the difficulties that followed Intel in transferring mass production to 14nm standards, the performance of the next, fifth generation of Core processors, Broadwell, it turned out to be very crumpled. For a long time, the company could not decide whether it was worth launching desktop processors with this design on the market at all, since when trying to manufacture large semiconductor crystals, the level of rejects exceeded acceptable values. Ultimately, Broadwell quad-cores intended for desktop computers did appear, but, firstly, this happened only in the summer of 2015 - with a nine-month delay relative to the originally planned date, and secondly, two months after their announcement, Intel presented the design next generation, Skylake.

Nevertheless, from the point of view of the development of the microarchitecture, Broadwell can hardly be called a secondary development. Moreover, this generation of desktop processors used solutions that Intel had never resorted to either before or since. The uniqueness of the desktop Broadwell was determined by the fact that they were penetrated by a productive integrated graphics core Iris Pro of the GT3e level. And this means not only that the processors of this family possessed the most powerful integrated video core at that time, but also that they were equipped with an additional 22-nm Crystall Well crystal, which is a fourth-level cache memory based on eDRAM.

The rationale for adding a separate fast integrated memory chip to the processor is quite obvious and is due to the needs of a productive integrated graphics core in a frame buffer with low latency and high bandwidth. However, the eDRAM installed in Broadwell was architecturally designed as a victim cache, and the computing cores of the CPU could also use it. As a result, desktop Broadwell became the only mass processors of their kind with 128 MB L4 cache. However, at the same time the volume of the L3 cache located in the processor chip suffered somewhat, which was reduced from 8 to 6 MB.

Some improvements have been incorporated into the basic microarchitecture as well. Despite the fact that Broadwell was in the tick phase, rework touched the entrance of the execution pipeline. The out-of-order execution scheduler window was increased, the volume of the table of associative translation of second-level addresses increased by one and a half times, and, in addition, the entire translation scheme acquired a second miss handler, which made it possible to process two address translation operations in parallel. In total, all the innovations have increased the efficiency of out-of-order execution of commands and prediction of complex code branches. Along the way, the mechanisms for performing multiplication operations were improved, which in Broadwell began to be processed at a significantly faster pace. As a result of all this, Intel was even able to argue that improvements in microarchitecture increased the specific performance of Broadwell compared to Haswell by about five percent.

But despite all this, it was impossible to talk about any significant advantage of the first desktop 14-nm processors. Both the L4 cache and microarchitectural changes only tried to compensate for Broadwell's main flaw - low clock frequencies. Due to problems with the technological process, the base frequency of the older member of the family, Core i7-5775C, was set only at 3.3 GHz, and the turbo frequency did not exceed 3.7 GHz, which turned out to be worse than the characteristics of Devil's Canyon by as much as 700 MHz.

A similar story happened with overclocking. The maximum frequencies to which it was possible to heat up the desktop Broadwell without using advanced cooling methods were in the region of 4.1-4.2 GHz. Therefore, it is not surprising that consumers were skeptical about the release of Broadwell, and processors of this family remained a strange niche solution for those who were interested in a productive integrated graphics core. The very first full-fledged 14-nm chip for desktop computers, which was able to attract the attention of wide layers of users, was only the next project of the microprocessor giant - Skylake.

Skylake, like the previous generation processors, was manufactured using a 14-nm process technology. However, here Intel was already able to achieve normal clock speeds and overclocking: the older desktop version of Skylake, Core i7-6700K, received a nominal frequency of 4.0 GHz and auto-overclocking in turbo mode to 4.2 GHz. These are slightly lower values \u200b\u200bwhen compared with the Devil's Canyon, but the newer processors are definitely faster than their predecessors. The fact is that Skylake is "so" in Intel's nomenclature, which means significant changes in the microarchitecture.

And they really are. At first glance, there were not many improvements in the Skylake design, but they were all targeted and allowed to eliminate the existing weaknesses in the microarchitecture. In short, Skylake got larger internal buffers for deeper out-of-order execution of instructions and higher cache memory bandwidth. Improvements have been made to the branch prediction block and the input portion of the execution pipeline. The rate of execution of division instructions has also been increased, and the mechanisms for executing operations of addition, multiplication and FMA instructions have been rebalanced. To top it off, the developers have worked to improve the efficiency of the Hyper-Threading technology. In total, this resulted in an approximately 10 percent improvement in performance per clock compared to previous generations of processors.

In general, Skylake can be characterized as a deep enough optimization of the original Core architecture, so that there are no bottlenecks in the processor design. On the one hand, due to an increase in the decoder power (from 4 to 5 micro-ops per clock) and the speed of the micro-ops cache (from 4 to 6 micro-ops per clock), the instruction decoding rate has significantly increased. On the other hand, the efficiency of processing the resulting micro-operations has increased, which was facilitated by the deepening of out-of-order execution algorithms and the redistribution of the capabilities of the execution ports, along with a serious revision of the execution rate of a number of ordinary, SSE and AVX commands.

For example, Haswell and Broadwell had two ports each for performing multiplications and FMA operations on real numbers, but only one port was intended for additions, which did not correspond well to the real program code. In Skylake, this imbalance was eliminated and additions began to be performed on two ports. In addition, the number of ports capable of handling integer vector instructions has grown from two to three. Ultimately, all this led to the fact that for almost any type of operation in Skylake there are always several alternative ports. This means that in the microarchitecture almost all possible reasons for the downtime of the pipeline were finally successfully eliminated.

Noticeable changes have also affected the caching subsystem: the throughput of the L2 and L3 cache has been increased. In addition, the associativity of the L2 cache was reduced, which ultimately made it possible to improve its efficiency and reduce the penalty when processing misses.

Substantial changes have also taken place at a higher level. So, in Skylake, the bandwidth of the ring bus, which connects all processor units, has doubled. In addition, the CPU of this generation has a new memory controller, which is now compatible with DDR4 SDRAM. And in addition to this, a new DMI 3.0 bus with doubled bandwidth was used to connect the processor to the chipset, which made it possible to implement high-speed PCI Express 3.0 lines, including through the chipset.

However, like all previous versions of the Core architecture, Skylake was another variation on the original design. This means that in the sixth generation of the Core microarchitecture, Intel developers continued to adhere to the tactics of phased implementation of improvements at each development cycle. In general, this is not a very impressive approach, which does not allow you to see any significant changes in performance right away - when comparing CPUs from neighboring generations. But on the other hand, when modernizing old systems, it is not difficult to notice a tangible increase in performance. For example, Intel itself willingly compared Skylake to Ivy Bridge, while demonstrating that in three years the speed of processors has increased by more than 30 percent.

And in reality it was quite serious progress, because then everything became much worse. After Skylake, any improvement in specific performance of processor cores stopped altogether. Those processors currently on the market still continue to use the Skylake microarchitectural design, despite the fact that almost three years have passed since its introduction in desktop processors. The unexpected downtime was due to the fact that Intel was unable to cope with the implementation of the next version of the semiconductor process with 10nm norms. As a result, the whole “tick-tock” principle fell apart, forcing the microprocessor giant to somehow get out and engage in multiple re-release of old products under new names.

Generation processors Kaby Lake, which appeared on the market at the very beginning of 2017, became the first and very striking example of Intel's attempts to sell the same Skylake to customers for the second time. The close family ties between the two generations of processors were not particularly hidden. Intel honestly said that Kaby Lake is no longer a "tick" or "so", but a simple optimization of the previous design. At the same time, the word "optimization" meant some improvements in the structure of 14-nm transistors, which opened up the possibility of increasing clock frequencies without changing the thermal package. For the modified technical process, a special term "14+ nm" was even coined. Thanks to this manufacturing technology, Kaby Lake's senior mainstream desktop processor, dubbed the Core i7-7700K, was able to offer users a nominal 4.2 GHz frequency and a 4.5 GHz turbo frequency.

Thus, the increase in frequencies of Kaby Lake in comparison with the original Skylake was about 5 percent, and that was all, which, frankly, cast doubt on the legality of attributing Kaby Lake to the next generation of Core. Up to this point, each subsequent generation of processors, no matter whether it belonged to the "tick" or "tock" phase, provided at least some increase in the IPC indicator. Meanwhile, in Kaby Lake, there were no microarchitectural improvements at all, so it would be more logical to consider these processors just the second stepping of Skylake.

However, the new version of the 14-nm technical process was still able to prove itself in some ways: the overclocking potential of Kaby Lake in comparison with Skylake grew by about 200-300 MHz, due to which the processors of this series were warmly received by enthusiasts. True, Intel continued to use thermal paste instead of solder under the processor cover, so scalping was necessary to fully overclock Kaby Lake.

Intel did not cope with the introduction of 10nm technology by the beginning of this year. Therefore, at the end of last year, another type of processors based on the same Skylake microarchitecture was introduced to the market - Coffee Lake... But talking about Coffee Lake as the third guise of Skylake is not entirely correct. Last year was a period of a radical paradigm shift in the processor market. AMD returned to the “big game”, which was able to break the established traditions and create demand for mass processors with more than four cores. Suddenly Intel found itself in a catch-up role, and the release of Coffee Lake was not so much an attempt to fill the gap before the long-awaited 10nm Core processors, as a reaction to the release of six- and eight-core AMD Ryzen processors.

As a result, Coffee Lake processors received an important structural difference from their predecessors: the number of cores in them was increased to six pieces, which happened for the first time with the mainstream Intel platform. However, at the same time, no changes were introduced at the microarchitecture level: Coffee Lake is essentially a six-core Skylake, assembled on the basis of exactly the same computational cores in the internal structure, which are equipped with an L3 cache increased to 12 MB (according to the standard principle of 2 MB per core ) and are united by the usual ring bus.

However, despite the fact that we so easily allow ourselves to talk about Coffee Lake "nothing new", it is not entirely fair to say that there have been no changes. Although nothing has changed in the microarchitecture again, Intel specialists had to spend a lot of effort in order for the six-core processors to fit into the standard desktop platform. And the result was quite convincing: the six-core processors remained faithful to the usual thermal package and, moreover, did not slow down at all in clock frequencies.

In particular, the senior representative of the Coffee Lake generation, the Core i7-8700K, received a base frequency of 3.7 GHz, and in turbo mode it can accelerate to 4.7 GHz. At the same time, the overclocking potential of Coffee Lake, despite its more massive semiconductor crystal, turned out to be even better than that of all its predecessors. Core i7-8700K is often brought by their ordinary owners to the 5 GHz line, and such overclocking can be real even without scalping and replacing the internal thermal interface. And this means that Coffee Lake, although extensive, is a significant step forward.

All this became possible exclusively thanks to the next improvement of the 14nm technological process. In the fourth year of its use for mass production of desktop chips, Intel has achieved truly impressive results. The introduced third version of the 14-nm standards ("14 ++ nm" in the manufacturer's designations) and the rearrangement of the semiconductor crystal made it possible to significantly improve the performance in terms of each watt spent and to raise the total computing power. With the introduction of the six-core Intel, perhaps, was able to take an even more significant step forward than any of the previous microarchitecture improvements. And today Coffee Lake looks like a very tempting option for modernizing old systems based on previous carriers of the Core microarchitecture.

Codename	Technical process	Number of cores	GPU	L3 cache, MB	Number of transistors, billion	Crystal area, mm 2
Sandy bridge	32 nm	4	GT2	8	1,16	216
Ivy bridge	22 nm	4	GT2	8	1,2	160
Haswell	22 nm	4	GT2	8	1,4	177
Broadwell	14 nm	4	GT3e	6	N / a	~ 145 + 77 (eDRAM)
Skylake	14 nm	4	GT2	8	N / a	122
Kaby lake	14+ nm	4	GT2	8	N / a	126
Coffee lake	14 ++ nm	6	GT2	12	N / a	150

⇡ Processors and Platforms: Specifications

To compare the last seven generations of Core i7, we took the senior representatives in the respective series - one from each design. The main characteristics of these processors are shown in the following table.

	Core i7-2700K	Core i7-3770K	Core i7-4790K	Core i7-5775C	Core i7-6700K	Core i7-7700K	Core i7-8700K
Codename	Sandy bridge	Ivy bridge	Haswell (Devil's Canyon)	Broadwell	Skylake	Kaby lake	Coffee lake
Production technology, nm	32	22	22	14	14	14+	14++
release date	23.10.2011	29.04.2012	2.06.2014	2.06.2015	5.08.2015	3.01.2017	5.10.2017
Kernels / threads	4/8	4/8	4/8	4/8	4/8	4/8	6/12
Base frequency, GHz	3,5	3,5	4,0	3,3	4,0	4,2	3,7
Turbo Boost frequency, GHz	3,9	3,9	4,4	3,7	4,2	4,5	4,7
L3 cache, MB	8	8	8	6 (+128 MB eDRAM)	8	8	12
Memory support	DDR3-1333	DDR3-1600	DDR3-1600	DDR3L-1600	DDR4-2133	DDR4-2400	DDR4-2666
Instruction set extensions	AVX	AVX	AVX2	AVX2	AVX2	AVX2	AVX2
Integrated graphics	HD 3000 (12 EU)	HD 4000 (16 EU)	HD 4600 (20 EU)	Iris Pro 6200 (48 EU)	HD 530 (24 EU)	HD 630 (24 EU)	UHD 630 (24 EU)
Max. graphics core frequency, GHz	1,35	1,15	1,25	1,15	1,15	1,15	1,2
PCI Express version	2.0	3.0	3.0	3.0	3.0	3.0	3.0
PCI Express Lines	16	16	16	16	16	16	16
TDP, W	95	77	88	65	91	91	95
Socket	LGA1155	LGA1155	LGA1150	LGA1150	LGA1151	LGA1151	LGA1151v2
Official price	$332	$332	$339	$366	$339	$339	$359

Interestingly, in the seven years since the release of Sandy Bridge, Intel has not been able to significantly increase the clock speeds. Despite the fact that the manufacturing process has changed twice and the microarchitecture has been seriously optimized twice, today's Core i7 has hardly advanced in terms of its operating frequency. The newest Core i7-8700K has a nominal frequency of 3.7 GHz, which is only 6 percent higher than the frequency of the 2011 Core i7-2700K.

However, this comparison is not entirely correct, because Coffee Lake has one and a half times more processing cores. If we focus on the quad-core Core i7-7700K, then the increase in frequency looks still more convincing: this processor accelerated relative to the 32-nm Core i7-2700K by a fairly significant 20 percent in megahertz terms. Although it can hardly be called an impressive gain anyway: in absolute terms, it translates into an increase of 100 MHz per year.

There are no breakthroughs in other formal characteristics either. Intel continues to supply all of its processors with an individual L2 cache of 256 KB per core, as well as a shared L3 cache for all cores, the size of which is determined at the rate of 2 MB per core. In other words, the main factor that has made the greatest progress is the number of cores. Core development started with quad-core CPUs, and came to six-core ones. Moreover, it is obvious that this is not the end, and in the near future we will see eight-core versions of Coffee Lake (or Whiskey Lake).

However, as it is easy to see, for seven years Intel has hardly changed its pricing policy either. Even the six-core Coffee Lake has risen in price by only six percent compared to the previous four-core flagships. All the rest of the older Core i7 class processors for the mass platform have always cost consumers about $ 330-340.

It is curious that the biggest changes took place not even with the processors themselves, but with their support for RAM. The throughput of dual-channel SDRAM has doubled since the release of Sandy Bridge until today: from 21.3 GB / s to 41.6 GB / s. And this is another important circumstance that determines the advantage of modern systems compatible with high-speed DDR4 memory.

Anyway, all these years, the rest of the platform has evolved along with the processors. If we are talking about the main milestones in the development of the platform, then, in addition to the increase in the speed of compatible memory, I would also like to note the appearance of support for the PCI Express 3.0 graphics interface. It seems that fast memory and fast graphics bus, along with advances in frequencies and processor architectures, are powerful reasons why modern systems are better and faster than the past. DDR4 SDRAM support appeared in Skylake, and the transfer of the PCI Express processor bus to the third version of the protocol took place in Ivy Bridge.

In addition, the system logic sets accompanying the processors received a noticeable development. Indeed, today's Intel chipsets of the three hundredth series can offer much more interesting features in comparison with Intel Z68 and Z77, which were used in LGA1155 motherboards for Sandy Bridge generation processors. It is easy to verify this from the following table, in which we have brought together the characteristics of Intel's flagship chipsets for the mass platform.

	P67 / Z68	Z77	Z87	Z97	Z170	Z270	Z370
CPU Compatibility	Sandy bridge Ivy bridge		Haswell	Haswell Broadwell	Skylake Kaby lake		Coffee lake
Interface	DMI 2.0 (2 GB / s)				DMI 3.0 (3.93 GB / s)
PCI Express standard	2.0				3.0
PCI Express Lines	8				20	24
PCIe M.2 support	No			there is	Yes, up to 3 devices
PCI support	there is	No
SATA 6Gb / s	2		6
SATA 3Gb / s	4		0
USB 3.1 Gen2	0
USB 3.0	0	4	6		10
USB 2.0	14	10	8		4

In modern sets of logic, the possibilities for connecting high-speed storage media have significantly developed. Most importantly, thanks to the transition of chipsets to the PCI Express 3.0 bus, today in performance assemblies you can use high-speed NVMe drives, which, even compared to SATA SSDs, can offer noticeably better responsiveness and faster read and write speeds. And this alone can be a strong argument in favor of modernization.

In addition, modern systems logic sets provide much richer options for connecting additional devices. And we are not only talking about a significant increase in the number of PCI Express lanes, which ensures the presence of several additional PCIe slots on the boards, replacing conventional PCI. Along the way, today's chipsets also have native support for USB 3.0 ports, and many modern motherboards are equipped with USB 3.1 Gen2 ports.

4th generation Intel Core processors (Haswell) are part of the Core i7 and Core i5 lines, are manufactured according to the 22nm technological process for the LGA 1150 socket and are intended primarily for 2-in-1 devices that support the functionality of mobile and tablet PCs, as well as portable monoblocks.

Haswell's 4th generation Intel Core processors are primarily designed for ultrabook devices.
They provide 50% longer runtime under active loads than previous generation processors.
High energy efficiency allows selected Ultrabook models to last more than 9 hours on a single charge.

The processors have integrated graphics that provide performance comparable to discrete graphics solutions.
These processors have twice the graphics performance of previous generation Intel processors.

The corporation is ready to present more than 50 different options for 2-in-1 form factor devices in a variety of price categories.

The flagship of this family is the Core i7-4770K processor, consisting of 1.4 billion transistors and, in addition to a quartet of x86 cores with Hyper-Threading support, including HD Graphics 4600 graphics, a controller supporting up to 32 GB of dual-channel DDR3 1600 memory and 8 MB of cache third level.

The CPU is clocked at 3.5GHz (up to 3.9GHz with Turbo Boost), plus this model features an 84W TDP and an unlocked multiplier, allowing for some serious overclocking.

4th Generation Intel Core i7 Desktop:

. Intel Core i7-4770T: unlocked multiplier, 45W TDP, 4 cores, 8 threads, 2.5 GHz base, 3.7 GHz Turbo, 1333/1600 MHz DDR3, 8 MB L3 cache, Intel HD Graphics 4600 up to 1200 MHz, LGA-1150

. Intel Core i7-4770S: unlocked multiplier, TDP 65W, 4 cores, 8 threads, 3.1 GHz base, 3.9 GHz Turbo, 1333/1600 MHz DDR3, 8 MB L3 cache, Intel HD Graphics 4600 up to 1200 MHz, LGA-1150

. Intel Core i7-4770: unlocked multiplier, TDP 84W, 4 cores, 8 threads, 3.4 GHz base, 3.9 GHz Turbo, 1333/1600 MHz DDR3, 8 MB L3 cache, Intel HD Graphics 4600 up to 1200 MHz, LGA-1150

. Intel Core i7-4770K: unlocked multiplier, TDP 84W, 4 cores, 8 threads, 3.5 GHz base, 3.9 GHz Turbo, 1333/1600 MHz DDR3, 8 MB L3 cache, Intel HD Graphics 4600 up to 1250 MHz, LGA-1150

. Intel Core i7-4770R: unlocked multiplier, 65W TDP, 4 cores, 8 threads, 3.2 GHz base, 3.9 GHz Turbo, 1333/1600 MHz DDR3, 8 MB L3 cache, Intel Iris Pro 5200 graphics up to 1300 MHz, BGA

. Intel Core i7-4765T: unlocked multiplier, 35W TDP, 4 cores, 8 threads, 2.0 GHz base, 3.0 GHz Turbo, 1333/1600 MHz DDR3, 8 MB L3 cache, Intel HD Graphics 4600 up to 1200 MHz, LGA-1150

4th Generation Intel Core i5 Desktop:

. Intel Core i5-4670T: unlocked multiplier, 45W TDP, 4 cores, 4 threads, 2.3 GHz base, 3.3 GHz Turbo, 1333/1600 MHz DDR3, 6 MB L3 cache, Intel HD Graphics 4600 up to 1200 MHz, LGA-1150

. Intel Core i5-4670S: unlocked multiplier, 65W TDP, 4 cores, 4 threads, 3.1 GHz base, 3.8 GHz Turbo, 1333/1600 MHz DDR3, 6MB L3 cache, Intel HD Graphics 4600 up to 1200 MHz, LGA-1150

. Intel Core i5-4670K

. Intel Core i5-4670: unlocked multiplier, TDP 84W, 4 cores, 4 threads, 3.4 GHz base, 3.8 GHz Turbo, 1333/1600 MHz DDR3, 6MB L3 cache, Intel HD Graphics 4600 up to 1200 MHz, LGA-1150

. Intel Core i5-4570: unlocked multiplier, TDP 84W, 4 cores, 4 threads, 3.2 GHz base, 3.6 GHz Turbo, 1333/1600 MHz DDR3, 6MB L3 cache, Intel HD Graphics 4600 up to 1200 MHz, LGA-1150

. Intel Core i5-4570S: unlocked multiplier, TDP 65W, 4 cores, 4 threads, 2.9 GHz base, 3.6 GHz Turbo, 1333/1600 MHz DDR3, 6MB L3 cache, Intel HD Graphics 4600 up to 1200 MHz, LGA-1150

. Intel Core i5-4570T: unlocked multiplier, 35W TDP, 2 cores, 4 threads, 2.9 GHz base, 3.6 GHz Turbo, 1333/1600 MHz DDR3, 6MB L3 cache, Intel HD Graphics 4600 up to 1200 MHz, LGA-1150

Intel has come a very long way of development, from a small chip manufacturer to the world leader in the production of processors. During this time, many technologies for the production of processors have been developed, the technological process and characteristics of devices have been greatly optimized.

A lot of processor performance depends on the location of the transistors on the silicon crystal. The technology for arranging transistors is called microarchitecture, or simply architecture. In this article, we'll look at what Intel processor architectures have been used throughout the company's development and how they differ from each other. Let's start with the most ancient microarchitectures and go all the way to new processors and future plans.

As I said, in this article we will not consider the bit capacity of processors. By the word architecture, we mean the microarchitecture of a microcircuit, the location of transistors on a printed circuit board, their size, distance, technological process, all this is covered by this concept. We will not touch RISC and CISC instruction sets either.

The second thing to look out for is Intel processor generations. You've probably heard many times - this is the fifth generation processor, the fourth, and this is the seventh. Many people think that this is designated i3, i5, i7. But actually there is no i3, and so on - these are processor brands. And the generation depends on the architecture used.

With each new generation, the architecture improved, processors became faster, more economical and smaller, they emitted less heat, but at the same time they were more expensive. There are few articles on the Internet that would describe all this in full. Now let's look at how it all began.

Intel processor architectures

I say right away that you should not expect technical details from the article, we will consider only the basic differences that will be of interest to ordinary users.

First processors

First, let's briefly plunge into history to understand how it all began. Let's not go too far and start with 32-bit processors. The first was Intel 80386, it appeared in 1986 and could work at frequencies up to 40 MHz. Older processors also had a generational count. This processor belongs to the third generation, and here the 1500 nm technical process was used.

The next, fourth generation was the 80486. The architecture used in it was called the 486. The processor ran at 50 MHz and could execute 40 million instructions per second. The processor had 8 KB of the first level cache, and for the manufacture was used the technological process of 1000 nm.

The next architecture was the P5 or Pentium. These processors appeared in 1993, here the cache was increased to 32 kb, the frequency to 60 MHz, and the technical process was reduced to 800 nm. In the sixth generation P6, the cache size was 32 KB, and the frequency reached 450 MHz. The tech process has been reduced to 180 nm.

Then the company began producing processors based on the NetBurst architecture. It used 16 KB of the first level cache for each core, and up to 2 MB of the second level cache. The frequency increased to 3 GHz, and the process technology remained at the same level - 180 nm. Already 64 bit processors appeared here that supported addressing more memory. There were also many command enhancements, and Hyper-Threading technology was added, which allowed two threads to be created from a single core, which increased performance.

Naturally, each architecture has improved over time, increased frequency and decreased process technology. There were also intermediate architectures, but here everything was simplified a little, since this is not our main topic.

Intel Core

NetBurst was replaced in 2006 by Intel Core architecture. One of the reasons for the development of this architecture was the impossibility of increasing the frequency in NetBrust, as well as its very high heat dissipation. This architecture was designed for the development of multi-core processors, the size of the first-level cache was increased to 64 KB. The frequency remained at the level of 3 GHz, but the power consumption, as well as the technical process, was greatly reduced to 60 nm.

Core processors supported Intel-VT hardware virtualization, as well as some command extensions, but did not support Hyper-Threading as they were developed from the P6 architecture, where this was not yet possible.

First generation - Nehalem

Further, the numbering of generations was started from the beginning, because all the following architectures are improved versions of Intel Core. The Nehalem architecture replaced the Core, which had some limitations such as the inability to increase the clock speed. She appeared in 2007. It uses 45 nm process technology and added support for Hyper-Therading technology.

Nehalem processors have 64 KB L1 cache, 4 MB L2 cache and 12 MB L3 cache. The cache is available for all processor cores. It also became possible to embed a graphics accelerator into the processor. The frequency has not changed, but the performance and size of the PCB have increased.

Second generation - Sandy Bridge

Sandy Bridge appeared in 2011 to replace Nehalem. It already uses the 32 nm process technology, uses the same amount of the first level cache, 256 MB of the second level cache and 8 MB of the third level cache. Experimental models used up to 15 MB of shared cache.

Also, now all devices are available with integrated graphics accelerator. The maximum frequency has been increased as well as the overall performance.

Third generation - Ivy Bridge

Ivy Bridge processors are faster than Sandy Bridge, and are manufactured using a 22 nm process technology. They consume 50% less energy than previous models and also offer 25-60% higher performance. The processors also support Intel Quick Sync technology, which allows you to encode video several times faster.

Fourth generation - Haswell

The Haswell generation of Intel processor was developed in 2012. Here the same technical process was used - 22 nm, the cache design was changed, the power consumption mechanisms were improved and performance was slightly improved. But the processor supports many new sockets: LGA 1150, BGA 1364, LGA 2011-3, DDR4 technology, and so on. Haswell's main advantage is that it can be used in portable devices due to its very low power consumption.

Fifth generation - Broadwell

This is an improved version of the Haswell architecture, which uses the 14nm process technology. In addition, several architectural improvements were made that resulted in an average performance improvement of 5%.

Sixth Generation - Skylake

The next architecture of intel core processors - the sixth generation Skylake was released in 2015. This is one of the most significant updates to the Core architecture. To install the processor on the motherboard, an LGA 1151 socket is used, now DDR4 memory is supported, but DDR3 support is retained. Thunderbolt 3.0 is supported, as well as the DMI 3.0 bus, which gives twice the speed. And by tradition, there has been increased productivity, as well as reduced energy consumption.

Seventh generation - Kaby Lake

The new, seventh generation Core - Kaby Lake came out this year, with the first processors arriving in mid-January. There weren't many changes here. The 14 nm process technology is preserved, as well as the same LGA 1151 socket. Supports DDR3L SDRAM and DDR4 SDRAM, PCI Express 3.0, USB 3.1 buses. In addition, the frequency was slightly increased, and the density of the transistors was also reduced. The maximum frequency is 4.2 GHz.

conclusions

In this article, we looked at the Intel processor architectures that were used in the past, as well as those that are in use today. Then the company plans to switch to the 10 nm process technology and this generation of intel processors will be called CanonLake. But so far Intel is not ready for this.

Therefore, in 2017 it is planned to release an improved version of SkyLake codenamed Coffe Lake. There may also be other microarchitectures of the Intel processor until the company fully masters the new technical process. But we will learn about all this over time. I hope this information was helpful to you.

about the author

Founder and site administrator, I am fond of open source software and the Linux operating system. I currently use Ubuntu as my main OS. Besides Linux, I am interested in everything related to information technology and modern science.