The Look

For a few generations now, Huawei has been cultivating a specific look on its devices. The machined aluminium metal unibody combined with the gaps required for the antenna meant that the Mate S, the Mate 8, the Mate 9, and the P9 felt like part of the family. I didn’t get the same feeling with the base P10 models, and I also don’t get the same feeling with the Mate 10 either.  There are three immediate reasons I can think of.

First are the color choices. As I am writing this piece, I have only seen the Mate 10 and Mate 10 Pro in dark colors. When I put them side-by-side with other devices, it does not look significantly different.


Huawei P9, Huawei Mate 10, Huawei Mate 9, LG V30+

This is especially true in low light, and there’s no defining ‘Huawei’ feature. On the rear, the dark color again hides the fact that it is a Huawei device, aside from the perhaps odd way the dual cameras look. There is a band on some of the colors to signify a ‘strip’ where the cameras are, but this is not part of Huawei’s regular look. The strips we have seen to date come on the P9 and P10, not on the Mate units. One caveat to all this: when Huawei launched the P10 in ‘Greenery’, in collaboration with Pantone, it seemed odd at the time. But I can now pick that phone out of a crowd, it is so obvious. There is something to be said about being different.

A note on colors: the Mate 10 will be offered in Mocha Brown, Black, Champagne Gold, and Pink Gold. The Mate 10 Pro will be in Midnight Blue, Titanium Gray, Mocha Brown, and Pink Gold. The Mate 10 Porsche Design will be in Diamond Black only.

Second is the fingerprint sensor. This is perhaps more of a personal issue, but to date I have preferred rear fingerprint sensors. Moving to the front for the P10 put me off a little (especially in a dark color), and the fact that the regular Mate 10 now goes this way, with a thin fingerprint sensor, seems a little off-putting.

Third is the display. With most major smartphone manufacturers focusing on this ‘all-screen’ display technology, there leaves little room for individualization for the OEMs to make a mark. Apple, either by luck or by design, got this right. Despite the backlash on the iPhone X about that little notch for the cameras, there is no mistaking that a phone with a notch is an iPhone X. The Mate 10 and Mate 10 Pro do not have the same instantly recognizable look. How to make it obviously recognizable (and different to the iPhone) is for someone paid a lot more than me to think about, but it means the Mate 10 and Mate 10 Pro have the potential to be lost in the crowd. The P11 (if there is one next year) will have to do something on this front.

The Silicon: The Kirin 970

On the silicon side, at the heart of the new Mate 10 phones is the Kirin 970 SoC. The new Kirin 970 is fabbed at TSMC using its smartphone-focused 10nm process. We were expecting Huawei/HiSilicon to be the first SoC vendor to 10nm last year, but its release cycle was just before 10nm ramped up for mass production. The chip uses the same ARM Cortex-A73 and ARM Cortex-A53 cores as the previous generation, although this time running from more mature blueprints. For the last generation Huawei was the first to the gate with ARM’s latest cores, which had a bit of concern on the power side as shown in Matt’s review. ARM announced the next generation A75/A55 cores earlier this year, but in true ‘not ready yet’ fashion for Huawei, these designs are not ready for mass production.


A PCB mockup of the Kirin chip, alongside a 1.4 cm square Core i7 logo

Aside from the A73/A53 cores, the Kirin 970 uses ARM’s latest Mali G72 graphics, this time in an MP12 configuration. This means a base +50% gain for graphics cores, along with the improvements from G71 to G72, but the benefits of a ‘wider’ graphics engine typically allow running it at lower frequencies, nearer the power efficiency point, and saving power. In the game of silicon cat and mouse, balancing die size with cost and power, Huawei has gone for added cost/die size in order to reduce power consumption.

HiSilicon High-End Kirin SoC Lineup
SoC Kirin 970 Kirin 960 Kirin 950/955
CPU 4x A73 @ 2.40 GHz
4x A53 @ 1.80 GHz
4x A73 @ 2.36GHz
4x A53 @ 1.84GHz
4x A72 @ 2.30/2.52GHz
4x A53 @ 1.81GHz
GPU ARM Mali-G72MP12
? MHz
ARM Mali-G71MP8
1037MHz
ARM Mali-T880MP4
900MHz
LPDDR4
Memory
2x 32-bit
LPDDR4 @ 1833 MHz
2x 32-bit
LPDDR4 @ 1866MHz
29.9GB/s
2x 32-bit
LPDDR4 @ 1333MHz 21.3GB/s
Interconnect ARM CCI ARM CCI-550 ARM CCI-400
Storage UFS 2.1 UFS 2.1 eMMC 5.0
ISP/Camera Dual 14-bit ISP Dual 14-bit ISP
(Improved)
Dual 14-bit ISP
940MP/s
Encode/Decode 2160p60 Decode
2160p30 Encode
 
2160p30 HEVC & H.264
Decode & Encode

2160p60 HEVC
Decode
1080p H.264
Decode & Encode

2160p30 HEVC
Decode
Integrated Modem Kirin 970 Integrated LTE
(Category 18)
DL = 1200 Mbps
3x20MHz CA, 256-QAM
UL = 150 Mbps
2x20MHz CA, 64-QAM
Kirin 960 Integrated LTE
(Category 12/13)
DL = 600Mbps
4x20MHz CA, 64-QAM
UL = 150Mbps
2x20MHz CA, 64-QAM
Balong Integrated LTE
(Category 6)
DL = 300Mbps
2x20MHz CA, 64-QAM
UL = 50Mbps
1x20MHz CA, 16-QAM
Sensor Hub i7 i6 i5
NPU Yes No No
Mfc. Process TSMC 10nm TSMC 16nm FFC TSMC 16nm FF+

The third main metric for in the hardware is going to be its new ‘Neural Processing Unit’, or NPU. This is silicon dedicated to running artificial intelligence calculations and frameworks, in the form of neural networks. As with other task-specific processors, technically these AI tasks can be run on the CPU or GPU, but because an AI network can run at lower precision and have fixed calculation steps, by developing specific hardware it allows for higher performance at much lower power – the same basic rationale behind GPUs for graphics, ISPs for image processing, etc.

The IP for Huawei’s NPU comes from Cambricon Technologies, and from a high-level might be considered similar to NVIDIA’s Tensor Cores. We are under the impression the Huawei NPU runs several 3x3x3 matrix multiply engines, whereas the Tensor cores run 4x4x4. Huawei runs all this in 16-bit floating point mode, and has a listed performance of 1.92 TFLOPs. This is a relatively high number, and for reference is twice the throughput as what Apple quotes for its new Neural Engine found in the A11 Bionic processors for the iPhone 8 and iPhone X.

The latest unconfirmed reports I have seen put Huawei’s NPU at around 25-30% of the full silicon area. They are quoting ‘under 100 mm2’ for the total die size, and a total of 5.5 billion transistors. That comes out to a surprising 55 million transistors per square millimeter using TSMC’s 10nm process, which is double that of AMD’s Ryzen design, and even above Intel’s own 48MTr/mm2 estimate given at their manufacturing day.

If Huawei did not have an NPU, the die size would be a lot smaller, and here comes a fundamental fact as we move to even smaller process nodes (as in, physically smaller, rather than just a smaller number for a name): it becomes harder and harder to extract pure performance out of a non-parallel design. A chip designer either makes a smaller chip, or spends the transistors on dedicated hardware – either supporting a new video encoder algorithm, a new DSP, or in this case, hardware specifically for artificial intelligence networks.

Smartphone as a Desktop

I remember, almost ten years ago, one of Anand’s prophecies.  It went something like this:

“Give me a smartphone, with all my files, I can dock and use as a PC, and it will revolutionize personal computing.”

At the time, Anand predicted that Microsoft had all the key elements in place: an OS, a smartphone platform, and potentially a gaming platform in the Xbox. All Microsoft had to do was put them all together, although at the time they were focusing on other matters, such as Windows 8 and fixing Windows 8.

Initially we saw Windows RT running on ARM on some hybrid tablets, but the ecosystem did not bite. Eventually we saw Windows' Continuum functionality hit the scene to not a lot of fanfare. It required significant grunt, and we saw a device from Acer, a device from HP, and it also had a slow death.

Qualcomm are going to push the concept via the Windows on Snapdragon platform, using the Snapdragon 835. Qualcomm is working with Microsoft and combined they are working with most of the major laptop OEMs to provide ARM devices that can run almost a full-blown copy of Windows. These are still laptops though, and not Anand’s original vision of a smartphone.

Huawei is going to try and roll its own solution to this. When connecting to a TV, a custom Linux interface will spring up like a traditional desktop operating system, somewhat similar to Samsung's recently launched DeX feature. Bluetooth devices can be connected, and it will have access to all the standard Android apps. The smartphone itself can act as a trackpad for a mouse, or a keyboard, and be connected to something like the MateDock (sold alongside the original Matebook) for additional functionality such as Ethernet, more USB ports, and additional video outputs.

As the headlines for the Mate 10 will be around artificial intelligence, this feature is likely to be left into a footnote for now, similar to how DeX has been on the Galaxy S8 series. In order to get it off the ground, I suspect that Huawei will have to implement some type of ‘Desktop Dock’ that can allow for additional attachments as well as charging at the same time – at this point Huawei says that users will have to buy a splitter cable to support charging at the same time. This is the first generation, so there are some rough edges – it only supports displays at their native resolution up to 1920x1080 for now, and when using a Bluetooth device I did notice some lag. Other features, such as something similar to Windows Snap, should be high on the list.

The Mate 10, Mate 10 Pro, and Mate 10 Porsche Design Microsoft Translate: Artificial Intelligence and Application Lag
POST A COMMENT

103 Comments

View All Comments

  • name99 - Monday, October 16, 2017 - link

    Think of AI as a pattern recognition engine. What does imply?
    Well for one thing, the engine is only going to see patterns in what it is FED! So what is it being fed?
    Obvious possibilities are images (and we know how that's working out) and audio (so speech recognition+translation, and again we know how that's working out). A similar obvious possibility could be stylus input and so writing recognition, but no-one seems to care much about that these days.

    Now consider the "smart assistant" sort of idea. For that to work, there needs to be a way to stream all the "activities" of the phone, and their associated data, through the NPU in such a way that patterns can be detected. I trust that, at least the programmers reading this, start to see what sort of a challenge that is. What are the relevant data structures to represent this stream of activities? What does it mean to find some pattern/clustering in these activities --- how is that actionable?

    Now Apple, for a few years now, has been pushing the idea that every time a program interacts with the user, it wraps up that interaction in a data structure that describes everything that's being done. The initial reason for this was, I think, for the on-phone search engine, but soon the most compelling reason for this (and easiest to understand the idea) was Continuity --- by wrapping up an "activity" in a self-describing data structure, Apple can transmit that data structure from, say, your phone to your mac or your watch, and so continue the activity between devices.

    Reason I bring this up is that it obviously provides at least a starting point for Apple to go down this path. But only a starting point. Unlike images, each phone does not generate MILLIONS of these activities, so you have a very limited data set within which to find patterns. Can you get anything useful out of that? Who knows?

    Android also has something called Activities, but as far as I can tell they are rather different from the Apple version, and not useful for the sort of issue I described. As far as I know Android has no such equivalent today. Presumably MS Will have to define such an equivalent as part of their copy of (as subset of) Continuity that's coming out with Fall Creators, and perhaps they have the same sort of AI ambitions that they hope to layer upon it?
    Reply
  • Valantar - Tuesday, October 17, 2017 - link

    The thing is, the implementation here is a "pattern recognition engine" /without any long-term memory/. In other words: it can't learn/adapt/improve over time. As such, it's as dumb as a bag of rocks. I wholeheartedly agree with not caring if the phone can recognize cats/faces/landscapes in my photos (which, besides, a regular non-AI-generated algorithm can do too, although probably not as well). How about learning the user's preferences in terms of aesthetics, subject matter, gallery culling? That would be useful, especially the last one: understanding what the fifteen photos you just took were focusing on, and then picking the best one in terms of focus, sharpness, background separation, colour, composition, and so on. Sure, also a task an algorithm could do (and do, in some apps), but sufficiently complex that it's likely that an AI that learns over time would do a far better job. Not to mention that an adaptive AI in that situation could regularly present the user with prompts like "I selected this as the best shots. These were the runner-ups. Which do you prefer?" which would give valuable feedback to adjust the process. Reply
  • serendip - Tuesday, October 17, 2017 - link

    I do plenty of manual editing and cataloging of photos shot on the phone, in a mirror of the processes I use for a DSLR and a laptop. I don't think an AI will know if I want to do a black and white series in Snapseed from color photos, it won't know which ones to keep or delete, and it won't know about the folder organization I use.

    So what exactly is the AI for?
    Reply
  • tuxRoller - Tuesday, October 17, 2017 - link

    It COULD improve over time if they pushed out upgraded NN. Off device training with occasional updates to supported devices is going to be the best option for awhile. Reply
  • Krysto - Monday, October 16, 2017 - link

    Disappointed they aren't using the AI accelerator for more advanced computational photography, like say better identifying a person's face and body and then doing the boken around that, or improving dynamic range by knowing exactly which parts to expose more, and so on.

    Auto-switching to a "mode" is really something that other phone makers have had for a few years now in their "Auto" mode.
    Reply
  • Ian Cutress - Monday, October 16, 2017 - link

    This was one of my comments to Huawei. I was told that the auto modes do sub-frame enhancements for clarity, though I'm under the assumption those are the same as previous tools and algorithms and not AI driven. Part of the issue here is AI can be a hammer, but there needs to be nails. Reply
  • melgross - Monday, October 16, 2017 - link

    With the supposed high performance of this neural chip when compared to Apple’s right now, I’m a bit confused.

    Since we don’t know exactly how this works, and we know even less about Apple’s, how can we even begin to compare performance between them?

    Hopefully, you will soon have a real review of the 8/8+. As well as the deep dive of then SoC, something which was promised for last year’s model, but never materialized.

    A comparison between these two new SoCs will be interesting.
    Reply
  • name99 - Monday, October 16, 2017 - link

    Apple gave an "ops per sec" number, Huawei gave a FLOPS number. One was bigger than the other.
    That's all we have.

    There are a million issues with this. Are both talking about 32-bit flops? Or 16-bit? Maybe Apple meant 32-bit FLOPs and Huawei 16-bit?
    And is FLOP actually a useful metric? Maybe in real situations these devices are really limited by their cache or memory subsystems?

    To be fair, no-one is (yet) making abig deal about this precisely because anyone who know anything understands just how meaningless both the numbers are. It will take a year or three before we have enough experience with what the units do that we CARE about, and so know what to bother benchmarking or how to compare them.

    Baidu have, for example, a supposed NPU benchmark suite, but it kinda sucks. All it tests is the speed of some convolutions at a range of different sizes. More problematically, at least as it exists today, it's basically C code. So you can look up the performance number for an iPhone, but it's meaningless because it doesn't even give you the GPU performance, let alone the NPU performance.
    We need to learn what sort of low-level performance primitives we care about testing, then we need to write up comparable cross-device code that uses the optimal per-device APIs on each device. This will take time.
    Reply
  • melgross - Wednesday, October 18, 2017 - link

    This is the problem I’m thinking about. We don’t have enough info to go by. Reply
  • varase - Tuesday, November 21, 2017 - link

    I assumed Apple's numbers were something in the inferences/second ballpark as this was a neural processor (and MLKit seems to be processing data produced by standard machine learning models). We know the Apple neural processor is used by Face ID, first as a gatekeeper (detect attempts to fake it out, and then attempt to see if the face in view is the same one store stored in the secure enclave.

    Flops seems to imply floating point operations/second.

    Color me confused.
    Reply

Log in

Don't have an account? Sign up now