Straight from the lab: why a benchmark isn’t the answer to everything
Background informationMobile

Straight from the lab: why a benchmark isn’t the answer to everything

Dominik Bärlocher
Zurich, on 31.01.2018
Responsible for translation: Eva Francis
Benchmarks should be there to give a standardised comparison for smartphones and other technology. But the automated tests forget one rather important thing: the people who’ll be using the devices. Let’s take a look behind the scenes of our test methodology.

Benchmarks promise a lot. Above all, they’re supposed to be a reliable, objective and neutral indication of smartphone performance. This leads some people to the conclusion they’re better than any other test. As a professional phone tester, I can confirm that’s not the case.

It doesn’t often happen that I have two of the same type of phones on my desk. But I’ve struck lucky with the LG V30. Not only do I have an LG V30+, a Korea export, but also an EU version of the LG V30. There are only two things that set these devices apart:

  1. The LG V30+ features 128 GB internal memory, while the LG V30 only has 64 GB
  2. The LG V30+ boasts a hybrid dual SIM slot and the LG V30 doesn’t

    The rest of the specs are identical. Now when I run through a benchmark app, the values should be the same.

    Let the testing begin: methodology for the benchmarks

    I’m using the following devices for my benchmark test:

V30+ (128GB, Moroccan Blue, 6", Hybrid Dual SIM, 16Mpx)
LG V30+ (128GB, Moroccan Blue, 6", Hybrid Dual SIM, 16Mpx)

The app I’m using for the benchmark is Antutu Benchmark with the 3D Add On. There are numerous benchmarks in the Google Play Store, but Antutu had consistently good reviews. After discussing options with the mobile geeks in the company, we came across Antutu by chance.

This is where we stumbled across the first problem of benchmark testing. There isn’t just the one benchmark, because anyone can develop and publish their own benchmark app. If benchmarks are to be universal, they’ll have to establish some sort of standard.

But at the moment, this standard doesn’t exist. That’s why any benchmark from any app can be disputed. For the very good reason that another app can regurgitate another figure, which has just as much weight in the benchmark world as the Antutu test.

The result: V30+ wins

I ran ten rounds of Antutu benchmarks. The mobile geeks weren’t in agreement. Each of them thought they knew how to make a benchmark better and therefore more meaningful. After a benchmark, you’re supposed to leave the phone in the fridge for half an hour so it can cool down. It’s also meant to be in flight mode so that data transfer doesn’t interfere with any of the functions.

no information available about this image
The LG V30 and the LG V30+ are almost identical

A benchmark that can be influenced by so many environmental factors and that delivers inconsistent data will obviously be doubted. That’s why I decided to carry out the test like this: I’d take both of the phones, do the benchmark test ten times one after the other. Without taking a break, without putting it in the fridge and without waiting for the right lunar phase.

LG V30LG V30+
1162116169016
2165968168973
3158907163637
4160792160500
5156781157918
6147413152253
7148210149940
8142798148834
9173738173223
10165803168960

A quick analysis:

  • On average, the LG V30 scored 158 252.60 points
  • On average, the LG V30+ scored 161 325.40 points
  • The LG V30 scored the highest single value with 173 738.00 points
  • The LG V30 scored the lowest single value with 142 798.00 points

    This shows that, on average, it was the LG V30+ that won the round. The difference was about 3072.80 points or 1.9%.

But something occurred to me as I was carrying out the benchmark tests. Going back to the fridge idea, the purpose is to cool down the phone. The theory is that a cooled phone delivers better and more reliable results. And yet, my test contradicts that. At least anecdotally. To be absolutely sure, I’d need to carry out a lot more tests, which I’d then say were representative based on nothing at all. In the ninth round of testing, both phones gave their highest value and the lowest in the eighth round.

What benchmarks tell you

Benchmarks do have a certain significance. Here’s what I found when I compared two completely different phones, an old HTC M7 from 2013 and a brand new Razer phone.

no information available about this image
Surprisingly, the Razer phone (2018) won against the HTC M7 (2013)
HTC One M7Razer Phone
140863178219
240459180698
340227181227
440045180238
540988177600
640814176727
740603171843
840662175492
940467175660
1039987171611

A quick analysis:

  • On average, the Razer phone scored 176 931.50 points
  • On average, the HTC One M7 scored 40 511.50 points
  • The Razer Phone scored the highest single value with 181 227 points
  • The HTC One M7 scored the lowest single value with 39 611 points

    And what does that all mean? The new phone is better than the old one. Who’d have thought it? The difference of 77.10% is completely meaningless.

Right, time for another set of phones to battle it out. Razer Phone versus Samsung Galaxy Note 8.

no information available about this image
With phones that are more or less on a par with each other, it’s not worth doing a benchmark test
Razer PhoneSamsung Galaxy Note 8
1178219175360
2180698176210
3181227176939
4180238176036
5177600175236
6176727176321
7171843122752
8175492175762
9175660176286
10171611114510
Galaxy Note8 EU (64GB, Midnight Black, 6.30", Hybrid Dual SIM, 12Mpx)
Samsung Galaxy Note8 EU (64GB, Midnight Black, 6.30", Hybrid Dual SIM, 12Mpx)

The Note 8 gets defeated. But then again, that’s not really surprising when you check the specs. At best, the benchmark test seems to be a game that confirms your theory; at worst, it’s a waste of time.

What benchmarks don’t tell you

At digitec, when we test phones, we go above and beyond the realms of the benchmark tests. At the end of it, you know what they’re like to use day in, day out – you don’t just get a jumble of numbers from an app. Because let’s face it, you’re going to be using your phone on a daily basis, and even the best benchmark score will hide umpteen factors.

It won’t tell you anything about the bit of dirt behind the glass on my LG V30+, which you wouldn’t notice if you just picked it up once for a benchmark test. The camera speed on the Razer phone wouldn’t have been called into question and the durability of the HTC One M7 wouldn’t have been brought to light.

To uncover these points, to assess them, qualify and quantify them, you need human eyes and hands. At the end of the day, it’s you and not an app who’s going to be holding the phone in your hands and using it to call people, take pictures and send WhatsApps to friends and family. Arbitrary values can be as high as they want, but they’ll never communicate all of this.

And on that note, I’ll carry on testing… just without benchmarks most of the time.

23 people like this article


Dominik Bärlocher
Dominik Bärlocher
Senior Editor, Zurich
Journalist. Author. Hacker. A storyteller searching for boundaries, secrets and taboos – putting the world to paper. Not because I can but because I can’t not.

These articles might also interest you