You're not connected to the Internet.
Corporate logo
SmartphoneBackground knowledge 1723

Straight from the lab: why a benchmark isn’t the answer to everything

Benchmarks should be there to give a standardised comparison for smartphones and other technology. But the automated tests forget one rather important thing: the people who’ll be using the devices. Let’s take a look behind the scenes of our test methodology.

Benchmarks promise a lot. Above all, they’re supposed to be a reliable, objective and neutral indication of smartphone performance. This leads some people to the conclusion they’re better than any other test. As a professional phone tester, I can confirm that’s not the case.

It doesn’t often happen that I have two of the same type of phones on my desk. But I’ve struck lucky with the LG V30. Not only do I have an LG V30+, a Korea export, but also an EU version of the LG V30. There are only two things that set these devices apart:

  1. The LG V30+ features 128 GB internal memory, while the LG V30 only has 64 GB
  2. The LG V30+ boasts a hybrid dual SIM slot and the LG V30 doesn’t

The rest of the specs are identical. Now when I run through a benchmark app, the values should be the same.

Let the testing begin: methodology for the benchmarks

I’m using the following devices for my benchmark test:

  • V30 (6", 64GB, 16MP, Cloud Silver)
  • V30 (6", 64GB, 16MP, Cloud Silver)
  • V30 (6", 64GB, 16MP, Cloud Silver)
CHF 599.–
LG V30 (6", 64GB, 16MP, Cloud Silver)
Excellent 6-inch Amoled Phablet in IP68-certified 18:9 format. Supports HDR10, Daydream, Google Assistant, UX 6.0+ and Hi-Fi Quad DAC.
29

Availability

Mail delivery

  • More than 10 piece(s)
    in our warehouse

Collection

  • Basel: the day after tomorrow at 12:30h
  • Bern: the day after tomorrow at 12:00h
  • Dietikon: the day after tomorrow at 13:30h
  • Geneva: the day after tomorrow at 14:00h
  • Kriens: the day after tomorrow at 13:00h
  • Lausanne: the day after tomorrow at 12:30h
  • St Gallen: the day after tomorrow at 13:00h
  • Winterthur: the day after tomorrow at 12:00h
  • Wohlen: the day after tomorrow at 11:00h
  • Zurich: the day after tomorrow at 12:15h

PickMup

If ordered immediately.
Information subject to change.

View details

  • V30+ (6", 128GB, Dual SIM, 16MP, Moroccan Blue)
  • V30+ (6", 128GB, Dual SIM, 16MP, Moroccan Blue)
  • V30+ (6", 128GB, Dual SIM, 16MP, Moroccan Blue)
LG V30+ (6", 128GB, Dual SIM, 16MP, Moroccan Blue)
7

Availability

  • Currently unavailable. Delivery date unknown.

Information subject to change.

View details

The app I’m using for the benchmark is Antutu Benchmark with the 3D Add On. There are numerous benchmarks in the Google Play Store, but Antutu had consistently good reviews. After discussing options with the mobile geeks in the company, we came across Antutu by chance.

This is where we stumbled across the first problem of benchmark testing. There isn’t just the one benchmark, because anyone can develop and publish their own benchmark app. If benchmarks are to be universal, they’ll have to establish some sort of standard.

But at the moment, this standard doesn’t exist. That’s why any benchmark from any app can be disputed. For the very good reason that another app can regurgitate another figure, which has just as much weight in the benchmark world as the Antutu test.

The result: V30+ wins

I ran ten rounds of Antutu benchmarks. The mobile geeks weren’t in agreement. Each of them thought they knew how to make a benchmark better and therefore more meaningful. After a benchmark, you’re supposed to leave the phone in the fridge for half an hour so it can cool down. It’s also meant to be in flight mode so that data transfer doesn’t interfere with any of the functions.

The LG V30 and the LG V30+ are almost identical

A benchmark that can be influenced by so many environmental factors and that delivers inconsistent data will obviously be doubted. That’s why I decided to carry out the test like this: I’d take both of the phones, do the benchmark test ten times one after the other. Without taking a break, without putting it in the fridge and without waiting for the right lunar phase.

A quick analysis:

  • On average, the LG V30 scored 158 252.60 points
  • On average, the LG V30+ scored 161 325.40 points
  • The LG V30 scored the highest single value with 173 738.00 points
  • The LG V30 scored the lowest single value with 142 798.00 points

This shows that, on average, it was the LG V30+ that won the round. The difference was about 3072.80 points or 1.9%.

But something occurred to me as I was carrying out the benchmark tests. Going back to the fridge idea, the purpose is to cool down the phone. The theory is that a cooled phone delivers better and more reliable results. And yet, my test contradicts that. At least anecdotally. To be absolutely sure, I’d need to carry out a lot more tests, which I’d then say were representative based on nothing at all. In the ninth round of testing, both phones gave their highest value and the lowest in the eighth round.

What benchmarks tell you

Benchmarks do have a certain significance. Here’s what I found when I compared two completely different phones, an old HTC M7 from 2013 and a brand new Razer phone.

Surprisingly, the Razer phone (2018) won against the HTC M7 (2013)

  • Phone (5.70", 64GB, 12MP, Black)
  • Phone (5.70", 64GB, 12MP, Black)
  • Phone (5.70", 64GB, 12MP, Black)
CHF 749.–
Razer Phone (5.70", 64GB, 12MP, Black)
Watch, Listen, Play.
13

Availability

Mail delivery

  • More than 10 piece(s)
    in our warehouse

Collection

  • Basel: the day after tomorrow at 12:30h
  • Bern: the day after tomorrow at 12:00h
  • Dietikon: the day after tomorrow at 13:30h
  • Geneva: the day after tomorrow at 14:00h
  • Kriens: the day after tomorrow at 13:00h
  • Lausanne: the day after tomorrow at 12:30h
  • St Gallen: the day after tomorrow at 13:00h
  • Winterthur: the day after tomorrow at 12:00h
  • Wohlen: the day after tomorrow at 11:00h
  • Zurich: the day after tomorrow at 12:15h

PickMup

If ordered immediately.
Information subject to change.

View details

A quick analysis:

  • On average, the Razer phone scored 176 931.50 points
  • On average, the HTC One M7 scored 40 511.50 points
  • The Razer Phone scored the highest single value with 181 227 points
  • The HTC One M7 scored the lowest single value with 39 611 points

And what does that all mean? The new phone is better than the old one. Who’d have thought it? The difference of 77.10% is completely meaningless.

Right, time for another set of phones to battle it out. Razer Phone versus Samsung Galaxy Note 8.

With phones that are more or less on a par with each other, it’s not worth doing a benchmark test

  • Galaxy Note8 (6.30", 64GB, Dual SIM, 12MP, Midnight Black)
  • Galaxy Note8 (6.30", 64GB, Dual SIM, 12MP, Midnight Black)
  • Galaxy Note8 (6.30", 64GB, Dual SIM, 12MP, Midnight Black)
40 of 300 remaining
CHF 649.–instead of 746.–1
Samsung Galaxy Note8 (6.30", 64GB, Dual SIM, 12MP, Midnight Black)
Do bigger things
95

Availability

Mail delivery

  • More than 10 piece(s)
    in our warehouse

Collection

  • Basel: Only 3 piece(s)
  • Bern: Only 1 piece(s)
  • Dietikon: the day after tomorrow at 13:30h
  • Geneva: the day after tomorrow at 14:00h
  • Kriens: Only 1 piece(s)
  • Lausanne: the day after tomorrow at 12:30h
  • St Gallen: the day after tomorrow at 13:00h
  • Winterthur: the day after tomorrow at 12:00h
  • Wohlen: the day after tomorrow at 11:00h
  • Zurich: the day after tomorrow at 12:15h

PickMup

If ordered immediately.
Information subject to change.

View details

The Note 8 gets defeated. But then again, that’s not really surprising when you check the specs. At best, the benchmark test seems to be a game that confirms your theory; at worst, it’s a waste of time.

What benchmarks don’t tell you

At digitec, when we test phones, we go above and beyond the realms of the benchmark tests. At the end of it, you know what they’re like to use day in, day out – you don’t just get a jumble of numbers from an app. Because let’s face it, you’re going to be using your phone on a daily basis, and even the best benchmark score will hide umpteen factors.

It won’t tell you anything about the bit of dirt behind the glass on my LG V30+, which you wouldn’t notice if you just picked it up once for a benchmark test. The camera speed on the Razer phone wouldn’t have been called into question and the durability of the HTC One M7 wouldn’t have been brought to light.

To uncover these points, to assess them, qualify and quantify them, you need human eyes and hands. At the end of the day, it’s you and not an app who’s going to be holding the phone in your hands and using it to call people, take pictures and send WhatsApps to friends and family. Arbitrary values can be as high as they want, but they’ll never communicate all of this.

And on that note, I’ll carry on testing… just without benchmarks most of the time.

These articles might also interest you

<strong>One Perfect Shot</strong>: The story of how I rearranged my flat to get a photo
PhotographyBackground knowledge

One Perfect Shot: The story of how I rearranged my flat to get a photo

<strong>Smartphones with bokeh effect:</strong> revolutionary or just a gimmick?
PhotographyBackground knowledge

Smartphones with bokeh effect: revolutionary or just a gimmick?

<strong>Samsung Galaxy Note 8</strong> – It’s big, it’s strong and it’s armed with a pen
video
SmartphoneNews and trends

Samsung Galaxy Note 8 – It’s big, it’s strong and it’s armed with a pen

User
Journalist. Author. Hacker. A storyteller searching for boundaries, secrets and taboos – putting the world to paper. Not because I can but because I can’t not.

17 comments

Please log in.

You have to be logged in to create a new comment.


User TheEscalader

Du hast völlig Recht. Beim Kauf vom Handy für den täglichen Gebrauch bringen einem Benchmarks nicht, dafür aber Reviews.
Und cooles Hintergrundbild! #fsociety

31.01.2018
Report abuse

You must log in to report an abuse.

User Dominik Bärlocher

Hello, friend.

31.01.2018
Report abuse

You must log in to report an abuse.

Answer
User garned

"Das LG V30 hat den niedrigsten Einzelwert von 172 798.00 Punkten erzielt"
Laut der Tabelle sollten das 142 798.00 Punkte, oder?

31.01.2018
Report abuse

You must log in to report an abuse.

User Dominik Bärlocher

Da hast du recht. Ist gefixed. Danke für den Hinweis.

31.01.2018
Report abuse

You must log in to report an abuse.

Answer
User reze_dig

tsss .. nur ruckeln darf es nicht bei Office Anwendungen mehr muss der Bench für mich selber nicht aussagen.
WICHTIG ist: Akkudauer (länger = besser!!!!), gute Lesbarkeit des Displays, evtl. Kamera und Spritzwasserfest sowie keine Fantasiepreise von 800.- plus

31.01.2018
Report abuse

You must log in to report an abuse.

User The Merc

Freut mich, dass dies für Sie wichtig ist. Leider kann man diese Dinge nicht so testen. Ruckeln stellt man am Besten bei Praxistests fest, anders wird's schwierig(er). Die Akkulaufzeit braucht keine Benchmark, hier gilt entweder auf die Specs vertrauen oder auch testen. Die Lesbarkeit muss man ebenfalls von blossem Auge in einer Vorführung feststellen, die Kamera kann auch viele schöne Specs haben, das Ergebnis hängt aber auch vom Geschmack des Users ab. Und am ende wirfst das Ding ins Wasser.

31.01.2018
Report abuse

You must log in to report an abuse.

User The Merc

Hier geht es auch eher um das Wissen, wie man mit Benchmarks umgehen soll. Die andern Daten sind entweder relativ zum Betrachter oder schwierigzu testen, oft via Anwendung nicht möglich. Wenn Sie also einen wirklichen Praxisvergleich möchten, schlage ich reviews, recherche und vielleicht sogar die Anfrage bei Bekannten, welche ein Gerät besitzen anzufragen.

31.01.2018
Report abuse

You must log in to report an abuse.

Answer
User miklagard

@Dominik Du als Hacker und Datenschützer solltest doch Wissen das man das Outlook ab nicht verwendet, da deine Maildaten über die Server Microsoft USA gelenkt werden...

31.01.2018
Report abuse

You must log in to report an abuse.

User fumo

ähm nein, Unsere Daten gehen nach Irland.
Aber Aluhutträger mögen echte Fakten ja nicht so, darum wirst du das eh nicht glauben wollen ;)

01.02.2018
Report abuse

You must log in to report an abuse.

User miklagard

Lieber Fumo
schon ein wenig älter aber der zustand ist noch nicht besser..
heise.de/mac-and-i/meldung/...

01.02.2018
Report abuse

You must log in to report an abuse.

User fumo

Lieber Miklagard
Genau lesen und nicht der Panikmache folgen hilft. Ich zitiere: "Dies geht aus den Datenschutzbedingungen für Acompli hervor". Was die Routine getan haben bevor sie von MS übernommen wurden ist kein gutes Anhaltspunkt für Vorwürfe. Gerichtsurteile beweisen dass die Daten der EU Nutzer von MS für US Behörden nicht einsehbar sind weil sie nicht über US Boden gehen.

01.02.2018
Report abuse

You must log in to report an abuse.

Answer
User xerxes300

Vorab, ich habe mich nicht über "Antutu Benchmark" informiert. Meine Frage, wie lange wird denn gebenchmarked? Wenn die Antwort unter 15min ist, dann sind die Benchmarks für mich noch nutzloser. Ich stosse nicht oft an die Grenze von einem Smartphone aber oft wird es heiss und wird langsamer.

31.01.2018
Report abuse

You must log in to report an abuse.

User xerxes300

Natürlich wird es langsamer um sich selbst vor dem Überhitzen zu schützen. Fachbegriff: Thermal throttling

31.01.2018
Report abuse

You must log in to report an abuse.

User Dominik Bärlocher

Eine Runde Antutu dauert gefühlte drei bis fünf Minuten.

31.01.2018
Report abuse

You must log in to report an abuse.

User July Sullivan

Wer sich die Tabelle genau ansieht merkt das beim Samung der Benchmark völlig in sich zusammenfällt im 10. Durchlauf. Ich mein das konnte zufall sein, ein "lag zwischendurch", könnte aber auch heissen das dass phone nach 30-50 Minuten Deuerbelastung nicht mehr zu gebrauchen ist... Dominik, eine Mission für dich :D

31.01.2018
Report abuse

You must log in to report an abuse.

User xerxes300

@July Sullivan Ich würde mich auch extrem über einen "richtigen" Benchmark freuen, der minimum 15min dauert. 3 min Benchmarks sind einfach. Das Handy mit der besten Hardware und Software Lösung gewinnt. Aber bei längeren Tests kommen noch Kühlung und Ausdauer ins Spiel.

01.02.2018
Report abuse

You must log in to report an abuse.

User djdomrep

Sinnvoll wären Benchmarks wie Prime95 bei PC's welchen man schön über Stunden/Tage oder auch Wochen durchlaufen lassen kann.
Abgesehen davon testen diese Benchmarks einige unnütze Sachen, siehe Geekbench.
Ich persönlich verwende 3DMark um die Geräte für Gaming zu testen und nicht um rauszufinden wieviel Fotos ich pro sekunde schiessen kann, da man eigentlich eh nur einzelne macht.
Gaming ist finde ich ein wichtiger Punkt bei Smartphones, sowie auch das "flüssige Erlebnis" bei der Verwendung.

02.02.2018
Report abuse

You must log in to report an abuse.

Answer