FULLTEXT01
FULLTEXT01
Evaluation of cross-platform
technology Flutter from the user’s
perspective
ADIBBIN HAIDER
ADIBBIN HAIDER
Abstract
The aim of the following thesis was to evaluate the cross-platform technol-
ogy Flutter based on a user perspective. The key aspects investigated were
user-perceived performance such as startup time and application size. Addi-
tionally, the user-perception was also a key feature investigated. To evaluate
the cross-platform technology Flutter, multiple sample applications were de-
veloped. For evaluating the user perception between Flutter and native appli-
cations, a user study was conducted using a UEQ. The results suggested that
if the performance of the application is vital for the users, a Flutter application
is most likely not suitable. The user perception study showed that there were
no significant differences between the developed sample applications. No sig-
nificant differences between the Flutter applications and the native application
for either the Android or the iOS platform. Thus, a Flutter application can be a
suitable alternative to a native Android application from the user’s perspective.
iv
Sammanfattning
Syftet med avhandlingen var att utvärdera multiplattformsteknologin Flutter
utifrån ett användarperspektiv. De viktigaste aspekter som undersöktes var an-
vändarupplevd prestanda som starttid och applikationsstorlek. Dessutom var
användaruppfattning en nyckelfunktion som undersöktes. För att utvärdera platt-
formstekniken Flutter utvecklades flera exempelapplikationer. För att jämföra
användarnas uppfattning mellan Flutter och native applikationer genomfördes
en användarstudie med en UEQ. Resultaten indikerade att om applikationens
prestanda är avgörande för användarna är en Flutter-applikation sannolikt inte
lämplig. Användaruppfattningsstudien visade att det inte fanns några signifi-
kanta skillnader mellan de utvecklade provapplikationerna. Inga signifikanta
skillnader fanns mellan Flutter-applikationerna och den ursprungliga appli-
kationen för varken Android- eller iOS-plattformen. Således kan en Flutter-
applikation vara ett lämpligt alternativ för en inbyggd Android-applikation ur
användarens perspektiv.
Contents
1 Introduction 1
1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 Research questions . . . . . . . . . . . . . . . . . . . . . . . 2
1.4 Aim . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.5 Delimitations . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.6 Related work . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.6.1 Sample applications . . . . . . . . . . . . . . . . . . 4
1.6.2 User experience and user interface . . . . . . . . . . . 5
1.6.3 Performance . . . . . . . . . . . . . . . . . . . . . . 5
2 Background 7
2.1 Cross-platform applications . . . . . . . . . . . . . . . . . . . 7
2.1.1 Flutter . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.2 Native mobile applications . . . . . . . . . . . . . . . . . . . 12
2.2.1 Development . . . . . . . . . . . . . . . . . . . . . . 13
2.3 Model-View-ViewModel . . . . . . . . . . . . . . . . . . . . 13
2.4 t-Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.5 User experience questionnaire . . . . . . . . . . . . . . . . . 16
3 Method 19
3.1 Developing sample applications . . . . . . . . . . . . . . . . 19
3.1.1 Complexity . . . . . . . . . . . . . . . . . . . . . . . 19
3.1.2 Design pattern . . . . . . . . . . . . . . . . . . . . . 20
3.1.3 User experience and user interface . . . . . . . . . . . 21
3.1.4 Developed applications . . . . . . . . . . . . . . . . . 21
3.2 Performance metrics . . . . . . . . . . . . . . . . . . . . . . 25
3.2.1 Size . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
3.2.2 Startup time . . . . . . . . . . . . . . . . . . . . . . . 26
v
vi CONTENTS
4 Result 30
4.1 User perceived performance . . . . . . . . . . . . . . . . . . 30
4.1.1 Size . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
4.1.2 Startup time . . . . . . . . . . . . . . . . . . . . . . . 31
4.2 User perception study . . . . . . . . . . . . . . . . . . . . . . 33
4.2.1 Android . . . . . . . . . . . . . . . . . . . . . . . . . 33
4.2.2 iOS . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
5 Discussion 36
5.1 User perceived performance . . . . . . . . . . . . . . . . . . 36
5.1.1 Flutter overhead . . . . . . . . . . . . . . . . . . . . 36
5.1.2 Size . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
5.1.3 Startup time . . . . . . . . . . . . . . . . . . . . . . . 38
5.2 User perception study . . . . . . . . . . . . . . . . . . . . . . 39
5.2.1 Method . . . . . . . . . . . . . . . . . . . . . . . . . 40
5.3 Sample applications . . . . . . . . . . . . . . . . . . . . . . . 40
5.4 User’s perspective . . . . . . . . . . . . . . . . . . . . . . . . 42
5.5 Design pattern . . . . . . . . . . . . . . . . . . . . . . . . . . 42
5.6 Sources of error . . . . . . . . . . . . . . . . . . . . . . . . . 43
5.7 Sustainability & Ethical Aspects . . . . . . . . . . . . . . . . 43
6 Conclusion 44
7 Future work 46
Bibliography 47
Introduction
1.1 Introduction
In the last decade of the smartphone era, there have mainly been two platforms
dominating the market: Android and iOS [1, 2, 3, 4]. It has resulted in devel-
opers building applications with identical functionality for separate platforms
with different codebases [5, 6, 7, 8]. There are several cross-platform technolo-
gies available aiming to solve this issue, giving developers the opportunity to
"write once, run anywhere".
One example of such a platform is Flutter which is created by Google. Flutter
aims to solve the challenge of developing applications for different platforms
with a single codebase. Thus, achieving write once, run on several platforms,
such as Android and iOS. [9, 10]
The following thesis evaluates the cross-platform technology Flutter based on
a user perspective. The key aspects investigated are user-perceived perfor-
mance such as startup time and application size.
The gist of this thesis evaluation is based on developing sample applications.
One native application each for Android and iOS, and two Flutter applications
1
2 CHAPTER 1. INTRODUCTION
for Android and iOS using the same codebase. This will result in three code-
bases and four sample applications.
Thereafter, a user perception study will be conducted on the sample applica-
tions. Additionally, the performance will be measured while the participants
in the study use the applications.
1.2 Motivation
Research to date shows that Flutter has not been investigated or evaluated thor-
oughly within academia. No previous studies from the user’s perspective have
been found to date. The perception of applications is crucial for both users
and developers. Users want to use an application that satisfies their needs
and meet their expectations [11]. Thus, developers have the incentive to build
applications that allow individuals to use their applications to keep them satis-
fied and meet their expectations. Therefore, developers might be interested to
know whether Flutter developed applications are an alternative from the user’s
perspective. Similar research exists for other cross-platform solutions. How-
ever, none for Flutter. Hence, this thesis has significant news value and lays a
foundation for further work regarding Flutter.
1.4 Aim
This thesis aims to answer the research question, can a Flutter developed ap-
plication be a viable alternative for native application from a user-centered
perspective? To tackle this question sample applications will be developed
using Flutter and natively for Android and iOS. Thereafter, the applications
will be evaluated and compared using a user experience questionnaire.
The comprehensive aim of this thesis is to help developers make the right
decision. Dependent on the result of this thesis, it may encourage or discourage
developers to build applications using Flutter.
1.5 Delimitations
Attempting to evaluate a whole cross-platform technology is a substantial ef-
fort. Considering that Flutter supports multiple platforms such as Android,
iOS, macOS, and web. Due to limited time and scope, this thesis will solely
focus on evaluating Flutter from a user-centered perspective for the Android
and iOS platforms.
deemed to be applicable for this thesis. In particular, they used a user experi-
ence questionnaire for understanding the user’s perspective. The same method
was used for this thesis.
All of the previous work has conducted their research focusing on one or a
few aspects. Aspect such as application size, development productivity, look
and feel experience, performance, taxonomy, UI/UX, etc. Cross-platform so-
lutions are broad and consist of multiple technologies. It is hard or near impos-
sible to research the whole technology and conclude something objectively.
Studies regarding mobile application taxonomy tend to split applications into
two groups. Applications that run on one platform and cross-platform applica-
tions which run on multiple platforms. For single platform applications, most
of the studies seems to agree on naming native application [13, 14], except
one study by El-Kassas et al. [15]. All of the studies define cross-platform
applications in distinctive ways.
This thesis will not address the differences in mobile application taxonomy.
The thesis will use the naming convention of native application, for applica-
tions that run on a single platform. The naming cross-platform application
will be used for for applications that run on multiple platforms. This thesis
will not address nor investigate the differences in cross-platform taxonomy in
regards to the UI toolkit Flutter.
There are a plethora of studies related to cross-platform applications. It would
be challenging to summarize every aspect of research in the field of cross-
platform applications. For the purpose of this thesis, thorough research and
investigation have been conducted in previous work related to this thesis and
its aim. The following sections, sample applications, look and feel, and per-
formance is highly relevant to this thesis.
cations. One native application for each of the target platforms. One cross-
platform application developed using the Flutter UI toolkit. The sample ap-
plication will be simple to use and have limited functionalities.
1.6.3 Performance
Multiple studies [17, 12, 5] has been focusing on researching the performance
of applications developed using cross-platform solutions. More general re-
search [16, 20, 21, 22, 25] in the area of mobile cross-platform platforms,
has also discussed the performance between native applications and cross-
platform applications. The overall conclusion seems to be that native applica-
tions tend to perform better than cross-platform developed applications.
In research by Mercado, Munaiah, and Meneely [5] based on analyzing cus-
tomer reviews for Android and iOS applications, users had a more positive
perception of native applications in aspects of performance. The same study
6 CHAPTER 1. INTRODUCTION
Background
This chapter presents the underlying theory and evaluation methods used for
the thesis. Starting by briefly describing the concept of cross-platform and then
how Flutter works. Thenceforth, a brief description of native applications and
a design pattern named Model-View-ViewModel is presented. The chapter
finishes by describing the evaluation method User Experience Questionnaire.
7
8 CHAPTER 2. BACKGROUND
2.1.1 Flutter
Flutter is Google’s approach to cross-platform application development. Google
themselves describes Flutter as a UI toolkit. With Flutter, developers can build
applications for mobile, web, and desktop from a single codebase2 . Web sup-
port is currently in a beta phase3 , while desktop (Linux, macOS, Windows)
support is still in the alpha phase4 .
Declarative UI
Flutter is a declarative framework, meaning Flutter builds the UI to reflect the
current state of the application5 . In comparison to imperative programming,
when an application state changes, the UI also needs to be handled and updated
to reflect the latest application state. In declarative programming the applica-
tion changes when the UI rebuilds to reflect the current application state.
One of its benefits is that there is only one code path for any state of the UI. The
UI is described once for any state6 . A potential drawback might be that it re-
build itself as soon as the application state changes, as it might be performance
heavy.
Architectural layers
The design of Flutter is based on layering the system. Each layer contains
independent libraries that depend on the layer below. No independent library
has prioritized access to the layer below. Every part of the framework level is
designed to be optional and replaceable.7
architectural-overview, Accessed 2021-02-08
2
Flutter - Beautiful native apps in record time, https://flutter.dev/, Accessed
2021-02-08
3
Web support for Flutter, https://flutter.dev/web, Accessed 2021-02-08
4
Dekstop support for Flutter, https://flutter.dev/desktop, Accessed 2021-
02-08
5
Start thinking declaratively, https://flutter.dev/docs/development/
data-and-backend/state-mgmt/declarative, , Accessed 2021-02-08
6
Ibid.
7
Flutter architectural overview, https://flutter.dev/docs/resources/
architectural-overview, Accessed 2021-02-08
CHAPTER 2. BACKGROUND 9
Developers will interact with Flutter using the Flutter framework. It provides
a reactive framework with a set of platforms, layout, and foundation widgets.
The framework is written in Dart language. It is also the language used when
developing Flutter applications.9
The core of Flutter is the Flutter engine, which is developed using C++. The
engine exposes itself to the Flutter framework through the dart library dart:ui.
It wraps the engine’s C++ code in Dart classes.10
The engine implements Flutter’s core libraries, including animation and graph-
ics, file and network I/O, accessibility support, plugin architecture, and a Dart
runtime and compiles toolchain. It is responsible for rasterizing visuals when
8
Flutter architectural overview, https://flutter.dev/docs/resources/
architectural-overview, Accessed 2021-02-08
9
Ibid.
10
Ibid.
10 CHAPTER 2. BACKGROUND
Framework
Developers will usually interact with Flutter using the Framework, which is
composed of a series of layers. The bottom layers are the foundation classes.
Classes for building block services like animation, paintings, and gestures that
use common abstractions over the underlying foundation.12
Above that is the rendering layer. It provides an abstraction for dealing with
the layout. It consists of a tree of renderable objects. These objects can be
manipulated dynamically. If there are any changes, the tree will automatically
update the layout to reflect the changes.13
The widgets layer is an abstraction of the rendering layer. Each rendered object
has a corresponding class in the widgets layer. The Material and Cupertino are
examples of two packages using the widget layers components. Components
from the Material package implements the Material design language14 , and
respectively, components from the Cupertino package are implementing the
iOS design language15 . The philosophy is the widgets should produce appro-
priate effects of the platform conventions but not adapt automatically when
app design choices are needed16 .17
The Flutter framework also implements packages including higher-level fea-
tures that a developer might need. Packages supporting features camera and
webview, as well as platform-agnostic features like characters, HTTP, and an-
imations.18
11
Ibid.
12
Flutter architectural overview, https://flutter.dev/docs/resources/
architectural-overview, Accessed 2021-02-08
13
Ibid.
14
Material Design - https://material.io/
15
Human Interface Guidelines, https://developer.apple.com/design/
human-interface-guidelines/ios/overview/themes/, Accessed 2021-02-
08
16
Platform specific behaviors and adaptations https://flutter.dev/docs/
resources/platform-adaptations, Accessed 2021-02-14
17
Flutter architectural overview, https://flutter.dev/docs/resources/
architectural-overview, Accessed 2021-02-08
18
Ibid.
CHAPTER 2. BACKGROUND 11
Widgets
The framework and applications UI main building blocks are widgets. The
idea is to use widgets when developing UI for Flutter applications19 . Wid-
gets can either be stateless or stateful. Stateless widgets are immutable, where
stateful widget are always mutable202122 .
A widgets state is store in a State object. The state consists of the mutable
values of the widget23 . If the widgets state changes, the widget is responsible
for notifying the state object that it has changed24 . Thus, when the widget
changes, the framework will be notified and rebuild the UI. See more in 2.1.1.
Rendering
The Dart code that paints Flutter’s graphics compiles into native code. It uses
the open-source 2D graphics library Skia25 . It does not use an abstraction layer
that might use underlying native libraries. It bypasses the platform’s system
UI libraries and uses its own widget set. It enables Flutter to skip adding an
overhead that could be significant, which might affect an application’s perfor-
mance. Particularly if there would be several interactions between the UI and
the application logic.26
Flutter embeds a copy of Skia as part of the Flutter engine. It lets develop-
ers update their applications and stay up to date with the latest improvements.
Even if the versions of the platforms do not update. The same goes for An-
droid, iOS, macOS, or Windows.27
19
Introduction to widgets, https://flutter.dev/docs/development/ui/
widgets-intro, Accessed 2021-02-08
20
StatefulWidget class, https://api.flutter.dev/flutter/widgets/
StatefulWidget-class.html, Accessed 2021-02-08
21
StatelessWidget class, https://api.flutter.dev/flutter/widgets/
StatelessWidget-class.html, Accessed 2021-02-08
22
Adding interactivity to your Flutter app, https://flutter.dev/docs/
development/ui/interactive, Accessed 2021-02-08
23
Ibid
24
StatefulWidget class, https://api.flutter.dev/flutter/widgets/
StatefulWidget-class.html, Accessed 2021-02-08
25
Skia Graphics Library - https://skia.org/
26
Flutter architectural overview, https://flutter.dev/docs/resources/
architectural-overview, Accessed 2021-02-08
27
Ibid.
12 CHAPTER 2. BACKGROUND
Platform embedding
To the underlying operating system of the platform, Flutter applications are
packaged in the same way as native applications. The embedders are platform-
specific and are written in the language appropriate to the platform. For An-
droid, the embedder is built using Java and C++. For iOS, the embedder is
built using Objective-C/Objective-C++.28
The platform embedder is the native application that hosts the Flutter content.
Its binds together the host operating system and Flutter. When starting a Flut-
ter app, the platform embedder initializes the Flutter engine, obtains threads
for the UI and rastering, and creates a texture which Flutter can write to. The
embedder is responsible for the application lifecycle, input gestures (mouse,
keyboard, touch, etc), window sizing, thread management, and platform mes-
sages. Flutter currently include embedders for Android, iOS, Linux, macOS,
and Windows.29
On Android, Flutter loads in the embedder as an Activity. The view is be-
gin controlled by a FlutterView. The FlutterView renders Flutter content as a
view or a texture, depending on the composition and z-ordering of the Flutter
content.30
On iOS, Flutter is loaded in the embedder as a UIViewController. The plat-
form embedder creates a FlutterEngine which is the host to Dart VM, Flut-
ter runtime and a FlutterViewController. The FlutterViewController attaches
the FlutterEngine to pass UIKit or CocoaInput events into Flutter and display
frames rendered by the FlutterEngine using Metal or OpenGL.31
They often have the benefit of taking advantage of the device’s various fea-
tures, such as a camera and sensors.
2.2.1 Development
Each platform usually has its development process with its native program-
ming languages. Android applications can be developed by using Kotlin, Java
or C++ [27, 28], while iOS applications can be developed by using Swift or
Objective-C [29].
Developing a native application is often developed with the SDK provided by
the vendor. The SDKs include development tools and common user interface
elements. Interface elements as buttons, text, navigation components, etc. [27,
29]
This means that the development of native applications is tightly coupled with
its platform. This prevents developers to develop one application for multiple
platforms where the code is reused. If an application needs to support multiple
platforms, it is necessary to have multiple codebases.
2.3 Model-View-ViewModel
According to Sorensen and Mikailesc [30], developers have been using dif-
ferent design patterns for building applications since the day it was neces-
sary to have user interfaces. Some of these examples are the Model-View-
Controller(MVC) and the Model-View-Presenter(MVP) pattern. What is pre-
sented to the user is the View, necessary data that is visible in the view is the
Model, and the Controller or the Presenter ties the two together.
In 2005, John Gossman presented another design pattern named Model-View-
ViewModel (MVVM). Similar to a Controller or Presenter, the ViewModel
glues the Model and ViewModel together. However, in contrast to a Con-
troller or a Presenter, the ViewModel does not hold any references to the view.
The ViewModel exposes data models and objects contained in the view. The
responsibilities of the ViewModel and View are now different than in the pre-
vious design patterns. The design pattern is often visualized linearly (see fig.
2.2). [30, 31]
14 CHAPTER 2. BACKGROUND
The MVVM pattern is often visualized linearly to point out the flow of data
and information. The model is responsible for accessing different data sources
(e.g., databases, files, or servers). In general, the model often tends to be thin
in the MVVM implementation. The View represents the appropriate format
whether it is graphical or non-graphical, reflecting the state of the data. It
collects user interactions and events. Similar to the Model, Views in MVVM
are thin and contains minimal code. It only contains code that is required to
make the View work and allow user interactions. [31]
In MVVM, most of the code is in the ViewModel. The ViewModel should
represent the view state and is expected to behave according to the view logic,
i.e. the user interactions. ViewModel handles the communication between
the Model and the View. It passes all the necessary data between the Model
in View in such forms that they can digest the data. Necessary validation is
performed in the ViewModel. [31]
Kouraklis [31] describes this pattern as the components work in sets of two.
The View is aware of the ViewModel, updates the ViewModel’s properties,
and tracks any changes that occur in the ViewModel. As seen in fig. 2.2, the
ViewModel does not hold any references to the View. Similarly, the Model is
not aware of the ViewModel or the View. Only the ViewModel has a reference
to the Model. The ViewModel passes events and data to the Model, as they
are pushed by the view in forms that the Model can interpret. As the View-
Model holds a reference to the Model, it tracks any changes in the Model and
consequently pushes necessary signals to the View.
The three mentioned design patterns MVC, MVP, and MVVM are common
design patterns for both Android and iOS development [32, 33]. In research
by Lou [32], he concludes that implementing MVP and MVVM patterns re-
sults in a lower coupling level (i.e. superior testability and modifiability) and
consumed less memory than using the MVC pattern. There was no significant
performance difference between the MVP and MVVM patterns.
Aljamea and Alkandari [33] investigates both MVC and MVVM for iOS devel-
opment. The authors claim that using the MVC pattern tends to have massive
CHAPTER 2. BACKGROUND 15
files containing the code for both the view and the controller. Which makes
it challenging to test, reuse, and maintain. Additionally, it increases the risk
of writing errors in the code without recognizing them as they are long and
difficult to test. Therefore, the authors conclude their research by advising de-
velopers to use the MVVM pattern as it separates concerns making it more
testable and modifiable.
2.4 t-Test
According to Quirk [34], two groups are independent of each other whenever
the groups consist of different people. Conditionally, no person can be in both
groups and every person only produces one value in number in statistical mea-
surements. When studying the independent groups, it is often required to sta-
tistically test central tendencies (mean or median) of the two groups, whether
they are significantly different from each other or not[35].
In 2006, Ruxton [35] evaluated 130 research papers and found that there were
mainly three different tests used to find significant changes between two groups.
The three common tests were: Student’s t-test, Mann–Whitney U test, and the
t-test for unequal variances. In the 130 papers evaluated the majority (47 out
of 71) of them had a sample size below 30 participants.
The unequal variance t-test is an adaption of the Student’s t-test. The Stu-
dent’s t-test assumes an equal variance were the first test method does not.
As the name implies, conducting a t-test with unequal variances is based on
the assumption that the variance is unequal and/or has an unequal sample
size. Which results in the t-test using unequal variances produce more reli-
able results than the Student’s t-test. In practice, it is more like the sample
size is unequal. Therefore, Ruxton [35] suggests that using a t-test with un-
equal variance should always be used in preference to the Student’s t-test or
Mann–Whitney U test.
The t-test with unequal variance is formulated as:
X 1 − X2
t= r (2.1)
s21 s2
N1
+ N22
where Xj , sj and Nj are the jth sample mean, sample standard deviation, and
sample size, respectively j ∈ {1, 2}. If the t-value is below 0.05 there is a
significant difference between the group with a 95% confidence[36].
16 CHAPTER 2. BACKGROUND
UEQ scales
• Attractiveness - Overall impression of the product. Do users like or
dislike the product?
• Perspicuity - Is it easy to get familiar with the product? Is it easy to
learn how to use the product?
• Efficiency - Can users solve their tasks without unnecessary effort?
• Dependability - Does the user feel in control of the interaction?
• Stimulation - Is it exciting and motivating to use the product?
• Novelty- Is the product innovative and creative? Does the product catch
the interest of users?
UEQ consists of multiple scale items. Thus, each scale item is a scale question
about the product which the participant should answer or leave blank. The
questions are created by the creators of the UEQ and are thus predefined. The
scale items are not replaceable with custom questions. The UEQ and the scale
items are available in multiple languages at UEQ’s official website32 . The scale
item ranges from a value of -3 to 3. The middle value is 0, which is considered
neutral. One item regarding the attractiveness of a product looks like this:
attractive ◦ ◦ ◦ ◦ ◦ ◦ ◦ unattractive
Participants in the study evaluating the product should answer the UEQ im-
mediately after using the product. If participants fill out the UEQ after discus-
sions about the product, it may influence the results. The goal of the UEQ is to
catch an immediate impression of a user towards the product. Thus, discussion
about the product may come afterward. [37]
More data collected using the UEQ will give a more stable scale means, and
thus more accurate conclusions can be drawn from the data. There is no de-
fined minimum number to use the UEQ. However, according to Schrepp [37],
products evaluated go about providing reliant results from 20-30 participants.
The reliability of data also depends on the participant’s level of agreement.
The more the participants agree, the lower the standard deviation of the an-
swers to items is, and less data is needed for reliable results.
The data gathered from the UEQ can be analyzed using an Excel-tool. The tool
is provided by the creators of UEQ and is available at their website33 . It aims
32
User Experience Questionnaire - https://www.ueq-online.org/
33
User Experience Questionnaire - https://www.ueq-online.org/
18 CHAPTER 2. BACKGROUND
to make the analysis of the UEQ data as easy as possible. They also provide
a tool designed to be used when comparing two different products that have
been evaluated using two different UEQ’s.
The comparison tool produces a graph that can be used to control if the results
have a significant difference. It calculates the mean values of the UEQ scales
with a 95% confidence interval. The confidence interval is represented by
error-bars in the graph.
Assuming it would be possible to repeat the evaluation indefinitely under the
same conditions, there would some random influences which would affect the
results. However, with a 95% confidence interval, it is the interval in which
95% of the scale mean repetitions should be located hypothetically. Thus, the
confidence interval shows how accurate the measurements are. If the confi-
dence interval is relatively large, it indicates that the measurements are inac-
curate.
The comparison tool also conducts a two-sample t-test. It uses a selected α
value of 0.05 which indicates how statistically different the results are. If the
t-test value is below the α value 0.05, there is a significant difference between
the two versions. This can be seen in the graph produced by the tool, using the
error-bars. If the error-bars overlap, there is no significant difference among
the versions.
Chapter 3
Method
3.1.1 Complexity
In previous studies (see. 1.6.1) the complexity varied as there were differ-
ent functionalities implemented. Both Abrahamsson and Berntsen [19] and
Fredrikson [18] developed applications that did not use any external APIs and
19
20 CHAPTER 3. METHOD
tions. Therefore, the aim was to use the MVVM pattern when developing the
sample applications for this research and thesis.
The state diagram reflects the user flow and different states for all of the de-
veloped sample applications. As they are identical for each developed sample
application. This means, when the user starts the app, the trending view is
shown immediately and it fetches trending movies from the TMDB API. Af-
ter startup, the user can change between being in the trending view or search
view anytime. Thus, the outer state is always in the Trending or Search state.
When the user selects a movie from either the trending list or on a hit from
the search results, the inner state is changed to the Movie state. The user can
return to the previous state from the Movie state.
Detailed explanation of each state
• Trending
– Fetching: Fetches trending movies immediately after startup
using TMDB API with endpoint
api.themoviedb.org/3/trending/movie/day?page=<PAGE>.
Fetches additional movies when user reach ends of list (pagina-
tion). Movies already loaded are still visible to the user.
– List: Loaded trending movie list visible to the user.
• Search
– Searching: Searching for desired movie using
24 CHAPTER 3. METHOD
3.2.1 Size
Measurement of the installer was primarily done by measuring the size of the
installer file. Thus, for Android installer .aab, and likewise for the iOS installer
.ipa the size of the file was measured.
For the measurement of the application size and how much disk space was
needed for the installed sample applications, the built-in platform features were
used. On both Android and iOS, the application size could be found under the
settings menu. This information was also available to the user. Thus, achieving
a user-perceived application size.
26 CHAPTER 3. METHOD
give reliable results (see more in 2.5) for one product. Thus, it would need
80-120 participants as there are in total four applications being tested. How-
ever, in the study by Hansson and Vidhall [12] which used the same method
for evaluating the cross-platform framework React Native had conclusive re-
sults from 32 participants testing four different applications. Fredrikson [18]
conducted a similar study using the same UEQ method and found conclusive
results with 30 participants testing three different applications.
The UEQ also provided comparison-tools calculating statistically significant
differences between two different products. It calculated the statistical differ-
ence using a t-test with unequal variances. The t-test with unequal variance
has been recommended approach by Ruxton [35] (see 2.4) since 2006. Using
t-test with unequal variances is more likely to produce reliable results when
evaluating differences among two independent groups than other similar meth-
ods. In the same study, it was found the majority of studies related to evaluating
significant differences in two independent groups, had below 30 participants
in the tests.
Based on the findings of Ruxton [35] and previous studies which used the same
method by Fredrikson [18] and Hansson and Vidhall [12], the aim was to get
at around 20-30 participants for each comparison. Resulting in a total of 40-
60 participants. Additionally, applying the t-test using unequal variances on
the results of the UEQ would indicate if there were any significant statistical
differences between the different applications.
application. Participants were not made aware of which version of the sample
application they tested.
For convenience, and to encourage more individuals to participate in the study,
face-to-face tests were performed. Meaning, an Android or iOS device was
handed to a user, which they then performed the tasks on. Immediately after
the tasks were performed, they were asked to answer the UEQ. Similar to the
participants who tested the applications remotely, these participants were not
aware of which version of the sample application they tested. The face-to-face
test was conducted without giving them any additional information that would
not be available for the participants who tested the applications remotely on
their devices.
User tasks
The users’ tasks described below, were chosen to ensure the participants in
both groups tested the application at a minimum. No matter how much time
each participant spent testing the app, participants from both groups had tested
all of the functionalities of the application.
1. View details about any trending movie.
2. Search for any movie. If search results
(a) includes desired movie → view details about the movie.
(b) does not include desired movie → search for another movie and
repeat the task until desired movie is found.
Dataset
When gathering data for the user perception study, the participants were first
asked what their default platform was. If it was Android, they tested a sample
application for the Android platform. Respectively for iOS users, they tested
one of the iOS sample applications.
This resulted in 23 Android users and 23 iOS users. In a total of 46 participants
in the study. Of the Android users, 11 participants tested the Flutter Android
application and 12 participants tested the native Android application. Of the
iOS users, 11 participants tested the Flutter iOS application and 12 participants
tested the native iOS application.
There were no specific target group of participants. The only requirement
was that they were using an Android or iOS device regularly. This was done
CHAPTER 3. METHOD 29
to decrease the threshold for acquiring participants to the study. The sample
applications could also be used by anyone.
This lead to a group of individuals with different backgrounds, different ages,
and different technical knowledge. The application was deemed to have such
limited functionality, that none of those factors would have resulted in any
significant effects on the study.
Result
This chapter presents the results of the research conducted for the thesis. It
starts by presenting the results regarding the user-perceived performance. There
is one section each for application size and startup time. It is followed by
presenting the results of the user perception study conducted using the UEQ
method.
30
CHAPTER 4. RESULT 31
.aab .ipa
Native 5.4 MB 3.9 MB
Flutter 22.5 MB 76.1 MB
Table 4.1: Application package sizes of the developed sample applications.
The installed application size differs from the application package size, as it
gives different install files for different devices. Meaning, the installed appli-
cation sizes also differ among different devices. The result of installed appli-
cations on two different test devices is shown in table 4.2.
On Android, the installed Flutter application was 30.33 MB larger than the
native application on a Samsung Galaxy S9 using Android 10. On iOS, the
installed Flutter application was 28.7 MB larger than the native application on
an iPhone XS using iOS 14.3. This corresponds to the Flutter application was
approximately seven times larger and nine times larger than the native sample
Android application and respectively sample iOS application.
Android
The native Android application was tested on seven different devices, and the
Flutter application was tested on another six android devices. In a total of
32 CHAPTER 4. RESULT
13 unique devices, where all of them were different models. The Flutter de-
veloped application startup median was approximately 130% longer than the
native application startup median. Thus, native Android application results in
a better user-perceived performance in regards to startup time.
The slowest startup time was 826 ms on a Samsung Galaxy S8 using Android 9
for the native application and 936 ms on a Samsung Galaxy S9 using Android
10 for the Flutter application. The fastest startup time for the native application
was 77 ms on an OnePlus using Android 9 and 128 ms on a Samsung Galaxy
S20 5G using Android 10 for the Flutter application.
The devices which resulted in the lowest startup time were also the oldest de-
vices used in the study. Samsung Galaxy S20 5G was the newest device used
in the test. The model of the OnePlus which measured 77 ms for startup time
could not be identified from the monitoring tool.
iOS
The native iOS application was tested on five different devices, and the Flut-
ter application was tested on another four devices. In a total of nine different
devices. However, there were only three different models. The Flutter de-
veloped application startup median was approximately 230% longer than the
native application startup median. Thus, native iOS application results in a
better user-perceived performance in regards to startup time.
The slowest startup time was 716 ms on an iPhone XR using iOS 14.2 for the
native application and 3.97 s on an iPhone 6 Plus using 12.5 for the Flutter
application. The fastest startup time for the native application was 156 ms on
an iPhone XS using iOS 14.3 and 105 ms on an iPhone XS using iOS 14.3 for
the Flutter application.
The iPhone 6 Plus was the oldest device and had the oldest version among
the three devices tested. It resulted in the longest startup time for the Flutter
application. The iPhone XS was among the newest devices and use the newest
version among the test devices. It resulted in the fastest startup time for both
the native application and the Flutter application.
CHAPTER 4. RESULT 33
4.2.1 Android
The values of the UEQ scale means for the developed Android sample applica-
tions can be seen in figure 4.1. The native Android application scored higher
on most UEQ scales. However, there are no statistical differences that can be
seen in the results of the t-test with unequal variances in figure 4.5. Further-
more, in figure 4.1 all of the error bars overlap with each other, which further
indicates there are no significant differences between the two applications.
Figure 4.1: Comparison of UEQ scale means for the Android applications.
34 CHAPTER 4. RESULT
4.2.2 iOS
The values of the UEQ scale means for the developed iOS sample applications
can be seen in figure 4.2. The Flutter application scored higher on all UEQ
scales. Similar to the case for the Android platform, there is no statistical
difference between the two iOS applications. Results of the t-test with unequal
variances show (see table 4.6) shows no significant differences and all of the
error bars overlap with each other.
Figure 4.2: Comparison of UEQ scale means for the iOS applications.
CHAPTER 4. RESULT 35
Discussion
This chapter presents the discussion of the results and findings during the re-
search for the thesis. It starts by discussing the measurements and results
of the user-perceived performance research. Followed by the discussion and
analysis of the user perception study. Lastly, there are sections regarding the
source of error, sustainability, and ethical aspects
36
CHAPTER 5. DISCUSSION 37
5.1.2 Size
This thesis found that for both Android and iOS platforms, the developed
Flutter application was significantly larger than its native counterpart. It is
assumed this is due to necessary Flutter overhead. Flutter bypasses the plat-
form’s UI library and has its widget system, therefore it needs to embed the
widget system
No official documentation was found if Flutter contains all of the widgets or
only the widgets necessary for a developed Flutter application. There is a
risk that redundant widgets are added. This could be one pitfall of why the
Flutter applications were significantly larger. However, it is certain that Flutter
includes a copy of the Skia graphics library (see section 2.1.1) and thereby
increases the application size.
Having an embedded widget system and include a copy of Skia might be detri-
mental to the application size. It does have its benefits as well. Flutter does
not need to have an abstraction layer that similar cross-platform technologies
(see section 2.1.1) would need instead.
Furthermore, this allows Flutter to be developed without adapting to changes
for each platform. It might be beneficial for developers as well as the main-
tainers of the Flutter. When any platform updates its versions and its native
UI libraries, it would not affect existing Flutter applications. It is beneficial
for developers as they only need to adapt to changes in Flutter, and not track
changes for each platform. For the maintainers of Flutter, it results in not hav-
ing to adapt to every new release of platform versions. Which likely implies
that there is less time needed to make Flutter work and more time to improve
the existing technology. Where maintainers of similar technologies might need
to spend time adapting their solution for each new version of every supported
38 CHAPTER 5. DISCUSSION
One idea that can be considered, is that Flutter developed applications seems
to take longer time to start than native applications and similar cross-platform
solutions. In prior studies, the startup time is quite similar between native
and cross-platform developed applications [17]. A note of caution is due here
since there ware significant difference between lower and upper bounds of the
startup time. To increase the confidence of the conclusion, the startup time
would need to be measured on more devices and thus require more participants
in the user study.
Another major source of uncertainty is in the method used to measure the
startup time. It does not consider the difference between cold and hot startup
times. Cold startup time would the the measurement of the time taken when
the application started from scratch, where the system’s process has not been
previously started. Hot startup time would be the measurement of the time
taken where the system’s process had already been started. However, if there
would be a significant number of users tested the application these difference
would likely not affect the end result. If there was vast amount of quantitative
startup time data available, it would be sufficient to draw conclusion based on
the median measured by Firebase Performance Monitoring. The small size of
the dataset indicates the tool was not appropriate chose for this research. In
conclusions, it can thus be suggested that Flutter native application seems to
take longer time to startup than native applications but the uncertainty is high.
for all of the UEQ scales for both the Android and iOS platforms. The t-test
with unequal variances further statistically validates that there are no signif-
icant differences between the developed applications. Implying, there is no
preference from the users between native and Flutter applications.
These findings suggest that it is possible to achieve the same look and feel as
native applications. This may be explained by the fact the Flutter’s adaption
philosophy is to produce the appropriate effect of the platform conventions but
not adapt atomically to new changes in the platform (see section 2.1.1). I.e.
Flutter aims to mimic native application conventions. It can thus be suggested
that the results of the conducted user perception study are only valid as long
as the adaption philosophy is consistent and up to date with the platform con-
ventions. If the platform conventions changes due to an updated version and
Flutter does not get updated to adapt to the newer conventions, there would be
a discrepancy between Flutter developed applications and native applications.
Consequently, this would affect users as they might get accustomed to and
expected a certain behavior while using their device. Thus, perceive Flutter
application differently for better or worse.
5.2.1 Method
It was harder than expected to convince people to participate in the user per-
ception study. For many individuals, downloading and installing an applica-
tion on their device, test it, answer a questionnaire was too much of a hurdle.
Hence, the method might have been the wrong one.
All of this also had to be done remotely due to the COVID-19 pandemic in
Sweden, which entailed that it was difficult to understand the participants. Did
they understand the instructions, did they understand the application, or did
they need more initial guidance? Also, did this affect their perception? The
UEQ is still most likely the right tool for the job, it should probably not have
been conducted remotely.
The features chosen for the sample applications are features that are present in
production applications. Features such as searching for items, displaying data
in a scrollable list, and view further details about an item. The functionalities
chosen for the sample applications are also functionalities that are present in
production applications. Fetching data from an external API, parsing JSON
data, and using an endpoint that uses pagination. This was deemed to mimic
a production application sufficiently, for concluding whether a Flutter devel-
oped application is a viable alternative for a native application from the user’s
perspective.
The aim was to reduce disparities between the sample application and produc-
tion applications, which does not mean there do not exist disparities between
the sample applications and production applications. Consequently, it does
affect the results of the thesis. If different applications with other features and
functionalities had been chosen, the results might have been different. It could
affect the application size, startup time, and the user’s perception.
However, the outcome or findings of the user-perceived performance would
most likely be the same. One of the main findings and assumed biggest sig-
nificant factor for inferior user-perceived performance is the built-in overhead
of the Flutter engine. The overhead of the Flutter engine is the fundamental
block that is constant. Thus, no matter what features and functionalities are
chosen when developing a Flutter application, it would be the same. The same
issues and difficulties arising from Flutter’s inner workings would arise for any
Flutter developed application.
Similarly, it would doubtfully change the outcome and findings from the user
perception study. Conditionally, that the applications are limited to used rela-
tive limited features and functionalities as in this study.
If the same research would have been made on applications that were depen-
dent on functionalities only available through native libraries, it would most
likely affect the user perception. Whether in favor or disfavor of the Flutter
application can not be said until it has been tested. Furthermore, no features
were deemed to be resource-intensive, which implies it difficult to say what the
effects would be on user perception if resource-intensive features were neces-
sary.
42 CHAPTER 5. DISCUSSION
Conclusion
This thesis aimed to answer the question "can a Flutter developed application
be a viable alternative for native application from the user’s perspective?".
Based on the results, a few conclusions can be drawn.
If the application size of the application is vital for the users, a Flutter applica-
tion is most likely not suitable. However, if it is of less importance, a Flutter
application might be a viable alternative to a native application from a user’s
perspective.
The user-centered perspective was broken down into two smaller focus areas,
user-perceived performance and user perception. In both aspects, the native
applications seemed to perform better and leave a better impression on the
users.
In terms of performance from a user’s perspective, Flutter developed appli-
cations had a larger application size. The results also indicates the Flutter
applications longer time to start. However it is uncertain due to significant
variance in measured startup times. One of the main reasons of the inferior
user perceived performance may be due to Flutter’s internal workings and its
built-in overhead.
For user perception, a study was conducted using the User Experience Ques-
tionnaire (UEQ). The aim of the study was was understand user perception
and see if there were any significant differences between native and Flutter
developed applications. The result seems to suggest there is no significant
difference between a Flutter-developed application and a native application.
There is a high uncertainty here, due to the few amount of participants in the
44
CHAPTER 6. CONCLUSION 45
study. More participants in the study would likely increase the confidence
of the conclusion, Less likely, more participants might change the outcome,
thus indicating a significant difference between the applications. However, the
t-test performed seem to indicate that there wouldn’t be any significant differ-
ence.
Chapter 7
Future work
46
Bibliography
[1] Margaret Butler. “Android: Changing the mobile landscape”. In: IEEE
pervasive Computing 10.1 (2010), pp. 4–7.
[2] Aijaz Ahmad Sheikh et al. “Smartphone: Android Vs IOS”. In: The
SIJ Transactions on Computer Science Engineering & its Applications
(CSEA) 1.4 (2013), pp. 141–148.
[3] Markus Pierer. “Mobile platform support”. In: Mobile Device Manage-
ment. Springer, 2016, pp. 37–42.
[4] Kristiina Rahkema and Dietmar Pfahl. “Empirical study on code smells
in iOS applications”. In: Proceedings of the IEEE/ACM 7th Interna-
tional Conference on Mobile Software Engineering and Systems. 2020,
pp. 61–65.
[5] Iván Tactuk Mercado, Nuthan Munaiah, and Andrew Meneely. “The
impact of cross-platform development approaches for mobile applica-
tions from the user’s perspective”. In: Proceedings of the International
Workshop on App Market Analytics. 2016, pp. 43–49.
[6] Henning Heitkötter, Sebastian Hanschke, and Tim A Majchrzak. “Eval-
uating cross-platform development approaches for mobile applications”.
In: International Conference on Web Information Systems and Tech-
nologies. Springer. 2012, pp. 120–138.
[7] Manuel Palmieri, Inderjeet Singh, and Antonio Cicchetti. “Comparison
of cross-platform mobile development tools”. In: 2012 16th Interna-
tional Conference on Intelligence in Next Generation Networks. IEEE.
2012, pp. 179–186.
[8] Spyros Xanthopoulos and Stelios Xinogalos. “A comparative analy-
sis of cross-platform development approaches for mobile applications”.
In: Proceedings of the 6th Balkan Conference in Informatics. 2013,
pp. 213–220.
47
48 BIBLIOGRAPHY
1 2 3 4 5 6 7
Irriterande Njutbar 1
Obegriplig Begriplig 2
Kreativ Tråkig 3
Värdefull Värdelös 5
Tråkig Spännande 6
Ointressant Intressant 7
Oförutsägbar Förutsägbar 8
Snabb Långsam 9
Uppfinningsrik Fantasilös 10
Hindrande Stödjande 11
Bra Dålig 12
Komplicerad Enkel 13
Bakåtsträvande I framkant 15
Motiverande Omotiverande 18
Ineffektiv Effektiv 20
Tydlig Förvirrande 21
Opraktisk Praktisk 22
Strukturerad Rörig 23
Estetisk Oestetisk 24
Konservativ Innovativ 26
52
TRITA -EECS-EX-2021:756
www.kth.se