MeeGo 1.2 Harmattan Developer Documentation Develop for the Nokia N9

Designing for performance

When developing applications, consider performance issues at the very beginning, because the design is usually the main factor behind the performance of the final software. If poor performance is caused by design issues, you cannot improve the performance of your application by micro-optimising the code. Design may cause problems in the following ways, for example:

  • poorly chosen central algorithms
  • inefficient architecture
  • expensive UI design that causes heavy CPU or memory utilisation

Performance problems caused by the design are not necessarily noticed early enough in the project. When several components are already working together, changes in the design are simply too expensive to carry out.

Building prototypes is usually the best way to test the design. Build the very first prototypes as early as possible to prove the initial design concepts. If it seems that some requirements cannot be fulfilled with the initial design, you can redesign the application when it is still affordable.

Note: Verify performance at all levels and especially with prototypes, with realistic data amounts. This is very important to remember when prototyping, otherwise crucial design decisions can go wrong.

Prototypes must provide enough performance statistics for verifying that the required performance level can be reached. In addition to data handling speed, the resource consumption of the prototypes must be realistic enough. Even though production quality cannot be expected since most of the functionality does not typically exist in prototypes, you can prove the design concepts based on the main functions. However, never assume that you can safely use 100 % of any type of system resources (CPU, RAM, IO), since there are some critical tasks that must be executable at any time!

Note: Minimise direct and and indirect file accesses and usage of resources (such as RAM and textures).

Improving performance during the design phase

Estimate the speed of your application in advance by considering the following:

  • What is the true, actually measured hardware performance? How fast does it perform certain duties? This covers low-level issues such as speed of copying pieces of memory around the Harmattan device, doing different types of calculations, and basic IO.
  • What are the requirements for the software? What is the software exactly supposed to do? For example, how many pieces of data and how much memory is needed to move around and how fast does it happen?

Based on this information, you can determine what is achievable and what is not. Nevertheless, performance evaluations based on software requirements and hardware expectations enable you to estimate the possible performance of certain simple use cases where only some parts of the hardware are taxed at once.

Performance capabilities of more complex use cases in which the final application executes high-level scenarios can only be estimated properly by prototyping with the actual (or as close to actual as possible) hardware.

Designing testable software

If the design of an application is clean, well-documented and easy to understand, software developers and testers can quickly learn and understand the design and thus make good use of this knowledge in their work. This also makes the software easier to develop and test.

Take testing into account when writing the code of your application by adding enough logging and refactoring the code so that it is easier to understand its logic and execution flow. If it is easy to understand the code, it is also easier to implement modifications, if needed. In general, planning the testing already when designing and writing the code can save a lot of time in the long run. Test-driven development (TDD) illustrates how you can start your development by creating the test cases for the code before the actual code is written.

Minimising active working time

Make sure that tasks are done as quickly as possible (in other words, use optimal algorithms) and when the task is finished, the application does not do anything before the user (or some other external party) requires the application to react. If the application wakes up periodically just to check if there is something to do, it diminishes the battery life.

In general, keep the amount of work required to carry out a specific task to a minimum. For example, if an object needs to be drawn to the screen, update only those parts of the screen that need to be updated. Each additional pixel costs four (4) bytes of memory and memory handling requires activities from many parts of the device.

To perform tasks required by the user as efficiently as possible, certain activities can be carried out in advance. For instance, your application can prerender a set of items that a list object will display, instead of rendering them only when the list is accessed

Even if all the results are not used, this prevents unnecessary delays later on. For instance, even if only a few entries of the precalculated sin table are accessed, the increased access speed usually justifies the additional effort.

Note: Caching is a compromise between memory usage and performance. Too high memory usage can also reduce performance.

Choosing algorithms and data structures

If the input size is fairly large (which means that there are at least several dozens of items), the algorithmic efficiency usually determines the performance. If the execution time of a chosen algorithm increases very rapidly in relation to input, and the increase is, for example O( n2 ) or faster, there is not much point in tuning the performance. Micro-optimisation efforts only make sense if algorithmic efficiency is good enough.

Try out different options before you start coding. In general, performance requirements can be used to determine the required speed and amount of data managed by the software on average and in the worst case. Based on this information, you can select the algorithms and data structures. Consider tradeoffs between time and space (CPU usage and memory usage), and possibly between continuous data throughput and latency to handle a single piece of data. If the chosen algorithms and data structures are not suitable, optimisation efforts at code level do not usually solve the situation.

For coding hints and tips with Qt containers, see the Qt documentation online.

Avoiding messaging between processes

If there are several process instances (or threads) that need to communicate with one another, it often complicates issues. Message passing between separate processes usually slows down the application logic as well due to additional interrupts, network delays and possible synchronisation issues.

Since using high-level messaging systems is always very slow, avoid D-Bus if you can. Instead of D-Bus messaging, consider following options:

  • If you need to transfer small amounts of data between multiple processes, use sockets. They provide a simple and efficient way to communicate between processes.
  • If you need to transfer large amounts of data between multiple processes, use shared memory. It provides an efficient way to transfer complex or large data structures between processes.

For more information on basic approaches and optimisation guidelines for inter-process communications, see Inter-process communication guidelines.

Timing of IO activities

Design all IO-related activities, like reading or writing files, or reading from the network, so that they do not disturb UI activities. For example, reading a large amount of data from files when the application is starting up can significantly slow down the application startup. Consider the following options instead:

  • Minimise the data read and write operations. In the worst case, file reading may be blocked for several seconds when the filesystem is excessively used by multiple processes. Avoid write operations especially at start-up because they slow down the read operations. If the filesystem is full, write operations will fail completely.
  • To make the application startup faster, read files only after the application has already constructed and displayed (an empty) UI. Since sudden, partial changes in UI layout are always annoying, make sure that the UI layout does not change when the file data is accessed.
  • If some data must be read from the file even before an empty UI can be displayed, read only those pieces of data that are absolutely necessary for the initial setup of the application. That is, do not even try to read all the data at once unless it is faster than any other option. The rest of the data can be read when the user has already seen that the application starts up quickly.
  • If the state of the application has to be saved into a file when the application exits, this can cause a long, perceivable delay. Instead of blocking the UI and writing lots of data when the exit is requested, select one of the following approaches:
    • When a user requests to exit, only close the visible UI and place the application in the background but do not exit yet. Do the state saving activities only after it would appear to the user to that the application has closed very quickly. The real exit can happen silently after the state-related data has been saved.
    Note: This solution cannot recover state changes if the system suddenly needs to go down.
    • Save state-related changes to a file shortly after the changes have happened and the device is idle. This way the information in the file reflects the latest state of application. When the user wants to exit, nothing needs to be saved to the file anymore, and the application can shut down immediately.

Using parallel processing only when necessary

Splitting logical activities over several processes or threads usually causes problems. As a general rule, keep all activities within single process instances. Multithreading is difficult to get right, and when it causes problems, it is especially difficult to debug applications that contain several threads (or several process instances).

However, more than one thread or process instance can improve the usability or performance in the following cases:

  • File IO: using a separate thread to access file system can help keep the UI responsive.
  • Network IO: using separate thread for network access can help avoid long network time-outs and keep the UI responsive.

Using virtual functions only when they are really needed

A single virtual function in a class requires that every instantiated object (and instantiated derived object) contain an extra pointer. For simple classes, this can even double the size of objects. Creating virtual objects costs more than creating non-virtual objects, because the virtual function table must be initialised. It also takes a slightly longer time to call virtual functions than non-virtual functions because of the additional level of indirection. This is especially important if you have a large amount of such objects or the methods are called very frequently.

However, when virtual functions are really needed, it is the fastest mechanism to build the same functionality. If you need to track an object's type, you must store a type flag in the class, which increases the size of objects. Then you have to construct a switch statement that is used to select what to do based on the flag; virtual function table is faster, less error prone and more flexible than a custom built switch statement.