Based on the experience I gained while developing my Zaurus media player, here is a short compendium of optimization rules, tweaks and hints when using SQLite on an embedded system (may apply to other systems as well):
- Simplify the database schema as much as possible – even if that means redundant data or illogical structure
- Don’t generalize the database schema – generalization will mostly sacrifice performance and one can’t afford that on an embedded system with its tight restrictions, even if it is more convenient for the developer.
- Only use relations (via IDs etc.) where absolutely necessary. The overhead for lookup and joining tables is considerable, even with an index on the relation.
- Order the tables correctly in SELECTs. Put a table left-most if it is lacking an index on the relation. More details are here.
In general: Check the order of tables in the SELECT statement. A different permutation may be more optimal. Profile. - Prepare your statements and bind values where applicable. This way you can get rid of the parser and VM creation overhead in tight loops (e.g. when inserting and updating).
- Use transactions – even if you’re just reading the data. This may yield a few milliseconds.
- Use temporary tables for intermediate results. They are fast and stay in cache most of the time. Depending on how your SQLite instance is set up, data will only be swapped into an external file if the cache is saturated.
- Try to avoid using views for data you’re constantly accessing. If you can afford it, create temporary tables and insert data there. This will eliminate the overhead imposed by the view evaluation.
- Avoid sub-queries since they tend to create temporary tables and insertion of the intermediate results into those tables may be expensive.
- Try to use indices only on static data or data that changes rarely. Building an index on live or temporary data can be expensive performance-wise. Only do so if the time required for the data lookup considerably outweights the time required for building the index.
- Alternative to indices: hashkeys – Instead of using indices on very long strings, you may store the hash values of those strings as keys in the same table. A lookup via hash values may be a whole lot more efficient. This method is also very effective when you can’t afford the creation of an index due to performance reasons. Downside: You have to take care of the hashkeys. (See remarks in the comments below.)
- No useless indices. Create indices only if your queries actually use the indices on the table (check with EXPLAIN). Having useless indices around may pollute otherwise precious database cache space.
- Be cache-friendly. Depending on the memory conditions, creating temporary tables and indices may bash the cache. Reloading data back into the cache is expensive.
- Double-check your queries and profile them. The SQLite optimizer doesn’t perform as well as the optimizers of big DBs (Firebird / Interbase, PostgreSQL, Oracle etc.).
- Check compiler settings. A higher optimization setting in your C-compiler may very well yield a few tens of milliseconds. Make sure to inline functions (-O3 for GCC 2.95.x, -O2 for GCC 3.x.x and higher). Optimize for architecture and CPU. Omit stack frame pointers (-fomit-frame-pointer) if you’re not producing executables with debug symbols. This may free an additional register for the compiler to use.
- Disable unused SQLite features. This helps to reduce binary size and may also affect performance.
Here are some additional docs to consider:
http://www.sqlite.org/cvstrac/wiki?p=PerformanceTuning
http://www.sqlite.org/cvstrac/wiki?p=PerformanceTuningWindows
http://www.sqlite.org/cvstrac/wiki?p=PerformanceConsiderations
http://www.sqlite.org/optoverview.html
http://web.utk.edu/~jplyon/sqlite/SQLite_optimization_FAQ.html