Documentation Source Text

Changes On Branch branch-3.18
Login

Many hyperlinks are disabled.
Use anonymous login to enable hyperlinks.

Changes In Branch branch-3.18 Excluding Merge-Ins

This is equivalent to a diff from 5c613c450a to 7e41434c2b

2017-05-10
16:38
Update the 35%-faster document to the latest from trunk. (Leaf check-in: 7e41434c2b user: drh tags: branch-3.18)
2017-05-05
16:56
Copy the 35% faster changes from trunk. (check-in: 6fa9ca7f89 user: drh tags: branch-3.18)
12:55
Import the faster-than-filesystem document from trunk. (check-in: 488af3774f user: drh tags: branch-3.18)
2017-04-12
20:00
Update fts5 documentation to reflect new column filter capability. (check-in: 27ba2d2d59 user: dan tags: trunk)
2017-04-11
23:14
Fix a typo on the compile.html page. (check-in: 5c613c450a user: drh tags: trunk)
19:19
Fix typos in the howtocorrupt.html document. (check-in: 20b925f012 user: drh tags: trunk)

Changes to pages/docsdata.tcl.

277
278
279
280
281
282
283





284
285
286
287
288
289
290
doc {SQLite As An Application File Format} {appfileformat.html} {
  This article advocates using SQLite as an application file format
  in place of XML or JSON or a "pile-of-file".
}
doc {Well Known Users} {famous.html} {
  This page lists a small subset of the many thousands of devices
  and application programs that make use of SQLite.





}


###############################################################################
heading {Technical and Design Documentation} technical {
  These documents are oriented toward describing the internal
  implementation details and operation of SQLite.  







>
>
>
>
>







277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
doc {SQLite As An Application File Format} {appfileformat.html} {
  This article advocates using SQLite as an application file format
  in place of XML or JSON or a "pile-of-file".
}
doc {Well Known Users} {famous.html} {
  This page lists a small subset of the many thousands of devices
  and application programs that make use of SQLite.
}
doc {35% Faster Than The Filesystem} {fasterthanfs.html} {
  This article points out that reading blobs out of an SQLite database
  is often faster than reading the same blobs from individual files in
  the filesystem.
}


###############################################################################
heading {Technical and Design Documentation} technical {
  These documents are oriented toward describing the internal
  implementation details and operation of SQLite.  

Added pages/fasterthanfs.in.













































































































































































































































































































































>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
<title>35% Faster Than The Filesystem</title>
<tcl>hd_keywords {faster than the filesystem}</tcl>

<table_of_contents>

<h1>Summary</h1>

<p>Small blobs (for example, thumbnail images)
can be read out of an SQLite database about 35% faster
than they can be read from individual files on disk.

<p>Furthermore, a single SQLite database holding
10-kilobyte blobs uses about 20% less disk space than
storing the blobs in individual files.

<p>The performance difference arises (we believe) because when
reading from an SQLite database, the open() and close() system calls
are invoked only once, whereas
open() and close() are invoked once for each blob
when reading the blobs from individual files.  It appears that the
overhead of calling open() and close() is greater than the overhead
of using the database.  The size reduction arises from the fact that
individual files are padded out to the next multiple of the filesystem
block size, whereas the blobs are packed more tightly into an SQLite
database.

<h1>How These Measurements Are Made</h1>

<p>The performance comparison is accomplished using the
[https://www.sqlite.org/src/file/test/kvtest.c|kvtest.c] program
found in the SQLite source tree.
To compile the test program, first gather the kvtest.c source file
into a directory with the [amalgamation|SQLite amalgamation] source
files "sqlite3.c" and "sqlite3.h".  Then on unix, run a command like
the following:

<codeblock>
gcc -Os -I. -DSQLITE_DIRECT_OVERFLOW_READ kvtest.c sqlite3.c \
    -o kvtest -ldl -lpthread
</codeblock>

<p>Or on Windows with MSVC:

<codeblock>
cl -I. -DSQLITE_DIRECT_OVERFLOW_READ kvtest.c sqlite3.c
</codeblock>

<p>
Use the resulting "kvtest" program to
generate a test database with 100,000 random blobs, each 10,000 bytes in
size using a command like this:

<codeblock>
./kvtest init test1.db --count 100k --size 10k
</codeblock>

<p>
Next, make copies of all the blobs into individual files in a directory
using commands like this:

<codeblock>
mkdir test1.dir
./kvtest export test1.db test1.dir
</codeblock>

<p>
At this point, you can measure the amount of disk space used by
the test1.db database and the space used by the test1.dir directory
and all of its content.  On a standard Ubuntu Linux desktop, the
database file will be 1,024,512,000 bytes in size and the test1.dir
directory will use 1,228,800,000 bytes of space (according to "du -k"),
about 20% more than the database.

<p>
Measure the performance for reading blobs from the database and from
individual files using these commands:

<codeblock>
./kvtest run test1.db --count 100k --blob-api
./kvtest run test1.dir --count 100k
</codeblock>

<p>
Depending on your platform, you should see that reads from the test1.db
database file are about 35% faster than reads from individual files in
the test1.dir folder.

<h2>Variations</h2>

<p>The [-DSQLITE_DIRECT_OVERFLOW_READ] compile-time option causes SQLite
to bypass its page cache when reading content from overflow pages.  This
helps database reads of 10K blobs run a little faster, but not all that much
faster.  SQLite still holds a speed advantage over direct filesystem reads
without the SQLITE_DIRECT_OVERFLOW_READ compile-time option.

<p>Other compile-time options such as using -O3 instead of -Os or
using [-DSQLITE_THREADSAFE=0] and/or some of the other
[recommended compile-time options] might help SQLite to run even faster
relative to direct filesystem reads.

<p>When constructing the test data, trying varying the size of the blob.
The performance advantage will shift toward direct filesystem reads as
the size of blobs increase, since the cost of invoking open() and close()
will be amortized over more bytes transferred using read().  The break-even
point, the point where it becomes faster to read directly from the filesystem,
will vary from one system to another.  In the other direction, reducing the
blob size provide more advantage to database reads.  With a 5 KB blob size,
reading from the database is twice as fast and uses 60% less space than
blobs stored as individual files.

<p>The --blob-api option causes database reads to occur using the
[sqlite3_blob_open()], [sqlite3_blob_reopen()], and [sqlite3_blob_read()]
interfaces instead of using SQL statements.  Without the --blob-api
option, a separate SQL statement is run to read each blob and
the performance of reading from the database is approximately
the same as the performance from reading directly from files.
This is still a significant finding, since few people would
expect a [full-featured SQL] database to run as fast as direct file reads,
and yet SQLite does.

<p>The --random option on the "run" command causes the
blobs to be read in a random order.  This causes the performance of database
reads to decrease.  The reason is that the blobs are tightly packed in
the database, rather than being padded out to the next block size as when
they are stored in the filesystem.  Some pages contain parts of
adjacent blobs.  When the blobs are read sequentially, those pages are
only read into memory once and cached and then used to reconstruct
adjacent blobs, but when blobs are read in a random order, those pages
that share parts of two or more blobs tend to be read multiple times,
leading to decreased performance.

<p>The "--mmap SIZE" option on the "run" command causes the database file
to be accessed using mmap() instead of via read().  The SIZE argument is
the size of the memory mapped region, and should be the size of the database
file for maximum performance.  Using "--mmap 1G" causes the database reads
to be almost twice as fast as disk reads even when the --random
option is used.

<p>When --random is used and both --blob-api and --mmap are omitted,
reading directly from files on disk is generally a little faster, but
reads from the database are still competitive.

<h1>Other Considerations</h1>

<p>Some other SQL database engines advise developers to store blobs in separate
files and then store the filename in the database.  In that case, where
the database must first be consulted to find the filename before opening
and reading the file, simply storing the entire blob in the database is
gives much faster read performance with SQLite.
See the [Internal Versus External BLOBs] article for more information.

<p>This report only looks at the performance of reads, not writes.
Because SQLite implements [atomic commit|power-safe ACID transactions]
we expect that write performance into SQLite will be slower than writing
directly to individual files.  However, if ACID transactions are disabled
via [PRAGMA journal_mode|PRAGMA journal_mode=OFF]
(thus putting SQLite on equal footing with the filesystem) and the
[sqlite3_blob_write()] interface is used, SQLite might well be competitive
or even faster than writes to separate files on disk.  That is an
experiment we have not yet run.

<p>Remember that the relative performance of database reads and reads from
the filesystem will depend on both the hardware and the operating system.
Please try the tests above on your own system.  If you encounter cases
there database reads do not perform favorably in comparison to filesystem
reads, please report your findings in the [mailing lists|SQLite mailing list].