Index: pages/docsdata.tcl
==================================================================
--- pages/docsdata.tcl
+++ pages/docsdata.tcl
@@ -279,10 +279,15 @@
   in place of XML or JSON or a "pile-of-file".
 }
 doc {Well Known Users} {famous.html} {
   This page lists a small subset of the many thousands of devices
   and application programs that make use of SQLite.
+}
+doc {35% Faster Than The Filesystem} {fasterthanfs.html} {
+  This article points out that reading blobs out of an SQLite database
+  is often faster than reading the same blobs from individual files in
+  the filesystem.
 }
 
 
 ###############################################################################
 heading {Technical and Design Documentation} technical {

ADDED   pages/fasterthanfs.in
Index: pages/fasterthanfs.in
==================================================================
--- /dev/null
+++ pages/fasterthanfs.in
@@ -0,0 +1,166 @@
+<title>35% Faster Than The Filesystem</title>
+<tcl>hd_keywords {faster than the filesystem}</tcl>
+
+<table_of_contents>
+
+<h1>Summary</h1>
+
+<p>Small blobs (for example, thumbnail images)
+can be read out of an SQLite database about 35% faster
+than they can be read from individual files on disk.
+
+<p>Furthermore, a single SQLite database holding
+10-kilobyte blobs uses about 20% less disk space than
+storing the blobs in individual files.
+
+<p>The performance difference arises (we believe) because when
+reading from an SQLite database, the open() and close() system calls
+are invoked only once, whereas
+open() and close() are invoked once for each blob
+when reading the blobs from individual files.  It appears that the
+overhead of calling open() and close() is greater than the overhead
+of using the database.  The size reduction arises from the fact that
+individual files are padded out to the next multiple of the filesystem
+block size, whereas the blobs are packed more tightly into an SQLite
+database.
+
+<h1>How These Measurements Are Made</h1>
+
+<p>The performance comparison is accomplished using the
+[https://www.sqlite.org/src/file/test/kvtest.c|kvtest.c] program
+found in the SQLite source tree.
+To compile the test program, first gather the kvtest.c source file
+into a directory with the [amalgamation|SQLite amalgamation] source
+files "sqlite3.c" and "sqlite3.h".  Then on unix, run a command like
+the following:
+
+<codeblock>
+gcc -Os -I. -DSQLITE_DIRECT_OVERFLOW_READ kvtest.c sqlite3.c \
+    -o kvtest -ldl -lpthread
+</codeblock>
+
+<p>Or on Windows with MSVC:
+
+<codeblock>
+cl -I. -DSQLITE_DIRECT_OVERFLOW_READ kvtest.c sqlite3.c
+</codeblock>
+
+<p>
+Use the resulting "kvtest" program to
+generate a test database with 100,000 random blobs, each 10,000 bytes in
+size using a command like this:
+
+<codeblock>
+./kvtest init test1.db --count 100k --size 10k
+</codeblock>
+
+<p>
+Next, make copies of all the blobs into individual files in a directory
+using commands like this:
+
+<codeblock>
+mkdir test1.dir
+./kvtest export test1.db test1.dir
+</codeblock>
+
+<p>
+At this point, you can measure the amount of disk space used by
+the test1.db database and the space used by the test1.dir directory
+and all of its content.  On a standard Ubuntu Linux desktop, the
+database file will be 1,024,512,000 bytes in size and the test1.dir
+directory will use 1,228,800,000 bytes of space (according to "du -k"),
+about 20% more than the database.
+
+<p>
+Measure the performance for reading blobs from the database and from
+individual files using these commands:
+
+<codeblock>
+./kvtest run test1.db --count 100k --blob-api
+./kvtest run test1.dir --count 100k
+</codeblock>
+
+<p>
+Depending on your platform, you should see that reads from the test1.db
+database file are about 35% faster than reads from individual files in
+the test1.dir folder.
+
+<h2>Variations</h2>
+
+<p>The [-DSQLITE_DIRECT_OVERFLOW_READ] compile-time option causes SQLite
+to bypass its page cache when reading content from overflow pages.  This
+helps database reads of 10K blobs run a little faster, but not all that much
+faster.  SQLite still holds a speed advantage over direct filesystem reads
+without the SQLITE_DIRECT_OVERFLOW_READ compile-time option.
+
+<p>Other compile-time options such as using -O3 instead of -Os or
+using [-DSQLITE_THREADSAFE=0] and/or some of the other
+[recommended compile-time options] might help SQLite to run even faster
+relative to direct filesystem reads.
+
+<p>When constructing the test data, trying varying the size of the blob.
+The performance advantage will shift toward direct filesystem reads as
+the size of blobs increase, since the cost of invoking open() and close()
+will be amortized over more bytes transferred using read().  The break-even
+point, the point where it becomes faster to read directly from the filesystem,
+will vary from one system to another.  In the other direction, reducing the
+blob size provide more advantage to database reads.  With a 5 KB blob size,
+reading from the database is twice as fast and uses 60% less space than
+blobs stored as individual files.
+
+<p>The --blob-api option causes database reads to occur using the
+[sqlite3_blob_open()], [sqlite3_blob_reopen()], and [sqlite3_blob_read()]
+interfaces instead of using SQL statements.  Without the --blob-api
+option, a separate SQL statement is run to read each blob and
+the performance of reading from the database is approximately
+the same as the performance from reading directly from files.
+This is still a significant finding, since few people would
+expect a [full-featured SQL] database to run as fast as direct file reads,
+and yet SQLite does.
+
+<p>The --random option on the "run" command causes the
+blobs to be read in a random order.  This causes the performance of database
+reads to decrease.  The reason is that the blobs are tightly packed in
+the database, rather than being padded out to the next block size as when
+they are stored in the filesystem.  Some pages contain parts of
+adjacent blobs.  When the blobs are read sequentially, those pages are
+only read into memory once and cached and then used to reconstruct
+adjacent blobs, but when blobs are read in a random order, those pages
+that share parts of two or more blobs tend to be read multiple times,
+leading to decreased performance.
+
+<p>The "--mmap SIZE" option on the "run" command causes the database file
+to be accessed using mmap() instead of via read().  The SIZE argument is
+the size of the memory mapped region, and should be the size of the database
+file for maximum performance.  Using "--mmap 1G" causes the database reads
+to be almost twice as fast as disk reads even when the --random
+option is used.
+
+<p>When --random is used and both --blob-api and --mmap are omitted,
+reading directly from files on disk is generally a little faster, but
+reads from the database are still competitive.
+
+<h1>Other Considerations</h1>
+
+<p>Some other SQL database engines advise developers to store blobs in separate
+files and then store the filename in the database.  In that case, where
+the database must first be consulted to find the filename before opening
+and reading the file, simply storing the entire blob in the database is
+gives much faster read performance with SQLite.
+See the [Internal Versus External BLOBs] article for more information.
+
+<p>This report only looks at the performance of reads, not writes.
+Because SQLite implements [atomic commit|power-safe ACID transactions]
+we expect that write performance into SQLite will be slower than writing
+directly to individual files.  However, if ACID transactions are disabled
+via [PRAGMA journal_mode|PRAGMA journal_mode=OFF]
+(thus putting SQLite on equal footing with the filesystem) and the
+[sqlite3_blob_write()] interface is used, SQLite might well be competitive
+or even faster than writes to separate files on disk.  That is an
+experiment we have not yet run.
+
+<p>Remember that the relative performance of database reads and reads from
+the filesystem will depend on both the hardware and the operating system.
+Please try the tests above on your own system.  If you encounter cases
+there database reads do not perform favorably in comparison to filesystem
+reads, please report your findings in the [mailing lists|SQLite mailing list].