Page 1 of 1

Use LZ4 instead of BZ2 to compress the saved database

Posted: Sun Jun 21, 2020 10:39 pm
by yfdyh000
Using LZ4 will balance the speed and space saving.

At present, it may take more than 10 seconds to compress (BZIP2) and write the database (Everything.db) when the program is closed.

Re: Use LZ4 instead of BZ2 to compress the saved database

Posted: Mon Jun 22, 2020 1:29 am
by void
Please do not use database compression.

The compression is minimal and the extra CPU usage is expensive.

It is only useful if your drive is extremely slow (< 1MBps) and you have plenty of CPU usage available.

I will consider LZ4, thank you for the suggestion.

Re: Use LZ4 instead of BZ2 to compress the saved database

Posted: Sun Jun 28, 2020 6:59 pm
by Marco77
Zstandard is also an algorithm which is fast to compress and, importantly, to decompress. It was co-designed by the same author as LZ4, Yann Collet.

Re: Use LZ4 instead of BZ2 to compress the saved database

Posted: Sun Jun 28, 2020 8:25 pm
by vsub
void wrote:
Mon Jun 22, 2020 1:29 am
Please do not use database compression.

The compression is minimal and the extra CPU usage is expensive.

It is only useful if your drive is extremely slow (< 1MBps) and you have plenty of CPU usage available.

I will consider LZ4, thank you for the suggestion.
Isn't the database loaded from the hdd\ssd into ram where it is uncompressed if it is?
If yes,does that mean if the database is not compressed,Everything will start faster after windows restart

Re: Use LZ4 instead of BZ2 to compress the saved database

Posted: Mon Jun 29, 2020 3:11 am
by void
No, the database is uncompressed as it is read from disk.

Everything will always read the database from disk with a 64KB buffer. So Everything will read 64 KB chunks at a time..
bz2 will have its own buffers, it will decompress from the 64KB read buffer into its own buffer which is 900KB.

The Everything database is already compressed without bz2.
bz2 just adds another layer of compression.
The "Compress Database" option enables or disables this extra bz2 compression layer.

Enabling bz2 compression will:
Makes loading slightly slower for SSDs.
Severely reduce the saving performance of Everything.

The performance difference with loading compressed vs uncompressed is minimal.
The saving performance is severely reduced when enabling compression (high CPU usage).

Everything will report the database load and save timings to the debug console.
So you can check which option works best for you.

Re: Use LZ4 instead of BZ2 to compress the saved database

Posted: Mon Jun 29, 2020 10:07 pm
by NotNull
void wrote:
Mon Jun 29, 2020 3:11 am
Everything will report the database load and save timings to the debug console.
What entries should we look for?

Re: Use LZ4 instead of BZ2 to compress the saved database

Posted: Mon Jun 29, 2020 10:28 pm
by yfdyh000
void wrote:
Mon Jun 29, 2020 3:11 am
No, the database is uncompressed as it is read from disk.

Everything will always read the database from disk with a 64KB buffer. So Everything will read 64 KB chunks at a time..
bz2 will have its own buffers, it will decompress from the 64KB read buffer into its own buffer which is 900KB.

The Everything database is already compressed without bz2.
bz2 just adds another layer of compression.
The "Compress Database" option enables or disables this extra bz2 compression layer.

Enabling bz2 compression will typically make Everything load slightly faster, as there is less I/O.
However, saving performance is severely reduced.

The performance difference with loading compressed vs uncompressed is minimal.
The saving performance is severely reduced when enabling compression (high CPU usage).

Everything will report the database load and save timings to the debug console.
So you can check which option works best for you.

Here are some of my test results:
       Save (Secs)  Load (Secs)  Size (MB)
BZ2      1.09     0.47      3.22 (63% compression)
Normal    0.07     1.83      8.66
This is with write buffering enabled in Windows, so the save speed is not accurate.
Maybe you can benchmark a 100MB or larger database? It takes several seconds to saving, even tens of seconds if BZ2 compress is enabled.
It also takes up a discernible disk size, which can be compressed by some fast algorithms.

Re: Use LZ4 instead of BZ2 to compress the saved database

Posted: Tue Jun 30, 2020 6:30 am
by void
What entries should we look for?
Timing debug information is always in Blue text.

SSD Normal (uncompressed):

Code: Select all

db_save_local 85181 folders, 812077 files
saved db: 0.277260 seconds
-This is not accurate as write cache is enabled, see below for write cache disabled test to give a better idea of performance.

Everything.db size on disk: 46,358,004 bytes

Code: Select all

loaded 85180 folders, 812058 files, in 1.497799 seconds
-Load timings from a fresh boot

SSD Compressed:

Code: Select all

db_save_local 85180 folders, 812059 files
saved db: 5.362448 seconds
Everything.db size on disk: 20,452,358 bytess

Code: Select all

loaded 85180 folders, 812058 files, in 3.791765 seconds
-Load timings from a fresh boot

SSD Normal (uncompressed) Write Cache Disabled:

Code: Select all

db_save_local 85180 folders, 812059 files
saved db: 0.803708 seconds
HDD Normal (uncompressed):

Code: Select all

db_save_local 228829 folders, 800000 files
saved db: 0.613437 seconds
Everything.db size on disk: 38,289,196 bytes

Code: Select all

loaded 228829 folders, 800000 files, in 2.482928 seconds
-Load timings from a fresh boot

HDD Compressed:

Code: Select all

db_save_local 228829 folders, 800000 files
saved db: 6.409400 seconds
Everything.db size on disk: 15,518,383 bytes

Code: Select all

loaded 228829 folders, 800000 files, in 2.717218 seconds
-Load timings from a fresh boot

TL:DR: -rough tests for about 1million files:

Code: Select all

               Uncompressed      Compressed
SSD Load       1.49              3.79
SSD Save       0.27              5.36
HDD Load       2.48              2.71
HDD Save       0.61              6.40
Results will vary for your hardware.

Re: Use LZ4 instead of BZ2 to compress the saved database

Posted: Tue Jun 30, 2020 3:43 pm
by therube
(Oh, I mentioned - though haven't tried yet, "clearing cache", https://freefilesync.org/forum/viewtopi ... 420#p25052.)

(Voids results, at least with this .db, seem to fly in the face of Google/Mozilla's [supposed] reasoning for lz4'ing everything under the sun.
Also interesting the rather negligible difference between SSD/HDD, with HDD ever quicker in the 1 instance.
[One day, I'll have a SSD, maybe.]
Oh, & Mozilla's implementation of lz4, while it follows the spec, isn't "standard" [also think, .jar] to the majority of the lz4 [zip] related tools out there.)