Tuesday, March 31, 2020

Project - stage1



Project – stage 1.

Spo600 final project is that I choose one open source to optimize.
For fist stage, choose open source package and build software. Benchmark the performance of the current implementation of the software on AArch64 and x86-64 systems. Lastly experiment with build options to determine if this has any impact on the performance.


1.     Choose open source

I choose “Zopfli (https://github.com/google/zopfli)” open source. It is compression algorithm made by Google.  Zopfli is written in C for portability. It is a compression-only library. Zopfli is bit-stream compatible with compression used in gzip, Zip, PNG, HTTP requests, and others.
If you compare compression algorithm, zopfli is slower than others. When compare the fast one, gzip 9, zopfli is slower more than 80times.

*https://www.lifehacker.com.au/2013/03/a-look-at-zopfli-googles-open-source-compression-algorithm/
 
Also, I saw “Zopfli Compression Algorithm is a compression library programmed in C to perform very good, but slow, deflate or zlib compression.” in zopfli GitHub.
So, I want to optimize this one.



    2.     Build the software
    2-1.  x86_64
    1)    clone the code to server
I cloned the data from the Zopfli GitHub.


2) add image to the server

For my testing, I will choose the 10mb png file  from https://www.sample-videos.com/download-sample-png-image.php.

3) Benchmark the performance
To benchmark the performance, I used 10mb, 20mb and 30mb files.
Real: elapsed real (wall clock) time used by the process, in seconds.
User: total number of CPU-seconds that the process used directly (in user mode), in seconds.
Sys: total number of CPU-seconds used by the system on behalf of the process (in kernel mode), in seconds.

time
10mb
20mb
30mb
real
2m19.740s
3m38.955s
5m38.456s
user
1m22.996s
3m38.287s
5m37.188s
sys
0m0.163s
0m0.320s
0m0.728s

 * 10mb

* 20mb
* 30mb


I chosen 10mb file, executed 5times with O3 building option.
10mb
1st
2nd
3rd
4th
5th
Real
1m23.846s
1m23.742s1m23,551s1m24.355s4m58.047s
User
1m23.571s
1m23.469s1m23.282s1m24.060s1m33.316s
Sys
0m0.132s
0m0.133s0m0.132s0m0.150s0m0.154s


4) Experiment with build option.
I use the 10mb.png file with various build options.
-O0- no optimization
-O1- first level optimization
-O2 – second level optimization
-O3 – highest optimization
-Ofast – optimize for speed only
time
-O0
-O1
-O2
-O3
-Ofast
real
3m39.489s
1m41.897s1m33.908s2m19,740s1m23.712s
user
3m39.023s
1m41.606s1m33.620s1m22.996s
 1m23.455s
sys
0m0.142s
0m0.127s0m0.138s0m0.163s0m0.119s

 *-O0

*- O1


*-O2

*-O3

*-Ofast


2-2.  AArch64
1) Benchmark the performance
To benchmark the performance, I used 10mb, 20mb and 30mb files.
time
10mb
20mb
30mb
real
8m8.914s
18m56.510s
29m23.732s
user
8m7.776s
18m53.535s
29m18.217s
sys
0m0.229s
0m0.847s
0m2.001s

*10mb
* 20mb
*30mb

I chosen 10mb file, executed 5times with buildin option O3.
10mb
1st
2nd
3rd
4th
5th
Real
8m19.454s
8m42.835s8m31.470s
8m24.040s
8m8.914s
User
8m18.075s
8m41.506s8m30.148s
8m22.796s
8m7.776s
sys
0m0.339s
0m0.299s0m0.339s
0m0.319s
0m0.229s



2) Experiment with build option.
I use the 10mb.png file with various build options.
time
-O0
-O1
-O2
-O3
-Ofast
real
23m39.314s
10m5.855s
8m49.292s
8m8.914s
8m7.720s
user
23m36.622s
10m4.457s
8m48.034s
8m7.776s
8m6.445s
sys
0m0.330s
0m0.329s
0m0.310s
0m0.229s
0m0.349s

*-O0


*-O1


*-O2


*-O3


*-Ofast


When I change the building option the running time is also changed. no optimization(O0) is most slow and -Ofast is most fast. When the building option is changed except the code changing, the performance is changed. It is really interesting to me.
For stage2, I will profile the software to determine which part of the code is doing most of the work.




No comments:

Post a Comment