Skip to content
GitLab
Explore
Sign in
Register
Primary navigation
Search or go to…
Project
G
GrayScott Reloaded - Rust practical
Manage
Activity
Members
Labels
Plan
Issues
Issue boards
Milestones
Code
Merge requests
Repository
Branches
Commits
Tags
Repository graph
Compare revisions
Build
Pipelines
Jobs
Pipeline schedules
Artifacts
Deploy
Container Registry
Analyze
Value stream analytics
Contributor analytics
CI/CD analytics
Repository analytics
Help
Help
Support
GitLab documentation
Compare GitLab plans
Community forum
Contribute to GitLab
Provide feedback
Keyboard shortcuts
?
Snippets
Groups
Projects
This is an archived project. Repository and other project resources are read-only.
Show more breadcrumbs
GRASLAND Hadrien
GrayScott Reloaded - Rust practical
Commits
a316f234
Commit
a316f234
authored
9 months ago
by
Hadrien Grasland
Browse files
Options
Downloads
Patches
Plain Diff
Add chapter on storage compression
parent
5e58ee1d
No related branches found
No related tags found
No related merge requests found
Changes
4
Hide whitespace changes
Inline
Side-by-side
Showing
4 changed files
exercises/Cargo.lock
+33
-1
33 additions, 1 deletion
exercises/Cargo.lock
exercises/Cargo.toml
+1
-1
1 addition, 1 deletion
exercises/Cargo.toml
handouts/src/28-compressor.md
+92
-0
92 additions, 0 deletions
handouts/src/28-compressor.md
handouts/src/SUMMARY.md
+2
-1
2 additions, 1 deletion
handouts/src/SUMMARY.md
with
128 additions
and
3 deletions
exercises/Cargo.lock
+
33
−
1
View file @
a316f234
...
...
@@ -430,6 +430,17 @@ version = "1.0.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "5443807d6dff69373d433ab9ef5378ad8df50ca6298caf15de6e52e24aaf54d5"
[[package]]
name = "errno"
version = "0.2.8"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "f639046355ee4f37944e44f60642c6f3a7efa3cf6b78c78a0d989a8ce6c396a1"
dependencies = [
"errno-dragonfly",
"libc",
"winapi",
]
[[package]]
name = "errno"
version = "0.3.9"
...
...
@@ -440,6 +451,16 @@ dependencies = [
"windows-sys 0.52.0",
]
[[package]]
name = "errno-dragonfly"
version = "0.1.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "aa68f1b12764fab894d2755d2518754e71b4fd80ecfb822714a1206c2aab39bf"
dependencies = [
"cc",
"libc",
]
[[package]]
name = "getrandom"
version = "0.2.15"
...
...
@@ -494,11 +515,13 @@ checksum = "bdcd9b131fd67bb827b386d0dc63d3e74196a14616ef800acf87ca5fef741a10"
dependencies = [
"bitflags 1.3.2",
"cfg-if",
"errno 0.2.8",
"hdf5-derive",
"hdf5-sys",
"hdf5-types",
"lazy_static",
"libc",
"lzf-sys",
"ndarray",
"parking_lot 0.11.2",
"paste",
...
...
@@ -570,7 +593,7 @@ dependencies = [
"bitflags 2.6.0",
"derive_more",
"enum-iterator",
"errno",
"errno
0.3.9
",
"hwlocality-sys",
"libc",
"num_enum",
...
...
@@ -729,6 +752,15 @@ version = "0.4.21"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "90ed8c1e510134f979dbc4f070f87d4313098b704861a105fe34231c70a3901c"
[[package]]
name = "lzf-sys"
version = "0.1.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "0798d023ce0905e2c77ed96de92aab929ff9db2036cbef4edfee0daf33582aec"
dependencies = [
"cc",
]
[[package]]
name = "malloc_buf"
version = "0.0.6"
...
...
This diff is collapsed.
Click to expand it.
exercises/Cargo.toml
+
1
−
1
View file @
a316f234
...
...
@@ -9,7 +9,7 @@ gpu = ["dep:directories", "dep:vulkano", "dep:vulkano-shaders"]
[dependencies]
clap
=
{
version
=
"4.5.7"
,
features
=
[
"derive"
,
"env"
]
}
directories
=
{
version
=
"5.0.1"
,
optional
=
true
}
hdf5
=
"0.8.1"
hdf5
=
{
version
=
"0.8.1"
,
features
=
[
"lzf"
]
}
hwlocality
=
"1.0.0-alpha.5"
indicatif
=
"0.17.8"
iterator_ilp
=
"2.1.2"
...
...
This diff is collapsed.
Click to expand it.
handouts/src/28-compressor.md
0 → 100644
+
92
−
0
View file @
a316f234
# Compression
During the last exercise, you may have been disappointed by the fact that an
optimization which has a great impact in microbenchmarks (removing the inner
copy in
`Concentrations::current_v()`
) had no impact on the full simulation
workload.
This becomes easily understandable, once you realize that the simulation is
currently not CPU-bound, but storage-bound. As you can check by running a system
monitor in parallel with the simulation, the storage device is intermittently at
100% utilization while the simulation is running, indicating that it is very
likely to be the bottleneck.
Therefore, we actually need to speed up the I/O. And this means that we must now
pay attention to what our HDF5 writer is doing.
## A closer look at the I/O code
Intuitively, performance while the simulation is running can be affected either
by the way the simulation is configured, or by what we are doing on every
storage write:
```
rust,ignore
/// Create or truncate the file
///
/// The file will be dimensioned to store a certain amount of V species
/// concentration arrays.
///
/// The `Result` return type indicates that this method can fail and the
/// associated I/O errors must be handled somehow.
pub
fn
create
(
file_name
:
&
str
,
shape
:
[
usize
;
2
],
num_images
:
usize
)
->
hdf5
::
Result
<
Self
>
{
// The ? syntax lets us propagate errors from an inner function call to
// the caller, when we cannot handle them ourselves.
let
file
=
File
::
create
(
file_name
)
?
;
let
[
rows
,
cols
]
=
shape
;
let
dataset
=
file
.new_dataset
::
<
Float
>
()
.chunk
([
1
,
rows
,
cols
])
.shape
([
num_images
,
rows
,
cols
])
.create
(
"matrix"
)
?
;
Ok
(
Self
{
file
,
dataset
,
position
:
0
,
})
}
/// Write a new V species concentration table to the file
pub
fn
write
(
&
mut
self
,
result_v
:
ArrayView2
<
Float
>
)
->
hdf5
::
Result
<
()
>
{
self
.dataset
.write_slice
(
result_v
,
(
self
.position
,
..
,
..
))
?
;
self
.position
+=
1
;
Ok
(())
}
```
Obviously, we cannot change much in
`write()`
, so let's focus on chat happens
inside of
`create()`
. There are two obvious areas of leverage:
-
We can change our hardcoded chunk size of 1 to something larger, and see if
doing I/O at a higher granularity helps.
-
Try to enable additional HDF5 options, such as compression, to reduce the
volume of data that is eventually sent to the storage device.
In which order should we perform these optimizations? Well, compression is
affected by block size, since it feeds the compression engine with more data,
which can be either good (more patterns to compress) or bad (worse CPU cache
locality slowing down the compression algorithm). Therefore, we should try to
enable compression first.
## Exercise
Previous experience from the course's author suggests that on modern NVMe
storage devices, only the LZ4/LZO/LZF family of fast compressors are still
worthwhile. Anything more sophisticated, even Zstandard at compression level 1,
will result in a net slowdown.
Therefore, please try to enable LZF dataset compression...
```
rust,ignore
let
dataset
=
file
.new_dataset
::
<
Float
>
()
.chunk
([
1
,
rows
,
cols
])
.lzf
()
.shape
([
num_images
,
rows
,
cols
])
.create
(
"matrix"
)
?
;
```
...and see if it helps or hurts for this particular computation, on your
storage hardware.
This diff is collapsed.
Click to expand it.
handouts/src/SUMMARY.md
+
2
−
1
View file @
a316f234
...
...
@@ -37,7 +37,8 @@
-
[
Execution
](
25-simulating.md
)
-
[
Batching
](
26-batch-system.md
)
-
[
Avoiding copies
](
27-copyright.md
)
-
[
Compression
](
)
-
[
Compression
](
28-compressor.md
)
-
[
Block tuning
](
)
-
[
Async storage
](
)
-
[
Harder ideas
](
)
...
...
This diff is collapsed.
Click to expand it.
Preview
0%
Loading
Try again
or
attach a new file
.
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Save comment
Cancel
Please
register
or
sign in
to comment