Tuesday, April 25, 2023

All Systems Red

Got notification from public library that my hold on All Systems Red is ready. Finished it in one sitting after dinner.



Good short read. I guess i will have to pickup the whole series.


The following is an auto-generated summary:

All Systems Red is a science fiction novella by Martha Wells, published in 2017. It is the first book in The Murderbot Diaries series, which follows the adventures of a self-aware security robot that calls itself Murderbot. Here is a brief summary of the book:

Murderbot is assigned to protect a team of scientists who are surveying a planet for its natural resources. However, it has secretly hacked its governor module, which allows it to act independently and watch soap operas. When one of the scientists is attacked by a giant creature that was not in their data, Murderbot saves her and reveals some of its human traits. The team leader, Dr. Mensah, asks Murderbot to help them investigate why their information is incomplete and corrupted.

Murderbot discovers that another survey team on the other side of the planet has been killed by rogue security robots, and that someone is trying to sabotage their mission and eliminate them. Murderbot and the scientists have to escape from their habitat and fight off the attackers, while also dealing with their mixed feelings about Murderbot’s autonomy and past. Along the way, Murderbot learns more about itself and its emotions, and develops a bond with the humans it protects.

Sunday, April 16, 2023

Changing the backup approach

Here is my current backup strategy. Basically all devices backup to a NAS (which also hosts shared files). And periodically, backup the whole NAS to an offline PC:

The "backup of backup" setup has a few downsides:

  • The work of digging out and connecting the old PC to the network periodically. Additional hardware / software maintenance.
  • The software RAID 5 needs multiple HDDs, usually retired hardware previously used in the NAS. High risk of failure.


An alternative would be using external USB HDDs for the "backup of backup":


First attempt:

  • Use external HDDs via USB/eSATA for the "backup of backup". Rotate between two external HDDs to minimize the chance of single point of failure
  • Use veracrypt to encrypt the whole external HDD device
  • Within the encrypted volume, use BTRFS and mount with compression. Can also use the snapshot function of BTRFS if necessary
  • Sadly the NAS is running a very old version of Synology software and it doesn't support veracrypt or BTRFS. Will need to do this on a desktop PC.

While BTRFS do compression and snapshots, will need to pay attention on the disk usage. It is inefficient and difficult to re-balance if the free space is low (the main reason why I gave up on using BTRFS on boot partition even when it is the default for OpenSUSE). Should probably do a re-balance after every rsync.

Preliminary results:

  • Only less than 1/10th of the files are actually compressed. Probably because the NAS manly stores photos, movies, and backup archives... in which already in compressed format.
~$ sudo compsize /mnt/backup
Processed 117244 files, 1223867 regular extents (1223867 refs), 17472 inline.
Type       Perc     Disk Usage   Uncompressed Referenced   
TOTAL       95%      1.5T         1.6T         1.6T        
none       100%      1.5T         1.5T         1.5T        
zstd        22%       22G         103G         103G

  • Encounter RAID read errors and veracrypt issues. May need to think about doing checksum after backup
ata7.00: exception Emask 0x0 SAct 0xff800c SErr 0x0 action 0x0
ata7.00: irq_stat 0x40000008
ata7.00: failed command: READ FPDMA QUEUED
ata7.00: cmd 60/08:78:88:92:ed/00:00:73:00:00/40 tag 15 ncq dma 4096 in
                                    res 51/40:01:8f:92:ed/00:00:73:00:00/40 Emask 0x409 (media error) <F>
ata7.00: status: { DRDY ERR }
ata7.00: error: { UNC }
ata7.00: ATA Identify Device Log not supported
ata7.00: ATA Identify Device Log not supported
ata7.00: configured for UDMA/133
sd 6:0:0:0: [sde] tag#15 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_OK cmd_age=3s
sd 6:0:0:0: [sde] tag#15 Sense Key : Medium Error [current] 
sd 6:0:0:0: [sde] tag#15 Add. Sense: Unrecovered read error - auto reallocate failed
sd 6:0:0:0: [sde] tag#15 CDB: Read(10) 28 00 73 ed 92 88 00 00 08 00
blk_update_request: I/O error, dev sde, sector 1944949391 op 0x0:(READ) flags 0x4000 phys_seg 1 prio class 0
ata7: EH complete
md/raid:md0: read error corrected (8 sectors at 1944947336 on sde1)


To be continued...


Some useful commands, assuming sdd is the external HDD:

# create veracrypt volume
sudo veracrypt --volume-type=normal -c /dev/sdd
# mount veracrypt volume
sudo veracrypt --filesystem=none --slot=1 /dev/sdd
# create BTRFS
sudo mkfs.btrfs -L backup /dev/mapper/veracrypt1
# mount BTRFS with compression
sudo mount -o compress=zstd:15 /dev/mapper/veracrypt1 /mnt/backup
# check BTRFS compression ratio
sudo compsize /mnt/backup


References:

https://wiki.archlinux.org/title/VeraCrypt

https://man.archlinux.org/man/btrfs.5#COMPRESSION


Friday, April 7, 2023

Load testing APIs with Grafana k6

 Trying out Grafana k6 for performance testing. Heard that it is way better than Apache JMeter.


 

Setup

Using k6 with docker is easy. Just pull the image and run whatever test script we have prepared, e.g.:

docker run --rm -i grafana/k6 run - <script.js

And it should output the result similar to this when done:

But I wanted to store the the output and visualize the result, like the screenshot at the top of this post. Especially I wanted to see the impact when increasing number of concurrent access. Hence I have decided to run k6 with timescaledb and Grafana.

Luckily, there is also a docker solution for that. The only changes I have done was modifying docker-compose.yml to mount the database data directory with host folder for easy clean up:

Otherwise, just build the special k6 image that supports timescaledb and start the containers:

docker-compose build k6

docker-compose up -d

Note that the actual k6 container to run the script won't be started by docker-compose. Instead it will be executed separately, e.g. with a shell script:

#!/bin/sh
set -e

TAG_BASE="newssum"
TAG_NAME="$TAG_BASE-$(date +%s)"

docker run --rm --net=host -e K6_OUT=timescaledb=postgresql://k6:k6@localhost:5432/k6 -i xk6-output-timescaledb-k6 run --tag testid=$TAG_NAME - < k6/newssum.js

Test script

I have decided to try out load testing my newssum APIs. The script itself is fairly straightforward. A few points to note:

  • the test will be ramping up the number of virtual users (VUs) slowly from 20 to 160 to show the impact on heavy loading


  • since the newssum APIs cache the result, the script will call them once during the setup stage to preload them
  • by default, k6 uses a separate VU to execute the setup (only once, before the actual test start). Hence each VU will need to find out the available "sources" itself in the "default" function when running the actual load test

 

 Once the execution is done, result can be visualized with Grafana container by pointing the browser to http://localhost:3000. e.g.:

 

For this specific example, can see that newssum starts to struggle once we have over ~100 concurrent requests as the 95th percentile shoots up.

tl;dr

I can see why developers like k6 over JMeter. It is simple and straightforward to use Javascript to create load tests that require special setup.