Monday, May 22, 2023

Beasts and Beauty

 



Beasts and Beauty: Dangerous Tales. a quick read for a long weekend. Recommended.


A generated review:

"Beasts and Beauty: Dangerous Tales" by Soman Chainani is a delightfully wicked and humorous collection of stories that will keep readers hooked from start to finish. With his signature blend of fantasy and wit, Chainani takes classic fairy tales and gives them a dark twist, adding unexpected layers and surprises. The characters are richly developed, each with their own quirks and flaws, and the author's clever writing style adds an extra dose of charm. From mischievous beasts to cunning beauties, this book is a wild ride through a world where nothing is as it seems. Prepare to be enchanted and entertained!



Wednesday, May 10, 2023

Using Google Coral TPU on OpenSUSE




Installing the TPU driver

The official guide only covers steps for Debian. Third-party compiled binaries are available for Fedora, RHEL, and OpenSUSE etc. The RPM for OpenSUSE Tumbleweed can be downloaded directly here

Once downloaded, install the RPM file with "sudo zypper install gasket..."

Installing the userspace runtime driver

The userspace driver can be built from source inside docker:

git clone https://github.com/google-coral/libedgetpu.git
cd libedgetpu
CPU=k8 DOCKER_CPUS="k8" DOCKER_IMAGE="ubuntu:18.04" DOCKER_TARGETS=libedgetpu make docker-build

Copy the files from "out" folder and install using ldconfig:

sudo mkdir -p /usr/lib/tpu/lib
sudo cp --preserve=links out/direct/k8/* /usr/lib/tpu/lib
echo /usr/lib/tpu/lib | sudo tee -a /etc/ld.so.conf.d/tpu.conf
sudo ldconfig

Running the examples

The "pycoral" package mentioned on the officail guide is kind of confusing. The "pycoral" from pypi is actually a totally unrelated package. What we need for the TPU can be found here.

Also, there is no support for Python 3.10 yet. So needed to use 3.9. To install packages in order to run the example:

python3.9 -m venv venv
. venv/bin/activate
pip install numpy pillow
python -m pip install --extra-index-url https://google-coral.github.io/py-repo/ pycoral~=2.0

Then follow the steps on the official guide to run the example.

$ python3 examples/classify_image.py --model test_data/mobilenet_v2_1.0_224_inat_bird_quant_edgetpu.tflite --labels test_data/inat_bird_labels.txt --input test_data/parrot.jpg
----INFERENCE TIME----
Note: The first inference on Edge TPU is slow because it includes loading the model into Edge TPU memory.
12.8ms
2.8ms
2.8ms
2.8ms
2.8ms
-------RESULTS--------
Ara macao (Scarlet Macaw): 0.75781



Tuesday, April 25, 2023

All Systems Red

Got notification from public library that my hold on All Systems Red is ready. Finished it in one sitting after dinner.



Good short read. I guess i will have to pickup the whole series.


The following is an auto-generated summary:

All Systems Red is a science fiction novella by Martha Wells, published in 2017. It is the first book in The Murderbot Diaries series, which follows the adventures of a self-aware security robot that calls itself Murderbot. Here is a brief summary of the book:

Murderbot is assigned to protect a team of scientists who are surveying a planet for its natural resources. However, it has secretly hacked its governor module, which allows it to act independently and watch soap operas. When one of the scientists is attacked by a giant creature that was not in their data, Murderbot saves her and reveals some of its human traits. The team leader, Dr. Mensah, asks Murderbot to help them investigate why their information is incomplete and corrupted.

Murderbot discovers that another survey team on the other side of the planet has been killed by rogue security robots, and that someone is trying to sabotage their mission and eliminate them. Murderbot and the scientists have to escape from their habitat and fight off the attackers, while also dealing with their mixed feelings about Murderbot’s autonomy and past. Along the way, Murderbot learns more about itself and its emotions, and develops a bond with the humans it protects.

Sunday, April 16, 2023

Changing the backup approach

Here is my current backup strategy. Basically all devices backup to a NAS (which also hosts shared files). And periodically, backup the whole NAS to an offline PC:

The "backup of backup" setup has a few downsides:

  • The work of digging out and connecting the old PC to the network periodically. Additional hardware / software maintenance.
  • The software RAID 5 needs multiple HDDs, usually retired hardware previously used in the NAS. High risk of failure.


An alternative would be using external USB HDDs for the "backup of backup":


First attempt:

  • Use external HDDs via USB/eSATA for the "backup of backup". Rotate between two external HDDs to minimize the chance of single point of failure
  • Use veracrypt to encrypt the whole external HDD device
  • Within the encrypted volume, use BTRFS and mount with compression. Can also use the snapshot function of BTRFS if necessary
  • Sadly the NAS is running a very old version of Synology software and it doesn't support veracrypt or BTRFS. Will need to do this on a desktop PC.

While BTRFS do compression and snapshots, will need to pay attention on the disk usage. It is inefficient and difficult to re-balance if the free space is low (the main reason why I gave up on using BTRFS on boot partition even when it is the default for OpenSUSE). Should probably do a re-balance after every rsync.

Preliminary results:

  • Only less than 1/10th of the files are actually compressed. Probably because the NAS manly stores photos, movies, and backup archives... in which already in compressed format.
~$ sudo compsize /mnt/backup
Processed 117244 files, 1223867 regular extents (1223867 refs), 17472 inline.
Type       Perc     Disk Usage   Uncompressed Referenced   
TOTAL       95%      1.5T         1.6T         1.6T        
none       100%      1.5T         1.5T         1.5T        
zstd        22%       22G         103G         103G

  • Encounter RAID read errors and veracrypt issues. May need to think about doing checksum after backup
ata7.00: exception Emask 0x0 SAct 0xff800c SErr 0x0 action 0x0
ata7.00: irq_stat 0x40000008
ata7.00: failed command: READ FPDMA QUEUED
ata7.00: cmd 60/08:78:88:92:ed/00:00:73:00:00/40 tag 15 ncq dma 4096 in
                                    res 51/40:01:8f:92:ed/00:00:73:00:00/40 Emask 0x409 (media error) <F>
ata7.00: status: { DRDY ERR }
ata7.00: error: { UNC }
ata7.00: ATA Identify Device Log not supported
ata7.00: ATA Identify Device Log not supported
ata7.00: configured for UDMA/133
sd 6:0:0:0: [sde] tag#15 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_OK cmd_age=3s
sd 6:0:0:0: [sde] tag#15 Sense Key : Medium Error [current] 
sd 6:0:0:0: [sde] tag#15 Add. Sense: Unrecovered read error - auto reallocate failed
sd 6:0:0:0: [sde] tag#15 CDB: Read(10) 28 00 73 ed 92 88 00 00 08 00
blk_update_request: I/O error, dev sde, sector 1944949391 op 0x0:(READ) flags 0x4000 phys_seg 1 prio class 0
ata7: EH complete
md/raid:md0: read error corrected (8 sectors at 1944947336 on sde1)


To be continued...


Some useful commands, assuming sdd is the external HDD:

# create veracrypt volume
sudo veracrypt --volume-type=normal -c /dev/sdd
# mount veracrypt volume
sudo veracrypt --filesystem=none --slot=1 /dev/sdd
# create BTRFS
sudo mkfs.btrfs -L backup /dev/mapper/veracrypt1
# mount BTRFS with compression
sudo mount -o compress=zstd:15 /dev/mapper/veracrypt1 /mnt/backup
# check BTRFS compression ratio
sudo compsize /mnt/backup


References:

https://wiki.archlinux.org/title/VeraCrypt

https://man.archlinux.org/man/btrfs.5#COMPRESSION


Friday, April 7, 2023

Load testing APIs with Grafana k6

 Trying out Grafana k6 for performance testing. Heard that it is way better than Apache JMeter.


 

Setup

Using k6 with docker is easy. Just pull the image and run whatever test script we have prepared, e.g.:

docker run --rm -i grafana/k6 run - <script.js

And it should output the result similar to this when done:

But I wanted to store the the output and visualize the result, like the screenshot at the top of this post. Especially I wanted to see the impact when increasing number of concurrent access. Hence I have decided to run k6 with timescaledb and Grafana.

Luckily, there is also a docker solution for that. The only changes I have done was modifying docker-compose.yml to mount the database data directory with host folder for easy clean up:

Otherwise, just build the special k6 image that supports timescaledb and start the containers:

docker-compose build k6

docker-compose up -d

Note that the actual k6 container to run the script won't be started by docker-compose. Instead it will be executed separately, e.g. with a shell script:

#!/bin/sh
set -e

TAG_BASE="newssum"
TAG_NAME="$TAG_BASE-$(date +%s)"

docker run --rm --net=host -e K6_OUT=timescaledb=postgresql://k6:k6@localhost:5432/k6 -i xk6-output-timescaledb-k6 run --tag testid=$TAG_NAME - < k6/newssum.js

Test script

I have decided to try out load testing my newssum APIs. The script itself is fairly straightforward. A few points to note:

  • the test will be ramping up the number of virtual users (VUs) slowly from 20 to 160 to show the impact on heavy loading


  • since the newssum APIs cache the result, the script will call them once during the setup stage to preload them
  • by default, k6 uses a separate VU to execute the setup (only once, before the actual test start). Hence each VU will need to find out the available "sources" itself in the "default" function when running the actual load test

 

 Once the execution is done, result can be visualized with Grafana container by pointing the browser to http://localhost:3000. e.g.:

 

For this specific example, can see that newssum starts to struggle once we have over ~100 concurrent requests as the 95th percentile shoots up.

tl;dr

I can see why developers like k6 over JMeter. It is simple and straightforward to use Javascript to create load tests that require special setup.