File Format

Naïve bash

To solve this challenge I started with a pure bash approach. The original file is i_0 q_0 i_1 q_1 ... So i start by separating each i_k q_k on a separate line

xxd -c 8 -g 4 -p file-format.iq

Then I split each line into the first 8 and last 8 chars, concatenate this

concat_i=''
concat_q=''
for iq in $(xxd -c 8 -g 4 -p file-format.iq) ; do
    i=$(echo $iq | cut -b -8)
    q=$(echo $iq | cut -b 9-)
   concat_i="${concat_i}${i}"
   concat_q="${concat_q}${q}"

Finally I convert it back to bytes and calculate the hash.

#!/bin/bash

concat_i=''
concat_q=''
for iq in $(xxd -c 8 -g 4 -p file-format.iq) ; do
    i=$(echo $iq | cut -b -8)
    q=$(echo $iq | cut -b 9-)
   concat_i="${concat_i}${i}"
   concat_q="${concat_q}${q}"
done
hash=$(echo -n "${concat_i}${concat_q}" | xxd -r -p | sha256sum | cut -d ' ' -f 1)
echo "FCSC{${hash}}"

This works BUT it is extremely slow (~5 seconds) as on each iteration of the loop

I spawn two echo and two cut processes
I create a new copy of each of my concatenated strings

Pipe operator

To solve both of these issue we can resort to the all powerful | (pipe) operator, we can get the concatenated i and q in a much simpler way.

Buffered pipe

#!/bin/bash

i=$(xxd -c 8 -g 4 -p file-format.iq| cut -b -8 | tr -d '\n')
q=$(xxd -c 8 -g 4 -p file-format.iq| cut -b 9- | tr -d '\n')
hash=$(echo -n "${i}${q}" | xxd -r -p | sha256sum | cut -d ' ' -f 1)
echo "FCSC{${hash}}"

This is an improvement as it runs in less than a second, despite still using explicit assignments (to variables i q hash). Why not solve this without using that much RAM.

Streaming pipe

#!/bin/bash

(
    xxd -c 8 -g 4 -p file-format.iq | cut -b -8 | tr -d '\n'
    xxd -c 8 -g 4 -p file-format.iq | cut -b 9- | tr -d '\n'
) | xxd -r -p | sha256sum | cut -d ' ' -f 1 | awk '{print "FCSC{"$1"}"}'

Now the real question is how many loops are happening irl (before the hashing) ? The answer is 6 or 2 depending on how we count.

From a high level each xxd, cut, tr is looping through the output of the whole file. Except that the pipe operator directly connects the standard output of the previous program to the standard input of the next. So each piped line is happening concurrently, with each program processing the output of the previous as soon as it is available. So this native concurrency enabled by the pipe operator means that when we finish processing the initial xxd we have practically finished executing the whole bash line.

`awk` scripting

If this is already quite performant this can still be improved slightly using awk.

This time we will only be parsing the file once, and we will be using awk native variables. We will be doing one explicit loop, but as this is done in awk this is done at compiled machine code speed.

#!/bin/bash

xxd -p -c 8 file-format.iq | awk '
{
    i_part = i_part substr($0, 1, 8)
    q_part = q_part substr($0, 9, 8)
}
END {
    print i_part q_part
}' | xxd -r -p | sha256sum | awk '{print "FCSC{"$1"}"}'

Benchmarking

After having run each of these programs 160 (10xnproc) times the following shows the time taken by each program.

Program	Avg Time (s)
./s-naive.sh	11.53767 s
./s-buffered-pipe.sh	0.01173 s
./s-streamed-pipe.sh	0.00954 s
./s-awk.sh	0.00621 s

What of the kernel file caching

If you are wondering why in the buffered or streamed pipe I did not store the value of the xxd command in a variable instead of parsing the file twice. I considered that the first file read would make the kernel cache the file, making the second xxd file read very efficient. In practice this is not entirely accurate because when I ran the program the file-format.iq never left the cache. But what if it would ?

We will be using /proc/sys/vm/drop_caches which is a file whose sole purpose is to be able to benchmark disk I/O.

$ cat cache-check.sh
#!/bin/bash
# COLD RUN (Disk I/O)
FILE=$1
echo "--- Cold Run (Reading from Disk) ---"
sync; echo 3 | sudo tee /proc/sys/vm/drop_caches > /dev/null
time ${FILE}

echo -e "\n--- Warm Run (Reading from RAM) ---"
# WARM RUN (Cached)
time ${FILE}

$ ./cache-check.sh ./s-awk.sh
--- Cold Run (Reading from Disk) ---
FCSC{843161934a8e53da8723047bed55e604e725160b868abb74612e243af94345d7}

real 0m0.021s
user 0m0.000s
sys 0m0.016s

--- Warm Run (Reading from RAM) ---
FCSC{843161934a8e53da8723047bed55e604e725160b868abb74612e243af94345d7}

real 0m0.004s
user 0m0.001s
sys 0m0.008s

$ ./cache-check.sh ./s-streamed-pipe.sh
--- Cold Run (Reading from Disk) ---
FCSC{843161934a8e53da8723047bed55e604e725160b868abb74612e243af94345d7}

real 0m0.021s
user 0m0.002s
sys 0m0.021s

--- Warm Run (Reading from RAM) ---
FCSC{843161934a8e53da8723047bed55e604e725160b868abb74612e243af94345d7}

real 0m0.006s
user 0m0.003s
sys 0m0.013s

real: is the effective time taken by the program to run
user: is the time taken in user mode to run the program
sys: is the CPU time in kernel mode

The observations here are what we expected:

disk I/O is the bottleneck
kernel caching of files is a major improvement
sys+user > real in the warm run showing us that the pipe operator effectively multithreads

If I run these 160 times I get

Command	Cold Avg (s)	Warm Avg (s)	Speedup
./s-buffered-pipe.sh	0.01927 s	0.01130 s	1.7x
./s-streamed-pipe.sh	0.01651 s	0.00767 s	2.2x
./s-awk.sh	0.01519 s	0.00625 s	2.4x

Conclusion

This writeup is a little rabbit hole on bash performance, because I was frustrated that my initial script took more than a second to output the flag.

Hopefully you will find it interesting.