It looks like they did it both ways (“raw rate” vs “adjusted rate”):
In the case of the adjusted compression rate, the model’s size is also added to the compressed size, i.e., it becomes (compressed size + number of model parameters) / raw size. This metric allows us to see the impact of model parameters on the compression performance. A very large model might be able to compress the data better compared to a smaller model, but when its size is taken into account, the smaller model might be doing better. This metric allows us to see that.