Yes. They also mention that using such large models for compression is not practical because their size thwarts any amount of data you might want to compress. But this result gives a good picture into how generalized such large models are, and how well they are able to predict the next tokens for image/audio data at a high accuracy.