You could use ffmpeg or python to split the video into a sequence of images and an audio file, then AI upscale the images using Upscayl, and finally combine the upscaled images and audio back into a video, using ffmpeg.
I’ve seen issues in the past where the audio would be out-of-sync when recombining the frames because ffmpeg wouldn’t output the right number of frames, so someone wrote a python script to split the video into frames and apparently it works correctly.