I’ve hacked plenty of bash aliases, functions, and scripts using coreutils myself; but sometimes you need something a bit more robust when it comes to error handling, retrying, maintainability, and an actually distributed solution instead.
xargs on its own might be more resilient than a distributed crawler, as one would expect, but if I’m tasked with building a distributed data processing pipeline I want more guarantees from the system as a whole, not only from its individual building blocks.
The time and effort put into embedding these guarantees in hacked shell scripts running on a dozen machines might be better invested into building a more solid foundation instead.