Machine learning is only as good as the dataset it has, and given that english has a HUGE data set on the internet, its okay at it, but it makes sense that for other languages, its likely not ideal.
An example would is art. Look up one using a smaller data set (e.g fully legal ones where all training data had artist permission) vs ones trained on the larger dataset where legality wasnt a concern. Night and day difference