Google is doing this exact same thing with Gemini, the platform behind Bard / Assistant.
Gemini has large scale models, that live in data centers, and handles complex queries. They also have a “Nano” version of the model that can live on a phone and handle simpler on-device tasks.
The smaller models are great for things like natural language UI and smart home controls. It’s also way faster and capable of working offline. A big use case for offline AI has been hiking with the Apple Watch in areas with no reception.