The CUDA moat is pretty deep, but the primitive are starting to solidify and almost no one uses CUDA directly. Increasingly popular libraries are going multi-backend (thanks Apple silicon).
My guess is that as soon as cheap accelerators with LARGE memory banks hit the market, the libraries will support whatever API those need and CUDA dominance will be essentially shattered forever.
But we are not there yet because making good numerical hardware is fucking hard.
They need to be substantially cheaper and (more importantly) they need loads more memory.
The problem is that everyone (chiefly nvidia, but not only) is afraid to hurt their professional offerings by introducing consumer grade ML cards. They are not afraid of Joe having to use a smaller model to do AI on his security cameras, they are afraid of large companies ditching all their A100 cards for consumer equipment.
So they try and segment the market any way they can think of and Joe gets screwed.
It’s classic market failure really.