8/ Sure, running high-res inference is more expensive, but we add Gram loss late in training (1M iters) and run just 70k iters!
We initially tried from 200k iters, to prevent degradation, but even damaged features can be quickly fixed, reaching the same quality as early use 🤯