I'd like to fine tune a model that does img2img with a text prompt to guide the output. I think img2img-turbo might be the closest to what I'm after, though by default it uses a fixed prompt which can be made variable with some tweaking of the training code.
At the moment I only have access to 24GB VRAM which limits my options. What I'm after is training a model to make specific text-based modifications to images, and I have plenty of before to after images plus the modification text prompts to train on. Worst case, I can try to see if reducing the image size during training makes it possible with my setup.