I believe lighting plays a very important part in making a scene realistic when it comes to creating one artificially, like in 3D modelling. That is why I also think the lighting of these AI generated images is the prime source of what impresses people about these images since no matter how unrealistic or distorted the subject is, the lighting makes it look like a natural part of the background. This is clearly different from photos like from poorly Photoshopped ones where the subject feels deliberately inserted into the scene from a cutout.
I am interested to understand how LLMs understand the context of the lighting when creating images. Do they make use of samples which happen to have the exact same lighting positions or do they add the lighting as an overlay instead? Also, why is it that lighting doesn’t look convincing in some cases like having multiple subjects together etc.?
To get a bit more technical, they build images in passes; each becoming more coherent than the last. This gives thsm a boost to understanding understand how ‘things’ relate to one another, knowing them by nothing but that relation. Light + light source being an example, and the angle of lighting being another deeper layer of that a - completed on a less noisy pass.
This is how its able to build images from its training parts, it sorta understands how each of them relate to some things, so its able to sorta organizes an image of random noise each pass, eventually creating a ‘unique’ image inspired by its training data.
It also gives it that perfect image you mention, cause its specifically trained to look like what looks good to us - its essentially a function optimised on nothing but.