Your prompt is confusing, the way it is structured, sounds like a dog, looking at a statute, with a person on top of that. So you are stating 3 figures instead of 2, this is what is causing the issue.
I fixed your prompt:
A dog looking at a "human statue on a pedestal"
My first and second generations produced these images right off the bat.
My prompt is concise and reads evenly, with no confusion as to what is going on.
No biggy, the prompts need to be precise that is all, get rid of unnesscesar
...