Pull back on the visual intensity bar, if it is shoved too far right, it will give surreal results in most circumstances. Hyper-realism can push images over the edge, only use it once your image is "locked" in looking undistorted.
I would try this:
Use one of those images as a structure reference, and push that slider all the way right. Then find a photo, not small and low resolution, but clean/ clear of realistic people and use that as your style reference. Push that slider all the way right, only pull back on it once that is locked in with undistorted results. Pull back on this will introduce different styles outside of what you have locked in. After all of this up vote and down vote the results to train Firefly.
One last thing, make sure your prompt is clear. Run on sentences and ChatGBT only confuse the program. With that many people in the scene, distortions are expected, this affects all AI Gen programs, including the popular MidJourney. You know 7 fingers, extras arms, and legs, AI Gen programs only understand shapes and patterns, not human anatomy currently. Let's first focus on correct humans faces, then go from there.