If there is one truly "silver bullet" problem that gen AI should be solving -- and specifically Firefly should strive towards -- for creatives, especially those in archviz, interior design, environment concepting, etc. is to make Firefly brilliant at understanding how to populate an image, render, photo of a space and fill it with accurate crowd imagery, and provide controls on how dense to populate the scene.
To accurately account for lighting, perspective, have controls for the demographs (general public? VIP event?) and how dense the scene should be filled, and intuitively know to put a few people in the front of the scene, a few in the back, to understand how to generate crowds to interact with the elements of the image, like seated guests on couch furniture, or ordering a drink at a bar element.