Users have asked for many different options. The current behavior is what I am calling "adaptive per caption." Adaptive meaning the background resizes to the text, and "per caption" meaing it uses all lines in the caption to set the size. What you want (which is what traditional captions are and what many users are asking for) is "adaptive per line" - the background resizes to each line.
Here's another thread, but the feature request it links to is for getting drop shadow on the whole background box.