In principle this is simple. You need to produce a mask so that the wanted areas are white on the mask and the unwanted areas black. Then you just put a white fill layer underneath.
In, practice, with that photograph, there are many areas where the wanted content is the same colour and brightness as the unwanted content. Look at the loops in the chain. So you will need to take a lot of time manually going over the mask so that every part of the chain is selected and every gap is deselected. This will take you a long time.
If it was me, I would photograph it again with much better lighting - or get the photographer to do the same. Then some simple tone adjustments should do it
Dave