Hello Ayush,
your approach is going in the right direction, but is as you mentioned not very optimized, in the value processor alone you won't have another chance but if you are using a Pixelprocessor instead, it gets actually very easy and optimized, because the Pixelprocessor calculates every pixel in parallel and you can exploit this, in order to get the total bounding box of an input very fast.
Here is an overview over the compete graph, ignore the first section, it's just an input, you would probably use an input Node for it and do this in a subgraph, in order to reuse it, whenever you need it.

The Algorithm:
- Prepare your Data, for every Pixel greater 0 (a valid pixel) you set the current Position in the grid as Output, for every invalid pixel you set the output to 2 in the red and green channels and to -1 in the blue and alpha channels. This generates a mask, of -1 and 2 for the min and max values of x and y.

- Next Step is the core idea behind the whole thing, you downsample the image step by step until it's a 1x1 texture, in every downsample step you check the 4 neighbours of every pixel if they are valid and if so, output the minimum and maximum values of x and y of all 4 neighbours. You basically calculate the bounding box of a 2x2 pixel area, in the next step you repeat this step and merge the 4 neigbouring bounding boxes into a new bounding box.


- In the last Downsample Step you have only a 1x1 Texture, with all the "mini" - Bounding Boxes merged into one Bounding Box, which lets you extract the min and max values from the Bounding Box in the following order:
- red: bbox min x
- green: bbox min y
- blue: bbox max x
- alpha: bbox max y
Stay healthy and creative Marco