Skip to main content
Participating Frequently
March 30, 2020
Question

Select based on object size?

  • March 30, 2020
  • 4 replies
  • 3632 views

Hi. I want to select the writing on very old manuscripts. There are many pages, and I would like to automate all I can. I can use magic wand or color range to select the writing. But there is a lot of grainyness, tiny ink blots, and/or other defects that have come with age that also get selected.

 

Is there some way to include object size a parameter for selection? For example, the letters, like what I'm typing right now, are all within a very limited size threshhold. I was wondering if there might be a way to only select objects on such a basis. (I especially have a script in mind, fwiw.)

 

Or is there perhaps a better way to accomplish what I'm seeking to do altogether?

This topic has been closed for replies.

4 replies

Participating Frequently
June 5, 2020

Hi. I'm coming back around to my initial question, because my efforts to implement a solution have failed thus far.

Restated: In Photoshop, can I make a selection more precise by taking size into consideration?

 

I know this may not be clear, so I hope this example will help: On an image like this...

...if I hit the magic wand (threshold 60) on one of the lightest spots (and delete, to help you see the results), I get ...

...tons of unwanted "islands" selected in addition to the letters. (The letters are what I want.) Some islands are bigger than the biggest letter; most are smaller. I would like a way to eliminate these islands based on a range of width and/or height in pixels (or at the very least, a range of total pixel area).

 

Some qualifiers and thoughts:

- I'm a novice.

- I'm seeking to repeat this on numerous similar images, so I am looking for solution that can be automated (in a script or recorded action).

- I have explored at least a few noise-cancelling, blurring-type operations and the like; I haven't found them sufficient.(And I'm very open to hearing more suggestions along these lines that would serve my purposes. Even then, I would still love to hear how use size as a parameter in selecting objects in Photoshop.)

- "Connected components" with adjustable size parameters (maximums and minimums) seems to be a "thing" in image processing. How do we do that in Photoshop? 

- I know I'm very new and green to all this, so please bear with me... There appear to be many ways to make improve precision of selections based on color (brightness, contrast, RBG channels, etc). Is adding a threshold for pixel width as a parameter complicated? I am really asking, because I really don't know.

 

Thanks!

Kukurykus
Legend
April 10, 2020

I did something like it, so anyone willing can play with to adjust it to your needs:

How to remove small black dots from text page - selecting pixel radius via script

Participating Frequently
April 19, 2020

The script at that link is so helpful! Thank you very much.

 

I indeed aspire to play with it, as you put it. 🙂

 

I tried the script and was impressed, and I am very hopeful it will aid me greatly. I altered the "pat" function value to test the difference that made. I also tried your original code as well as the alteration you made later in the thread.

 

Since I'm a beginner, and since my goal is a little different than the person that needed that script, I really could use some assistance in understanding some things from the code. (Here's another example image, fwiw.)

 

So, the goal is to separate each letter (and ultimately to save each one to a file).

 

1. For one thing, I'm trying to figure out how to apply your code without any alterations at all to the letters themselves - no loss of color or resolution, no filling gaps, etc. (You mentioned some fill-in and some minor thinning and increase in contrast going on in the existing script.)

 

2. Would you mind explaining, if possible, exactly which part of the code uses size (pixel radius) to differentiate between small dots and letters?

 

3. When I run the script and then hit Ctrl+Z backwards through the steps, at one step I see small blue squares. Would you mind just explaining what those are? (My guess is they have to do with the measurements. But I searched, couldn't find anything online, and don't want to do too much guessing.)

Kukurykus
Legend
April 19, 2020

1. increase / decrease some values in the script to see the effect. Some little changes may help

2. try that experincing with the script, 3. I'm not sure, but blue squares may be path(Items)

 

Maybe I have something wrong set in CC 2020, but it didn't work there (white layer as result).

Then I tried in CS6 and all was okey. You can try with other layer I'm providing.

It's not original that was put to dropbox in that thread, but result will be same:

(Just in case you can't get it, I post also how the result looks like on my side)

 

 

 

 

 

 

 

 

 

Tom Winkelmann
Inspiring
March 31, 2020

Please post an example, to see if we can find a workaround for your problem... 😉

Participating Frequently
April 10, 2020

Thanks! Here's a low-resolution example of a manuscript page:

I would like all instances of:

etc. These are rush jobs. But I'd like them a pixel or two outside the perimeters. And each saved as an image file. (And fwiw, there are many pages to this manuscript.)

 

I would really appreciate anyone's suggestions toward a solution. So far, only a couple of angles have come to me, but they are incomplete.

  • One would be to slice the source page into small slices saved into individual files - small enough so that Object Select (or Subject Select?) works (as it seemed to when I tested a small cropped area of a couple of words a while back). Then invert selection and remove background. Then use a plugin that splits "islands" of pixels to layers (such as Troubleshoot fonts, or Split Layer to Islands). Then save layers to files.
  • The other angle brings me back to my original notion of some script that utilizes selection size. I read that selection sizes (height and width) can indeed be accessed (With Navigator/Info palette open, Window>Info, use the Info tab, w/h shown near the bottom), and that there are scripts that can access these measurements. So I have still wondered why it wouldn't be possible to have a script that checked through all selections' sizes and only acted on the ones within a certain threshhold (eg, width of .5 cm to 1.5 cm) and ignored the rest. (In this case, I think I could just use color range to make selections, then rely on the script to winnow things down to what I need.)

 

All this said, I'm not particular to any method! Any help towards a quality, reusable solution would be wonderful.

 

All the best.

Mylenium
Legend
March 30, 2020

They aren't objects, just pixels and there is no built-in logic in PS that differentiates this beyond the parameters provided. Adding a specific surface area constraint would be quite convoluted, anyway, and taking care of special cases like commas, dots and similar is another exercise. You're also wrong about scripts - they have no magic beyond parametrically controlling existing functions. What you want simply doesn't exist in PS and thus couldn't be manipulated using scripts or actions. one would have to write a whole custom plug-in. That being the case, it's back to good old duplicating layers and processing them with adjustments to increase contrast and such and base the selections on that rather than the original sources.

 

Mylenium

Participating Frequently
March 31, 2020

Thank for taking time to respond, and for your explanation.

 

By "object," I just had in mind the lingo of Photoshop's Object Selection tool from the Magic Wand group. Or even "subject" in the sense of the Select Subject tool. But yes, it's all pixels.

 

One reason I had this in mind is because I tried to use these tools. The results are not good. But if I focus in and select a rectangle with just a word or two enclosed, then use the Object Selection tool, it grabs the letters as desired. What I'd like is for this to happen for whole pages, of course.

 

(Another clarification, just in case: I'm trying to get at the letters themselves; the text itself - not even the words - need to be intact. I'm not sure if that makes a difference or not.)

 

There is no punctuation in the document I have in mind (a medieval Latin manuscript), so that wouldn't be an issue. And since I'm aiming for the letters, smaller things getting left out would be okay anyways. Again, my hope is for a way to leave out things smaller than letters in general.

 

What I hear you saying is that there is no way to use size per se to dictate what does or what doesn't get selected. Thank you for clarifying that. If you think of any other angle, or any exception to this, I'd greatly appreciate knowing.

 

(For what it's worth, my relatively uneducated guess was that somewhere, somehow in the Subject Select tool's Sensei algorithm, size has to matter. (Kind-of like, "Hey, the full image is not a subject. It has to be smaller than that." And, "Hey, those little grains, regardless of color and contour, can't be subjects. You have to find something bigger." (This is closer to my simplistic way of thinking.)) And maybe this would be in the Object Selection tool's logic, too. (Idk.) Anyways, if this very loose concept had any resemblance of truth, then my hope was to likewise be able to instruct Photoshop to make size matter when making or working with (eg, subtracting from or adding to) a given selection.)

 

Thanks.