• Global community
    • Language:
      • Deutsch
      • English
      • Español
      • Français
      • Português
  • 日本語コミュニティ
    Dedicated community for Japanese speakers
  • 한국 커뮤니티
    Dedicated community for Korean speakers
Exit
0

How to remove everything except numbers from chart images?

Explorer ,
Feb 27, 2023 Feb 27, 2023

Copy link to clipboard

Copied

 

I receive about 500 to 1000 photos of charts or graphs on a daily basis that i have to remove all the components of the photos except their numerical values. 

 

This is an example of my photos: 

001.jpg 

I select graph bars using Magic Wand Tools in Photoshop

002.jpg 

Then move selected area to between logos and the numerical values

003.jpg 

And finally, by removing the various components of the images, only numerical values remain

004.jpg 

I have to do these steps with Photoshop action feature because the number of my daily photos is very high. 
I can do these steps with action feature automatically if the distance between the edge of the graph bars and the first digit of the numerical values is the same in all images. but the main problem here is that the space between the numerical values and the graph bars in the images received daily is not the same

 
For example, this is another photo where the distance between the numerical values and the bars of the graph is very close: 

005.jpg 

If I apply the steps I went through for the previous image using Photoshop action feature on this image, surely some of the numerical values will be removed in this image. 

 

How can I delete all the components of my daily images except the numerical values through the Photoshop action feature? 

Can I move selected area exactly to before the numeric values using the Photoshop script

 

Note: 

  1. I can't use Photoshop Color Range tool for selecting numeric values because numeric values have same color as the rest of the image contents. 
  2. My daily images have a single layer and are not smart objects
  3. number of images received daily is very high and I cannot do it manually. I must do it with Photoshop action feature or another automatic solution. 
  4. I can't use OCR tool because some of the images contain some logos that contain letters or numbers. If the logo numbers are extracted and combined with numerical values, it is not possible to find the extracted logo numbers. 
  5. The space between the graph bar and numerical values in each image is specific and this distance is different between different images. 
  6. The space between the graph bar and numerical values in a photo is the same. 
  7. All my images resolution is same 
  8. It is not a problem if resolution of the photo are changed. 
  9. I know that the explanation I provided is complicated, but I could not express my problem more simply than this. 

 

I have been facing this difficult problem for about three months and I hope I can find a solution here 

 

Sorry if there are any spelling errors in the text because my English is very poor. I wrote this text with the help of Google Translate.

TOPICS
Actions and scripting , Windows

Views

15.9K

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Adobe
Community Expert ,
Mar 05, 2023 Mar 05, 2023

Copy link to clipboard

Copied

Use OCR tool for this sample image to understand what I mean: 

The example you posted in your last post lacks the commas in the numbers – why? 

You seem to constantly add new variabilities to the task. 

 

As for OCR: 

Acrobat’s OCR gets me this rtf: 

Screenshot 2023-03-05 at 14.07.44.png

That the negative numbers are ignored altogether seems a bit unexpected to me, but the »2« in line 2 appears distinguishable by size. 

Of course I have no idea how to incorporate Acrobat etc. in a Scripted process.

I work in image editing after all. 

 

I can use OCR tool only in one case. There should be scannable icons among the numerical values and logos: 

Untitled22.jpg 

Or any better way to prevent merging of numeric values and numbers in logos during OCR 

I hope you can fully understand my limitations in using OCR tool with this example

Do you have influence on the how the files are created?

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Explorer ,
Mar 05, 2023 Mar 05, 2023

Copy link to clipboard

Copied

quote

Use OCR tool for this sample image to understand what I mean: 

The example you posted in your last post lacks the commas in the numbers – why? 

You seem to constantly add new variabilities to the task. 

 

As for OCR: 

Acrobat’s OCR gets me this rtf: 

Screenshot 2023-03-05 at 14.07.44.png

That the negative numbers are ignored altogether seems a bit unexpected to me, but the »2« in line 2 appears distinguishable by size. 

Of course I have no idea how to incorporate Acrobat etc. in a Scripted process.

I work in image editing after all. 

 

I can use OCR tool only in one case. There should be scannable icons among the numerical values and logos: 

Untitled22.jpg 

Or any better way to prevent merging of numeric values and numbers in logos during OCR 

I hope you can fully understand my limitations in using OCR tool with this example

Do you have influence on the how the files are created?


By @c.pfaffenbichler

The values received in daily photos may be different. It is possible that the photos received today will be in eight digits and have commas, but the photos received tomorrow will be in single digits and decimals and will not have commas. 

I am not involved in the creation of the received photos. I edit the photos I post here to show my own descriptions. 

In another forum, someone was able to remove even »g« logo in the sample image below consists of lot of alternating steps of surface blur, gaussian blur, unsharp mask and levels adjustments. 

001.jpg 

I requested him to send me the action file used. I will post the file here if I get it. 

The values in the photos received daily may be as follows: 

1/23

1.23

123,123,123

123$

123123123

And maybe some other types of values that I don't remember

In my opinion, the best way for this project is to select the graph bars and move the selection to before the numerical values and delete the selected content. The only problem with this idea is the different space between the graph bars and the numerical values in the different photos. If it is possible to move the selection in different photos accurately to before the numerical values with the script, it is easy to remove the extra content of the photos.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Mar 05, 2023 Mar 05, 2023

Copy link to clipboard

Copied

Why do you call these images »photos« by the way? 

Are they actually taken with a camera or created all-digitally (somehow)? 

 

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Explorer ,
Mar 05, 2023 Mar 05, 2023

Copy link to clipboard

Copied

quote

Why do you call these images »photos« by the way? 

Are they actually taken with a camera or created all-digitally (somehow)? 

 


By @c.pfaffenbichler



My English is very poor. 

I speak in this forum using Translator.

I use both »photos« and »images« for the files I edit in Photoshop😁

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Explorer ,
Mar 06, 2023 Mar 06, 2023

Copy link to clipboard

Copied

quote

Why do you call these images »photos« by the way? 

Are they actually taken with a camera or created all-digitally (somehow)? 

 


By @c.pfaffenbichler

I ended up removing 90% of the extra content from the images. How can I select the main content using the height and remove the remaining 10% extra content? Does Photoshop have the ability to deselect selected content that does not have the desired height?

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Mar 06, 2023 Mar 06, 2023

Copy link to clipboard

Copied

I for one do not understand what you mean. 

So please post meaningful screenshots or sketches to clarify what you are talking about. 

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
People's Champ ,
Mar 06, 2023 Mar 06, 2023

Copy link to clipboard

Copied

quote
quoteIn my opinion, the best way for this project is to select the graph bars and move the selection to before the numerical values and delete the selected content.

By @abolfazl28627254vbil

 

Isn't it faster to select the entire zone with numbers using a polygon lasso or a path, and then apply a mask?

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Mar 06, 2023 Mar 06, 2023

Copy link to clipboard

Copied

quote

Isn't it faster to select the entire zone with numbers using a polygon lasso or a path, and then apply a mask?


By @r-bin

 

Ah, but this is in the context of scripting many thousands of variable source images that vary greatly from one to the next...

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Explorer ,
Mar 06, 2023 Mar 06, 2023

Copy link to clipboard

Copied

If Photoshop had feature to set the minimum and maximum diameter of solor range selection, maybe the problem would be solved. 

Untitled54355345435.jpg

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Explorer ,
Mar 06, 2023 Mar 06, 2023

Copy link to clipboard

Copied

quote

 solor


By @abolfazl28627254vbil

solor = color

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Mar 06, 2023 Mar 06, 2023

Copy link to clipboard

Copied

quote

If Photoshop had feature to set the minimum and maximum diameter of solor range selection, maybe the problem would be solved. 

Untitled54355345435.jpg


By @abolfazl28627254vbil

You seem to expect vector/text data capabilities from pixel data. 

I think you should accept what Photoshop is and does. (At least at current, what the future may bring – who knows?) 

 

Also: In the sample images you provided in various posts the numbers were quite different in height, so it may be hard to avoid a Scripting approach. 

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Explorer ,
Mar 07, 2023 Mar 07, 2023

Copy link to clipboard

Copied

quote
quoteAlso: In the sample images you provided in various posts the numbers were quite different in height, so it may be hard to avoid a Scripting approach. 

By @c.pfaffenbichler

So the only help that Photoshop can give me right now is to increase transparent space between objects to right so that I can use OCR tool afterwards. 

Untitled.jpg 

Untitled12.jpg 

If can do this without converting selected objects in image to dedicated layers, image editing speed will increase. For example, select the transparent space with magic want tool and expand it with any way. By doing this, numbers in the logos will not be merged with the numerical values during OCR.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Explorer ,
Mar 07, 2023 Mar 07, 2023

Copy link to clipboard

Copied

quote

If can do this without converting selected objects in image to dedicated layers, image editing speed will increase. For example, select the transparent space with magic want tool and expand it with any way. By doing this, numbers in the logos will not be merged with the numerical values during OCR.


By @abolfazl28627254vbil

expand = distance

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Explorer ,
Mar 08, 2023 Mar 08, 2023

Copy link to clipboard

Copied

quote
quote
quoteAlso: In the sample images you provided in various posts the numbers were quite different in height, so it may be hard to avoid a Scripting approach. 

By @c.pfaffenbichler

So the only help that Photoshop can give me right now is to increase transparent space between objects to right so that I can use OCR tool afterwards. 

Untitled.jpg 

Untitled12.jpg 

If can do this without converting selected objects in image to dedicated layers, image editing speed will increase. For example, select the transparent space with magic want tool and expand it with any way. By doing this, numbers in the logos will not be merged with the numerical values during OCR.


By @abolfazl28627254vbil
So the only thing that can help me now is this idea. If you know a script that can do this, please provide it.

 

 

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Mar 08, 2023 Mar 08, 2023

Copy link to clipboard

Copied

I see no point in pursuing this as »the transparent space« could also include the space between numbers, between numbers and units of measurement, between the letters in or elements of the logos, …

How could this be meaningfully automated when it appears to come down to the same problem as before? 

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Explorer ,
Mar 08, 2023 Mar 08, 2023

Copy link to clipboard

Copied

quote

I see no point in pursuing this as »the transparent space« could also include the space between numbers, between numbers and units of measurement, between the letters in or elements of the logos, …

How could this be meaningfully automated when it appears to come down to the same problem as before? 


By @c.pfaffenbichler

by selecting white background and inverse selection we can select different components of a images. then by expanding accurately, bars, logos, and numerical values can be accurately separated. 
I have not mentioned one point before: The font of numerical values and its unit in the images received every day is the same. The distance between the numerical values and the graph bars in the received images is the same every day and is specific to the images of that day but this distance decreases to a few millimeters by increasing the bars of the graph. 

For example, this is one of the thousands of images that I received today: 

exam1.jpg 

In all images I received today, the font of the numerical values and the unit of the numerical values are the same. The only thing that is different in images received today is the distance between the numerical values and the graph bars. For example, if the number of bars increases to 6 bars, the distance between the graph bars and numerical values will decrease by a few millimeters. 

exam2.jpg 

The unit of numerical values and the font of numerical values in the received photos are the same every day and are specific to that day. The only difference between the photos received every day is the distance between the numerical values and the graph bars, because to increase the number of graph bars, the scale of the image components must be reduced. 

The main problem here is that it is not possible to define this distance reduction in daily received images for photoshop action feature. 

Maybe you think that I will use the selection and expansion of the images received today for the images received tomorrow. I have to say that I select and expand based on the images received every day and the fonts in the images of that day. 

Maybe I should calculate the minimum distance between the graph bars and the numerical values in the received daily images, and then drag the graph bars to the right by the same calculated minimum distance to cover the logos. For this I need to find image of the chart with the highest number of bars among thousands of images! This is not possible manually and I cannot count the number of bars one by one in the thousands of images received daily. Maybe the script can quickly find image with the highest number of bars among thousands of images.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Explorer ,
Mar 05, 2023 Mar 05, 2023

Copy link to clipboard

Copied

Another solution that can help me is to group photos based on the number of bars in the graphs. For example, all photos that have 5 bars should be placed in one folder. All photos with 6 bars should be placed in another folder and this procedure should be repeated for photos with other bars as well. With this, I can perform a different action for each group of photos.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Explorer ,
Mar 02, 2023 Mar 02, 2023

Copy link to clipboard

Copied

quote

I provided a Script that worked for the sample images; maybe you need to start doing your own trouble-shooting? 

 

I guess the easiest solution would be wrapping the operation in a try-clause – that way the files that don’t yield the intended results would not be processed but at least they should not stop the process for the other files. 


By @c.pfaffenbichler

A script can do this by going through the following steps: 

1) select white background using magic wand tool 

2) inverse selection 

3) Expanding the selection to ensure that all logos are surrounded, but to the extent that the selection of numerical values and bars and logos are not merged. (We can set the expand value each time in the script) 

4) If the script can recognize the order of the content of the selections, it should first clear from the first bar to the last bar. Then go to the first logo and clear from the first logo to the last logo. Do not change the rest of the selections. 

 

 

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Explorer ,
Mar 02, 2023 Mar 02, 2023

Copy link to clipboard

Copied

I noticed that your script "keep only right row". we must remove two left selected row Because it is impossible to select a number automatically and the selection of a number may be divided into two categories. But the selection of logos and blocks is not two-devide

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Mar 02, 2023 Mar 02, 2023

Copy link to clipboard

Copied


@abolfazl28627254vbil wrote:

I noticed that your script "keep only right row". we must remove two left selected row Because it is impossible to select a number automatically and the selection of a number may be divided into two categories. But the selection of logos and blocks is not two-devide


What is that supposed to mean? 

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Explorer ,
Mar 01, 2023 Mar 01, 2023

Copy link to clipboard

Copied

Did you test your method with these several actual sample images? Did the method also work for these three photos? - I don't know how you use Work Path for remove blocks and logos. If possible, please record video from your steps and post here.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Explorer ,
Mar 01, 2023 Mar 01, 2023

Copy link to clipboard

Copied

You can't choose based on the height because the height of some logos may be exactly the same as the height of the numbers!

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Mar 01, 2023 Mar 01, 2023

Copy link to clipboard

Copied

We can check on the height and width, or area etc.

 

Even if 5% are missed, this has to be better than nothing. The time saved on the bulk can be used on the outliers.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Mar 07, 2023 Mar 07, 2023

Copy link to clipboard

Copied

It seems a Script could handle the sample images (at least those that I downloaded, I might have missed some), but I guess it would  just be a question of time until the next image with different parameters is presented … 

removeEverythingExceptNumbers_mov01.gifScreenshot 2023-03-07 at 16.11.11.png

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Explorer ,
Mar 07, 2023 Mar 07, 2023

Copy link to clipboard

Copied

quote

It seems a Script could handle the sample images (at least those that I downloaded, I might have missed some), but I guess it would  just be a question of time until the next image with different parameters is presented … 

removeEverythingExceptNumbers_mov01.gifScreenshot 2023-03-07 at 16.11.11.png


By @c.pfaffenbichler

wow it looks great. 

Please check the following with your own script: 

 

1) chart images with numerical values with different units such as $ or KM: 

mesal1.jpg 

 

mesal2.jpg 

 

2) A graph where one of its values is zero and no bars are created for zero values: 

mesal3.jpg 

3) Does the font of numerical values affect your script? If some charts do not have logos and only have a graph bars and numerical values, will it affect your script? 

mesal4.jpg 

 

The only thing the script should be able to do in all these images is to keep the numerical values with their units

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines