• Global community
    • Language:
      • Deutsch
      • English
      • Español
      • Français
      • Português
  • 日本語コミュニティ
    Dedicated community for Japanese speakers
  • 한국 커뮤니티
    Dedicated community for Korean speakers
Exit
0

Help cutting and pasting text into Word

New Here ,
Oct 09, 2020 Oct 09, 2020

Copy link to clipboard

Copied

I have a book in a PDF form and I need to cut and paste excerpts into Word. But when I try, it comes out with many errors. A small amount is recognizable, but most of it is errors--odd formatting, characters, wrong letters, etc. Is there a way I can clean up the PDF to improve the accuracy of the cut and paste so I don't have to clean it all up manually? (I have to work with the copy I have, I do not have access to the original book to get a better scan.) 

TOPICS
Edit and convert PDFs , Scan documents and OCR

Views

353

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Oct 09, 2020 Oct 09, 2020

Copy link to clipboard

Copied

Hi Emily,

 

Where did the book-PDF come from? Do you know if it came from a computer (e.g., MS Word or InDesign generated content that then was converted into a PDF) or could it have been scanned and OCRed?

 

If it was the latter, than there's not much hope because it means that the person(s) who did the scanning might have made a bunch of mistakes in the process. One thing to try is to see a word that came out dreadful when pasted and then do a search for what the word is supposed to be within the PDF. That is, let's say there's a word "about," and when you copy and past taht word it comes out "ebut," clearly wrong. Now go to the PDF and search for "about." Is it found? If not than try to find "ebut." Did it find that? If the latter, than there is no hope. If the former than there's hope.

 

Meanwhile, what is your OS (and what release) and what version of Acrobat are you using (and what release)?

 

Thanks and good luck!

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
New Here ,
Oct 09, 2020 Oct 09, 2020

Copy link to clipboard

Copied

Hi, Gary,

 

Thanks for the reply! I'm using MacOS Catalina 10.15.4 and Acrobat Pro DC version 2020 012.20043 (I hope that gives the info you were seeking--I'm not much of a tech expert).

 

I don't know how the PDF was created, unfortunately. I did the test you suggested; I'm afraid it resulted in the second result, so perhaps I'm out of luck with this particular file.

 

If this version is a bust, I might be able to get hold of a hard copy of the book in a week or two. If I am able to do that and scan it myself, what is the best way to get excerpts into Word? I'm embarking on a large project that will involve getting sections from many different PDF into Word and I'd like to figure out the best way to do this.

 

Thanks again!

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Oct 09, 2020 Oct 09, 2020

Copy link to clipboard

Copied

Hi Emily,

 

If you are getting the book, do you know the state it is in? The reason why I ask is that books generally do not lie very flat on scanners (or photocopying machines for that matter). I'm sure you've seen this. Below I'm going to link to an article I wrote on getting good quality scans that should help you get past many of the scanning issues BUT, if you can't get the book to lie flat or (the extreme option) destroy the book, you may have issues there as well.

 

What you haven't told me yet is the actual amount that needs to be copied and pasted. The reason why this is important is that at a certain point you do have to ask yourself: do I want to get the job done by this approach that seems pretty reasonable OR do I want to get the job done?

 

By that I am saying that it might be actually faster to just retype the parts that you need to quote rather than what seems so logical in this day and age: copy and paste!

 

I have to add that about a year ago my sister found a family biography that our mom was typing. It was old, courier font so good news there, but in her aged state and that the platten on the typewriter was old (that's the rubber roller that the paper sits against and the keys hit that thing) and very slippery so the page would slowly rotate as she typed so that text was overlying text, it was a mess. I did try to scan and OCR but only got about half of it. The rest I just retyped by hand — it was faster than trying to correct it word-by-word.

 

This is all something that you have to determine after trying to do a scan as I have suggested how below and see what you get. I do wish you luck!

 

http://photosbycoyne.com/Gary's_Help/Scanning/clean-scanning.html

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
New Here ,
Oct 10, 2020 Oct 10, 2020

Copy link to clipboard

Copied

Thanks for the information. I'm mostly dealing with library books and it's true that, even if I use the high-quality scanner at my office, it's often not possible to get the pages totally flat. The article is helpful and I'll need to figure out how to apply your tips to the scanner in my office.

 

I will eventually need to convert many, many pages, so I've been hoping for a technological solution to reduce the labor. If I have not choice but to transcribe it all, then that's what I'll have to do. But it does seem crazy if in this day and age there's not a better solution. As with many things, it's more complicated a task than it seems on the surface. Thanks again!

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Oct 10, 2020 Oct 10, 2020

Copy link to clipboard

Copied

Hi Emily,

 

If it's possible to have that book be a sacrificial book, than technology wins.

 

When I retired from my day job, I had some 50 years of a professional journal I did not want to walk away from but knew I did not have space for at home. I was able to borrow from a friend a fast FujiScan scanner and chose to cut up every journal so that it could be run through the speed scanner. The scanning worked like a charm but FujiScan's OCR software was, well, not good, so after scanning I'd run the results through Acrobat and got the best of both worlds. 

 

BTW, I'm assuming that your scanning, copying, and pasting are all within copywrite provisions of the original source material and author? Hate for you to go through all this effort and find yourself in some kind of legal morass. 

 

Best,

 

Gary

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
New Here ,
Oct 20, 2020 Oct 20, 2020

Copy link to clipboard

Copied

LATEST

Thanks, Gary, for the reply and apologies for the delay in responding. I've been pulled in a couple other directions and am just now getting back to this. I'm planning to try to scan this source at my office tomorrow and I'll employ your helpful tips. Fingers crossed I end up with something more usable.

 

(And, yes, this project will adhere to all copyright provisions. Right now I'm working up a proposal, but for the final work we will secure permission from the copyright holder and we have a budget for that.)

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines