Exit
  • Global community
    • Language:
      • Deutsch
      • English
      • Español
      • Français
      • Português
  • 日本語コミュニティ
  • 한국 커뮤니티
0

Publishing is Always Publishing All

Enthusiast ,
Feb 11, 2013 Feb 11, 2013

Hi there.

RH 9 (latest patch)

Windows 7-64-bit

WebHelp output

We have a couple of very large projects (2000+ topics in each) and publishing is taking a long time because even though we have Publish All cleared, it's still sending up all the files.

For example, in one of these large projects, I made changes to two topics, and added an index entry and then generated it and published it. The generation time isn't too bad and is what I would expect.

But the publish time for only changing two topics and an index entry is ridiculous. It's sending up thousands of files. Here's my results:

Total Files: 2519

Files Published: 2179

Elapsed Time: 18:04

How can we improve this? This is happening for me and my two coworkers on our Doc Team. We are publishing to an internal intranet webserver and we use that for documentation reviews from SMEs. We all share the same publishing location. We all upload our content the same way, by mapping a drive to the server and publish using the File System option.

We are also sharing projects using the open source Merucrial source control system.

Is RH getting confused because multiple authors are sending to one spot? Meaning if I publish to that location and if someone else publishes there, then the next day when I publish again, does RH no longer know that I've published there before and think it has to do all the files over again?

Thanks in advance!

1.2K
Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Feb 12, 2013 Feb 12, 2013

Jared

No issue with what you say about the time but I don't think Rh is sending all the files, it just gives that impression.

You will see it running through all the filenames but it is checking the two files are synchronised and only uploading what has changed. However, that is more than the two topics. The index has changed and that affects a number of support files that also have to be uploaded.

Think of it this way. If you uploaded just what has changed using FTP it would be quicker. If you had to stop and manually compare the server and local versions, it would take longer and that is what Rh is doing.

I'm not sure about how source control works here. To the best of my knowledge the process takes what is there and then works as if it were on your PC, in other words it updates your copy. I'm not sure if multiple authors publishing rather than one person doing it is good practice. Maybe ask about the workflow in the source control forum.


See www.grainge.org for RoboHelp and Authoring tips

@petergrainge

Help others by clicking Correct Answer if the question is answered. Found the answer elsewhere? Share it here. "Upvote" is for useful posts.
Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Enthusiast ,
Feb 12, 2013 Feb 12, 2013

Hi Peter. Thanks for responding, but I'm currently watching the current target directory as I type, and it's clearly pushing up all the .htm files. I didn't change these and they have no relation to the files I did modify. The date modified for the htms in the publishing directory is my current date and time not some older date if it were only doing a compare.

I'll do some more testing.

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Feb 13, 2013 Feb 13, 2013

Let us know how that goes.


See www.grainge.org for RoboHelp and Authoring tips

@petergrainge

Help others by clicking Correct Answer if the question is answered. Found the answer elsewhere? Share it here. "Upvote" is for useful posts.
Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Feb 21, 2013 Feb 21, 2013

Perhaps try doing a Get Latest before generating and publishing, to be sure all the files are exactly the same as the last checkin? There's a .txt file in the publishing directory (at least there is one on a test project I have) that lists MD5 and SHA1 codes, so perhaps these aren't matching for some reason.

You could also try having only one person doing the publishing for a few days, just to see if that makes a difference.

Amber

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Enthusiast ,
Feb 27, 2013 Feb 27, 2013

Amber thanks for replying. I'm not sure what you mean by MD5 and SHA1 codes. What are those?

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Feb 27, 2013 Feb 27, 2013

They're codes that can be generated against file that are supposed to be unique to the specific file. If the file is changed in any way then the code will no longer match. I assume RH is using these codes to determine if the file has changed since the last upload. So if you all do a Get Latest before publishing, then I'm theorising that the unchanged files will then match those codes on the server (rather than perhaps having older versions being published from your local drive). But it's only a guess on my part.

Amber

(Ah, I like how wikipedia describes MD5 "also commonly used to check data integrity". )

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Feb 28, 2013 Feb 28, 2013

I bet RH is only looking at the date stamps to figure out what's changed. Doing hashes on each file would suck up a lot of time/processing.

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Enthusiast ,
Mar 01, 2013 Mar 01, 2013
...There's a .txt file in the publishing directory (at least there is one on a test project I have) that lists MD5 and SHA1 codes, so perhaps these aren't matching for some reason.
...
They're codes that can be generated against file that are supposed to be unique to the specific file. If the file is changed in any way then the code will no longer match. I assume RH is using these codes to determine if the file has changed since the last upload. So if you all do a Get Latest before publishing, then I'm theorising that the unchanged files will then match those codes on the server (rather than perhaps having older versions being published from your local drive). But it's only a guess on my part.

Okay. I've seen this before. The bsscftp.txt file. There's one for each folder in my project.

"Perhaps these aren't matching," you said. What are they supposed to match? How do I interpret these files? Looks like there's three lines of text for each file in the project:

Here's a few:

100

FILENAME:Assigning_PC-DMIS_Functions_to_buttons_on_the_SpaceMouse_or_SpaceBall.htm    MD5:8466113706989726975102507288529710379995797109686161    SHA-1:541226911911210852100981196582681001225299481005610067821051051007761   

FILENAME:Automating_PC_DMIS.htm    MD5:115102103871041118712253122491027452102116971039710357666161    SHA-1:1208343487111511678837711856857449847288845183556748115437261   

FILENAME:Available_PC_DMIS_Functions_for_SpaceMouse_or_SpaceBall.htm    MD5:75717310010089101114848773841117797779911978110120686161    SHA-1:1166612079687482821021071001161161085410149105100681137010384118717861  

...

And so on.

As for getting the latest, I assume you're referring to a source control? I do try to do a pull of the latest changes (we're using Mercurial, not RoboControl) but I don't know if that makes a difference.

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Mar 18, 2013 Mar 18, 2013

Sorry for the delay, I've been away for a couple of weeks.

If RH was checking these codes, the process would be something like: RH calculates the MD5 for a file, looks up the number in the .txt file, if the number is the same as the newly calculated one don't upload, if it's different (because the file has changed) then upload.

Yes, I meant getting the latest version from source control.

The last suggestion is designating one person to do all publishing jobs for a few days, to see if that makes a difference. It might not be workable longer term, but at least it might indicate a single file that is storing a list of "last changed" topics.

Amber

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Enthusiast ,
Mar 19, 2013 Mar 19, 2013
LATEST

It does make a diff to have just one person publishing. Lately, I've been the one pulling the other authors' changes and publishing when they need it. In that case, the publish part of generation only puts up the newly changed/added topics (along with a bunch of file its creates on the fly to handle searching, toc, index etc, but that's 'normal').

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Resources
RoboHelp Documentation
Download Adobe RoboHelp