Answered
Repairing a corrupt topic in an RSC database
I recently discovered that the inclusion of certain
non-alphanumeric characters in a RoboHelp HTML topic's file name
can cause that file in an RSC database to become corrupt. The
corresponding .htm file in my local working folders (Windows
Explorer) is not corrupt. The offending characters (at least those
that I've seen) include:
– (em dash; the en dash doesn't cause a problem)
… (ellispsis, where it is a single character; three separate periods, side-by-side, don't cause a problem)
' (apostrohphe)
Oddly, RoboSource Control allows files that contain one of these offending characters to be added to a version control database. Further, these files can be checked out to a client's local machine, where the client can then work in the file without any issues. However, upon the check in of such a file, the corresponding database file becomes corrupt. The corruption appears in the form of a name change to the file. For example, a file named "export_files….htm" (where the ellipsis is a single character) will be renamed to "export_filesBLAHBLAH" (where "BLAHBLAH" is a random combination of symbols and special characters). The offending character is always replaced with the symbols and special characters, which then form the remainder of the file name; any portion of the file name prior to the offending character remains unchanged.
I recognize the importance of restricting file names to alphanumeric characters, but in the event this occurs again, what's the best way to repair a corrupt topic in a database? In working this through my head, I've come up with the following process:
1. Delete the corrupt file from the database.
2. Use Windows Explorer on your local machine to make a copy of the corresponding .htm file.
3. Remove any offending characters from the file name of the copy of the .htm file.
4. Open RoboHelp HTML and delete the topic from the project.
5. Import the backup copy of the .htm file into the Help project.
6. Check in the "new" (restored) .htm file.
Note: I can't remember for certain, but I believe I tried to simply rename the corrupt file in RSC, but it wouldn't allow me so I don't think that's an option.
– (em dash; the en dash doesn't cause a problem)
… (ellispsis, where it is a single character; three separate periods, side-by-side, don't cause a problem)
' (apostrohphe)
Oddly, RoboSource Control allows files that contain one of these offending characters to be added to a version control database. Further, these files can be checked out to a client's local machine, where the client can then work in the file without any issues. However, upon the check in of such a file, the corresponding database file becomes corrupt. The corruption appears in the form of a name change to the file. For example, a file named "export_files….htm" (where the ellipsis is a single character) will be renamed to "export_filesBLAHBLAH" (where "BLAHBLAH" is a random combination of symbols and special characters). The offending character is always replaced with the symbols and special characters, which then form the remainder of the file name; any portion of the file name prior to the offending character remains unchanged.
I recognize the importance of restricting file names to alphanumeric characters, but in the event this occurs again, what's the best way to repair a corrupt topic in a database? In working this through my head, I've come up with the following process:
1. Delete the corrupt file from the database.
2. Use Windows Explorer on your local machine to make a copy of the corresponding .htm file.
3. Remove any offending characters from the file name of the copy of the .htm file.
4. Open RoboHelp HTML and delete the topic from the project.
5. Import the backup copy of the .htm file into the Help project.
6. Check in the "new" (restored) .htm file.
Note: I can't remember for certain, but I believe I tried to simply rename the corrupt file in RSC, but it wouldn't allow me so I don't think that's an option.