Découpage d'un fichier word en plusieurs fichiers

Participant ,
Jun 10, 2021 Jun 10, 2021

Copy link to clipboard

Copied

Bonjour,

Ayant un fichier word d'un livre, je voudrais le découper automatiquement par chapitre.

Le problème est que je ne sais pas par où commencer 😞

Est-il plus facile de la transformer avant en pdf ?

Merci pour vos idées.

Cordialement

 

Hello,

Having a word file of a book, I would like to automatically split it by chapter.

Problem is, I don't know where to start 😞

Is it easier to convert it to pdf before?

Thank you for your ideas.

cordially

TOPICS
Advanced techniques, Documentation

Views

131

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Adobe Community Professional ,
Jun 13, 2021 Jun 13, 2021

Copy link to clipboard

Copied

Indeed. If you convert it to PDF beforehand, it will be easy to split it by chapter using ColdFusion's cfpdf.

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Participant ,
Jun 14, 2021 Jun 14, 2021

Copy link to clipboard

Copied

Bonjour,

Merci pour la réponse.

Mais quelle est l'option de CFPDF à employer pour faire cela ? (CFPDF fait tellement de choses !)

Cordialement et merci par avance.

 

Hello,

Thank you for the answer.

But what is the CFPDF option to use to do this? (CFPDF does so many things!)

Sincerely, and thank you in advance.

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Adobe Community Professional ,
Jun 14, 2021 Jun 14, 2021

Copy link to clipboard

Copied

You could use action="deletepages". 🙂

In the following example, I extracted the individual chapters from Guy Kawasaki's ebook. 

<!--- Determine the chapter start and end. Preferably manually, for accuracy. --->
<cfset chapterStartPage=[15, 43, 81, 102, 119, 159, 193, 212, 239, 270, 285, 298, 318]>
<cfset chapterEndPage=[41, 80, 101, 118, 158, 191, 211, 238, 269, 284, 297, 316, 365]>

<cfset sourcePDF="C:\Users\bkbk\Desktop\kawasaki\The Art of the Start 2.0 - Guy Kawasaki.pdf">
<cfset chapterDestination="C:\Users\bkbk\Desktop\kawasaki\chapters">

<cfloop from="1" to="#arrayLen(chapterStartPage)#" index="i">
    
    <!--- 
    For example, chapter 3 corresponds to the third item in the above arrays: 
    chapterStartPage[3]=81, chapterEndPage[3]=101. 
    To extract chapter 3, delete all pages between 1 and 80 and 
    between 102 and the end of the book.  
    --->
 	<cfset pagesToBeDeleted="1-#chapterStartPage[i]-1#,#chapterEndPage[i]+1#-*">
 	
	<cfpdf
    action = "deletepages"
    pages = "#pagesToBeDeleted#"
    source = "#sourcePDF#"
    overwrite = "yes"
    destination = "#chapterDestination#\chapter #i#.pdf">
</cfloop>

 

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Adobe Community Professional ,
Jun 17, 2021 Jun 17, 2021

Copy link to clipboard

Copied

@ZNB , does that answer your question?

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Participant ,
Jun 24, 2021 Jun 24, 2021

Copy link to clipboard

Copied

Bonjour,

C'est exactement ce que je cherchais ! MERCI

Par contre, il faut que je trouve un moyen pour déterminer automatiquement le début et la fin du chapitre.

J'ai pensé demander au client de mettre par exemple $deb$ et $fin$ pour le déterminer si je ne trouve pas autre chose.

PDF mette-t-il un signe particulier en début et fin de chapitre ?

Merci par avance.

Cordialement

 

Hello,

This is exactly what I was looking for ! THANK YOU

On the other hand, I have to find a way to automatically determine the beginning and the end of the chapter.

I thought I'd ask the client to put for example $start$ and $end$ to determine if I can't find something else.

Does PDF put a special mark at the beginning and end of the chapter?

Thanks in advance.

cordially

 

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Adobe Community Professional ,
Jun 26, 2021 Jun 26, 2021

Copy link to clipboard

Copied

LATEST

Hi @ZNB ,

Your expectation is quite reasonable. I also expected ColdFusion would by now offer a way to extract the chapter metadata of a PDF. Especially because ColdFusion is, like PDF, within the Adobe family.

 

Perhaps, there is such a way. If there is, then I am unaware of it and shall be glad to hear about it. 

 

It may indeed be a good idea to ask the client to put $XstartX$ and $XendX$, respectively, at the start and end of a chapter. Here, X is the chapter number. Using chapter numbers will make the code simpler.

 

In any case, it is possible to get information about PDF chapters automatically. You could do so by using the iText PDF library integrated in ColdFusion to extract the PDF bookmark.  

 

In the example below, I extract the bookmark of the Guy Kawasaki PDF ebook. From it, I could construct an array of chapter-start-pages, as defined in my previous code. What remains is for you to find a way to define the chapter-end-pages.

 

<cfset reader = CreateObject("java", "com.lowagie.text.pdf.PdfReader").init("C:\Users\bkbk\Desktop\kawasaki\The Art of the Start 2.0 - Guy Kawasaki.pdf")> 
<cfset simpleBookmark = createObject("java","com.lowagie.text.pdf.SimpleBookmark")> 
<cfset bookmarks = simpleBookmark.getBookmark(reader)> 

<cfif isNull(bookmarks)> 
	 No bookmarks. 
	<cfabort> 
</cfif>

<cfset chapterStartPages = arrayNew(1)>	
	
<cfset iterator = bookmarks.listIterator()>


<cfloop condition="iterator.hasNext()">
	
	<!--- A HashMap --->
	<cfset bookmark = iterator.next()> 	

	<!--- Debugging code. 
	Shows you an object containing chapter titles and page numbers, 
	if there are any. 
	ColdFusion will tell you that this object is a struct.
	But it is not; it is a HashMap. 
	--->
	<!---<cfdump var="#bookmark#">--->	
	
	<cfoutput>   		
			<cfif not isNull(bookmark.get('Kids'))>		
				<cfloop from="1" to="#arrayLen(bookmark.get('Kids'))#" index="i">
					<cfif bookmark.get('Kids')[i]['Title'] contains "chapter">
						
						<cfset title = trim(bookmark.get('Kids')[i]['Title'])>
						<cfset pageNumber = listGetAt(trim(bookmark.get('Kids')[i]['Page']),1," ")>
						
						<p>
							Chapter Title: <strong>#title#</strong> <br>
							Chapter Start-Page: <strong>#pageNumber#</strong>
						</p>
						
						<cfset arrayAppend(chapterStartPages,pageNumber)>
					</cfif>
				</cfloop>
			</cfif>		
	</cfoutput>
</cfloop>

<cfdump var="#chapterStartPages#" label="Chapter Start Pages">	

 

 

The output is:

BKBK_0-1624721397256.png

 

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines