Using Python to Create Many Files from a Template Document
In one of my jobs, we sometimes create templates for writing content.
We then make a copy for each page of content that needs to be written and upload that into our task tracking tool.
This workflow means that often find myself name 100+ files.
Yes, that is exactly as tedious and time-consuming as it sounds!
Obviously, this is something to script!
Since such a workflow is probably a fairly common task in other places, so I thought it would take an extra step and share it.
Getting Started
I like to start with a checklist of what I expect to accomplish.
I often write this out as pseudocode and this becomes both the structure of my script and comments.
In this case, what I wanted was very simple and looked like this:
# Move to the folder
# Create a copy of the template
I expect to need the following information:
workingFolder # where the template exists and where the new files will be created
template # name of the template document
contentArea = # the content area to help differentiate large numbers of files
updatedDate = # Current year-month "YYYY-MM"
extension # the document extension
pageTitles # a list of titles for each page
Filling in the variables
My next step is to gather all of the information I need and fill in the variables.
This changes from batch to batch, but here's an example:
workingFolder = "~/DocumentNamer"
template = "Content Template v1.0 - 2018 11Nov 08.docx"
contentArea = "Home"
updatedDate = "2018-11"
extension = ".docx"
pageTitles = ["document 1", "document 2"]
We keep the page titles in a spreadsheet.
To get them into the proper format I use a text editor like TextMate or Atom.
Pasting into the editor, each title is on its own line.
I then use a Regular Expression (RegEx) search to find all the content on each line, as well as the carriage return at the end of the line.
I want to keep the content and throw away the carriage return, so I wrap the content in ().
It looks like this:
Find: (.*)\n
In the replace, I can use everything captured in () by calling $1.
I want to wrap each title in quotes and add a comma at the end so that my list of pageTitles
looks like I showed above.
It looks like this:
Replace: "$1",
This efficiently turns
document 1
document 2
into
"document 1", "document 2"
The last step is to copy all of that and paste into the square brackets.
pageTitles = ["document 1", "document 2"]
The Command Line Magic
If this was running in the Terminal, the command would be to simply copy the template to the new file name.
For example:
cp "Content Template v1.0 - 2018 11Nov 08.docx" "document 1"
As a simple command without any kind of piping, to do this from Python, I use call
from the subprocess module.
The command looks like this
call("cp", template, filename)
So, all that's left to do is to loop through all of the pageTitles and create an output file.name.
In Python that looks like this:
for pageTitle in pageTitles:
filename = contentArea + " - " + updatedDate + " - " + pageTitle + extension
Warning: If you are doing more complicated filename creations, make sure to use the os.path module.
The join()
command can save a lot of frustration!
Final Code
The final code looks like this:
#!/usr/bin/env python
import os
from subprocess import call
def main():
"""Create copies of the template for each page/task in Wrike"""
# Variables
workingFolder = "~/DocumentNamer"
template = "Content Template v1.0 - 2018 11Nov 08.docx"
contentArea = "Home"
updatedDate = "2018-12"
extension = ".docx"
# List of file names to create
pageTitles = ["document 1", "document 2"]
# Move to folder
os.chdir(workingFolder)
# create copy of template
for pageTitle in pageTitles:
filename = contentArea + " - " + updatedDate + " - " + pageTitle + extension
call(["cp", template, filename])
print "Created " + filename
if __name__ == '__main__':
main()
Improvements
This is meant to be quick and dirty code that gets used in real life.
It took minutes to write, seconds to update with the new titles, and saves me hours.
As such, it's not very pretty.
There are a couple of things that could be improved with a little effort.
- I usually run this from a text editor.
Instead, it could be modified to run from the command line. - There is no testing to see if anything failed, like moving to the working folder or finding the template properly.
- Using os.path module to properly join path elements.
- Creating the update date automagically using the datetime module.
Any other improvements you can think of? Please let me know!
Hello! Your post has been resteemed and upvoted by @ilovecoding because we love coding! Keep up good work! Consider upvoting this comment to support the @ilovecoding and increase your future rewards! ^_^ Steem On!
Reply !stop to disable the comment. Thanks!
Congratulations @wrashi! You have completed the following achievement on the Steem blockchain and have been rewarded with new badge(s) :
Click here to view your Board of Honor
If you no longer want to receive notifications, reply to this comment with the word
STOP
To support your work, I also upvoted your post!
Do not miss the last post from @steemitboard:
@wrashi, thank you for supporting @steemitboard as a witness.
Here is a small present to show our gratitude
Click on the badge to view your Board of Honor.
Once again, thanks for your support!
Do not miss the last post from @steemitboard: