After experimenting with S3cmd, I immediately started automating my custom backup process. Quick to get working and improve as you go, thanks to Python!
Calling S3cmd from Python
The trick is to use Python's subprocess
module to invoke the command as you would do in the Unix/Linux shell,
via the call
function.
To get a quick idea of how this works, you can play interactively with the Python interpreter, using the ls
command for example:
$ python
Python 2.7.11 (default, Dec 5 2015, 14:44:53)
...
>>> import subprocess
>>> CMD = "ls -la"
>>> subprocess.call(CMD, shell=True)
...
First, make it work with a simple case
In the first version of the script, I am targeting the screenshots I create every now and then, as shown in my previous article.
Here is the resulting code:
#!/usr/bin/env python
import sys
import subprocess
BUCKET = "s3://contentgardening.com-backups"
def main(argv):
""" Main function """
SRC_DIR = "/MYSCREENSHOTS"
DEST = BUCKET + "/images/screenshots/"
CMD = "s3cmd put --force %s/*.* %s" % (SRC_DIR, DEST)
subprocess.call(CMD, shell=True)
if __name__ == "__main__":
main(sys.argv[1:])
Note that the s3cmd
executable is referenced just by its name here, for better readability, but you want
to provide its full path. Even better, put the path in a variable as done for the destination S3 bucket.
To run it, we simply do:
$ python backup-to-s3--v1.py
The improved and extended version
I later extended the script to take care of different cases, such as backing up software archives which are located under a specific path on my computer.
The first change needed is using a "for loop" to go through all cases, and for each one, call the underlying tool with the right arguments. I used a mapping to provide the parameters for each case, mainly the source directory and the "virtual path" to use in the destination S3 bucket.
I ended up with the following code:
#!/usr/bin/env python
import sys
import subprocess
BUCKET = "s3://contentgardening.com-backups"
FILES_MAPPING = {
'software': ["/MYSOFTWAREArchivesAndImages", "/software/"],
'screenshots': ["/MYSCREENSHOTS", "/images/screenshots/"],
# add other cases here...
}
def main(argv):
""" """
CMD = ""
LSCMD = ""
backup_types = FILES_MAPPING.keys()
for backup in backup_types:
SRC_DIR = FILES_MAPPING[backup][0]
DEST = BUCKET + FILES_MAPPING[backup][1]
CMD = "s3cmd put --force %s/*.* %s" % (SRC_DIR, DEST)
subprocess.call(CMD, shell=True)
# Print feedback to know the current files that are backed up
LSCMD = "s3cmd ls %s" % DEST
out = subprocess.check_output(LSCMD, shell=True)
print(out)
if __name__ == "__main__":
main(sys.argv[1:])
As you can see, there is a 2nd part where I am calling s3cmd ls
to get the current list of file objects in S3, which
gives me a kind of feedback after the files are uploaded.
This is already a good start, and I could make it run regularly via crontab.
Other things to add
There are several ideas of features that could be added in the future, such as logging the feedback information or some stats, and/or building a summary report and even mailing that to an inbox or pushing it to an app for future analysis.
As I continue improving this, I will share more things that Python scripting allows us to do.