Friday 26 June 2015

Generating e-books with Calibre on Hostmonster

One thing the new generated Imaginary Realities web site currently lacks, which the old hand-written version had, is epub and PDF versions of the published issues that can be downloaded and read at one's leisure.  Or put away for offline reference, if that's what you're into.

These e-books were generated by Calibre, which is a wonderful piece of work written primarily in Python.  It allows you to add a web page as an e-book, and it'll go through and follow the links and put it together as a HTML e-book.  Then you can convert that e-book into any number of formats, including the two formats I concentrate on, PDF and epub.  All this is done through a pretty standard media library interface for e-books.  But you can also drive the conversion by command line if you are willing to put in a bit more work.

As the new web site is generated into HTML using Jinja2, I wanted to take the output and then as a final step, push it through Calibre's ebook-convert command.  Compiling Calibre from source is a complicated endeavour which the author warns against the complexity of, so I wanted to avoid that.  Instead I downloaded and installed one of the Linux static builds.  By default, it installs to /opt, but it's possible to redirect it to use ~/ as the base directory instead.

Once installed, it's a somewhat straightforward matter to generate the e-books (see the source code). Take the command path, and the command arguments which map to the original conversion done in the GUI (you have to guess this somehow)

The command:

command = "/home/mememe/opt/calibre/ebook-convert"
The arguments:
    standard_arguments = ""
    standard_arguments += " --disable-font-rescaling"
    standard_arguments += " --margin-bottom=72"
    standard_arguments += " --margin-top=72"
    standard_arguments += " --margin-left=72"
    standard_arguments += " --margin-right=72"
    standard_arguments += " --chapter=/"
    standard_arguments += " --page-breaks-before=/"
    standard_arguments += " --chapter-mark=rule"
    standard_arguments += " --output-profile=default"
    standard_arguments += " --input-profile=default"
    standard_arguments += " --pretty-print"
    standard_arguments += " --replace-scene-breaks=\"\""
    standard_arguments += " --toc-filter=.*\[\d+\].*"
And then invoke calibre for each issue and each format:
        output_basename = "imaginary-realities-v%02di%02d-%04d%02d" % (volume_number, issue_number, year_number, month_number)
        for suffix in ("epub", "pdf"):
            output_filename = output_basename +"."+ suffix
            ret = subprocess.call([ command_path, html_path, os.path.join(output_path, output_filename) ])
Unfortunately, if you're on shared hosting, then you're at the mercy of whomever administers it. It turns out that PDF generation uses QT components, and if you're not on the right version of libc++ or some combination of libraries, then the PyQT extension modules that come with Calibre's static install, will simply fail to import. It's not so much a bug with Calibre, as Hostmonster offering a dated environment. The Mobi format is also affected by this problem.

Epub is about the only format which will generate without the use of QT. Unfortunate, but there's not much that can be done about it without investing a lot more work, and there's so many other things that I could do with this project. My iPad 1 (which is a poorly aging piece of junk) will accept both epub and PDF in Apples e-book reader app. Hopefully, most other modern devices can also handle the epub format.  We'll see!  It might also be that there's a different tool which can be more easily installed or compiled and which will generate PDFs of the same level of quality.

Thursday 25 June 2015

Imaginary Realities source code

The Imaginary Realities web site is generated using Jinja2 from Python flat files.  I've changed the repository from private to public, as there's no real reason to keep it shelved away.  If my hosting goes down, someone else can easily generate their own version of the web site and host it, should that take their fancy.

The git repository is hosted on bitbucket:

https://bitbucket.org/rmtew/imaginary-realities