Tuesday 20 December 2011

OJS and file conversion

The creation of an online version of Amicus Curiae, the opening salvo of the SAS-Journals project, was based on organising existing articles in PDF form. It is also the case that new content for upcoming editions of the journal will continue to be uploaded to OJS also as finished PDFs. This works well for Amicus which has long had a print edition of which the creation of PDFs is a by-product.

However there are different ways to approach the publication of online journals; and issues surrounding this were discussed toward the end of the project, notably during the workshop on 20th October. The OJS software already supports the creation of articles 'from scratch' itself, using the XML-galleys plugin. The default output of this is based on NLM's XML schema for publishing journal articles. This approach gives more potential flexibility when it comes to offering different formats for the finished um... article.

OJS seems reasonably agnostic concerning the formats it will allow for submissions, and this makes life easier for authors. However, editors and journal managers may wish to be more rigid in the format in which they wish to present the articles once published (eg. must be PDF!). So what options are there for converting submitted manuscripts from one format to another?

This discussion on the PKP BB gives an insight in to some of the possibilities.

OpenOffice is generally the go-to tool for an open-source solution to document format conversion. It can be run "headless" (ie. without the GUI) and used either to batch process conversions or possibly as part of a plugin for OJS.

The good news for OJS administrators is that lemon8-XML a web-based application that operates separately from the OxS suite was released by PKP in 2009. This uses a headless instantiation of OpenOffice to convert Word or OpenOffice files into NXML.

So we could imagine a workflow like this:

Oo & Doc -> lemon8-XML -> NXML -> XSLT -> PDF & HTML

It is worth noting that there is a significant amount more jiggery-pokery required when going from XML to PDF than to HTML. As far as I am aware there would need to be an intermediary step involving something like FOP.  I'm sure someone has implemented something like this by now... anyone?


No comments:

Post a Comment

Note: only a member of this blog may post a comment.