Publishing: Content Regardless of Format?

January 09, 2014 Data & AI, MarkLogic

Publishing is about content, not format.— Wendy Queen, Associate Director of Project Muse, Johns Hopkins University Press

I first heard this quotation at an STM Group Innovation Day I attended in London in early December. The speaker was Sayeed Choudhury of the Data Conservancy at John Hopkins University and notes on his presentation were blogged by Suzanne Kavanagh of the ALPSP later that day. Choudhury’s presentation was about the implications of “Big Data” for scientific and scholarly researchers in terms of its collection and preservation and the ability for researchers and others to discover and share it.

I particularly enjoyed his argument that data are new types of collections that can be created or converted to digital formats for processing by machines.  They can change the nature of what libraries offer and enable us all to think about data in different ways – that is, the methods that are used to deal with them and the services that can be offered around them. He pointed out that the use of machines to automate some tasks will necessitate new types of infrastructure to support this, but that no matter how good machines get, there will still be a set of services – particularly interpretative tasks – that only human “experts” can do. Ultimately this is the only way to preserve the quality and value of the data and content that is being provided.

Choudhury then presented the quote by his John Hopkins’ colleague Wendy Queen – and it has stayed in the back of my mind because I’m not sure I entirely agree with it. I looked up the dictionary definition of the word “publish;” it is the issuance, public announcement or communication of “printed or otherwise reproduced textual or graphic material, computer software, etc.” That is, publishing is about making something known to the public – the world at large – rather than keeping it private or concealed. For me therefore, publishing is about content – yes – but it is also very much about format. That is because no matter what the quality of your content, if you can’t publish it in a format or medium that is easy to distribute and consume, then it is irrelevant to all but a small number.

I believe this argument is borne out by the impact that printing presses had on the educational and political persuasions of many 15th century people. Erasmus and Luther had quality content for sure, but without the ability to distribute it to the masses the impact their writings had on the public would have been vastly reduced and certainly much slower to change opinion and belief.

In order to find out a little more, I attempted to find the original source of the quotation but have not been successful. What I did find however was this document written by Queen and her colleague Dean Smith, who are the Associate Director and Director respectively of Project Muse at John Hopkins. It is about article-level enhancements in the humanities and social sciences and emphasizes the importance of dynamic content, innovation and collaboration at the article level.

I believe that possibly the point of this article’s titular quotation is not to say that format is not important – more that the specific format doesn’t matter as long as it is fit for purpose and easily consumable.  Many would also argue that for important content to have maximum impact, authors and publishers should aim to make it as engaging as possible – perhaps through interactivity.  Innovation is certainly coming very fast in that area as is discussed in this article by Bret McCabe about academic publishing confronting its digital future and this recent news item about the launch of the UCL Big Data Institute in collaboration with Elsevier. In fact Elsevier has long been a leader in this space with projects to enhance the article of the future making headlines for some time. Springer and Wiley and many others have also made advances in this area.

500 years after the widespread use of the printing press, we have far more choice when it comes to format although the uptake of digital articles and e-books is still a gradual one. Inevitably the quality of the content that is served up will vary widely. Perhaps we should look to the human experts that Choudhury mentioned to help us fine tune our filters and interpretation skills when it comes to scholarly research and content. Experts seem to agree that using technology to promote collaboration at research and peer review time will help, but in addition the format of the content needs to be flexible and curated responsibly with a platform-based system. Wendy Queen and Dean Smith point to this in their article:

“A move towards the integration of multiple content formats— journals, books, reference works, datasets, YouTube videos, and others—on a fully-discoverable platform has begun.”

In a previous blog I covered the importance of good metadata in making content discoverable and this is emphasized by the quotation above. It seems to me that publishing is about more than just content. Content – preferably high quality content – is vital but so are format and discoverability. Publishing is the act of announcing, declaring, communicating with the public and making quality data, content and research available to the widest audience for whom it is relevant…. and determining relevance is where analytics come in but that is material for another article!

Wishing you all a very Happy New Year!

Kate Tickner