Reproducible Research Publication

Sharing, publishing, archiving

Overview

Teaching: 30 min
Exercises: 10 min
Questions
  • How should I publish my efforts?

  • What are the licenses available to me?

Objectives
  • Learn the various common licences

Sharing, publishing and archiving research products

What is the difference between sharing, publishing & archiving?

Non-synomymous terms

We’ll be focusing on publishing and archiving

Why, with whom, what, when, where, and how to publish & archive?

For the remaining slides we will assume tour manuscript is submission ready.

Why?

Increased visibility and citations

Citation density for papers with and without publicly available microarray data, by year of study publication Piwowar & Vision (2013) Data reuse and the open data citation advantage.

Funding agency or journal requirements

Funding agencies

Journals

Why?

Distribution of reporting errors per paper for papers from which data were shared and from which no data were shared Wicherts et al (2011) Willingness to Share Research Data Is Related to the Strength of the Evidence and the Quality of Reporting of Statistical Results.

Why?

Reproducibility: what’s in it for me?

more efficient, less redundant science - by allowing others to build upon our work

Five selfish reasons to work reproducibly by Florian Markowetz

Who?

Whom do we need to share with?

For research to be reproducible, the research products (data, code) need to be publicly available in a form that people can find and understand them.

What?

Catalog the artifacts you produced this morning

Activity outcomes

share? YES!

share? maybe?

share? NO!

Activity outcomes

Advice: One way to determine what you need to publish is to go through and redo the analyses in your paper. Make note of the data and code and notes you needed to do that analysis. Make sure all of that is available. This might seem time consuming, but it assures that what you think you did is what you actually did.

Computing Workflows for Biologists: A Roadmap

When?

You can make your code and data public at any point of the research process.

However, at the point of paper submission, the results in your paper should be reproducible and therefore the data and code used in the paper published.

Where?

Discuss: Contrast with journal supplementary materials.

Many repositories

Registry of Research data Repositories

Growth of re3data.org

Only some of these are archival, meaning they’re committing to retaining your data and products for longer periods of time. This is an important consideration depending on your funders requirements.

how to choose?

what goes where when?

You will likely have different artifacts:

Possible workflow:

University libraries try to help

Libraries often have good resources for data management plans and information and access to repositories. They are particularly good at thinking about data archives.

Librarians are very helpful and super awesome! They’re a great resource.

##How to share, publish: file formats

Do’s

Don’t’s

how to share, publish: standard data formats

Using standard data formats is sometimes required, but even when it’s not, conforming to standards greatly increases opportunties for re-use and understanding.

how to share, publish: checklist

Documenting your research (in pairs)

Copyright applies to creative works

Typically not copyrightable:

Depends on jurisdiction and case:

Choose A License

Get help choosing a license

software licensing guide

Morin, Andrew, Jennifer Urban, and Piotr Sliz. 2012. A Quick Guide to Software Licensing for the Scientist-Programmer.

Creative Commons

### Open is not open to interpretation

The Open Definition sets out principles that define “openness” in relation to data and content. It makes precise the meaning of “open” in the terms open data, open content, and open source:

CC Zero

Dryad requires CC0

Dryad’s use of CC0 to make the terms of reuse explicit has some important advantages:

licenses versus community norms

From the Panton Principles: - […] in the scholarly research community the act of citation is a commonly held community norm when reusing another community member’s work.

Let scientists do science without having to talk to lawyers.

Challenges and concerns about publishing data and code

Discussion

What are some of the challenges of publishing research products? What are some of the concerns that people have?

Key Points