Reproducible Quantitative Methods

Lesson 1

Introduction to reproducibility and open science frameworks

yeah bar

Topics and Resources

  1. Course syllabus and expectations

    Here is our class syllabus

    This is a project-based learning course. The idea is to mentor students through a complete reproducible workflow, using a real dataset, with the intention of publishing their work as a manuscript, complete with data and code products. This project based approach is meant to stimulate the natural scientific curiosity that we wouldn't get from using canned examples to complete the exercises, and to motivate us by offering authorship on conventional scientific products- all while giving us skills that will bring their work into compliance with federal funder guidelines (for example, NSF). This means we'll need to start with some data. The good news is, unused (or under-used) data is everywhere.

      • See Simon Leather’s post on unused data that needs love

      • I want us to have groups established with appropriate data sets identified by week 2 of the class. Let's use this etherpad to discuss the available projects. New to etherpad? think of it as a collective notepad, with chat functionality. We can, and will, use these for a lot of purposes in this class. Need a fresh etherpad for a new project? Go here.

  2. Open science, open data, & reproducibility

    What is open science? What is open data? We'll use this time to talk, in very broad terms, what reproducibility and open science are, how they fit together, and why they're important. For this topic, I'll ask the class to come up with a definition together, and then clarify or tweak. This is a good opportunity for me to gauge prior knowledge and attitudes. Here are some resources you can use to help form your position:

      • What is open science? from The OpenScience Project (2009)

      • What is open science? from F1000Research (2014)

      • What is open data?

      • Open Data Primer 1

      • Challenges and Responses to Open Data

    Every community inevidably produces its own terminology and jargon, and the open science and reproducible research community is no exception. Please review the Open Research Glossary - this is not only an excellent resource for definitions of terminology commonly used in open science, it's also an example of the community-driven products that are common in the open science community. Check out this article describing how this glossary came about.

  3. Rules and regulations from funders and institutions

    Here's a bit of the legal stuff. Here's a bit about the rules and regulations surrounding reproducibility, sharing and openness.

      • Data Management Plans- Data management plans are now a required part of most federal grant proposals. See the SPARC resource for Data Sharing Requirements by Federal Agency.

      • Enforcement- what sort of teeth do the rules and regulations around sharing and reproducibility have? See Today’s Data, Tomorrow’s Discoveries NSF's public access plan.

      • Institutional Intellectual Property Policy- become familiar with your own! It might be hard to find. For example, Michigan State University’s copyright policies are here.

Exercises

  1. Find MSU's IP policy, and discuss
  2. Sometimes this information can be hard to find, and as we move on to other places, you're going to need to be able to find this information yourself. Search for:

    “Intellectual property”
    “IP rights”
    “Copyright policy”
    “Data sharing policy”

    Look on Office of Research or institutional Technology Transfer websites or for an institution-wide policy directory

    Can you interpret the policy in terms of your own work? Here are some questions I'd like you to think about:

    What are the rules or regulations around sharing your particular research products?
    Does the IP policy support the funding mandates?
    Who do you go to with questions about this policy?
    Is the policy different for students vs paid employees?

  3. Sign up for github
  4. If you don't already have a github account, please sign up! This will be important as we're organizing our groups for our projects. We'll be learning a lot more about github over the course of our projects, but we'll learn it in bits, so don't let the octocats intimidate you.

Discussion

Openness and reproducibility in research

So why bother with reproducibility? What's the big deal about openness? How do they fit together? in this, the first class discussion, let's think about your motivations for taking this class. We'll begin the discussion by watching the video together.

Video

Rethinking Research Data | Kristin Briney | TEDxUWMilwaukee (15:05)

Questions

Do you agree or disagree with Briney’s assertion that publication is advertising? What might make it “advertising” and not “science?”

What are your concerns or challenges to the concept of open data? Why do you support open data or open science?

Home | Next Lesson