Reproducible Quantitative Methods

Lesson 6

Programming in R / Licensing data and software for reuse

yeah bar

Topics and Resources

  1. R Review

    Review last week. Seriously, go over everything you did last week. Remind yourself what each line of code does. Remember, you're new to this and frequent reviews and reminders are important!

  2. The basics of programming in R

    We'll be going over an example of this on Thursday, but if you have little or no programming experience, please give this section a once-over before class.

    These three concepts-- conditionals, loops, and functions-- are essentially the fundamentals of any programming approach.

      • Conditional- Working Definition: A logical statement that allows a computer to follow a different set of instructions depending on whether a condition is true or false (true/false values are known as a boolean values).
    Example 1
    Example 2
    Example 3

      • Loop- Working Definition: A set of rules or steps that are repeated over and over until a certain condition is reached. This condition (another true/false, or boolean) is evaluated each time the loop runs to determine if the loop should keep going. Loops help you avoid repeating lines of code and add tremendous efficiency to your program.
    Example

      • Function- Working Definition: A discrete reusable chunk of code that can be called to perform a specific task.
    Example

    If you don’t use many functions in your own work, consult Quick-R.Look for problems within your dataset that can use any of these approaches, but it works to combine them as well.

    Example: if you have to make a calculation that requires a conditional or iterating through a loop, write this within a function, and experiment with applying the function you wrote to multiple data objects. Here is sample code from the first iteration of the course that uses a function to replace missing values with estimates:

Exercises

  1. Loops, Conditionals and functions
  2. Brainstorm how your knew knowledge will aide you in processing your data.


    ProTip(s)


    We've got lots of helpful hints for this section

    Plot early and plot often! When applying functions, loops and considtionals to your data, be sure to check that the operation has done what you expected it to do- (just simple x-y scatter plots using base R will often do the trick- check for things like impossible values, strange relationships between variables) to ensure the data going into, and coming out of functions is following expected patterns.

    Helpful commentary. Don't forget to comment your code while you work! This will help both you and future scientists understand what you did and why. Too many comments are better than too few!

    Make it pretty. Autoformat your code in R Studio to make it easier to read and debug - the command is (cmd + l in Mac or ctrl + l or Windows)

    Learn from your mistakes. Learning to use the R help and look up error messages is more useful than learning the syntax or commands for any specific package

Discussion

Licensing

For Tuesday: How do data and content licenses differ? What do you need to keep in mind when assigning a license more restrictive than CC-0 to your work? Talk about what happens if you are integrating multiple data sets and one data set is set as non-commercial while the other is only set as share-and-share alike. Think back to when we discussed MSU's IP policies and revisit what licenses you may or may not be allowed to apply to your work.

Readings

About Creative Commons Licences and Licensing for data reuse

Videos

Open Data Licensing Animation - OERIPR Support (7:01)

A short video introducing the concepts.



Kaitlin Thaney on Open Licencing (1:01:41)

A longer lecture. I'd like to introduce a friend and mentor of mine, the amazing Kaitlin Thaney. At the time this video was recorded, Kaitlin was the Director of the Mozilla Science Lab (she was recently promoted to Director of Mozilla Leadership Networks, and now oversees Mozilla's science, advocacy and development programs). Before she was with Mozilla, Kaitlin was with Creative Commons, a leader in intellectual property law in the era of the internet. CC had shaped how information is shared on the web for nearly two decades now. Kaitlin visited my class last semester and gave a fantastic lecture on the history shaping intellectual property online, and how that enables to scientific discovery.

Questions

How do data and content licenses differ?

What is license stacking and what do you need to keep in mind when assigning a license to your work?

What are some things you’d want to consider when selecting a data repository?

Previous Lesson | Home | Next Lesson