Reproducible Quantitative Methods

Instructor Guide, Lesson 6

Programming in R / Licensing data and software for reuse

yeah bar

Topics and Resources

  1. R Review

    Review last week. Seriously, go over everything you did last week. Remind students what each line of code does. Remember, they’re new to this and frequent reviews and reminders are important!

  2. The basics of programming in R

    These three concepts-- conditionals, loops, and functions-- are essentially the fundamentals of any programming approach.

      • Conditional- Working Definition: A logical statement that allows a computer to follow a different set of instructions depending on whether a condition is true or false (true/false values are known as a boolean values).
    Example 1
    Example 2
    Example 3

      • Loop- Working Definition: A set of rules or steps that are repeated over and over until a certain condition is reached. This condition (another true/false, or boolean) is evaluated each time the loop runs to determine if the loop should keep going. Loops help you avoid repeating lines of code and add tremendous efficiency to your program.
    Example

      • Function- Working Definition: A discrete reusable chunk of code that can be called to perform a specific task.
    Example

    If you don’t use many functions in your own work, consult Quick-R.Look for problems within your dataset that can use any of these approaches, but it works to combine them as well.

    Example: if you have to make a calculation that requires a conditional or iterating through a loop, write this within a function, and experiment with applying the function you wrote to multiple data objects. Here is sample code from the first iteration of the course that uses a function to replace missing values with estimates:

Exercises

  1. Loops, Conditionals and functions
  2. Let students suggest ideas about how to approach data processing using these methods, and nudge them in directions you’d like.


    ProTip(s)


    We've got lots of helpful hints for this section

    Plot early and plot often! Encourage students to plot data (just simple x-y scatter plots using base R will often do the trick- check for things like impossible values, strange relationships between variables) to ensure the data going into, and coming out of functions is following expected patterns.

    Helpful commentary. As you’re explaining analyses and live coding, type what you’re saying up as comments in your script file. This will help both you and your students remember what you did and why. Too many comments are better than too few!

    Make it pretty. Autoformat your code in R Studio to make it easier to read and debug - the command is (cmd + l in Mac or ctrl + l or Windows)

    Learn from your mistakes. Learning to use the R help and look up error messages is more useful than learning the syntax or commands for any specific package

Discussion

Licensing

How do data and content licenses differ? What do you need to keep in mind when assigning a license more restrictive than CC-0 to your work? Talk about what happens if you are integrating multiple data sets and one data set is set as non-commercial while the other is only set as share-and-share alike.Think back to the Intellectual Property Policy exercise and revisit what licenses you may or may not be allowed to apply to your work.

Readings

About Creative Commons Licences and Licensing for data reuse

Video

Open Data Licensing Animation - OERIPR Support (7:01)

Questions

How do data and content licenses differ?

What is license stacking and what do you need to keep in mind when assigning a license to your work?

What are some things you’d want to consider when selecting a data repository?

Previous Lesson | Home | Next Lesson